Commit Graph

54 Commits

Author SHA1 Message Date
marfrit 125f800513 docs/PHASE3: re-review NIT fold-in — pipe-to-sh EOL, ci= note, §12 sync
Re-review surfaced one new BLOCKER + two CONCERNs + four NITs. Folded:

N1 BLOCKER: `|%s*sh%f[%s]` missed `curl x | sh` (end-of-string canonical
   wrapper-bypass — Lua's `%f[%s]` requires transition INTO whitespace,
   which doesn't happen at EOL). Replaced with two patterns each for
   sh and bash: `|%s*sh%s` (followed by whitespace/args) and
   `|%s*sh%s*$` (end-of-string). Same for bash. Verified against 18
   wrapper-bypass test cases — all canonical idioms now HALT.

N2 CONCERN: `ci=true` rule flag had no implementation note. Added one
   sentence to §5 explaining the matcher lowercases the input string
   when ci is set.

N3 CONCERN: §12 commit #5 description was stale — still said
   "extends interactive CMD: extraction to consult is_destructive"
   which contradicts the R-B3 resolution (Norris-only). Rewrote
   commit #5 description to match R-B3, and bundled the
   ffi/readline.lua `_bound[seq]:free()` removal into commit #5's
   scope with explicit "Phase 1 amendment" callout. Same for the
   §12 risk note that still referenced the dropped behavior change.

Other NITs (N4 skip threshold, N5 approved-turn mention, N6 :model
swap interaction, N7 commit-attribution wording) are cosmetic and
will fold in-flight during implement if material.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 22:45:25 +00:00
marfrit 91ddcb005d docs/PHASE3: review fold-in — security-layer BLOCKERs resolved
Independent review surfaced 3 BLOCKERs + 6 CONCERNs + 7 NITs against
the analyze-tier draft. Resolutions applied:

BLOCKERs:
  B1 Shell-wrapper bypass — static patterns leaked on bash -c, sh -c,
     eval, pipe-to-shell, python -c, xargs|rm. Added 9 wrapper
     patterns to §5. Norris HALTs on any wrapper invocation; user
     reads the inner before proceed. The patterns are the
     conservative floor against the wrapper bypass class.
  B2 LLM second-opinion was self-policing — same model class
     generating actions then judging them. Switched probe model
     from `fast` to `deep` (qwen3-30b). Added re-roll inversion:
     if first probe says NO, ask "is this SAFE?". Disagreement
     between two probes → HALT. Cheap independent-class insurance.
  B3 `is_destructive` would have run on interactive CMD: extraction
     — a PHASE0 §6/§10 substrate amendment in disguise. Resolved
     Q24: heuristic runs ONLY when norris_active == true. No
     substrate change; interactive `confirm_cmd` semantics unchanged.

CONCERNs:
  C1 Skip-budget: consecutive_user_skips counter; 3+ similar skips
     escalate to abort/force-proceed prompt.
  C2 Algorithm-vs-Q25-resolution contradiction: §4 reordered to
     dispatch ALL pending actions before checking GOAL: complete.
  C3 Norris-goal eviction: goal embedded directly in the dynamic
     system-prompt suffix; survives sliding-window eviction.
  C4 Readline use-after-free window: M.bind no longer frees old
     callbacks; pin for process lifetime (bounded memory cost).
  C5 GOAL: complete matcher: line-level scan, exact match after
     trim — substrate-aligned with CMD: rigor.
  C6 §4 step 4 tightened: auto_approve does NOT bypass destructive
     heuristic; tool_call without auto_approve still HALTs even
     when destructive-clear (Norris conservative).

NITs deferred or rolled into pattern table:
  - chown root-path pattern tightened (NIT 2 in-line)
  - Test corpus expansion noted in §12 commit #1 risk
  - Other NITs are wording-level

Status: Plan (review folded). Ready for commit #1 (safety static
patterns) once another review pass clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 22:42:58 +00:00
marfrit cf4d79dd9d docs/PHASE3: analyze + baseline — \C-n mechanics, LLM latency, module pre-state
Analyze findings folded into the manifest:

  A1. \C-n binding can't toggle mid-prompt without rl_insert_text /
      rl_redisplay. Solution: bind those (one cdef + 2 wrappers in
      ffi/readline.lua) so \C-n inserts ":norris " at the cursor; user
      types goal + Enter. Routes through existing meta dispatch.

  A2. broker has no max_tokens passthrough. Add opts.max_tokens for
      the LLM second-opinion path (terminates at ~2 tokens; verified
      proxy honors it).

  A3. Phase 2 tool-sub-loop pattern IS the planner shape. safety.norris_step
      is the per-iteration extraction; driver loop in repl.lua.

Module-changes table (§3) updated with the rl_insert_text and
max_tokens rows.

Baseline doc (PHASE3-baseline.md, 80 lines) captures:
  - LLM second-opinion latency: 425-1162ms per probe, all 5 test
    cases correct. Worst-case 16-step Norris = ~20s overhead; with
    static-pattern fast-path + session cache, ~5s realistic.
  - Module pre-state at commit f26cbd9 (Phase 2 tip): LOC + state
    per file before Phase 3 edits.
  - Six static-pattern Lua-match sanity checks (all correct).
  - Carries: aish#15 (still open), aish#14, aish#32/#33.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 22:37:58 +00:00
marfrit b58a842e49 docs/PHASE3: formulate — Norris autonomous mode + destructive-op gate
Phase 3 formulate manifest. Three pillars per PHASE0.md §11 row 3:
Chuck Norris autonomous mode (planning loop), destructive-op heuristic
(static patterns + LLM second-opinion), and HALT/confirm protocol.

Resolutions baked in via §2:
  Q2  iterative re-plan after each action (not top-down tree)
  Action sources    CMD: lines AND MCP tool_calls — Phase 2 contract honored
  HALT trigger      static-pattern hit OR LLM-second-opinion flag
  HALT shape        3-way: proceed / skip / abort
  Auto-approve under Norris  honors Phase 2 auto_approve policy
                             EXCEPT destructive-op heuristic always wins
  LLM second-opinion model   the `fast` preset (cheapest)
  Norris prompt suffix       appended to system prompt while active;
                             "GOAL: complete" sentinel for done

Key extensions:
  - safety.is_destructive: ~20 static shell-idiom patterns + LLM probe;
    runs on interactive CMD: extraction too (§9 — replaces bare
    confirm_cmd for known-destructive cases). Q24 worth challenging
    at analyze.
  - safety.norris_step: single-iteration of the planner. Driver loop
    in repl.lua. \C-n toggle (real binding, replaces Phase 1
    placeholder); :norris <goal> explicit launch.
  - renderer.norris_begin/step/halt/end: visual parity with exec
    and tool_call frames. Prompt becomes [aish:fast ]> per
    PHASE0.md §9.
  - context.to_messages dynamically appends NORRIS MODE suffix
    when norris_active.

New open questions (Q23–Q30) tracked in §11:
  Q23 LLM second-opinion latency budget (caching mitigation)
  Q24 interactive CMD: also subject to is_destructive? (proposal: yes)
  Q25 GOAL: complete + pending actions in same response — dispatch first
  Q26 context preservation on abort/done/budget — all preserve
  Q27 :norris continue (resume after abort) — deferred to v2
  Q28 side-effect MCP tools not in *__shell/*__write_file patterns
  Q29 goal-implies-authorization for destructive ops — no, always confirm
  Q30 :norris no-arg vs \C-n share goal-prompt path — yes, trivial

Module-layout (PHASE0 §4) untouched — all changes are growth of
existing files. 6 commits expected at implement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:45:03 +00:00
marfrit f26cbd9a3a phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics
Phase 7 verify finding from TC #26 against :model cloud:
  HTTP 400 from openrouter→Amazon Bedrock:
  "tools.0.custom.name: String should match pattern
   '^[a-zA-Z0-9_-]{1,128}$'"

Anthropic via Bedrock validates tool names against that regex and
rejects dots. PHASE2 originally chose "." as the namespace separator
("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not.

Separator switched to "__" (two underscores) everywhere — internal
API matches on-wire shape, no transformation layer:

  - repl.lua:
    - tools_schema builds "alias__name"
    - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __)
    - :mcp tool parser uses same split
    - :mcp tools formatter prints "alias__name"
    - HELP block shows <alias__name>
  - safety.lua confirm_tool_call: alias.* glob → alias__* glob
  - config.lua example block: keys rewritten
  - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua
    row, §5 wire-shape JSON examples, §6 auto_approve schema, §7
    meta-cmd table, §12 plan all updated. Original "." references
    preserved in commit history.

Constraint: aliases must not themselves contain "__" so the parse
stays unambiguous. Tool names from MCP servers may have underscores
freely.

Second fix bundled — uninformative broker error:
  Previously "broker error: transport: HTTP response code said error"
  Now      "broker error: transport: HTTP 400: {full body snippet}"

ffi/curl.lua M.post_sse changes:
  - FAILONERROR no longer set (was hiding the response body).
  - raw_body accumulator added alongside the SSE buffer; captures
    every byte regardless of SSE shape.
  - After perform, check status_code via curl_easy_getinfo. On >=400,
    return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged.
  - End-of-stream SSE flush only runs on 2xx (no false event on
    error bodies that aren't SSE-shaped).
  - Phase 1 callers reading just first return slot stay correct.

End-to-end verified:
  - :model cloud + tools=[boltzmann__read_file ...] +
    "Use boltzmann__read_file with path=/etc/hostname" →
    Claude emits tool_call with name="boltzmann__read_file",
    args='{"path": "/etc/hostname"}'. ok=true, transport clean.
  - Force-bad tool name "bad.name.with.dots" → err string carries
    the full bedrock 400 with the regex-pattern message visible.

TC #26 (sub-loop end-to-end) is now testable against cloud — the
error that blocked it is resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:04:57 +00:00
marfrit 3fa6279f5b repl: :mcp tool — disambiguate "no alias" vs "unknown alias" errors
Surfaced by Phase 7 verify test case #29: typing :mcp tool list_dir (no
dot) printed "unknown alias: nil" instead of a useful diagnostic. The
parse failure was being conflated with the alias-not-found case.

Now:
  :mcp tool list_dir          -> tool name missing alias prefix: list_dir
  :mcp tool unknown_alias.x   -> unknown alias: unknown_alias
  :mcp tool known_alias.bogus -> unknown tool: known_alias.bogus

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 18:55:01 +00:00
marfrit 09800d192a config: Phase 2 mcp example block + deep model switch
Phase 2 commit #7 (final) per docs/PHASE2.md §12. Two changes bundled:

(1) commented-out mcp = {...} example block (~40 lines) at the end of
    config.lua showing the Phase 2 schema:
      - mcp.servers — alias → {url, auth_token | auth_env}
      - mcp.auto_approve — "<alias>.<tool>" or "<alias>.*" globs
      - mcp.max_tool_depth — sub-loop budget per ask_ai turn
    The block is OFF by default; uncomment + adjust per fleet to
    activate. Documentation-only; no behavior change to existing
    configs (mcp_sessions stays empty, tools_schema() returns [],
    broker omits the field — full Phase 1 compatibility).

(2) User-authored: deep model preset switched from
    mistral-nemo-12b-instruct to qwen3-30b-a3b-instruct, with a 10-min
    timeout_ms accommodating the larger model's RK3588 inference time.
    Reason: nemo backend is dormant per the proxy /v1/models discovery
    (aish#23 now returns 404 cleanly for unknown models instead of
    silent fallback); qwen3-30b is the practical "deep" alternative.

Phase 2 implementation is now complete — 7 of 7 commits landed:
  #1 6c194de  mcp.lua + ffi/curl status_code + PHASE0 §4 amendment
  #2 0fde77f  safety.lua confirm_tool_call
  #3 7c221a8  context.lua tool turns + use_tool_role fallback
  #4 c736d0e  renderer.lua tool-call frames
  #5 efdc728  broker.lua opts.tools + tool_call accumulator
  #6 7e9cfff  repl.lua sub-loop + :mcp meta + system-prompt block
  #7 (this)   config.lua example + deep model switch

Next phase-loop step: verify (Phase 7). Files written are wired and
isolated-tested; end-to-end model-driven verification waits on either
a more compliant model or explicit forcing of tool_calls from the
prompt — known to be marginal with the loaded qwen-1.5b but proven
correct against direct probes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 15:40:21 +00:00
marfrit 7e9cfff04d repl: tool-call sub-loop + :mcp meta + system-prompt augmentation
Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP
tool-call flow on top of broker/safety/context/renderer/mcp.

repl.lua additions:
  - mcp_sessions table populated from config.mcp.servers at startup.
    connect_mcp() helper does initialize + caches tools/list. Failures
    status-logged once; absent from mcp_sessions until manual reconnect
    (C4 — no auto-retry).
  - tools_schema() flattens connected sessions' tools into the OpenAI
    {type:"function", function:{name,description,parameters}} shape with
    "<alias>.<name>" namespacing.
  - flatten_content() concatenates content[type="text"] blocks; one-shot
    status warning when non-text blocks (image/resource) are dropped
    (§4 normative spec, v1 only handles text).
  - dispatch_tool_call(name, args_table) splits alias.tool, looks up
    session, calls. Returns (content_string, is_error). Errors of every
    flavor (missing alias, no session, rpc_error, transport_error)
    yield a synthesized "[aish] ..." string so callers always have a
    body for the role:"tool" turn — alternation preserved per C5/C7.
  - ask_ai rewritten as a sub-loop that re-issues the broker request
    until the model returns pure text or max_tool_depth (default 8) is
    hit. Each iteration: stream response → if tool_calls present,
    confirm-gate each → dispatch → append role:"tool" turn → continue.
    Argument-JSON parse failure produces a synthesized tool turn (C7).
    Decline at confirm produces "[aish] tool call declined by user"
    tool turn (alternation guarantee).
  - :mcp meta with sub-commands: list / tools / tool <a.n> / connect
    <url> [alias] / disconnect <alias>. HELP block extended.

context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8
(hybrid prompt: static frame about MCP + dynamic tools list in the
request body). Block is always present even when no MCP servers
configured — ~60 tokens for clarity that 'CMD:' remains the fallback.

CMD: extraction unchanged — runs on the FINAL pure-text response only
(not on intermediate iterations of the tool sub-loop). Substrate §3
invariant preserved.

End-to-end verified two ways:
  (1) Direct broker probe: aish's tools_schema fed through
      broker.chat_stream against hossenfelder → qwen-1.5b emits one
      tool_call payload with correct id + name="boltzmann.list_dir"
      + args='{"path":"/tmp"}'. Accumulator stitched the JSON-string
      across fragmented deltas.
  (2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock
      emits text + tool_call, sub-loop dispatches against LIVE
      boltzmann lmcp (auto_approve via policy), 80+ files rendered
      inside the tool_call frame, broker re-invoked with the
      extended context, mock returns pure text, sub-loop terminates.
      Total broker invocations: 2.

Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..."
suggestions even when an MCP tool is the better path; the small
model's system-prompt compliance is weak. Larger models and the
analyze-time direct probe confirm the tools_schema and tool_calls
flow is wire-correct — Phase 7 verify will exercise this against
qwen3-30b or cloud models when available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 15:20:42 +00:00
marfrit efdc7281c7 broker: opts.tools passthrough + streaming tool_call accumulator
Phase 2 commit #5 per docs/PHASE2.md §12. Streaming broker grows
tool-call support without taking a dependency on mcp.lua (caller
supplies the tools array — B5 from review).

chat_stream signature widens to (cfg, msgs, on_delta, opts):
  opts.tools  - optional array, passed to the request body as the
                OpenAI-shape tools field. OMITTED entirely when nil or
                empty (#tools == 0) — some servers reject "tools": [].

on_delta callback shape widens to (kind, payload):
  kind = "text",      payload = string         (Phase 1 path; unchanged
                                                semantics, signature
                                                changes from (delta) to
                                                ("text", delta))
  kind = "tool_call", payload = {id, name, arguments}
                                                emitted ONCE per call on
                                                finish_reason "tool_calls"
                                                after the streaming
                                                accumulator pulls
                                                fragmented JSON-string
                                                arguments together.

Accumulator behavior:
  - Keyed by delta.tool_calls[i].index.
  - If index is absent on a delta (some llama.cpp builds omit it on
    single-call streams; C2 in review), default to 0 with a one-shot
    stderr debug status per stream.
  - id and name captured from the opening delta of each slot.
  - function.arguments concatenated across all deltas as the raw
    JSON-string; caller (repl.lua / future Phase 2 commit #6) does
    dkjson.decode.
  - On finish_reason "tool_calls" the accumulator emits all collected
    calls in index order and resets.

M.chat external contract unchanged (C1): wrapper now uses the new
(kind, payload) shape internally but exposes the same text-string
return. No caller of M.chat passes opts.tools so tool_call kinds are
silently dropped.

repl.lua minimal companion edit: ask_ai's chat_stream callback updated
to the new shape. Text path unchanged; tool_call kinds are no-op
placeholders until commit #6 lands the sub-loop. Keeps Phase 1 streaming
functional between #5 and #6.

Smoke-tested against hossenfelder/8082 (post-#23 fix):
  - text-only: ok=true, kind="text" deltas received
  - with opts.tools: model emitted one tool_call,
    accumulator collected id + name=get_weather + args={"city":"Paris"}
    correctly across fragmented deltas
  - opts.tools={}: server accepted (field omitted as required)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:20:32 +00:00
marfrit c736d0e129 renderer: tool-call begin/end frames
Phase 2 commit #4 per docs/PHASE2.md §12. Adds M.tool_call_begin(name, args)
and M.tool_call_end(content, is_error) for visual parity with the existing
exec_begin/exec_end frame.

Visual cadence:
  ─── tool: <name (cyan)> ───
  <args, dim, truncated at 200 chars; omitted if empty/"{}">
  <content>
  ─── ok ───            (dim, success)
  ─── error ───         (red status word inside dim rule, on is_error=true)

Same rule glyph (━) and ANSI palette as the exec frame so the user reads
tool dispatch and shell dispatch the same way.

Smoke-tested all five shapes: success with args / empty args / error /
long args truncated / empty content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:11:42 +00:00
marfrit 7c221a8aae context: tool turns + tool_calls on assistant; use_tool_role fallback
Phase 2 commit #3 per docs/PHASE2.md §12. Three concrete edits per §3
context.lua row (the BLOCKER-fold-in from review):

  (a) Loosen Context:append shape-per-role: assistant may carry empty
      content if tool_calls is non-empty; role:"tool" requires
      tool_call_id + content.

  (b) Preserve tool_calls / tool_call_id on store (Phase 1 :append
      built {role, content} only and silently dropped extras).

  (c) Extend to_messages() with two emission modes selected by
      use_tool_role:
        true  (default) — OpenAI-standard role:"tool" + assistant
          turns with tool_calls (wrapped as {id, type:"function",
          function:{name, arguments}}).
        false (fallback) — collapse assistant-with-tool_calls + its
          following role:"tool" turns into a single assistant text
          turn with synthesized "[tool: name]\n<args>\n[result]\n
          <content>" body; merge consecutive assistant turns so the
          trailing post-tool-result text doesn't yield asst/asst
          back-to-back (same strict-template gotcha PHASE0.md §6
          warned about for user/user).

Alternation assert added (N4): role:"tool" turns must trace back
through zero-or-more prior tool turns to an assistant-with-tool_calls.
Catches sub-loop bugs at append time. Orphan tool turns rejected.

pending_exec_output behavior unchanged per §3 row: buffer persists
across tool-call sub-loops, flushes on next genuine user turn (B4).

Smoke-tested §12 verify-row #3:
  (i)   default mode round-trip — 5 OpenAI-shape messages, tool_calls
        + tool_call_id preserved.
  (ii)  fallback mode round-trip — collapsed into 3 messages
        (system/user/assistant), tool_calls + role:"tool" not emitted.
  (iii) multi-call: 2 tool_calls in one assistant turn followed by 2
        tool replies, both modes render correctly.
  (iv)  orphan tool turn after user — assertion fires.
  (v)   B4: pending_exec_output survives a tool sub-loop, flushes on
        next :append_user.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:10:47 +00:00
marfrit 0fde77fe35 safety: confirm_tool_call gate with auto-approve policy
Phase 2 commit #2 per docs/PHASE2.md §12. Implements just the per-call
confirm-gate surface; Phase 3 stubs (is_destructive, norris_step) stay
unimplemented with their error() bodies.

M.confirm_tool_call(name, args, cfg) checks cfg.mcp.auto_approve for:
  - exact match on "<alias>.<tool>"
  - "<alias>.*" glob covering a whole server

Miss falls back to a [y/N] readline prompt. Empty or non-"y" answer
rejects (matches the existing confirm_cmd UX from PHASE0 §10).

Pretty-printing renders args as compact JSON, truncated at 80 chars
with "..." suffix so one-line prompts stay readable.

Smoke-test passes all eight cases per §12 verify-row #2:
  exact match / alias glob → auto-approve, no prompt
  miss + y / n / empty / nil-cfg → prompt shown, expected verdict
  empty args / long args → clean rendering, truncation works

Note: PHASE0 §4 module-layout had a "lands in Phase 2" hint on the
norris_step stub; the actual landing is Phase 3 per PHASE0 §11 row 3.
Comment in safety.lua updated to clarify.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:07:57 +00:00
marfrit 6c194deea0 mcp: JSON-RPC client + ffi/curl status_code; PHASE0 §4 amended
First commit of Phase 2 per docs/PHASE2.md §12. Three changes bundled:

mcp.lua (new, 153 lines):
  - M.connect(url, opts) returns a Session.
  - Session:initialize() round-trips initialize + notifications/initialized
    + tools/list. Caches tools for session lifetime (lmcp announces
    capabilities.tools.listChanged = false; no refetch).
  - Session:list_tools() returns the cached tool list.
  - Session:call_tool(name, args) returns (result_table, kind) where
    kind ∈ {"ok", "handler_error", "rpc_error", "transport_error"} per
    the §4 error split. Folded HTTP-level failure into transport_error.
  - Per-server Bearer auth via opts.auth_token or opts.auth_env env-var
    indirection.
  - Captures protocolVersion mismatch as a warning string rather than
    aborting (lmcp doesn't negotiate — N3 in review).

ffi/curl.lua extension:
  - Add curl_easy_getinfo to ffi.cdef.
  - Pre-cast as getinfo_long; helper get_response_code() fetches
    CURLINFO_RESPONSE_CODE (decimal 2097154 = CURLINFOTYPE_LONG | 2).
  - M.post now returns (body, status_code) on transport success;
    (nil, errmsg) on libcurl failure stays unchanged. Phase 1 callers
    reading only the first slot are unaffected.

docs/PHASE0.md §4:
  - Insert `mcp.lua` between broker.lua and router.lua per PHASE2.md §9.
  - Module-stability invariant clarified: rename prohibition is what
    matters; adding new files is additive.

Smoke-test passes for all four kinds against boltzmann lmcp v0.5.4:
  - initialize: ok (7 tools cached)
  - list_dir /tmp: ok (1.2KB content)
  - read_file /nonexistent: ok (boltzmann's baseline §3 quirk —
    isError:false even on failure; content is authoritative)
  - nope_tool: rpc_error (code=-32601)
  - wrong auth: transport_error (HTTP 401)
  - unreachable host: transport_error (DNS failure)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:06:39 +00:00
marfrit f5daa6afc0 docs/PHASE2: re-review NITs — M.post shape, getinfo cdef, content flattening normative
Three follow-up NITs from the post-fold-in review:

  (1) Disambiguate M.post return shape: (body, status_code) on transport
      success regardless of status; (nil, errmsg) on libcurl failure
      stays unchanged. Phase 1 callers reading only the first slot are
      unaffected.

  (2) Note that the M.post extension requires extending ffi.cdef to
      include curl_easy_getinfo + CURLINFO_RESPONSE_CODE (decimal
      2097154, CURLINFOTYPE_LONG | 2) and a long[1] out-param shim.
      Implementation detail the commit #1 author will need.

  (3) Move the tool-result content-flattening rule from §12 risk note
      into §4 normative spec (forward-referenced both ways) — §4 is
      where a future reader looking for the tool-invocation contract
      will scan.

No design changes; clarifications only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:02:35 +00:00
marfrit d3570ccea4 docs/PHASE2: review fold-in — 5 BLOCKERs + 7 CONCERNs + key NITs
Independent review of the formulate+analyze+plan draft surfaced design
gaps that would have shipped as silent bugs. Resolutions applied:

BLOCKERs:
  B1 context.lua impact widened — Phase 1 :append asserts content and
     discards extra fields. Need (a) shape-per-role assert, (b) preserve
     tool_calls/tool_call_id on store, (c) emit from to_messages().
  B2 ffi/curl.M.post extended to return (body, status_code). lmcp's
     401 returns a non-JSON-RPC body that would have been mis-decoded.
  B3 §3 typo schema -> inputSchema.
  B4 pending_exec_output × tool-call sub-loop interaction specified.
  B5 §3/§12 broker dependency contradiction — broker takes opts.tools
     from caller; no layering inversion.

CONCERNs:
  C1 M.chat return polymorphism dropped (no consumer).
  C2 tool_calls[].index absent fallback: default to 0.
  C3 Re-injection stores accumulated text, not hard-coded empty.
  C4 :mcp connect failure: no auto-retry, status-log once.
  C5/C7 JSON-RPC error AND argument-parse failure both synthesize a
     role:"tool" turn — keeps strict-template alternation legal
     exactly the way PHASE0 §6 demanded for exec output.
  C6 §9 confirms §4 amendment is additive (preserves §3 invariant).

NITs:
  N3 protocolVersion fallback (lmcp doesn't negotiate).
  N4 Alternation assert in Context:append.
  N7 Model-routing bug filed as aish#23.
  N8 Day-one fallback test for use_tool_role=false in commit #3.

Manifest status: Plan (review folded). Status line and Resolutions
sections updated; commit-by-commit roadmap reflects revised specs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:00:07 +00:00
marfrit 447e430254 docs/PHASE2 §12: implementation plan — 7-commit roadmap
Bottom-up: mcp.lua → safety.lua → context.lua → renderer.lua → broker.lua
→ repl.lua → config.lua. Same cadence as Phase 0/1.

Risks called out explicitly:
- Empty tools array → omit field entirely (some servers reject [])
- isError:false on actual failure (baseline §3 finding) → pass content
  through regardless; let model read error text
- JSON-RPC error from tools/call → aish status only, no tool turn
  appended, no model recovery
- max_tool_depth=8 cap on tool-call sub-loop
- Argument JSON streaming may yield malformed JSON → status warn + skip
- Q18 fallback (use_tool_role=true default; prefix-injection plumbed
  but dead-coded; verify can flip)
- Connect-at-startup is sequential (~30ms × N); fine for N≤3

Two items left open for review: Q18 default flip vs ship-true-flip-on-fail,
and whether :mcp connect should re-fetch tools after the initial cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 12:37:27 +00:00
marfrit c5116bf129 docs/PHASE2-baseline: pre-implementation measurements
Phase 7 (verify) anchor. Captures:

- MCP RPC round-trip timings against boltzmann lmcp v0.5.4 (all sub-100ms
  on LAN; LLM is the latency floor, not the transport).
- 6 fixture responses saved to /tmp/aish-baseline/ covering initialize,
  notifications/initialized, tools/list, tools/call success, isError,
  and JSON-RPC unknown-tool error.
- Baseline design finding: boltzmann's read_file returns isError:false
  even on failure (error text in content). aish should treat content as
  authoritative, isError as advisory; feed both to the model. PHASE2.md
  §4's "pass-through" stance already accommodates; no manifest amendment
  needed.
- Streaming tool_calls delta shape verified against hossenfelder; matches
  PHASE2.md §5.
- Pre-MCP aish behavior snapshot: loaded model emits markdown code-fence
  ignoring the CMD: contract — once MCP tools exist the model gets a
  structured path that doesn't depend on prose-formatting compliance.
- Module pre-state at Phase 1 head 5878f73: LOC + capability snapshot
  per module so Phase 2 diff has a reference frame.
- Two boltzmann-proxy blockers (SSE buffering, model-field routing)
  carried explicitly into Phase 7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 12:34:32 +00:00
marfrit 5878f7347b docs/PHASE2: analyze — lmcp v0.5.4 probed, transport simplified
Live-probed against lmcp v0.5.4 (boltzmann) + hossenfelder broker proxy:

Transport simpler than spec:
- lmcp only implements POST-per-RPC with Connection: close; no held-open
  SSE channel. Combined with capabilities.tools.listChanged=false, no
  client-side listener is needed in v1. Drops the planned M.get_sse
  addition to ffi/curl.lua — Phase 1's M.post covers MCP.

Bearer auth is universal across the fleet — config schema grew
auth_token (literal) and auth_env (env-var indirection) fields per
server, mirroring PHASE0 §10's key_env convention.

Streaming tool_calls delta shape verified — accumulator by `index`,
function.arguments arrives as chunked JSON-string. Matches the
formulate-phase assumption in §5.

Resolutions:
  Q17 transport abstraction — POST-only, no SSE channel for lmcp.
  Q21 error mapping       — result.isError (model-recoverable, feed
                             back as tool turn) vs JSON-RPC error
                             (unknown method/tool, transport-level).
  Q18 role:"tool" turn    — accepted at protocol level (live-probed).
                             Mistral-nemo template verification
                             blocked by the hossenfelder model-field
                             routing bug; full closure carried to
                             Phase 7 verify.

Open-end recorded in §11: the hossenfelder proxy routes every request
to the loaded fast model regardless of model field, blocking Phase 2
testing against mistral-nemo specifically. Parallel to the SSE
buffering issue at marfrit/aish#15; same root (boltzmann proxy code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 09:51:03 +00:00
marfrit ec6793c93c docs/PHASE2: formulate — MCP client + tool-calling bridge
Phase 2 formulate manifest. Three pillars per PHASE0.md §11 row 2:
mcp.lua (JSON-RPC 2.0 over HTTP+SSE, target: lmcp), tool-calling bridge
(OpenAI tools field <-> MCP tools/call), and the safety.lua
authorization gate (per-call confirm + auto_approve policy).

Resolves PHASE0.md §13 Q6–Q10:
  Q6  CMD: + tool-calls coexist; substrate §3 unchanged
  Q7  config-declared servers + runtime :mcp connect
  Q8  per-call confirm default, auto_approve policy in config
  Q9  hybrid system prompt: static frame + dynamic tools body field
  Q10 streaming-from-day-one on Phase 1 SSE; on_delta widens to (kind, payload)

New questions tracked in §11 (Q17–Q22): transport abstraction, role:tool
vs prefix injection (mistral-nemo template verification needed), large
tool-result handling, parallel dispatch, error mapping, aish-as-MCP-server
(parked).

§4 module layout amended: mcp.lua slots between broker.lua and router.lua.
The amendment is documented in this manifest; the actual §4 table edit
lands when implementation starts (Phase 2 implement phase).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 09:23:53 +00:00
marfrit f7c3c32aa2 .claude: project-shared permission allowlist for read-only MCP/Bash
Adds .claude/settings.json — 10 read-only entries (mcp__*__read_file,
mcp__hub-tools__remote_list_hosts, Bash(ping *), Bash(dig *)) auto-allowed
in any aish session, reducing per-call permission prompts during routine
file-reading and host probing. Generated via /fewer-permission-prompts.

settings.local.json stays user-private (per-user ad-hoc grants); .gitignore
now covers it so it doesn't accidentally land in commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 08:08:26 +00:00
marfrit 7d62eb5659 review followups: pcall shield, :resume guard, shell quoting, nits
CONCERNs from the Phase 1 review pass:

ffi/curl.lua:
  - SSE write_cb body is now pcall-wrapped. A Lua error in on_event (or
    in the parse loop itself) is captured into cb_error and surfaced
    after curl_easy_perform rather than propagating across the FFI
    callback boundary (which LuaJIT documents as process-fatal). The
    EOS flush path gets the same shield. Errors return
    (nil, "callback: <msg>") from post_sse.

history.lua:
  - sh_singlequote() escapes shell metacharacters; the mkdir -p and
    ls -1 shell-outs no longer double-quote (where $(...) and $VAR
    still expand) — single-quote with embedded-' escaping is the
    safe form.
  - M.load now returns (turns, meta) instead of (meta, turns). turns
    is ALWAYS a table on success, never nil-when-no-header; failure
    path is the unambiguous (nil, err). Callers can `if not turns
    then` without the previous ambiguity. repl.lua :resume updated
    to the new shape.

repl.lua :resume:
  - Refuse to resume into a non-empty ctx — silent overwrite was the
    Q15 default, but the review surfaced the no-undo / no-warning
    failure mode. User must :reset (or :save then re-launch) to
    express intent. The current session's on-disk log is unaffected
    either way.

NITs:
  - ffi/libc.lua READ_BUF: comment noting it's module-shared and
    Phase 1 has no reentrant readers; revisit when that changes.
  - PHASE1.md §7: \C-x\C-c reservation pinned to Phase 3 ("deferred
    from Phase 1 — no consumer here") rather than the previous
    dangling "(or here)".

Regression suite verifies:
  - history.load new signature on success + failure paths
  - shell-quoted history.dir with $ doesn't trip
  - aish scripted run: ctx with 2 turns refuses :resume anchor with
    a clear status; user must :reset first

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:05:23 +00:00
marfrit 1f1065157e review BLOCKER: PTY input forwarding + raw mode toggle
Phase 1 review caught a structural gap: executor.exec only drained the
PTY master fd, never forwarded user keystrokes — vim/less/htop/nano
would render and hang on input. PHASE1.md §5 specified bidirectional
multiplex but only the read leg landed. tcgetattr/tcsetattr were also
missing, so even with input forwarding the parent's line discipline
would buffer until newline (breaking single-key UIs).

ffi/libc:
  - struct termios opaque buffer + tcgetattr/tcsetattr + cfmakeraw
  - M.set_raw(fd) saves termios + applies cfmakeraw; returns saved or
    (nil, err) when fd isn't a tty (scripted / piped-stdin runs)
  - M.restore_termios(fd, saved)
  - struct pollfd + M.poll (POLLIN constant)

executor:
  - multiplex(sess): poll(stdin, master); reads master on any revents
    (POLLHUP fires when child closes its slave end, not POLLIN — the
    revents != 0 check catches both); forwards stdin keystrokes to
    master; loop exits when master read returns 0 (EOF / child gone)
  - stdin polling is only enabled when stdin_is_tty (set_raw succeeded);
    piped-stdin runs (tests / scripted) would otherwise drain queued
    aish commands into the child of the *current* cmd, swallowing them
  - raw mode is restored before returning so the user lands back at the
    aish prompt in canonical mode

renderer + repl:
  - exec_output(out, code) split into exec_begin() (top rule, before
    spawn) + exec_end(code) (closing rule with exit, after wait). PTY
    multiplex streams the body live to stdout in between; the renderer
    never re-prints the body.

PHASE1.md §3:
  - tcgetattr/tcsetattr changed from "optional" to "required for
    single-key UIs to work — done-criteria #2"; poll added to the libc
    row description.

Verified:
  - non-interactive smoke (echo / false / exit 7 / ls /nonexistent /
    printf multi-line) — all exit codes correct, output streamed live,
    a\nb\nc\n preserved byte-for-byte
  - scripted-stdin run reaches all expected lines (no stdin draining
    into a non-interactive child)
  - aish prompt + framed exec block + exit-code line all render in
    correct order

Live interactive verification (vim / less / htop in a real terminal)
still needs a user-test pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:00:53 +00:00
marfrit a75118b2ae readline: bind() via rl_bind_keyseq; repl reserves \C-n no-op
Phase 1 readline binding wiring per PHASE1.md §7.

ffi/readline:
  M.bind(seq, lua_fn) -> bool
    Wraps lua_fn as a C callback (signature `int (int, int)` per
    readline's rl_command_func_t) and registers it via
    rl_bind_keyseq(seq, cb). Returns true on success (rl returns 0).
    Trampolines are pinned in module-local state so they outlive the
    bind call — readline retains the function pointer for the process
    lifetime. Rebinding the same seq frees the previous trampoline.
    Bound handlers are pcall-wrapped so a Lua error doesn't crash
    readline's input loop.

repl:
  Binds \C-n to a no-op that emits
    "[aish] Norris mode not yet implemented (Phase 3)"
  Verifies the mechanism end-to-end; Phase 3 (Norris autonomous mode)
  replaces the body with the actual toggle.

Smoke covers bind / rebind-same-seq (exercises the :free path) /
bind-different-seq with no errors. Live keyboard verification waits
on user-test.

Phase 1's 8(+1) inner loop is now functionally through `implement`;
next inner phase is `verify` (review pass) followed by memory-update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:26:58 +00:00
marfrit 9d586870e8 repl: session persistence wiring — auto-log, :save, :resume, :sessions
Phase 1 session log integration per PHASE1.md §6.

On every M.run(), open a session file at
  <config.history.dir>/sessions/<utc-iso8601>.jsonl
with a meta header (started, model, aish_version). If history.dir is
unset or unwritable, status-log the disable and continue without
persistence.

ask_ai logs the merged user turn (after pending exec output is folded
in) and the assistant turn (after streaming completes). run_shell does
NOT log [exec output] — that becomes part of the next user turn when
ctx.pending_exec_output is flushed.

New meta commands:
  :sessions       list session files; "*" marks the active one
  :save <name>    rename current session log to <name>.jsonl (auto-
                  appends .jsonl); reopens for continued append
  :resume <name>  load <name>.jsonl into ctx (replaces current turns
                  via ctx:reset + append loop). The current process's
                  own session log is unaffected — Phase 1 chooses
                  per-process logs over chained continuations.

:quit and EOF (Ctrl-D) both close the session file via shutdown_session
before exiting.

HELP text updated (no longer "Phase 0:" header since meta set has
grown). Q15 noted in PHASE1.md §10 (resume into non-empty context) is
resolved by the ctx:reset() in :resume — silent overwrite for Phase 1,
revisit if anyone cares.

End-to-end live verified: chat -> auto-log; :save renames; :sessions
listings; :resume + :history shows the round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:23:05 +00:00
marfrit 87316f8345 history: JSONL session log — open, append, load, list_sessions
Phase 1 persistence per PHASE1.md §6.

  history.open(path, meta?) -> session | (nil, err)
    parent dir auto-created; meta line written iff file is new/empty so
    reopening a session doesn't duplicate the header
  session:append(turn)
    JSON-encoded line, fh:flush after every write (no fsync — Q16
    tracks the policy if it ever bites)
  session:close()
  history.load(path) -> meta, turns | (nil, err)
    skips unparseable lines (e.g. partial trailing write from a crash);
    distinguishes the meta-header line from role/content turn lines
  history.list_sessions(dir) -> [basename, ...]
    sorted (ISO 8601 names lex-sort chronologically); no mtime / turn
    counts in Phase 1 — that's a Phase 4 :sessions UI concern

Smoke:
  - open, append 3 turns, close, list_sessions sees 1 file
  - load returns meta (model="fast") and 3 turns in order
  - corrupt tail (partial JSON line appended) is silently skipped on load
  - reopen with different meta does NOT duplicate the header line

Repl wiring (`:save`, `:resume`, `:sessions`, auto-write on quit) lands
in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:19:21 +00:00
marfrit a722f576ac repl + renderer: streaming assistant output (Phase 1)
repl.ask_ai now drives broker.chat_stream and pumps each delta into
renderer.assistant_delta(delta) as it arrives. renderer.assistant_flush
is called when the stream ends to add a trailing newline if missing.
The full reassembled response is then handed to executor.extract_cmd_lines
for the CMD: confirm-and-execute path (unchanged from Phase 0).

renderer.assistant() is kept for non-streaming callers (none in tree
right now, but cheap to keep around). assistant_delta/flush share no
state with assistant(); they use a module-local stream_buf that tracks
the in-progress streamed block.

Q12 deferred: incremental CMD: highlighting (cursor-positioning re-
render on flush) is not implemented in Phase 1 — deltas emit raw. The
§6 CMD: marker is still extractable on the reassembled string post-
stream, which is what executor cares about. Renderer's bold+cyan
treatment for CMD: lines stays available via M.assistant().

Broker error / SSE-framed api-error path still pops the user turn and
restores ctx.pending_exec_output. Order: assistant_flush always runs
(even on error) so the cursor lands on a fresh line before the broker-
error status renders.

Live verification: `Count one to ten` against hossenfelder fast streams
deltas through to stdout incrementally; CMD: extraction works on the
reassembled string; confirm gate intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:17:27 +00:00
marfrit e46a5c385d broker: chat_stream over post_sse; chat is now a buffering wrapper
Phase 1 streaming consumer per PHASE1.md §3.

  broker.chat_stream(model_cfg, messages, on_delta) -> true | (nil, err)
  broker.chat(model_cfg, messages)                  -> content | (nil, err)
                                                       (now a thin buffer over
                                                        chat_stream)

The HTTP shape unifies on stream:true. on_event from ffi/curl.post_sse
decodes each event's JSON, extracts choices[1].delta.content, and calls
on_delta(content) for non-empty string deltas. The `[DONE]` sentinel is
filtered. SSE-framed error envelopes ({"error":{"message":...}} arriving
as data:) surface as "api: ..." errors.

build_request is factored out so chat_stream and (future) any
non-streaming consumer share URL/body/header construction.

Live verification against hossenfelder fast preset:
  - chat_stream("Count one to five..."): 9 incremental deltas streamed
    token-by-token, assembled to "1 2 3 4 5"
  - chat("Reply with exactly: pong"): "pong" returned via buffer

Error envelope path is correct by inspection but not exercised live —
hossenfelder passes through bogus model names rather than rejecting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:16:07 +00:00
marfrit 2e36381576 ffi/curl: SSE streaming via post_sse — incremental data: events
Phase 1 streaming substrate per PHASE1.md §4.

  curl.post_sse(url, body, headers, on_event, timeout_ms)
    -> true | (nil, errmsg)

Reuses the Phase 0 WRITEFUNCTION hook. Each chunk delivery accumulates
into a per-request buffer; the buffer is drained for complete events
(\n\n-terminated). Each event's `data: ...` field(s) are joined per the
SSE spec and passed to on_event(data_string) synchronously. `:` comment
lines (keepalives) are filtered.

The `[DONE]` sentinel is passed through to on_event as-is (broker.lua
filters it — this module stays HTTP-layer only, no JSON / OpenAI shape
knowledge).

Two robustness items:
  - End-of-stream flush: the final event may lack \n\n if the server
    closes-on-EOF immediately after the last data: line (some llama.cpp
    builds, plain HTTP/1.0 close-on-EOF feeds). Post-perform, any
    remaining buffer is parsed as one last event.
  - FAILONERROR: a non-2xx response surfaces as a CURLcode error rather
    than silently feeding the error body into the SSE parser.

Smoke:
  [1] canned events via nc listener: 3 events parsed in order
  [2] chunk-split mid-event ("Hel" + sleep + "lo..."): correctly
      reassembled across two WRITEFUNCTION deliveries
  [3] LIVE against hossenfelder.fritz.box:8082 fast preset with
      stream:true: response "pong" assembled from incremental deltas;
      4 raw events (role + 1 content + finish_reason + [DONE])

Next: broker.lua chat_stream that decodes the OpenAI delta shape on
top of this and exposes on_delta(content_string) for renderer streaming.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:14:54 +00:00
marfrit ee4d7f86d6 executor: swap popen+sentinel for pty.spawn (Phase 1)
Replaces the Phase 0 io.popen + sentinel-echo exit-code recovery with
forkpty + waitpid via ffi/pty. The §7 amendment paragraph on PHASE0.md
is rewritten to point at PHASE1.md §5 — the workaround is gone, not
just renamed.

User-visible behavioral changes:
  - Interactive commands (vim, less, htop, top) now work via $cmd /
    :exec / known-command shell paths because the child has a real
    PTY for line discipline.
  - Exit codes are accurate: `false` -> 1, `exit 7` -> 7, signal kill
    -> 128+N (bash convention), shell parse error -> sh's 2.
  - Broken-shell-syntax cmd now shows the actual sh diagnostic
    (e.g. "Syntax error: end of file unexpected") instead of Phase 0's
    "(no output — possible shell parse error)" guess.
  - Output normalization: PTY emits CR LF; executor collapses \r\n
    -> \n to keep the Phase 0 contract ("output uses \n separators").

Code path:
  pty.spawn(cmd) -> drain master_fd until EOF
                 -> wait() returns ("exit", N) | ("signal", N) | ...
                 -> exit_code mapped: exit -> N, signal -> 128+N, else -1

Phase 0 invariants intact: `cd` interception unchanged (still libc.chdir
per §3 + §7), `CMD: ` extraction unchanged.

PHASE0.md §7: the "LuaJIT 2.1 popen-close caveat" paragraph is rewritten
to "Superseded by Phase 1" — points at PHASE1.md §5 for the live model.
The illustrative sketch is left in place as historical context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:08:27 +00:00
marfrit 10d2fc5ac1 ffi/pty: forkpty-backed spawn + session handle
Phase 1 PTY substrate per PHASE1.md §5. Replaces Phase 0's io.popen
sentinel-echo path with a real PTY so interactive cmds (vim, less,
htop) work and exit-status comes from waitpid instead of parsing a
sentinel out of stdout.

API:
  pty.spawn(cmd) -> session | (nil, err)
  session:read(count)   -> (data, n)   ; n == 0 means EOF
  session:write(data)   -> bytes
  session:close()                       ; closes master_fd; child gets SIGHUP
  session:wait(options) -> (kind, val)  ; "exit"/"signal"/"other"/nil
  session:signal(sig)   -> ok           ; kill(pid, sig)

Child branch execs `/bin/sh -c cmd`, preserving Phase 0's shell-
interpretation semantics (quoting, redirection, pipes still work).
The PTY makes vim/less/htop functional because the child gets a real
tty for line discipline instead of a pipe.

Loader uses the versioned-soname fallback idiom (util / util.so.1 /
util.so.0) so a runtime-only host without libutil-dev works.

Smoke covers: echo hello (exit 0), false (1), exit 7, bogus binary
(sh's 127), multi-line printf, cat bidirectional (write ping -> read
echo+cat output -> close master -> child exits via SIGHUP).

Next: executor.lua swap from popen+sentinel to pty.spawn. That commit
also retires the §7 amendment paragraph (no longer needed once popen
is gone).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:03:19 +00:00
marfrit 113f87125a ffi/libc: phase 1 syscalls — waitpid + raw fd I/O + kill
Extends Phase 0's chdir/errno/strerror with the syscalls that ffi/pty
needs to drive a forkpty'd child: waitpid (with WIFEXITED / WEXITSTATUS
/ WIFSIGNALED / WTERMSIG decoders), read, write, close, kill.

Status-word macros are reproduced from glibc bits/waitstatus.h using
the LuaJIT `bit` library. M.waitpid returns a structured (kind, value)
rather than the raw status word — callers don't have to know the
encoding:
  "exit",   N    — normal exit, N is exit code
  "signal", N    — killed by signal N
  "other",  raw  — stopped/continued (Phase 1 doesn't trace those)
  nil, err       — syscall failure

M.read / M.write / M.close / M.kill mirror their syscall return shape
with errno-string surfacing on failure. Read uses a shared 4 KiB
buffer for the common case; larger reads allocate a fresh buffer.

Smoke covers the chdir regression (still works), all four status
decoders against known status words, pipe round-trip for read/write/
close, EOF -> ("", 0), invalid-fd close -> false, kill(self, 0)
success, kill(bogus, 0) failure.

waitpid is not exercised by the smoke (needs a real child); that
arrives with ffi/pty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:58:35 +00:00
marfrit 539408f480 phase1 formulate: scope, tech decisions, module changes, open questions
Inner-loop Phase 1 (formulate) deliverable for the milestone Phase 1 of
the aish project. Drafts docs/PHASE1.md to specify what lands on top of
the Phase 0 substrate — no code changes, no §3 invariant amendments.

Phase 1 milestone scope per PHASE0.md §11:
  1. SSE streaming via libcurl FFI (existing WRITEFUNCTION hook)
  2. PTY-backed exec via forkpty(3); replaces popen + retires the §7
     sentinel exit-code workaround in favor of waitpid
  3. Session persistence as append-only JSONL under
     <config.history.dir>/sessions/<utc>.jsonl
  4. Readline custom bindings (rl_bind_keyseq); Phase 1 reserves \C-n
     as a no-op for Phase 3's Norris consumer

Module growth (no new file names beyond the §4-stubs):
  ffi/curl     -> M.post_sse(url, body, headers, on_event)
  ffi/pty      -> M.spawn / read / write / close / wait
  ffi/libc     -> waitpid + WEXITSTATUS + tcgetattr/tcsetattr
  ffi/readline -> M.bind(seq, fn)
  broker       -> M.chat_stream; M.chat becomes a buffering wrapper
  executor     -> PTY path; sentinel hack deleted
  repl         -> :save, :resume <name>, :sessions; streaming render
  renderer     -> assistant_delta + assistant_flush
  history      -> open / append / load / list_sessions

Open questions Q11–Q16 (six new) tracked in §10:
  - SSE shape uniformity across OpenRouter routes (Q11, Phase 7)
  - CMD: highlight-on-stream strategy (Q12, plan phase)
  - tty raw-mode recovery on Lua error (Q13, plan phase)
  - bind \C-n now or defer to Phase 3 (Q14, plan phase)
  - :resume into non-empty context (Q15, plan phase)
  - session-log fsync policy (Q16, default close-only; tracked)

Next inner phase is "analyze": for each module change, identify
dependencies + risks + per-commit ordering. Then baseline (capture
Phase 0 behaviors we want to preserve), plan, review, implement, verify,
memory-update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:56:20 +00:00
marfrit 16490e6905 fix: buffer exec output for next user turn; alternation for strict templates
User-test surfaced the bug: with `deep` (mistral-nemo-12b) active,
running `list files` -> y on `CMD: ls` -> `Are there directory entries
beginning with "lor"?` returned a Jinja exception:

    api: ... Error: Jinja Exception: After the optional system message,
    conversation roles must alternate user/assistant/user/assistant/...

Cause: §6 specified "exec output injected into context uses role 'user'
with a prefix tag '[exec output]'." This works for permissive templates
(qwen2.5-coder-1.5b, the `fast` preset) but produces a back-to-back
user/user pair on strict templates that enforce the OpenAI alternation
contract — `[exec output]` user turn followed by the user's actual
follow-up question.

Fix:

context.lua:
  - new field `pending_exec_output` (initially nil)
  - new method `:append_exec_output(out)` buffers (concat on subsequent
    captures so multi-shell-then-ai still merges everything)
  - new method `:append_user(content)` flushes buffered exec output as
    a `[exec output]\n...\n\n` prefix and appends a user turn
  - `:reset()` also clears the buffer

repl.lua:
  - run_shell calls ctx:append_exec_output(out) instead of
    ctx:append({role="user", content="[exec output]\n"..out})
  - ask_ai calls ctx:append_user(text) instead of raw :append; saves
    prev_pending so a broker error can restore the buffer for retry

PHASE0.md §6:
  - amended the role-injection paragraph to describe the buffer-and-
    prepend policy; the §3 invariants list is untouched (this was a §6
    design detail, not a locked invariant)

Verification:
  - context unit tests cover: alternation after the failing sequence,
    multi-shell merge, reset clears buffer, broker-error retry path
  - live reproduction against `deep` (mistral-nemo) of the exact
    user-reported sequence succeeds; model responds with a sensible
    `CMD: ls | grep '^lor'` instead of a Jinja exception

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:41:21 +00:00
marfrit 8870eb0451 config: route all presets through hossenfelder per issue #12
Resolves issue #12 by partial-accept of the recommendation.

What landed:
  - Single broker URL: http://hossenfelder.fritz.box:8082 for all three
    presets (fast / deep / cloud). Server-side model-aware routing; no
    client-side cloud auth (proxy holds the OpenRouter bearer).
  - Models from hossenfelder's /v1/models inventory:
      fast  -> qwen2.5-coder-1.5b-q4_k_m.gguf  (boltzmann local)
      deep  -> mistral-nemo-12b-instruct        (boltzmann local)
      cloud -> anthropic/claude-haiku-4.5       (OpenRouter route)
  - `cloud` was already pointing at hossenfelder but with https://; flipped
    to http:// so it matches the proxy's actual scheme.

What deferred:
  - Schema rename `models` -> `brokers` (and the 5-cloud-preset shape
    suggested in #12) — would touch repl.lua + broker.lua. Not blocking
    Phase 7. If multi-preset becomes useful in practice, file a separate
    issue for the rename then.

Phase 7 verification (live broker test):
  - broker.chat(fast, [user="say pong"]) -> "CMD: echo pong" in ~3s
  - multi-turn arithmetic (7*8=56, *2=112) preserved across turns

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:06:08 +00:00
marfrit a76ff664b3 phase0 amendment: §3/§7/§10 close review-surfaced manifest gaps
Three additions to PHASE0.md, all surfaced by the Phase 5 review of
the Phase 0 implementation. No invariant changes; manifest now matches
implementation reality.

§3 — FFI loader fallback paragraph. ffi.load("name") needs the
unversioned `libname.so` symlink that comes with the -dev package.
Phase 0 loaders try unversioned first then versioned sonames so
runtime-only hosts (no -dev) work as-is. Documents the actual
behavior in ffi/readline.lua and ffi/curl.lua.

§7 — LuaJIT 2.1 popen-close caveat paragraph. The §7 sketch had been
showing Lua 5.2's three-return io.popen():close() shape; LuaJIT 2.1
follows the Lua 5.1 ABI and returns just `true`. Phase 0 recovers
the exit status with a sentinel echo (`echo __AISH_EXIT_<tag>__$?`).
Phase 1 PTY+waitpid replaces the hack and the sketch becomes
accurate. Sketch left as-is (it's the right shape conceptually);
caveat now explicit.

§10 — cwd-relative package.path note. Phase 0 prepends `./?.lua;
./vendor/?.lua`, so aish must run from the repo root. Cwd-independent
resolution is a later concern. Also clarifies that --config is strict
(no fallback if the path is unopenable) — matches main.lua post the
review-followup commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:44:20 +00:00
marfrit abc993aa49 review followup: empty-input guards, ~/ symmetry, CMD: filter
Addresses three concerns + one nit from the Phase 0 review pass.

executor.lua:
  - M.exec guards empty / whitespace-only cmd up front, returns
    "(empty command)" / -1 instead of running the wrapper on nothing.
  - On sentinel-parse failure with empty output (typical of shell
    parse errors — the syntax error itself escapes to the popen
    parent's stderr because 2>&1 is inside the unparsable subshell),
    surface "(no output — possible shell parse error)" rather than
    a silent empty frame.
  - extract_cmd_lines now skips whitespace-only / empty bodies; a
    bare `CMD: ` line in assistant output no longer turns into an
    "execute ''? [y/N]" prompt.
  - "what" comments cleaned in maybe_chdir.

router.lua:
  - path_like now matches `~` and `~/foo` so `~/scripts/build.sh`
    classifies as shell (was: ai). Restores symmetry with executor's
    maybe_chdir, which already expands `~` on `cd`.

repl.lua:
  - :exec and :ask trim args and renderer.status a usage line on
    empty rather than running an empty cmd / sending an empty turn
    to broker.

Regression: full prior smoke suite still passes — known_commands
shell paths, all maybe_chdir branches, CMD: extraction with non-empty
bodies, exec exit-code recovery, all router branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:41:35 +00:00
marfrit a18e530c03 main: --config/--help arg parsing, vendor on package.path, REPL start
Phase 0 entry point per PHASE0.md §4, §10.

Resolves the §10 config search:
  --config <path>          (explicit; failure if not openable, no fallback)
  $AISH_CONFIG
  ~/.config/aish/config.lua
  ./config.lua

The explicit form now hard-fails instead of silently falling through to
the next candidate — caught in smoke (`--config /nonexistent` was loading
./config.lua).

Pre-pends `./?.lua;./vendor/?.lua` to package.path so `require("dkjson")`
finds vendor/dkjson.lua and project requires resolve from the repo root.
Run from the repo root; cwd-independent resolution lands later.

`--help` prints the usage block. Unrecognized arg exits 2 with a
diagnostic on stderr.

Phase 0 done-criteria (PHASE0.md §2):
  ✓ shell command execution with framed output
  ✓ :meta commands (full §5.2 set)
  ✓ in-memory conversation history with sliding-window eviction
  ✓ codebase layout matches §4 — every module name stable for Phase 1+
   live AI exchange — structurally wired; live test deferred per
     issue #12 (broker endpoint hostname not resolvable from noether)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:12:25 +00:00
marfrit e0e69f839b repl: readline loop, dispatch, all Phase 0 meta commands
Phase 0 implementation per PHASE0.md §5, §9.

Wires the lower-half modules into a single REPL:
  ffi/readline -> input + history
  router       -> classify(line) -> meta/shell/ai
  executor     -> run_shell with cd interception, frame output, capture
  broker       -> ask_ai, then extract+confirm CMD: lines from response
  context      -> turn list + eviction; status line on evict
  renderer     -> assistant text + exec frame + status

Prompt format `[aish:<model>]> ` per §9.

Meta commands all wired (§5.2): :quit/:q, :clear, :reset, :model <name>,
:models, :history, :exec <cmd>, :ask <text>, :help. Unknown meta names
report via renderer.status rather than crashing.

End-of-input (Ctrl-D on empty line) breaks the loop cleanly. Empty /
whitespace-only lines are skipped silently before dispatch — router
would otherwise classify them as ai with empty payload and pollute
context.

`CMD: ` extraction + confirm-and-execute is wired: when broker returns
an assistant turn, the response is scanned for §6 CMD: lines; each is
prompted via readline ("execute '...'? [y/N]") when config.shell
.confirm_cmd is true (default), else auto-executed.

On broker error, the user turn just appended is popped so the context
isn't polluted with a turn that has no assistant response.

Smoke covers :help, :models, shell exec via known_commands allowlist,
and Ctrl-D break. Live broker exchange deferred per issue #12.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:17:40 +00:00
marfrit f22a3b33c8 renderer: assistant text, exec output frame, status line
Phase 0 minimal output formatting per PHASE0.md skeleton.

  M.assistant(text)              — line-by-line; `CMD: ` lines bold+cyan
  M.exec_output(output, code)    — top/bottom rules; exit code on closing
                                   rule (red on non-zero)
  M.status(line)                 — dim "[aish] ..." single-liner

ANSI table is local to the module (no external dep). Trailing-sentinel
pattern ((text..\"\\n\"):gmatch(\"([^\\n]*)\\n\")) preserves blank lines
in assistant output rather than squashing them, at the cost of one
extra trailing newline — acceptable for Phase 0. Real syntax-aware
formatting (tree-sitter) lands in Phase 6.

Smoke verifies escape codes are emitted (od -c shows \\033[1m\\033[36m
around CMD: line) and the visual layout looks right.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 14:42:56 +00:00
marfrit f9f8b0370c broker: blocking POST /v1/chat/completions via ffi/curl + dkjson
Phase 0 implementation per PHASE0.md §6.

  M.chat(model_cfg, messages) -> content_string | (nil, errmsg)

Builds the OpenAI-compat JSON body:
  { model, messages, stream: false, temperature: model_cfg.temperature ?? 0.2 }

Sends Content-Type and (optionally) Authorization Bearer pulled from
model_cfg.key_env's process environment. Default timeout 60s; overridable
per-model via model_cfg.timeout_ms.

Error surfaces split:
  "transport: ..."  curl-side (TCP/TLS/timeout)
  "decode: ..."     non-JSON response body
  "api: ..."        OpenAI-style { error: { message } } envelope
  "broker.chat: no choices[1].message.content..."  shape miss

Tested against four canned mock responses (nc -lN listener feeding
HTTP/1.0 + Connection: close so EOF terminates the body): happy path,
api error envelope, raw-text non-JSON, empty choices[]. The on-wire
request body verified as well: POST path, headers, model/messages/
temperature/stream JSON.

Live test against a real llama.cpp/hossenfelder endpoint deferred per
issue #12 (broker endpoint configuration).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 14:10:00 +00:00
marfrit 91187d2302 router: classify(line, config) -> (kind, payload)
Phase 0 implementation per PHASE0.md §5.

Pure function. Three kinds:
  "meta"  — line starts with ":", payload is the rest
  "shell" — line starts with "$" (override, $ stripped), OR first word
            is in config.shell.known_commands, OR first word is
            path-like (`./`, `../`, `/`)
  "ai"    — everything else (including empty / whitespace-only; the
            repl loop skips empty payloads before dispatching)

Path-like detection is deliberately conservative in Phase 0: anchored
prefixes only, no quoted-path or shell-glob handling. Q4 in §13 tracks
multi-command CMD: blocks; this router doesn't see those (it only
classifies user input lines, not assistant output).

Smoke covers all branches plus a nil-config fallthrough.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 12:05:25 +00:00
marfrit 5fb4023c55 executor: io.popen wrapper, cd interception, CMD: extraction
Phase 0 implementation per PHASE0.md §6, §7.

  M.exec(cmd)              -> (output, exit_code)
  M.maybe_chdir(cmd)       -> nil | true | false, errmsg
  M.extract_cmd_lines(text)-> { "ls -la", "echo hi", ... }

Two non-obvious bits:

1. LuaJIT 2.1's io.popen():close() follows the Lua 5.1 ABI and returns
   only `true` — no child exit status. The §7 manifest sketch assumes
   Lua 5.2's three-return form, which doesn't apply here. Recover the
   exit code by appending `; echo __AISH_EXIT_<tag>__$?` after the
   command and parsing the sentinel-prefixed integer back out. Phase 1
   replaces this with waitpid via libc FFI when PTY support lands.

2. `cd` interception is a §3 invariant: must not delegate to popen
   (popen forks; a child cd evaporates). maybe_chdir parses the line,
   ~ expands, calls libc.chdir, returns success/failure separate from
   "not a cd" (nil) so the caller can distinguish.

CMD: extraction is anchored at start-of-line per the §3 "exact prefix,
single space" invariant — leading whitespace before CMD: does not match.

Smoke covers: echo capture (code=0), failed ls (code!=0), `false`
(code=1), multi-line output preserved, all maybe_chdir branches
(non-cd / bare / explicit / ~ expansion / failure), CMD extraction
including the leading-whitespace-rejection case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 12:03:19 +00:00
marfrit 10848645af context: in-memory turn list + max_turns sliding-window eviction
Phase 0 implementation per PHASE0.md §6, §8.

Context.new(opts) constructs with the §6 default system prompt (the
`CMD: ` extraction contract is hard-coded in there per §3 — locked
substrate, do not edit). opts overrides: system_prompt, max_turns
(default 40), token_budget (default 4096; visibility only in Phase 0
per Q1, deferred to Phase 3 for accurate tokenization).

API:
  ctx:append({role, content})    record a turn
  ctx:to_messages()              [{system,...}, ...turns] for broker.chat
  ctx:enforce_budget()           evict pairs (user+assistant) until
                                 #turns <= max_turns; returns count
  ctx:estimate_tokens()          char/4 heuristic
  ctx:reset()                    drop all turns (system_prompt kept)

System prompt is the §6 phrasing verbatim including the `CMD: ` clause
— stored on the context, NOT in self.turns, so it is prepended freshly
on every to_messages() call.

Smoke covers basic ops, no-evict-at-max, evict-on-overflow, bulk
eviction (14 turns -> 4), reset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:59:25 +00:00
marfrit 5fd7c7ac63 ffi/curl: blocking POST with header list and response capture
Phase 0 binding per PHASE0.md §6. M.post(url, body, headers, timeout_ms)
uses CURLOPT_{URL, POST, POSTFIELDS, HTTPHEADER, WRITEFUNCTION, NOSIGNAL,
TIMEOUT_MS, USERAGENT} on a fresh easy handle, capturing the response
into a Lua string via a closure-based WRITEFUNCTION callback.

curl_easy_setopt is variadic; LuaJIT's variadic FFI dispatch needs
ffi.new() per argument otherwise. Pre-cast to three concrete signatures
(long / void* / const char*) bypasses that — cleaner and matches the
lua-curl idiom.

Robust loader: tries `curl`, `curl.so.4`, `curl-gnutls.so.4` so a
runtime-only host (no libcurl-dev installed) just works. Same idiom
as ffi/readline.

Smoke against a local nc listener: request was correctly framed
(POST path, Content-Type + X-Test headers, Content-Length matches
JSON body length) and the canned response was captured into the
returned Lua string.

SSE streaming for Phase 1 reuses this same WRITEFUNCTION hook —
chunks arrive incrementally, the closure consumes them as they come.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:54:36 +00:00
marfrit c9116c9bbf ffi/readline: blocking readline() + add_history(), nil on EOF
Phase 0 binding per PHASE0.md §9. M.readline(prompt) returns the line
as a Lua string (the C buffer is freed via libc free immediately after
ffi.string copies it) or nil on EOF. M.add_history skips empty lines.

Loader handles the case where libreadline-dev's unversioned
`libreadline.so` symlink isn't installed — falls through to
`readline.so.8` (current Debian/Arch ALARM) and `.so.7` (older)
before giving up. This trips on noether-the-LXD: only the runtime
package is present.

Smoke (stdin from heredoc, two lines + EOF):

  p1> hello world  -> "hello world"
  p2> second line  -> "second line"
  p3>              -> nil (EOF)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:44:38 +00:00
marfrit fd63dff65e ffi/libc: implement chdir, errno, strerror
Smallest Phase 0 module per CLAUDE.md §4 implementation order.
M.chdir(path) returns (true) or (false, errmsg) — errmsg via
strerror(__errno_location()[0]). Glibc errno is thread-local
behind __errno_location() rather than a plain global, hence the
indirect access.

Verified against PHASE0.md §7 expectation: a libc.chdir() persists
across subsequent io.popen() calls (popen's child inherits the
parent's wd), which is the property executor.lua relies on for `cd`
interception. Smoke:

  libc.chdir("/tmp"); io.popen("pwd"):read("*l")  --> /tmp

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:35:17 +00:00
marfrit 2704edd57d phase0 amendment: vendor dkjson 2.8 under vendor/
Captures the JSON-library decision noted as open in CLAUDE.md §6.
dkjson is pure Lua (preserves §3's "no compiled extensions" invariant),
single file, redistributable (MIT/X11). Sourced from Debian's `lua-dkjson`
package (/usr/share/lua/5.1/dkjson.lua, version 2.8) — Debian's curated
copy of the upstream at dkolf.de.

Vendoring (rather than relying on a system lua-dkjson install) keeps
aish self-contained per the §3 "no luarocks packages" invariant: any
host with luajit can run the tree as-is.

PHASE0.md §3 grows one row recording the choice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:30:16 +00:00
marfrit 7b5d58686e docs: codify contribution flow — issues for features, PRs for review
Captures two carve-outs to aish's "non-PR-flow repo" default:

- Feature requests and bugs go to git.reauktion.de/marfrit/aish/issues
  rather than direct-implement-in-band. Tag `architecture` for cross-
  phase concerns. Aligns with the fleet-wide bug-filing convention from
  the `his` cheatsheet; this row extends it to features for aish.
- Review-required iteration opens a PR (authored as claude-<host>,
  marfrit reviews, self-approval forbidden). PR #1 was the precedent.

Both are opt-in; direct-to-main remains the default for autonomous
work that doesn't need a feedback loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:04:51 +00:00
marfrit fcfc23eef2 phase0 review: tighten phase 2 row + add Q9, Q10, sharpen Q6 (#1) 2026-05-10 11:00:35 +00:00
claude-noether e1d1931006 phase0 review: tighten phase 2 row + add Q9, Q10, sharpen Q6
Captures three findings from the review of 013c625 ("phase0 amendment:
insert MCP phase 2"). Opening as a PR rather than direct-to-main: the
non-PR-flow convention works fine for autonomous work, but feedback-
required iteration needs a readable medium that isn't the Claude Code
transcript.

§11 phase 2 row: spell out two scope items the original row left implicit —
the system-prompt rewrite to declare the tools schema (Phase 0's `CMD:`
contract is hard-coded into the prompt) and `safety.lua` extension to
gate tool calls (per Q8).

§13 Q6: explicit note that choosing "retire `CMD:`" requires a §3
invariant amendment in the same commit — keeps the substrate-vs-phase
boundary honest. Adds (§3 if retiring) to the impact column.

§13 Q9 (new): MCP system-prompt augmentation locus — static block in
broker.lua / per-request assembly from connected servers / hybrid.
Real architectural call with token-cost tradeoff per option.

§13 Q10 (new): tool-call streaming vs the Phase 1 SSE substrate —
phase-ordering question. Either Phase 2 lands on the blocking Phase 0
broker and refits when SSE arrives, or Phase 1 SSE moves before MCP
so tool-call deltas stream from day one.
2026-05-10 06:06:14 +00:00