# aish — Phase 2 Manifest **Project:** aish — AI-augmented conversational shell **Document:** Phase 2 Requirements, Architecture & Design Decisions **Status:** Analyze (formulate complete; live-probed against lmcp v0.5.4 + hossenfelder proxy) **Date:** 2026-05-12 PHASE0.md is the locked substrate; PHASE1.md is layered on top. This manifest specifies what Phase 2 adds. Section numbers reference back to PHASE0.md / PHASE1.md where relevant. --- ## 1. Scope of Phase 2 Three pillars per PHASE0.md §11 row 2: 1. **MCP client** (`mcp.lua`) — JSON-RPC 2.0 over HTTP+SSE transport. Target reference implementation: `lmcp`. Operations needed for v1: `initialize`, `tools/list`, `tools/call`. Multiple servers may be connected concurrently; tools are namespaced `.`. 2. **Tool-calling protocol bridge** — the broker sends OpenAI-compatible `tools` in the request body; the model emits `tool_calls` in the response; `mcp.lua` dispatches each call to the right server; the tool result is fed back as a `role:"tool"` turn in `context.lua` and the chat continues. 3. **Authorization gate** — `safety.lua` (PHASE0.md §4 stub) finally gets implemented. Every tool call is confirmed by the user by default, with per-tool and per-server `auto_approve` policies in `config.lua`. **Phase 2 is done when:** - aish can connect to at least one local `lmcp` server declared in `config.lua` and one connected via `:mcp connect ` at runtime. - `:mcp list` shows connected servers; `:mcp tools` shows discovered tools across all servers. - A model conversation can invoke a tool: the broker request carries the live tools schema; the response's `tool_calls` are confirmed by the user; each call dispatches to the right MCP server; the result re-enters the chat; the model continues with the result available. - `CMD:` extraction (PHASE0.md §6 substrate invariant) still works unchanged — Phase 2 is additive, not replacing. - A tool with `auto_approve = true` (in config) executes without the confirm prompt; a non-approved tool still prompts. --- ## 2. Technology Decisions (delta from Phase 1) | Decision | Choice | Rationale | |---|---|---| | MCP transport | HTTP POST per RPC, `Connection: close` per response, **no long-lived SSE GET channel** in v1 | Analyze finding (2026-05-12): lmcp v0.5.4 only implements the trivial POST-and-respond flavor of the spec's streamable-HTTP transport. Its GET /mcp endpoint announces the POST endpoint then closes — there's no server→client notification channel to listen on. Combined with lmcp's `capabilities.tools.listChanged = false`, aish doesn't need an SSE GET listener at all for lmcp. Stdio transport is left for a possible Phase 2.1 if a stdio-only MCP server becomes necessary. | | MCP protocol version | `2025-03-26` (confirmed by live probe of boltzmann:8080/mcp) | lmcp pins this in `MCP_VERSION`. aish sends the same in `initialize`; future model bumps are negotiated at connect time. | | MCP auth | Bearer token via `Authorization: Bearer ` header, per-server | Analyze finding: every lmcp deployment in mfritsche's fleet (boltzmann/hertz/pve*/nc/etc.) requires Bearer auth. Phase 2 config supports `auth_token` literal and `auth_env` env-var indirection per server (mirrors `key_env` in the models registry). lmcp servers without auth (broglie/higgs LAN-only) just leave the field nil. | | Tool-call wire format | OpenAI `tools` field on `/v1/chat/completions` body; `tool_calls` on assistant deltas; `role:"tool"` turn with `tool_call_id` for results | Standard, supported by llama.cpp and OpenRouter. Aligns with the existing `/v1/chat/completions` substrate invariant. | | Tool namespacing | `.` for both the wire-level tool name and `:mcp tools` listing | Avoids name collisions across servers. The alias comes from the config key or the connect URL hash. | | `CMD:` coexistence with tool-calls | Both stay live, no policy preference. Substrate invariant §3 unchanged. | Resolves Q6 (see §10). `CMD:` is the local-shell route; MCP tools are structured-API routes; they serve different purposes. Future phases (Norris, Phase 3) may prefer tools when both are available, but Phase 2 doesn't enforce. | | Authorization default | Per-call confirm (mirrors PHASE0.md §10 `confirm_cmd` for shell) | Conservative default; user can opt into auto-approval per tool or per server via config. Resolves Q8. | | System prompt augmentation | Hybrid: static frame in `broker.lua` system prompt + dynamic `tools` array in the request body | Tool list goes in the API field where it belongs; the system prompt only mentions that tools exist and how to use them. Per-request body cost is bounded (tools change rarely; small schemas). Resolves Q9. | | Tool-call streaming | Streaming-from-day-one — `broker.chat_stream`'s on_delta callback widens to handle `tool_calls` deltas in addition to text deltas | Resolves Q10. Phase 1 SSE landed first, so we're not retrofitting; we just extend the parser. **Wire shape confirmed at analyze** (2026-05-12 probe vs hossenfelder): `delta.tool_calls[]` arrives indexed; id+type+function.name appear on the opening delta; `function.arguments` is a JSON-string that arrives in character-fragment chunks; finish_reason "tool_calls" closes the call. Accumulator strategy matches §5. | | Tool-call concurrency | Sequential dispatch in Phase 2 v1 — process `tool_calls[0]` to completion, then `[1]`, etc. | Simpler error handling; tool effects often order-dependent (e.g. write-then-read). Parallel dispatch deferred (see Q20). | | MCP server lifecycle | aish does not manage MCP server processes (parallel to PHASE0.md §12 llama.cpp rule) | Declared in config or connected by URL; aish is a client only. | --- ## 3. Module Changes | File | State after Phase 1 | Phase 2 changes | |---|---|---| | `mcp.lua` | **New file** (not in PHASE0 §4 layout; this Phase amends the layout to add it) | Implement: `M.connect(url, opts) -> session` (opts: `alias`, `auth_token`, `auth_env`), `session:initialize()`, `session:list_tools() -> [{name, description, schema}]`, `session:call_tool(name, args) -> result | tool_error`, `session:close()`. JSON-RPC 2.0 over HTTP POST (`Content-Type: application/json`, `Accept: application/json`, `Authorization: Bearer `). Per-session state: alias, base-url, auth, tools-cache, request-ID counter. No persistent SSE channel — POST is one-shot per RPC. | | `safety.lua` | Stub | Implement Phase 2 surface only: `M.confirm_tool_call(tool_name, args, policy) -> bool`. Reads `config.mcp.auto_approve` (per-tool and per-server) before prompting. Norris destructive-op heuristic and HALT gate stay Phase 3. | | `broker.lua` | Streaming `chat_stream(cfg, msgs, on_delta)` | Request body grows `tools = mcp.tools_schema()` (assembled from all connected sessions). on_delta callback widens to `on_delta(kind, payload)` where `kind ∈ {"text", "tool_call"}`; tool_call payload includes id+name+arguments-delta. `M.chat` wrapper updates to buffer both. | | `context.lua` | turns = {{role, content}, ...} + `pending_exec_output` | New role: `"tool"`. Assistant turns may carry `tool_calls = [{id, name, arguments}]`. `to_messages()` flattens these into OpenAI-shape messages. Alternation rules: assistant-with-tool_calls is followed by N tool turns (one per call), then assistant text. | | `repl.lua` | meta cmds + ask_ai stream loop | After ask_ai sees `tool_calls`, enter a tool-execution sub-loop: confirm-gate each call via `safety.confirm_tool_call`, dispatch via `mcp.session:call_tool`, append tool turn to context, re-issue the broker request. Loop until assistant emits text without tool_calls. New meta: `:mcp connect [alias]`, `:mcp list`, `:mcp tools`, `:mcp disconnect `. | | `renderer.lua` | streaming text + exec frame | Add `tool_call_begin(name, args)`, `tool_call_end(result, ok)`. Visual style: indented, dim, parallel to the exec frame. | | `config.lua` | example with models/shell/context/history | Schema additions: `mcp = { servers = { alias = { url = "..." } }, auto_approve = { ["alias.tool"] = true } }`. Documented in §10 below. | | `ffi/curl.lua` | post + post_sse | **No additions in v1** — analyze finding ruled out the long-lived SSE GET channel for lmcp. Phase 1 `M.post` (already does sync POST with response capture) is sufficient for MCP JSON-RPC. | | `history.lua` | JSONL session log | Tool turns are logged like any other turn — `{role:"tool", tool_call_id:"...", content:"..."}`. Resume reconstructs them via `ctx:append` like user/assistant turns. | §4 module-layout amendment: `mcp.lua` slots between `broker.lua` and `router.lua` in the §4 table. Same commit lands the manifest amendment. --- ## 4. MCP Transport (analyze findings — lmcp v0.5.4) lmcp implements only the **synchronous POST** flavor of the MCP streamable-HTTP spec. Each RPC is one HTTP transaction: ``` client → server: POST /mcp Content-Type: application/json Accept: application/json Authorization: Bearer Body: { jsonrpc:"2.0", id, method, params } Returns: { jsonrpc, id, result | error } Connection: close ``` lmcp's `GET /mcp` exists but only sends a one-shot `event: endpoint` announcing the POST URL, then closes — there is no held-open server→client channel. Combined with the `listChanged: false` capability lmcp announces in `initialize`, **aish does not open a persistent SSE channel** to lmcp servers in v1. Notifications-from-server are out of scope here; track for v2 if a richer server appears. ### Handshake 1. `initialize` request: `{ protocolVersion: "2025-03-26", capabilities: {}, clientInfo: { name: "aish", version: "..." } }`. 2. Server response (lmcp): `{ protocolVersion: "2025-03-26", capabilities: { tools: { listChanged: false } }, serverInfo: { name, version } }`. 3. `notifications/initialized` POST (one-way; lmcp returns HTTP 202 with no body). ### Tool discovery 1. `tools/list` RPC → `{ tools: [{ name, description, inputSchema }] }`. 2. Cache per-session **for the session lifetime** — lmcp announces `listChanged: false`, so there's no need to refetch or listen for change notifications. ### Tool invocation `tools/call` with `{ name, arguments }`. lmcp distinguishes two failure modes: - **Tool-handler exception** → JSON-RPC `result` with `isError: true` and `content: [{ type:"text", text: "Error: ..." }]`. **Model-recoverable**: feed it back to the model as the next `role:"tool"` turn content and let it react. - **Unknown method or unknown tool** → JSON-RPC `error` with `code: -32601` ("Method not found" / "Tool not found"). **Transport level**: surface as an aish status, do not feed to model. This split resolves Q21. ### Lifecycle - Connect on startup (from `config.mcp.servers`) — best effort; failures are status-logged, don't abort aish. "Connect" here means: do the `initialize` round-trip + cache `tools/list` results. - `:mcp connect ` adds a session at runtime; alias auto-derived from hostname or supplied as second arg. - `:mcp disconnect ` drops cached state. There's no long-lived HTTP connection to close (every RPC was already `Connection: close`). - On aish quit, sessions are just forgotten — nothing to clean up server-side. --- ## 5. Tool-Call Bridge ### Broker request body (delta from Phase 1) ```json { "model": "...", "messages": [...], "stream": true, "temperature": 0.2, "tools": [ { "type":"function", "function": { "name":".", "description":"...", "parameters": } }, ... ] } ``` The `tools` array is assembled by `mcp.tools_schema()` — flattens `tools/list` results from every connected session, namespacing each tool as `.`. ### Response handling (streaming) llama.cpp / OpenAI deltas may include: ```json data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_…", "function":{"name":"alias.tool","arguments":"{\"a\":"}}]}}]} data: {"choices":[{"delta":{"tool_calls":[{"index":0, "function":{"arguments":"1}"}}]}}]} data: {"choices":[{"finish_reason":"tool_calls",...}]} ``` `broker.chat_stream` accumulates tool-call deltas keyed by `index`; the `arguments` field is a JSON-string that arrives chunked and is concatenated. On `finish_reason: tool_calls`, the accumulated calls are emitted to on_delta as `kind="tool_call"` with full payloads. ### Re-injection into context ```lua -- After tool execution ctx:append({ role = "assistant", content = "", -- or any model-emitted text tool_calls = { {id="call_…", name="alias.tool", arguments=} }, }) ctx:append({ role = "tool", tool_call_id = "call_…", content = , }) ``` `to_messages()` renders both shapes for the next broker request. The strict-alternation issue from PHASE0.md §6 (mistral-nemo Jinja) is handled differently here — tool turns ARE expected to follow assistant tool_calls per the OpenAI chat-template convention. If a model's template still rejects this shape, fall back to the `[tool: X]` prefix strategy used for exec output (Q18 below). ### Re-issuing the broker request After tool turns are appended, the broker is called again with the extended messages array. The model may emit more `tool_calls`, more text, or both. Loop until the response has no `tool_calls` (i.e. a plain text assistant turn). Budget: a max-tool-call-depth setting (default 8) prevents runaway loops. Hit-cap surfaces as a status: `[aish] tool-call depth limit reached`. --- ## 6. Authorization (safety.lua Phase 2 surface) ```lua -- safety.confirm_tool_call(tool_name, args_table, config) -> bool function M.confirm_tool_call(name, args, cfg) local policy = cfg.mcp and cfg.mcp.auto_approve or {} if policy[name] then return true end -- Per-server prefix check: "alias.*" entries local alias = name:match("^([^.]+)%.") if alias and policy[alias .. ".*"] then return true end -- Otherwise prompt local pretty = name .. "(" .. (#args > 0 and "..." or "") .. ")" local ans = rl.readline(("call '%s'? [y/N] "):format(pretty)) or "" return ans:lower():sub(1,1) == "y" end ``` Config schema (analyze-revised — Bearer auth fields added): ```lua mcp = { servers = { boltzmann = { url = "http://boltzmann.fritz.box:8080/mcp", auth_env = "BOLTZMANN_MCP_TOKEN", -- read from env at startup }, broglie = { url = "http://broglie.fritz.box:8080/mcp", -- no auth (LAN-only deployment) }, nc = { url = "https://nc.reauktion.de:8080/mcp", auth_token = "literal-token-if-not-using-env", -- alternative }, }, auto_approve = { ["boltzmann.read_file"] = true, -- specific tool ["broglie.*"] = true, -- whole server }, max_tool_depth = 8, } ``` Auth precedence per server: `auth_token` literal > `auth_env` indirection > nil (no Authorization header sent). Mirrors PHASE0 §10's `key_env` convention for cloud model API keys. Norris mode (Phase 3) will extend this: when autonomous, the destructive-op heuristic decides; for non-destructive tools, auto_approve. Outside scope here. --- ## 7. Meta Commands (Phase 2 additions) | Command | Action | |---|---| | `:mcp connect []` | Open a session; perform initialize + tools/list; add to active set | | `:mcp disconnect ` | Close one session | | `:mcp list` | Show connected sessions (alias, url, tool count, status) | | `:mcp tools` | List tools across all sessions (`alias.name` — short description) | | `:mcp tool ` | Show one tool's full inputSchema (debug aid) | Existing `:help` updated to list these. --- ## 8. System Prompt Augmentation `broker.lua`'s default system prompt grows by ~4 lines: ``` You may have access to MCP tools — they appear in this request's `tools` field. Call a tool by emitting a tool_call; the result will be supplied in the next turn. Use tools for structured operations (file reads, queries, etc.) and `CMD:` lines for local shell commands. Prefer tools when available; fall back to `CMD:` for anything not exposed as a tool. ``` The actual tool list is in the `tools` request-body field, not the prompt. This avoids per-turn token bloat for the full schema. §3 substrate invariants are unchanged. The `CMD:` extraction marker stays the local-shell route; tools are the additive structured route. --- ## 9. Migration from Phase 1 User-visible changes: - New `:mcp …` meta commands when MCP servers are configured or connected at runtime. - Assistant responses may now invoke tools — user sees a confirm prompt (similar to `CMD:` execution gate) followed by an indented tool-call frame with the result. - `CMD:` lines still work exactly as before for shell. Substrate (PHASE0.md §3) invariants: unchanged. Module layout (§4) amended to add `mcp.lua`; that amendment ships in the manifest commit. `config.lua`: existing configs without an `mcp` section continue to work — no MCP servers means no tools sent in the broker request body, no auth checks, no behavior change. --- ## 10. Out of Scope (Phase 2) Per PHASE0.md §11, these belong elsewhere: - Chuck Norris autonomous mode (Phase 3) — even though tool-calls enable richer autonomy, the *autonomous policy* is Phase 3's. - Destructive-op heuristic in safety.lua (Phase 3) — Phase 2 only implements the per-call confirm-prompt surface. - `memory.jsonl` summarization across sessions (Phase 4). - Multi-model routing / cloud fallback (Phase 5). - Tree-sitter syntax highlighting (Phase 6). Specifically out of Phase 2 scope despite proximity: - Stdio-transport MCP servers (Q17 below). - Parallel tool-call dispatch (Q20). - MCP `resources/list` and `prompts/list` capabilities — Phase 2 v1 only implements `tools/*`. Resources/prompts deferred (probably Phase 4 alongside memory). - Server-sent `notifications/progress` for long-running tool calls — ignored in v1; status surface comes later. --- ## 11. Open Questions | # | Question | Impact | Resolve by | |---|---|---|---| | Q17 | ~~MCP transport abstraction: stdio vs HTTP+SSE~~ | mcp.lua API shape | **Resolved at analyze.** Hard-code POST-only HTTP for v1. lmcp doesn't use the long-lived SSE channel and `listChanged: false` removes any v1 need for it. Stdio transport tracked as Phase 2.1 / out-of-scope here. | | Q18 | Tool-result re-injection: standard OpenAI `role:"tool"` turn, or `[tool: X]` prefix to next user turn (matching the §6 exec-output pattern)? | context.lua + broker.lua | **Partly resolved.** Live probe (2026-05-12, hossenfelder) shows `role:"tool"` accepted by the proxy + the loaded model (qwen2.5-coder-1.5b). Mistral-nemo-specific template testing is **blocked** by the hossenfelder proxy routing all `model` field values to the loaded fast model — see open-end below. Default v1 path: `role:"tool"` (standard); fallback to `[tool: X]` prefix is plumbed but unused unless a strict template rejects it during Phase 7 verify. | | Q19 | Large tool-result payloads: pass-through, truncate at N chars, or summarize via fast model? | context.lua + executor of tool-result | Phase 2 (plan); Phase 4 may refine with memory.jsonl | | Q20 | Parallel `tool_calls`: sequential v1 is safe; spec allows parallel. Move to parallel when both calls are read-only? | mcp.lua dispatch | Phase 2 (verify) — track for v2 | | Q21 | ~~MCP error mapping~~ | mcp.lua + broker.lua | **Resolved at analyze.** lmcp distinguishes: `result.isError=true` (handler exception, model-recoverable, feed back as tool turn content) vs JSON-RPC `error` (unknown method/tool, transport-level, surface as aish status). See §4. | | Q22 | aish's own command surface as an MCP server | scope expansion | **Out of Phase 2.** Parked for Phase 4+ if interest stays. | Open-end carried forward to Phase 7 (verify): - **Hossenfelder proxy `model`-field bug** (separate from aish): the proxy at `:8082` routes all requests to the loaded fast model regardless of the request's `model` field — chunks return `"model":"qwen2.5-coder-1.5b-q4_k_m.gguf"` even when `mistral-nemo-12b-instruct` was asked for. This **blocks live-verification of mistral-nemo's chat-template tool-role behavior**. Fix lives in boltzmann (parallel to the SSE-buffering bug tracked at [aish#15](https://git.reauktion.de/marfrit/aish/issues/15)). Phase 7 needs the proxy fix to fully close Q18. Resolved at formulate (above in §2 table): - Q6 (CMD: vs tools coexistence) — both, no policy preference, substrate unchanged. - Q7 (MCP discovery) — both, config-declared default + runtime `:mcp connect`. - Q8 (authorization) — per-call confirm default, per-tool/per-server `auto_approve` policy. - Q9 (system-prompt augmentation) — hybrid: static frame + dynamic `tools` body field. - Q10 (tool-call streaming) — streaming-from-day-one on top of Phase 1 SSE. Resolved at analyze (2026-05-12, live probes vs lmcp v0.5.4 + hossenfelder): - Q17 (transport abstraction) — POST-only, no SSE channel needed for lmcp. - Q21 (error mapping) — isError vs JSON-RPC error split per §4. --- *End of Phase 2 Manifest — aish*