Files

T

marfrit ec6793c93c docs/PHASE2: formulate — MCP client + tool-calling bridge

Phase 2 formulate manifest. Three pillars per PHASE0.md §11 row 2:
mcp.lua (JSON-RPC 2.0 over HTTP+SSE, target: lmcp), tool-calling bridge
(OpenAI tools field <-> MCP tools/call), and the safety.lua
authorization gate (per-call confirm + auto_approve policy).

Resolves PHASE0.md §13 Q6–Q10:
  Q6  CMD: + tool-calls coexist; substrate §3 unchanged
  Q7  config-declared servers + runtime :mcp connect
  Q8  per-call confirm default, auto_approve policy in config
  Q9  hybrid system prompt: static frame + dynamic tools body field
  Q10 streaming-from-day-one on Phase 1 SSE; on_delta widens to (kind, payload)

New questions tracked in §11 (Q17–Q22): transport abstraction, role:tool
vs prefix injection (mistral-nemo template verification needed), large
tool-result handling, parallel dispatch, error mapping, aish-as-MCP-server
(parked).

§4 module layout amended: mcp.lua slots between broker.lua and router.lua.
The amendment is documented in this manifest; the actual §4 table edit
lands when implementation starts (Phase 2 implement phase).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-12 09:23:53 +00:00

18 KiB

Raw Blame History

aish — Phase 2 Manifest

Project: aish — AI-augmented conversational shell Document: Phase 2 Requirements, Architecture & Design Decisions Status: Formulate (pre-analysis) Date: 2026-05-12

PHASE0.md is the locked substrate; PHASE1.md is layered on top. This manifest specifies what Phase 2 adds. Section numbers reference back to PHASE0.md / PHASE1.md where relevant.

1. Scope of Phase 2

Three pillars per PHASE0.md §11 row 2:

MCP client (mcp.lua) — JSON-RPC 2.0 over HTTP+SSE transport. Target reference implementation: lmcp. Operations needed for v1: initialize, tools/list, tools/call. Multiple servers may be connected concurrently; tools are namespaced <server>.<tool>.
Tool-calling protocol bridge — the broker sends OpenAI-compatible tools in the request body; the model emits tool_calls in the response; mcp.lua dispatches each call to the right server; the tool result is fed back as a role:"tool" turn in context.lua and the chat continues.
Authorization gate — safety.lua (PHASE0.md §4 stub) finally gets implemented. Every tool call is confirmed by the user by default, with per-tool and per-server auto_approve policies in config.lua.

Phase 2 is done when:

aish can connect to at least one local lmcp server declared in config.lua and one connected via :mcp connect <url> at runtime.
:mcp list shows connected servers; :mcp tools shows discovered tools across all servers.
A model conversation can invoke a tool: the broker request carries the live tools schema; the response's tool_calls are confirmed by the user; each call dispatches to the right MCP server; the result re-enters the chat; the model continues with the result available.
CMD: extraction (PHASE0.md §6 substrate invariant) still works unchanged — Phase 2 is additive, not replacing.
A tool with auto_approve = true (in config) executes without the confirm prompt; a non-approved tool still prompts.

2. Technology Decisions (delta from Phase 1)

Decision	Choice	Rationale
MCP transport	HTTP+SSE per the MCP spec's HTTP transport flavor	Matches `lmcp`'s native transport. Reuses the libcurl easy interface + the Phase 1 SSE parser in `ffi/curl.lua`. Avoids spawning child processes (stdio transport requires Phase 1-style PTY plumbing per server, more moving parts). Stdio is left for a possible Phase 2.1 if a stdio-only MCP server becomes necessary.
MCP protocol version	`2025-03-26` (or whatever `lmcp` announces in its `initialize` response)	Track lmcp's spec target. Negotiated at connect time; aish caps at the spec version it knows.
Tool-call wire format	OpenAI `tools` field on `/v1/chat/completions` body; `tool_calls` on assistant deltas; `role:"tool"` turn with `tool_call_id` for results	Standard, supported by llama.cpp and OpenRouter. Aligns with the existing `/v1/chat/completions` substrate invariant.
Tool namespacing	`<server-alias>.<tool-name>` for both the wire-level tool name and `:mcp tools` listing	Avoids name collisions across servers. The alias comes from the config key or the connect URL hash.
`CMD:` coexistence with tool-calls	Both stay live, no policy preference. Substrate invariant §3 unchanged.	Resolves Q6 (see §10). `CMD:` is the local-shell route; MCP tools are structured-API routes; they serve different purposes. Future phases (Norris, Phase 3) may prefer tools when both are available, but Phase 2 doesn't enforce.
Authorization default	Per-call confirm (mirrors PHASE0.md §10 `confirm_cmd` for shell)	Conservative default; user can opt into auto-approval per tool or per server via config. Resolves Q8.
System prompt augmentation	Hybrid: static frame in `broker.lua` system prompt + dynamic `tools` array in the request body	Tool list goes in the API field where it belongs; the system prompt only mentions that tools exist and how to use them. Per-request body cost is bounded (tools change rarely; small schemas). Resolves Q9.
Tool-call streaming	Streaming-from-day-one — `broker.chat_stream`'s on_delta callback widens to handle `tool_calls` deltas in addition to text deltas	Resolves Q10. Phase 1 SSE landed first, so we're not retrofitting; we just extend the parser.
Tool-call concurrency	Sequential dispatch in Phase 2 v1 — process `tool_calls[0]` to completion, then `[1]`, etc.	Simpler error handling; tool effects often order-dependent (e.g. write-then-read). Parallel dispatch deferred (see Q20).
MCP server lifecycle	aish does not manage MCP server processes (parallel to PHASE0.md §12 llama.cpp rule)	Declared in config or connected by URL; aish is a client only.

3. Module Changes

File	State after Phase 1	Phase 2 changes
`mcp.lua`	New file (not in PHASE0 §4 layout; this Phase amends the layout to add it)	Implement: `M.connect(url, alias) -> session`, `session:initialize()`, `session:list_tools() -> [{name, description, schema}]`, `session:call_tool(name, args) -> result`, `session:close()`. JSON-RPC 2.0 over HTTP POST for client→server, SSE GET for server→client notifications. Per-session state: connected, tools-cache, pending request ID counter.
`safety.lua`	Stub	Implement Phase 2 surface only: `M.confirm_tool_call(tool_name, args, policy) -> bool`. Reads `config.mcp.auto_approve` (per-tool and per-server) before prompting. Norris destructive-op heuristic and HALT gate stay Phase 3.
`broker.lua`	Streaming `chat_stream(cfg, msgs, on_delta)`	Request body grows `tools = mcp.tools_schema()` (assembled from all connected sessions). on_delta callback widens to `on_delta(kind, payload)` where `kind ∈ {"text", "tool_call"}`; tool_call payload includes id+name+arguments-delta. `M.chat` wrapper updates to buffer both.
`context.lua`	turns = {{role, content}, ...} + `pending_exec_output`	New role: `"tool"`. Assistant turns may carry `tool_calls = [{id, name, arguments}]`. `to_messages()` flattens these into OpenAI-shape messages. Alternation rules: assistant-with-tool_calls is followed by N tool turns (one per call), then assistant text.
`repl.lua`	meta cmds + ask_ai stream loop	After ask_ai sees `tool_calls`, enter a tool-execution sub-loop: confirm-gate each call via `safety.confirm_tool_call`, dispatch via `mcp.session:call_tool`, append tool turn to context, re-issue the broker request. Loop until assistant emits text without tool_calls. New meta: `:mcp connect <url> [alias]`, `:mcp list`, `:mcp tools`, `:mcp disconnect <alias>`.
`renderer.lua`	streaming text + exec frame	Add `tool_call_begin(name, args)`, `tool_call_end(result, ok)`. Visual style: indented, dim, parallel to the exec frame.
`config.lua`	example with models/shell/context/history	Schema additions: `mcp = { servers = { alias = { url = "..." } }, auto_approve = { ["alias.tool"] = true } }`. Documented in §10 below.
`ffi/curl.lua`	post + post_sse	One probable addition: GET request helper (for the server→client SSE channel of MCP transport). Or wrap the SSE-GET inside `mcp.lua` directly if it's tight enough. Decided at analyze.
`history.lua`	JSONL session log	Tool turns are logged like any other turn — `{role:"tool", tool_call_id:"...", content:"..."}`. Resume reconstructs them via `ctx:append` like user/assistant turns.

§4 module-layout amendment: mcp.lua slots between broker.lua and router.lua in the §4 table. Same commit lands the manifest amendment.

4. MCP Transport

lmcp exposes an HTTP+SSE transport. From a client perspective there are two channels per session:

client → server:   POST <base-url>     Content-Type: application/json
                                       Body: { jsonrpc:"2.0", id, method, params }
                                       Returns: { jsonrpc, id, result | error }

server → client:   GET  <base-url>     Accept: text/event-stream
                                       (held open; receives notifications/...)

Handshake (per MCP spec)

initialize request: { protocolVersion, capabilities, clientInfo }.
Server response: { protocolVersion, capabilities, serverInfo }. Cache the announced server capabilities — only invoke RPCs the server says it supports (e.g. tools capability).
notifications/initialized (one-way notification) signals end of handshake.

Tool discovery

tools/list RPC → { tools: [{ name, description, inputSchema }] }.
Cache per-session. Re-fetch on notifications/tools/list_changed from the SSE channel.

Tool invocation

tools/call with { name, arguments } → { content: [{type, text}], isError }.
The full text payload flows back to the model as the next role:"tool" turn's content.

SSE channel

Held open via ffi/curl.post_sse style — but using a GET (Phase 2 likely needs M.get_sse mirroring the POST variant). Each event is a JSON-RPC notification. Phase 2 v1 only listens for notifications/tools/list_changed; other notification types (progress, log) are ignored and tracked for later phases.

Lifecycle

Connect on startup (from config.mcp.servers) — best effort; failures are status-logged, don't abort aish.
:mcp connect <url> adds a session at runtime; alias auto-derived from hostname or supplied as second arg.
:mcp disconnect <alias> closes both channels.
On aish quit, all sessions are closed cleanly (best effort).

5. Tool-Call Bridge

Broker request body (delta from Phase 1)

{
  "model": "...",
  "messages": [...],
  "stream": true,
  "temperature": 0.2,
  "tools": [
    { "type":"function",
      "function": { "name":"<alias>.<tool>",
                    "description":"...",
                    "parameters": <inputSchema> } },
    ...
  ]
}

The tools array is assembled by mcp.tools_schema() — flattens tools/list results from every connected session, namespacing each tool as <alias>.<name>.

Response handling (streaming)

llama.cpp / OpenAI deltas may include:

data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_…",
                "function":{"name":"alias.tool","arguments":"{\"a\":"}}]}}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,
                "function":{"arguments":"1}"}}]}}]}
data: {"choices":[{"finish_reason":"tool_calls",...}]}

broker.chat_stream accumulates tool-call deltas keyed by index; the arguments field is a JSON-string that arrives chunked and is concatenated. On finish_reason: tool_calls, the accumulated calls are emitted to on_delta as kind="tool_call" with full payloads.

Re-injection into context

-- After tool execution
ctx:append({
  role = "assistant",
  content = "",                -- or any model-emitted text
  tool_calls = { {id="call_…", name="alias.tool", arguments=<json-string>} },
})
ctx:append({
  role = "tool",
  tool_call_id = "call_…",
  content = <tool-result-text>,
})

to_messages() renders both shapes for the next broker request. The strict-alternation issue from PHASE0.md §6 (mistral-nemo Jinja) is handled differently here — tool turns ARE expected to follow assistant tool_calls per the OpenAI chat-template convention. If a model's template still rejects this shape, fall back to the [tool: X] prefix strategy used for exec output (Q18 below).

Re-issuing the broker request

After tool turns are appended, the broker is called again with the extended messages array. The model may emit more tool_calls, more text, or both. Loop until the response has no tool_calls (i.e. a plain text assistant turn).

Budget: a max-tool-call-depth setting (default 8) prevents runaway loops. Hit-cap surfaces as a status: [aish] tool-call depth limit reached.

6. Authorization (safety.lua Phase 2 surface)

-- safety.confirm_tool_call(tool_name, args_table, config) -> bool
function M.confirm_tool_call(name, args, cfg)
    local policy = cfg.mcp and cfg.mcp.auto_approve or {}
    if policy[name] then return true end
    -- Per-server prefix check: "alias.*" entries
    local alias = name:match("^([^.]+)%.")
    if alias and policy[alias .. ".*"] then return true end
    -- Otherwise prompt
    local pretty = name .. "(" .. (#args > 0 and "..." or "") .. ")"
    local ans = rl.readline(("call '%s'? [y/N] "):format(pretty)) or ""
    return ans:lower():sub(1,1) == "y"
end

Config schema:

mcp = {
    servers = {
        local_fs = { url = "http://localhost:9700/" },
        gitea    = { url = "http://localhost:9701/" },
    },
    auto_approve = {
        ["local_fs.read_file"] = true,    -- specific tool
        ["gitea.*"]            = true,    -- whole server
    },
    max_tool_depth = 8,
}

Norris mode (Phase 3) will extend this: when autonomous, the destructive-op heuristic decides; for non-destructive tools, auto_approve. Outside scope here.

7. Meta Commands (Phase 2 additions)

Command	Action
`:mcp connect <url> [<alias>]`	Open a session; perform initialize + tools/list; add to active set
`:mcp disconnect <alias>`	Close one session
`:mcp list`	Show connected sessions (alias, url, tool count, status)
`:mcp tools`	List tools across all sessions (`alias.name` — short description)
`:mcp tool <alias.name>`	Show one tool's full inputSchema (debug aid)

Existing :help updated to list these.

8. System Prompt Augmentation

broker.lua's default system prompt grows by ~4 lines:

You may have access to MCP tools — they appear in this request's `tools`
field. Call a tool by emitting a tool_call; the result will be supplied
in the next turn. Use tools for structured operations (file reads,
queries, etc.) and `CMD:` lines for local shell commands. Prefer tools
when available; fall back to `CMD:` for anything not exposed as a tool.

The actual tool list is in the tools request-body field, not the prompt. This avoids per-turn token bloat for the full schema.

§3 substrate invariants are unchanged. The CMD: extraction marker stays the local-shell route; tools are the additive structured route.

9. Migration from Phase 1

User-visible changes:

New :mcp … meta commands when MCP servers are configured or connected at runtime.
Assistant responses may now invoke tools — user sees a confirm prompt (similar to CMD: execution gate) followed by an indented tool-call frame with the result.
CMD: lines still work exactly as before for shell.

Substrate (PHASE0.md §3) invariants: unchanged. Module layout (§4) amended to add mcp.lua; that amendment ships in the manifest commit.

config.lua: existing configs without an mcp section continue to work — no MCP servers means no tools sent in the broker request body, no auth checks, no behavior change.

10. Out of Scope (Phase 2)

Per PHASE0.md §11, these belong elsewhere:

Chuck Norris autonomous mode (Phase 3) — even though tool-calls enable richer autonomy, the autonomous policy is Phase 3's.
Destructive-op heuristic in safety.lua (Phase 3) — Phase 2 only implements the per-call confirm-prompt surface.
memory.jsonl summarization across sessions (Phase 4).
Multi-model routing / cloud fallback (Phase 5).
Tree-sitter syntax highlighting (Phase 6).

Specifically out of Phase 2 scope despite proximity:

Stdio-transport MCP servers (Q17 below).
Parallel tool-call dispatch (Q20).
MCP resources/list and prompts/list capabilities — Phase 2 v1 only implements tools/*. Resources/prompts deferred (probably Phase 4 alongside memory).
Server-sent notifications/progress for long-running tool calls — ignored in v1; status surface comes later.

11. Open Questions

#	Question	Impact	Resolve by
Q17	MCP transport abstraction: design `mcp.lua` from day one for both HTTP+SSE and stdio transports (transport_t interface), or hard-code HTTP+SSE and refit if a stdio-only server appears?	mcp.lua API shape	Phase 2 (plan)
Q18	Tool result re-injection: standard OpenAI `role:"tool"` turn, or `[tool: X]` prefix to next user turn (matching the §6 exec-output pattern)? Strict chat templates may reject `tool` role — needs verification against mistral-nemo specifically.	context.lua + broker.lua	Phase 2 (analyze)
Q19	Large tool-result payloads (file reads, query dumps): pass-through, truncate at N chars, or summarize via fast model? Token-budget pressure scales with tool use.	context.lua + executor of tool-result	Phase 2 (plan); Phase 4 may refine with memory.jsonl
Q20	Parallel `tool_calls`: sequential v1 is safe; spec allows parallel. Move to parallel when both calls are read-only (declared in tool metadata)?	mcp.lua dispatch	Phase 2 (verify) — track for v2
Q21	MCP server error mapping: JSON-RPC `error` response → tool-result content with `isError=true` (model can react), or aish-level transport error (broker aborts)?	mcp.lua + broker.lua	Phase 2 (plan)
Q22	aish's own command surface as an MCP server (eat your own dog food: expose `aish.exec`, `aish.read_file`, etc. via MCP so other clients can drive aish)?	scope expansion / new module	Out of Phase 2. Tracked for "maybe Phase 4 or later"; flagging here so it's not silently lost.

Resolved at formulate (above in §2 table):

Q6 (CMD: vs tools coexistence) — both, no policy preference, substrate unchanged.
Q7 (MCP discovery) — both, config-declared default + runtime :mcp connect.
Q8 (authorization) — per-call confirm default, per-tool/per-server auto_approve policy.
Q9 (system-prompt augmentation) — hybrid: static frame + dynamic tools body field.
Q10 (tool-call streaming) — streaming-from-day-one on top of Phase 1 SSE.

End of Phase 2 Manifest — aish

18 KiB Raw Blame History