repl: tool-call sub-loop + :mcp meta + system-prompt augmentation

Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP tool-call flow on top of broker/safety/context/renderer/mcp. repl.lua additions: - mcp_sessions table populated from config.mcp.servers at startup. connect_mcp() helper does initialize + caches tools/list. Failures status-logged once; absent from mcp_sessions until manual reconnect (C4 — no auto-retry). - tools_schema() flattens connected sessions' tools into the OpenAI {type:"function", function:{name,description,parameters}} shape with "<alias>.<name>" namespacing. - flatten_content() concatenates content[type="text"] blocks; one-shot status warning when non-text blocks (image/resource) are dropped (§4 normative spec, v1 only handles text). - dispatch_tool_call(name, args_table) splits alias.tool, looks up session, calls. Returns (content_string, is_error). Errors of every flavor (missing alias, no session, rpc_error, transport_error) yield a synthesized "[aish] ..." string so callers always have a body for the role:"tool" turn — alternation preserved per C5/C7. - ask_ai rewritten as a sub-loop that re-issues the broker request until the model returns pure text or max_tool_depth (default 8) is hit. Each iteration: stream response → if tool_calls present, confirm-gate each → dispatch → append role:"tool" turn → continue. Argument-JSON parse failure produces a synthesized tool turn (C7). Decline at confirm produces "[aish] tool call declined by user" tool turn (alternation guarantee). - :mcp meta with sub-commands: list / tools / tool <a.n> / connect <url> [alias] / disconnect <alias>. HELP block extended. context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8 (hybrid prompt: static frame about MCP + dynamic tools list in the request body). Block is always present even when no MCP servers configured — ~60 tokens for clarity that 'CMD:' remains the fallback. CMD: extraction unchanged — runs on the FINAL pure-text response only (not on intermediate iterations of the tool sub-loop). Substrate §3 invariant preserved. End-to-end verified two ways: (1) Direct broker probe: aish's tools_schema fed through broker.chat_stream against hossenfelder → qwen-1.5b emits one tool_call payload with correct id + name="boltzmann.list_dir" + args='{"path":"/tmp"}'. Accumulator stitched the JSON-string across fragmented deltas. (2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock emits text + tool_call, sub-loop dispatches against LIVE boltzmann lmcp (auto_approve via policy), 80+ files rendered inside the tool_call frame, broker re-invoked with the extended context, mock returns pure text, sub-loop terminates. Total broker invocations: 2. Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..." suggestions even when an MCP tool is the better path; the small model's system-prompt compliance is weak. Larger models and the analyze-time direct probe confirm the tools_schema and tool_calls flow is wire-correct — Phase 7 verify will exercise this against qwen3-30b or cloud models when available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 15:20:42 +00:00
parent efdc7281c7
commit 7e9cfff04d
2 changed files with 330 additions and 42 deletions
@@ -10,12 +10,22 @@ local M = {}

 -- The §6 default system prompt. The `CMD: ` (exact prefix, single space)
 -- contract is locked per §3 invariants — do not edit without amending PHASE0.
+-- Phase 2 appends ~4 lines about MCP tools per PHASE2.md §8 (hybrid:
+-- static frame here + dynamic tools list in the request body). The block
+-- is always present even when no MCP servers are configured — the cost
+-- is ~60 tokens and the model just sees instructions that don't apply.
 local DEFAULT_SYSTEM_PROMPT = [[
 You are aish, an AI-augmented shell assistant. You help the user execute shell
 commands, write and debug code, and re-engineer software. When suggesting shell
 commands, output them on a line beginning with exactly "CMD: " so aish can
 identify and optionally execute them. Be concise. Prefer concrete actions over
-explanations unless asked.]]
+explanations unless asked.
+
+You may have access to MCP tools — they appear in this request's `tools` field.
+Call a tool by emitting a tool_call; the result will be supplied in the next
+turn. Use tools for structured operations (file reads, queries, etc.) and
+`CMD:` lines for local shell commands. Prefer tools when available; fall back
+to `CMD:` for anything not exposed as a tool.]]

 local Context = {}
 Context.__index = Context
@@ -9,12 +9,15 @@ local broker   = require("broker")
 local renderer = require("renderer")
 local Context  = require("context")
 local history  = require("history")
+local mcp      = require("mcp")
+local safety   = require("safety")
+local json     = require("dkjson")

 local M = {}

 local HELP = [[
 Meta commands:
-  :quit / :q       exit aish (current session is flushed and closed)
+  :quit / :q              exit aish (session flushed and closed)
  :clear                  clear screen (history kept)
  :reset                  clear in-memory conversation history
  :model <name>           switch active model
@@ -25,6 +28,11 @@ Meta commands:
  :sessions               list session log files
  :save <name>            rename current session log to <name>.jsonl
  :resume <name>          load <name>.jsonl turns into the in-memory context
+  :mcp list               show connected MCP servers
+  :mcp tools              list tools across all sessions
+  :mcp tool <alias.name>  show one tool's inputSchema
+  :mcp connect <url> [a]  open an MCP session at runtime
+  :mcp disconnect <alias> drop an MCP session
  :help                   this message
 ]]

@@ -39,6 +47,117 @@ function M.run(config)

    local ctx = Context.new(config.context or {})

+    -- Phase 2: MCP sessions. Populated from config.mcp.servers at startup
+    -- (best-effort — failures are status-logged once, session absent from
+    -- mcp_sessions until manual :mcp connect; no auto-retry per PHASE2.md
+    -- §4 Lifecycle). Tools cached per-session for the session lifetime
+    -- (lmcp announces capabilities.tools.listChanged = false).
+    local mcp_sessions = {}  -- { [alias] = session }
+    local function connect_mcp(alias, server_cfg)
+        local sess = mcp.connect(server_cfg.url, {
+            alias      = alias,
+            auth_token = server_cfg.auth_token,
+            auth_env   = server_cfg.auth_env,
+        })
+        local ok, kind, err = sess:initialize()
+        if not ok then
+            renderer.status(("mcp %s: %s (%s)")
+                :format(alias, tostring(err), kind))
+            return false
+        end
+        mcp_sessions[alias] = sess
+        if sess.version_warning then
+            renderer.status("mcp " .. alias .. ": " .. sess.version_warning)
+        end
+        return true, #sess:list_tools()
+    end
+
+    if config.mcp and config.mcp.servers then
+        for alias, server_cfg in pairs(config.mcp.servers) do
+            local ok, n = connect_mcp(alias, server_cfg)
+            if ok then
+                renderer.status(("mcp %s: %d tools"):format(alias, n))
+            end
+        end
+    end
+
+    -- Assemble OpenAI-shape `tools` array across all live sessions, with
+    -- alias.name namespacing per PHASE2.md §5. Empty array → broker omits
+    -- the field entirely (§12 risk row 1).
+    local function tools_schema()
+        local out = {}
+        for alias, sess in pairs(mcp_sessions) do
+            for _, t in ipairs(sess:list_tools()) do
+                out[#out + 1] = {
+                    type = "function",
+                    ["function"] = {
+                        name        = alias .. "." .. t.name,
+                        description = t.description or "",
+                        parameters  = t.inputSchema
+                                        or { type = "object", properties = {} },
+                    },
+                }
+            end
+        end
+        return out
+    end
+
+    -- §4 "Content flattening": tool results may carry multiple blocks; v1
+    -- concatenates text and ignores non-text with a one-shot status.
+    local non_text_warned = false
+    local function flatten_content(content)
+        local parts = {}
+        local saw_non_text = false
+        for _, b in ipairs(content or {}) do
+            if b.type == "text" then
+                parts[#parts + 1] = b.text or ""
+            else
+                saw_non_text = true
+            end
+        end
+        if saw_non_text and not non_text_warned then
+            non_text_warned = true
+            renderer.status("tool returned non-text content blocks "
+                            .. "(image/resource ignored in v1)")
+        end
+        return table.concat(parts, "\n")
+    end
+
+    -- Split <alias>.<tool>, look up session, call. Returns (content_string,
+    -- is_error). Errors of all flavors (rpc, transport, missing alias)
+    -- yield a synthesized "[aish] tool ... failed: ..." string so the
+    -- caller always has a body for the role:"tool" turn — the strict-
+    -- template alternation rationale per PHASE0.md §6 and the C5/C7 fold
+    -- in PHASE2.md §4.
+    local function dispatch_tool_call(name, args)
+        local alias, tool_name = name:match("^([^.]+)%.(.+)$")
+        if not alias then
+            return ("[aish] tool name has no alias prefix: %s"):format(name), true
+        end
+        local sess = mcp_sessions[alias]
+        if not sess then
+            return ("[aish] no MCP server connected for alias '%s'")
+                    :format(alias), true
+        end
+        local result, kind, err = sess:call_tool(tool_name, args)
+        if not result then
+            if kind == "rpc_error" then
+                local msg = (type(err) == "table" and err.message)
+                              or tostring(err)
+                return ("[aish] tool dispatch failed: %s"):format(msg), true
+            else
+                return ("[aish] tool transport error: %s")
+                        :format(tostring(err)), true
+            end
+        end
+        -- result has content[] and possibly isError=true. flatten_content
+        -- handles the text-blocks-only flattening. We pass through the
+        -- content body regardless of isError (per PHASE2-baseline.md §3:
+        -- some tools set isError=false on actual failures, content text
+        -- is authoritative).
+        return flatten_content(result.content), (kind == "handler_error")
+    end
+
    -- Session log (PHASE1.md §6). Always open one on startup; auto-write
    -- every user/assistant turn; close on :quit. If history.dir is set but
    -- unwritable, log a status and continue without persistence.
@@ -104,40 +223,122 @@ function M.run(config)
        end
    end

-    -- Send user text to the active model, render the response token-by-token
-    -- via broker.chat_stream, and (per §6 + config.shell.confirm_cmd) optionally
-    -- execute extracted CMD: lines on the reassembled full text.
+    -- Send user text to the active model and process the response. If MCP
+    -- tools are connected and the model emits tool_calls, dispatch each
+    -- call (with safety confirm gate), append role:"tool" turns, and
+    -- re-call the broker — looping until the model returns pure text or
+    -- max_tool_depth is hit. CMD: extraction runs ONCE on the final
+    -- pure-text response (the §6 substrate invariant is unchanged).
+    local max_tool_depth = (config.mcp and config.mcp.max_tool_depth) or 8
+
    local function ask_ai(text)
        local prev_pending = ctx.pending_exec_output
-        ctx:append_user(text)  -- flushes any pending [exec output] as prefix
-        log_turn(ctx.turns[#ctx.turns])  -- merged user turn (may include exec)
+        ctx:append_user(text)
+        log_turn(ctx.turns[#ctx.turns])

-        local parts = {}
+        local depth = 0
+        local final_resp = ""
+        local first_iteration = true
+
+        while true do
+            local text_parts      = {}
+            local tool_calls_seen = {}
            local ok, err = broker.chat_stream(active_cfg, ctx:to_messages(),
                function(kind, payload)
-                -- Phase 2: callback shape widened to (kind, payload).
-                -- tool_call kinds are handled by the sub-loop landing in
-                -- commit #6; this commit ships only the text path so Phase
-                -- 1 streaming stays functional between #5 and #6.
                    if kind == "text" then
-                    parts[#parts + 1] = payload
+                        text_parts[#text_parts + 1] = payload
                        renderer.assistant_delta(payload)
+                    elseif kind == "tool_call" then
+                        tool_calls_seen[#tool_calls_seen + 1] = payload
                    end
-            end)
+                end,
+                { tools = tools_schema() })
            renderer.assistant_flush()

            if not ok then
                renderer.status("broker error: " .. tostring(err))
-            table.remove(ctx.turns)              -- back out the merged user turn
-            ctx.pending_exec_output = prev_pending  -- restore buffered exec output
+                if first_iteration then
+                    -- Back out the user turn so :resume / retry is clean.
+                    table.remove(ctx.turns)
+                    ctx.pending_exec_output = prev_pending
+                end
                return
            end
-        local resp = table.concat(parts)
-        ctx:append({ role = "assistant", content = resp })
+            first_iteration = false
+
+            local resp_text = table.concat(text_parts)
+
+            if #tool_calls_seen == 0 then
+                -- Pure text response — end of this AI turn.
+                ctx:append({ role = "assistant", content = resp_text })
                log_turn(ctx.turns[#ctx.turns])
+                final_resp = resp_text
+                break
+            end
+
+            -- Record the assistant turn with text AND tool_calls. Content
+            -- may be "" (C3: model often emits no prose before a call).
+            ctx:append({
+                role       = "assistant",
+                content    = resp_text,
+                tool_calls = tool_calls_seen,
+            })
+            log_turn(ctx.turns[#ctx.turns])
+
+            -- Process each tool_call. Every iteration appends EXACTLY one
+            -- role:"tool" turn per call (keeps alternation legal even on
+            -- decline/error per C5/C7).
+            for _, call in ipairs(tool_calls_seen) do
+                local args_table, args_err
+                if call.arguments and call.arguments ~= "" then
+                    args_table, _, args_err = json.decode(call.arguments)
+                else
+                    args_table = {}
+                end
+
+                local tool_content, is_error
+                if args_err then
+                    tool_content = ("[aish] tool arguments not parseable as "
+                                    .. "JSON: %s"):format(tostring(args_err))
+                    is_error = true
+                    renderer.tool_call_begin(call.name, call.arguments)
+                    renderer.tool_call_end(tool_content, true)
+                elseif not safety.confirm_tool_call(call.name, args_table,
+                                                   config) then
+                    tool_content = "[aish] tool call declined by user"
+                    is_error = true
+                    renderer.status(tool_content)
+                else
+                    renderer.tool_call_begin(call.name, call.arguments)
+                    local content, errflag = dispatch_tool_call(call.name,
+                                                                args_table)
+                    tool_content = content
+                    is_error = errflag
+                    renderer.tool_call_end(content, errflag)
+                end
+
+                ctx:append({
+                    role         = "tool",
+                    tool_call_id = call.id,
+                    content      = tool_content,
+                })
+                log_turn(ctx.turns[#ctx.turns])
+            end
+
+            depth = depth + 1
+            if depth >= max_tool_depth then
+                renderer.status(("tool-call depth limit reached (%d); "
+                                 .. "stopping sub-loop"):format(max_tool_depth))
+                final_resp = resp_text
+                break
+            end
+            -- loop body re-runs broker.chat_stream with the now-extended ctx
+        end
+
        status_evictions(ctx:enforce_budget())

-        for _, cmd in ipairs(executor.extract_cmd_lines(resp)) do
+        -- CMD: extraction on the final pure-text response only.
+        for _, cmd in ipairs(executor.extract_cmd_lines(final_resp)) do
            local doit
            if config.shell and config.shell.confirm_cmd then
                local ans = rl.readline(("execute '%s'? [y/N] "):format(cmd)) or ""
@@ -255,6 +456,83 @@ function M.run(config)
            for _, t in ipairs(turns) do ctx:append(t) end
            renderer.status(("resumed %d turns from %s"):format(#turns, name))
        end,
+        mcp = function(args)
+            local sub, sub_args = args:match("^%s*(%S*)%s*(.*)$")
+            if sub == "list" or sub == "" then
+                if next(mcp_sessions) == nil then
+                    renderer.status("(no MCP sessions)"); return
+                end
+                for alias, sess in pairs(mcp_sessions) do
+                    io.write(("  %s  %s  (%d tools)\n"):format(
+                        alias, sess.url, #sess:list_tools()))
+                end
+            elseif sub == "tools" then
+                local any = false
+                for alias, sess in pairs(mcp_sessions) do
+                    for _, t in ipairs(sess:list_tools()) do
+                        any = true
+                        local desc = (t.description or ""):gsub("\n", " ")
+                        io.write(("  %s.%-18s %s\n"):format(
+                            alias, t.name, desc:sub(1, 60)))
+                    end
+                end
+                if not any then renderer.status("(no tools)") end
+            elseif sub == "tool" then
+                local name = sub_args:match("^%s*(%S+)")
+                if not name then
+                    renderer.status("usage: :mcp tool <alias.name>"); return
+                end
+                local alias, tname = name:match("^([^.]+)%.(.+)$")
+                local sess = alias and mcp_sessions[alias]
+                if not sess then
+                    renderer.status("unknown alias: " .. tostring(alias))
+                    return
+                end
+                local found
+                for _, t in ipairs(sess:list_tools()) do
+                    if t.name == tname then found = t; break end
+                end
+                if not found then
+                    renderer.status("unknown tool: " .. name); return
+                end
+                io.write(("  %s.%s\n"):format(alias, found.name))
+                io.write(("  description: %s\n"):format(found.description or "(none)"))
+                io.write("  inputSchema:\n    ")
+                io.write((json.encode(found.inputSchema or {}, {indent = true})
+                          :gsub("\n", "\n    ")))
+                io.write("\n")
+            elseif sub == "connect" then
+                local url, alias = sub_args:match("^%s*(%S+)%s*(%S*)")
+                if not url or url == "" then
+                    renderer.status("usage: :mcp connect <url> [alias]"); return
+                end
+                if alias == "" then
+                    alias = url:match("https?://([^:/]+)") or url
+                end
+                if mcp_sessions[alias] then
+                    renderer.status("already connected: " .. alias); return
+                end
+                local ok, n = connect_mcp(alias, { url = url })
+                if ok then
+                    renderer.status(("mcp %s: connected (%d tools)")
+                                    :format(alias, n))
+                end
+            elseif sub == "disconnect" then
+                local alias = sub_args:match("^%s*(%S+)")
+                if not alias then
+                    renderer.status("usage: :mcp disconnect <alias>"); return
+                end
+                local sess = mcp_sessions[alias]
+                if not sess then
+                    renderer.status("not connected: " .. alias); return
+                end
+                sess:close()
+                mcp_sessions[alias] = nil
+                renderer.status("disconnected " .. alias)
+            else
+                renderer.status("usage: :mcp {list|tools|tool|connect|disconnect}")
+            end
+        end,
        help = function() io.write(HELP) end,
    }