repl: tool-call sub-loop + :mcp meta + system-prompt augmentation

Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP
tool-call flow on top of broker/safety/context/renderer/mcp.

repl.lua additions:
  - mcp_sessions table populated from config.mcp.servers at startup.
    connect_mcp() helper does initialize + caches tools/list. Failures
    status-logged once; absent from mcp_sessions until manual reconnect
    (C4 — no auto-retry).
  - tools_schema() flattens connected sessions' tools into the OpenAI
    {type:"function", function:{name,description,parameters}} shape with
    "<alias>.<name>" namespacing.
  - flatten_content() concatenates content[type="text"] blocks; one-shot
    status warning when non-text blocks (image/resource) are dropped
    (§4 normative spec, v1 only handles text).
  - dispatch_tool_call(name, args_table) splits alias.tool, looks up
    session, calls. Returns (content_string, is_error). Errors of every
    flavor (missing alias, no session, rpc_error, transport_error)
    yield a synthesized "[aish] ..." string so callers always have a
    body for the role:"tool" turn — alternation preserved per C5/C7.
  - ask_ai rewritten as a sub-loop that re-issues the broker request
    until the model returns pure text or max_tool_depth (default 8) is
    hit. Each iteration: stream response → if tool_calls present,
    confirm-gate each → dispatch → append role:"tool" turn → continue.
    Argument-JSON parse failure produces a synthesized tool turn (C7).
    Decline at confirm produces "[aish] tool call declined by user"
    tool turn (alternation guarantee).
  - :mcp meta with sub-commands: list / tools / tool <a.n> / connect
    <url> [alias] / disconnect <alias>. HELP block extended.

context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8
(hybrid prompt: static frame about MCP + dynamic tools list in the
request body). Block is always present even when no MCP servers
configured — ~60 tokens for clarity that 'CMD:' remains the fallback.

CMD: extraction unchanged — runs on the FINAL pure-text response only
(not on intermediate iterations of the tool sub-loop). Substrate §3
invariant preserved.

End-to-end verified two ways:
  (1) Direct broker probe: aish's tools_schema fed through
      broker.chat_stream against hossenfelder → qwen-1.5b emits one
      tool_call payload with correct id + name="boltzmann.list_dir"
      + args='{"path":"/tmp"}'. Accumulator stitched the JSON-string
      across fragmented deltas.
  (2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock
      emits text + tool_call, sub-loop dispatches against LIVE
      boltzmann lmcp (auto_approve via policy), 80+ files rendered
      inside the tool_call frame, broker re-invoked with the
      extended context, mock returns pure text, sub-loop terminates.
      Total broker invocations: 2.

Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..."
suggestions even when an MCP tool is the better path; the small
model's system-prompt compliance is weak. Larger models and the
analyze-time direct probe confirm the tools_schema and tool_calls
flow is wire-correct — Phase 7 verify will exercise this against
qwen3-30b or cloud models when available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-12 15:20:42 +00:00
parent efdc7281c7
commit 7e9cfff04d
2 changed files with 330 additions and 42 deletions
+11 -1
View File
@@ -10,12 +10,22 @@ local M = {}
-- The §6 default system prompt. The `CMD: ` (exact prefix, single space)
-- contract is locked per §3 invariants — do not edit without amending PHASE0.
-- Phase 2 appends ~4 lines about MCP tools per PHASE2.md §8 (hybrid:
-- static frame here + dynamic tools list in the request body). The block
-- is always present even when no MCP servers are configured — the cost
-- is ~60 tokens and the model just sees instructions that don't apply.
local DEFAULT_SYSTEM_PROMPT = [[
You are aish, an AI-augmented shell assistant. You help the user execute shell
commands, write and debug code, and re-engineer software. When suggesting shell
commands, output them on a line beginning with exactly "CMD: " so aish can
identify and optionally execute them. Be concise. Prefer concrete actions over
explanations unless asked.]]
explanations unless asked.
You may have access to MCP tools — they appear in this request's `tools` field.
Call a tool by emitting a tool_call; the result will be supplied in the next
turn. Use tools for structured operations (file reads, queries, etc.) and
`CMD:` lines for local shell commands. Prefer tools when available; fall back
to `CMD:` for anything not exposed as a tool.]]
local Context = {}
Context.__index = Context
+296 -18
View File
@@ -9,12 +9,15 @@ local broker = require("broker")
local renderer = require("renderer")
local Context = require("context")
local history = require("history")
local mcp = require("mcp")
local safety = require("safety")
local json = require("dkjson")
local M = {}
local HELP = [[
Meta commands:
:quit / :q exit aish (current session is flushed and closed)
:quit / :q exit aish (session flushed and closed)
:clear clear screen (history kept)
:reset clear in-memory conversation history
:model <name> switch active model
@@ -25,6 +28,11 @@ Meta commands:
:sessions list session log files
:save <name> rename current session log to <name>.jsonl
:resume <name> load <name>.jsonl turns into the in-memory context
:mcp list show connected MCP servers
:mcp tools list tools across all sessions
:mcp tool <alias.name> show one tool's inputSchema
:mcp connect <url> [a] open an MCP session at runtime
:mcp disconnect <alias> drop an MCP session
:help this message
]]
@@ -39,6 +47,117 @@ function M.run(config)
local ctx = Context.new(config.context or {})
-- Phase 2: MCP sessions. Populated from config.mcp.servers at startup
-- (best-effort — failures are status-logged once, session absent from
-- mcp_sessions until manual :mcp connect; no auto-retry per PHASE2.md
-- §4 Lifecycle). Tools cached per-session for the session lifetime
-- (lmcp announces capabilities.tools.listChanged = false).
local mcp_sessions = {} -- { [alias] = session }
local function connect_mcp(alias, server_cfg)
local sess = mcp.connect(server_cfg.url, {
alias = alias,
auth_token = server_cfg.auth_token,
auth_env = server_cfg.auth_env,
})
local ok, kind, err = sess:initialize()
if not ok then
renderer.status(("mcp %s: %s (%s)")
:format(alias, tostring(err), kind))
return false
end
mcp_sessions[alias] = sess
if sess.version_warning then
renderer.status("mcp " .. alias .. ": " .. sess.version_warning)
end
return true, #sess:list_tools()
end
if config.mcp and config.mcp.servers then
for alias, server_cfg in pairs(config.mcp.servers) do
local ok, n = connect_mcp(alias, server_cfg)
if ok then
renderer.status(("mcp %s: %d tools"):format(alias, n))
end
end
end
-- Assemble OpenAI-shape `tools` array across all live sessions, with
-- alias.name namespacing per PHASE2.md §5. Empty array → broker omits
-- the field entirely (§12 risk row 1).
local function tools_schema()
local out = {}
for alias, sess in pairs(mcp_sessions) do
for _, t in ipairs(sess:list_tools()) do
out[#out + 1] = {
type = "function",
["function"] = {
name = alias .. "." .. t.name,
description = t.description or "",
parameters = t.inputSchema
or { type = "object", properties = {} },
},
}
end
end
return out
end
-- §4 "Content flattening": tool results may carry multiple blocks; v1
-- concatenates text and ignores non-text with a one-shot status.
local non_text_warned = false
local function flatten_content(content)
local parts = {}
local saw_non_text = false
for _, b in ipairs(content or {}) do
if b.type == "text" then
parts[#parts + 1] = b.text or ""
else
saw_non_text = true
end
end
if saw_non_text and not non_text_warned then
non_text_warned = true
renderer.status("tool returned non-text content blocks "
.. "(image/resource ignored in v1)")
end
return table.concat(parts, "\n")
end
-- Split <alias>.<tool>, look up session, call. Returns (content_string,
-- is_error). Errors of all flavors (rpc, transport, missing alias)
-- yield a synthesized "[aish] tool ... failed: ..." string so the
-- caller always has a body for the role:"tool" turn — the strict-
-- template alternation rationale per PHASE0.md §6 and the C5/C7 fold
-- in PHASE2.md §4.
local function dispatch_tool_call(name, args)
local alias, tool_name = name:match("^([^.]+)%.(.+)$")
if not alias then
return ("[aish] tool name has no alias prefix: %s"):format(name), true
end
local sess = mcp_sessions[alias]
if not sess then
return ("[aish] no MCP server connected for alias '%s'")
:format(alias), true
end
local result, kind, err = sess:call_tool(tool_name, args)
if not result then
if kind == "rpc_error" then
local msg = (type(err) == "table" and err.message)
or tostring(err)
return ("[aish] tool dispatch failed: %s"):format(msg), true
else
return ("[aish] tool transport error: %s")
:format(tostring(err)), true
end
end
-- result has content[] and possibly isError=true. flatten_content
-- handles the text-blocks-only flattening. We pass through the
-- content body regardless of isError (per PHASE2-baseline.md §3:
-- some tools set isError=false on actual failures, content text
-- is authoritative).
return flatten_content(result.content), (kind == "handler_error")
end
-- Session log (PHASE1.md §6). Always open one on startup; auto-write
-- every user/assistant turn; close on :quit. If history.dir is set but
-- unwritable, log a status and continue without persistence.
@@ -104,40 +223,122 @@ function M.run(config)
end
end
-- Send user text to the active model, render the response token-by-token
-- via broker.chat_stream, and (per §6 + config.shell.confirm_cmd) optionally
-- execute extracted CMD: lines on the reassembled full text.
-- Send user text to the active model and process the response. If MCP
-- tools are connected and the model emits tool_calls, dispatch each
-- call (with safety confirm gate), append role:"tool" turns, and
-- re-call the broker — looping until the model returns pure text or
-- max_tool_depth is hit. CMD: extraction runs ONCE on the final
-- pure-text response (the §6 substrate invariant is unchanged).
local max_tool_depth = (config.mcp and config.mcp.max_tool_depth) or 8
local function ask_ai(text)
local prev_pending = ctx.pending_exec_output
ctx:append_user(text) -- flushes any pending [exec output] as prefix
log_turn(ctx.turns[#ctx.turns]) -- merged user turn (may include exec)
ctx:append_user(text)
log_turn(ctx.turns[#ctx.turns])
local parts = {}
local depth = 0
local final_resp = ""
local first_iteration = true
while true do
local text_parts = {}
local tool_calls_seen = {}
local ok, err = broker.chat_stream(active_cfg, ctx:to_messages(),
function(kind, payload)
-- Phase 2: callback shape widened to (kind, payload).
-- tool_call kinds are handled by the sub-loop landing in
-- commit #6; this commit ships only the text path so Phase
-- 1 streaming stays functional between #5 and #6.
if kind == "text" then
parts[#parts + 1] = payload
text_parts[#text_parts + 1] = payload
renderer.assistant_delta(payload)
elseif kind == "tool_call" then
tool_calls_seen[#tool_calls_seen + 1] = payload
end
end)
end,
{ tools = tools_schema() })
renderer.assistant_flush()
if not ok then
renderer.status("broker error: " .. tostring(err))
table.remove(ctx.turns) -- back out the merged user turn
ctx.pending_exec_output = prev_pending -- restore buffered exec output
if first_iteration then
-- Back out the user turn so :resume / retry is clean.
table.remove(ctx.turns)
ctx.pending_exec_output = prev_pending
end
return
end
local resp = table.concat(parts)
ctx:append({ role = "assistant", content = resp })
first_iteration = false
local resp_text = table.concat(text_parts)
if #tool_calls_seen == 0 then
-- Pure text response — end of this AI turn.
ctx:append({ role = "assistant", content = resp_text })
log_turn(ctx.turns[#ctx.turns])
final_resp = resp_text
break
end
-- Record the assistant turn with text AND tool_calls. Content
-- may be "" (C3: model often emits no prose before a call).
ctx:append({
role = "assistant",
content = resp_text,
tool_calls = tool_calls_seen,
})
log_turn(ctx.turns[#ctx.turns])
-- Process each tool_call. Every iteration appends EXACTLY one
-- role:"tool" turn per call (keeps alternation legal even on
-- decline/error per C5/C7).
for _, call in ipairs(tool_calls_seen) do
local args_table, args_err
if call.arguments and call.arguments ~= "" then
args_table, _, args_err = json.decode(call.arguments)
else
args_table = {}
end
local tool_content, is_error
if args_err then
tool_content = ("[aish] tool arguments not parseable as "
.. "JSON: %s"):format(tostring(args_err))
is_error = true
renderer.tool_call_begin(call.name, call.arguments)
renderer.tool_call_end(tool_content, true)
elseif not safety.confirm_tool_call(call.name, args_table,
config) then
tool_content = "[aish] tool call declined by user"
is_error = true
renderer.status(tool_content)
else
renderer.tool_call_begin(call.name, call.arguments)
local content, errflag = dispatch_tool_call(call.name,
args_table)
tool_content = content
is_error = errflag
renderer.tool_call_end(content, errflag)
end
ctx:append({
role = "tool",
tool_call_id = call.id,
content = tool_content,
})
log_turn(ctx.turns[#ctx.turns])
end
depth = depth + 1
if depth >= max_tool_depth then
renderer.status(("tool-call depth limit reached (%d); "
.. "stopping sub-loop"):format(max_tool_depth))
final_resp = resp_text
break
end
-- loop body re-runs broker.chat_stream with the now-extended ctx
end
status_evictions(ctx:enforce_budget())
for _, cmd in ipairs(executor.extract_cmd_lines(resp)) do
-- CMD: extraction on the final pure-text response only.
for _, cmd in ipairs(executor.extract_cmd_lines(final_resp)) do
local doit
if config.shell and config.shell.confirm_cmd then
local ans = rl.readline(("execute '%s'? [y/N] "):format(cmd)) or ""
@@ -255,6 +456,83 @@ function M.run(config)
for _, t in ipairs(turns) do ctx:append(t) end
renderer.status(("resumed %d turns from %s"):format(#turns, name))
end,
mcp = function(args)
local sub, sub_args = args:match("^%s*(%S*)%s*(.*)$")
if sub == "list" or sub == "" then
if next(mcp_sessions) == nil then
renderer.status("(no MCP sessions)"); return
end
for alias, sess in pairs(mcp_sessions) do
io.write((" %s %s (%d tools)\n"):format(
alias, sess.url, #sess:list_tools()))
end
elseif sub == "tools" then
local any = false
for alias, sess in pairs(mcp_sessions) do
for _, t in ipairs(sess:list_tools()) do
any = true
local desc = (t.description or ""):gsub("\n", " ")
io.write((" %s.%-18s %s\n"):format(
alias, t.name, desc:sub(1, 60)))
end
end
if not any then renderer.status("(no tools)") end
elseif sub == "tool" then
local name = sub_args:match("^%s*(%S+)")
if not name then
renderer.status("usage: :mcp tool <alias.name>"); return
end
local alias, tname = name:match("^([^.]+)%.(.+)$")
local sess = alias and mcp_sessions[alias]
if not sess then
renderer.status("unknown alias: " .. tostring(alias))
return
end
local found
for _, t in ipairs(sess:list_tools()) do
if t.name == tname then found = t; break end
end
if not found then
renderer.status("unknown tool: " .. name); return
end
io.write((" %s.%s\n"):format(alias, found.name))
io.write((" description: %s\n"):format(found.description or "(none)"))
io.write(" inputSchema:\n ")
io.write((json.encode(found.inputSchema or {}, {indent = true})
:gsub("\n", "\n ")))
io.write("\n")
elseif sub == "connect" then
local url, alias = sub_args:match("^%s*(%S+)%s*(%S*)")
if not url or url == "" then
renderer.status("usage: :mcp connect <url> [alias]"); return
end
if alias == "" then
alias = url:match("https?://([^:/]+)") or url
end
if mcp_sessions[alias] then
renderer.status("already connected: " .. alias); return
end
local ok, n = connect_mcp(alias, { url = url })
if ok then
renderer.status(("mcp %s: connected (%d tools)")
:format(alias, n))
end
elseif sub == "disconnect" then
local alias = sub_args:match("^%s*(%S+)")
if not alias then
renderer.status("usage: :mcp disconnect <alias>"); return
end
local sess = mcp_sessions[alias]
if not sess then
renderer.status("not connected: " .. alias); return
end
sess:close()
mcp_sessions[alias] = nil
renderer.status("disconnected " .. alias)
else
renderer.status("usage: :mcp {list|tools|tool|connect|disconnect}")
end
end,
help = function() io.write(HELP) end,
}