Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP
tool-call flow on top of broker/safety/context/renderer/mcp.
repl.lua additions:
- mcp_sessions table populated from config.mcp.servers at startup.
connect_mcp() helper does initialize + caches tools/list. Failures
status-logged once; absent from mcp_sessions until manual reconnect
(C4 — no auto-retry).
- tools_schema() flattens connected sessions' tools into the OpenAI
{type:"function", function:{name,description,parameters}} shape with
"<alias>.<name>" namespacing.
- flatten_content() concatenates content[type="text"] blocks; one-shot
status warning when non-text blocks (image/resource) are dropped
(§4 normative spec, v1 only handles text).
- dispatch_tool_call(name, args_table) splits alias.tool, looks up
session, calls. Returns (content_string, is_error). Errors of every
flavor (missing alias, no session, rpc_error, transport_error)
yield a synthesized "[aish] ..." string so callers always have a
body for the role:"tool" turn — alternation preserved per C5/C7.
- ask_ai rewritten as a sub-loop that re-issues the broker request
until the model returns pure text or max_tool_depth (default 8) is
hit. Each iteration: stream response → if tool_calls present,
confirm-gate each → dispatch → append role:"tool" turn → continue.
Argument-JSON parse failure produces a synthesized tool turn (C7).
Decline at confirm produces "[aish] tool call declined by user"
tool turn (alternation guarantee).
- :mcp meta with sub-commands: list / tools / tool <a.n> / connect
<url> [alias] / disconnect <alias>. HELP block extended.
context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8
(hybrid prompt: static frame about MCP + dynamic tools list in the
request body). Block is always present even when no MCP servers
configured — ~60 tokens for clarity that 'CMD:' remains the fallback.
CMD: extraction unchanged — runs on the FINAL pure-text response only
(not on intermediate iterations of the tool sub-loop). Substrate §3
invariant preserved.
End-to-end verified two ways:
(1) Direct broker probe: aish's tools_schema fed through
broker.chat_stream against hossenfelder → qwen-1.5b emits one
tool_call payload with correct id + name="boltzmann.list_dir"
+ args='{"path":"/tmp"}'. Accumulator stitched the JSON-string
across fragmented deltas.
(2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock
emits text + tool_call, sub-loop dispatches against LIVE
boltzmann lmcp (auto_approve via policy), 80+ files rendered
inside the tool_call frame, broker re-invoked with the
extended context, mock returns pure text, sub-loop terminates.
Total broker invocations: 2.
Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..."
suggestions even when an MCP tool is the better path; the small
model's system-prompt compliance is weak. Larger models and the
analyze-time direct probe confirm the tools_schema and tool_calls
flow is wire-correct — Phase 7 verify will exercise this against
qwen3-30b or cloud models when available.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 commit #3 per docs/PHASE2.md §12. Three concrete edits per §3
context.lua row (the BLOCKER-fold-in from review):
(a) Loosen Context:append shape-per-role: assistant may carry empty
content if tool_calls is non-empty; role:"tool" requires
tool_call_id + content.
(b) Preserve tool_calls / tool_call_id on store (Phase 1 :append
built {role, content} only and silently dropped extras).
(c) Extend to_messages() with two emission modes selected by
use_tool_role:
true (default) — OpenAI-standard role:"tool" + assistant
turns with tool_calls (wrapped as {id, type:"function",
function:{name, arguments}}).
false (fallback) — collapse assistant-with-tool_calls + its
following role:"tool" turns into a single assistant text
turn with synthesized "[tool: name]\n<args>\n[result]\n
<content>" body; merge consecutive assistant turns so the
trailing post-tool-result text doesn't yield asst/asst
back-to-back (same strict-template gotcha PHASE0.md §6
warned about for user/user).
Alternation assert added (N4): role:"tool" turns must trace back
through zero-or-more prior tool turns to an assistant-with-tool_calls.
Catches sub-loop bugs at append time. Orphan tool turns rejected.
pending_exec_output behavior unchanged per §3 row: buffer persists
across tool-call sub-loops, flushes on next genuine user turn (B4).
Smoke-tested §12 verify-row #3:
(i) default mode round-trip — 5 OpenAI-shape messages, tool_calls
+ tool_call_id preserved.
(ii) fallback mode round-trip — collapsed into 3 messages
(system/user/assistant), tool_calls + role:"tool" not emitted.
(iii) multi-call: 2 tool_calls in one assistant turn followed by 2
tool replies, both modes render correctly.
(iv) orphan tool turn after user — assertion fires.
(v) B4: pending_exec_output survives a tool sub-loop, flushes on
next :append_user.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-test surfaced the bug: with `deep` (mistral-nemo-12b) active,
running `list files` -> y on `CMD: ls` -> `Are there directory entries
beginning with "lor"?` returned a Jinja exception:
api: ... Error: Jinja Exception: After the optional system message,
conversation roles must alternate user/assistant/user/assistant/...
Cause: §6 specified "exec output injected into context uses role 'user'
with a prefix tag '[exec output]'." This works for permissive templates
(qwen2.5-coder-1.5b, the `fast` preset) but produces a back-to-back
user/user pair on strict templates that enforce the OpenAI alternation
contract — `[exec output]` user turn followed by the user's actual
follow-up question.
Fix:
context.lua:
- new field `pending_exec_output` (initially nil)
- new method `:append_exec_output(out)` buffers (concat on subsequent
captures so multi-shell-then-ai still merges everything)
- new method `:append_user(content)` flushes buffered exec output as
a `[exec output]\n...\n\n` prefix and appends a user turn
- `:reset()` also clears the buffer
repl.lua:
- run_shell calls ctx:append_exec_output(out) instead of
ctx:append({role="user", content="[exec output]\n"..out})
- ask_ai calls ctx:append_user(text) instead of raw :append; saves
prev_pending so a broker error can restore the buffer for retry
PHASE0.md §6:
- amended the role-injection paragraph to describe the buffer-and-
prepend policy; the §3 invariants list is untouched (this was a §6
design detail, not a locked invariant)
Verification:
- context unit tests cover: alternation after the failing sequence,
multi-shell merge, reset clears buffer, broker-error retry path
- live reproduction against `deep` (mistral-nemo) of the exact
user-reported sequence succeeds; model responds with a sensible
`CMD: ls | grep '^lor'` instead of a Jinja exception
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 0 implementation per PHASE0.md §6, §8.
Context.new(opts) constructs with the §6 default system prompt (the
`CMD: ` extraction contract is hard-coded in there per §3 — locked
substrate, do not edit). opts overrides: system_prompt, max_turns
(default 40), token_budget (default 4096; visibility only in Phase 0
per Q1, deferred to Phase 3 for accurate tokenization).
API:
ctx:append({role, content}) record a turn
ctx:to_messages() [{system,...}, ...turns] for broker.chat
ctx:enforce_budget() evict pairs (user+assistant) until
#turns <= max_turns; returns count
ctx:estimate_tokens() char/4 heuristic
ctx:reset() drop all turns (system_prompt kept)
System prompt is the §6 phrasing verbatim including the `CMD: ` clause
— stored on the context, NOT in self.turns, so it is prepended freshly
on every to_messages() call.
Smoke covers basic ops, no-evict-at-max, evict-on-overflow, bulk
eviction (14 turns -> 4), reset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README, .gitignore, CLAUDE.md (project conventions)
- docs/PHASE0.md — full Phase 0 manifest (locked substrate)
- 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented
with module-scoped responsibilities matching the manifest
- config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b
snappy/32k + cloud via OpenRouter through hossenfelder)
File names match docs/PHASE0.md §4 exactly. Module bodies fill in across
later phases; the tree shape is locked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>