marfrit/aish - aish - marfrit's space

Author	SHA1	Message	Date
marfrit	7e9cfff04d	repl: tool-call sub-loop + :mcp meta + system-prompt augmentation Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP tool-call flow on top of broker/safety/context/renderer/mcp. repl.lua additions: - mcp_sessions table populated from config.mcp.servers at startup. connect_mcp() helper does initialize + caches tools/list. Failures status-logged once; absent from mcp_sessions until manual reconnect (C4 — no auto-retry). - tools_schema() flattens connected sessions' tools into the OpenAI {type:"function", function:{name,description,parameters}} shape with "<alias>.<name>" namespacing. - flatten_content() concatenates content[type="text"] blocks; one-shot status warning when non-text blocks (image/resource) are dropped (§4 normative spec, v1 only handles text). - dispatch_tool_call(name, args_table) splits alias.tool, looks up session, calls. Returns (content_string, is_error). Errors of every flavor (missing alias, no session, rpc_error, transport_error) yield a synthesized "[aish] ..." string so callers always have a body for the role:"tool" turn — alternation preserved per C5/C7. - ask_ai rewritten as a sub-loop that re-issues the broker request until the model returns pure text or max_tool_depth (default 8) is hit. Each iteration: stream response → if tool_calls present, confirm-gate each → dispatch → append role:"tool" turn → continue. Argument-JSON parse failure produces a synthesized tool turn (C7). Decline at confirm produces "[aish] tool call declined by user" tool turn (alternation guarantee). - :mcp meta with sub-commands: list / tools / tool <a.n> / connect <url> [alias] / disconnect <alias>. HELP block extended. context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8 (hybrid prompt: static frame about MCP + dynamic tools list in the request body). Block is always present even when no MCP servers configured — ~60 tokens for clarity that 'CMD:' remains the fallback. CMD: extraction unchanged — runs on the FINAL pure-text response only (not on intermediate iterations of the tool sub-loop). Substrate §3 invariant preserved. End-to-end verified two ways: (1) Direct broker probe: aish's tools_schema fed through broker.chat_stream against hossenfelder → qwen-1.5b emits one tool_call payload with correct id + name="boltzmann.list_dir" + args='{"path":"/tmp"}'. Accumulator stitched the JSON-string across fragmented deltas. (2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock emits text + tool_call, sub-loop dispatches against LIVE boltzmann lmcp (auto_approve via policy), 80+ files rendered inside the tool_call frame, broker re-invoked with the extended context, mock returns pure text, sub-loop terminates. Total broker invocations: 2. Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..." suggestions even when an MCP tool is the better path; the small model's system-prompt compliance is weak. Larger models and the analyze-time direct probe confirm the tools_schema and tool_calls flow is wire-correct — Phase 7 verify will exercise this against qwen3-30b or cloud models when available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 15:20:42 +00:00
marfrit	7c221a8aae	context: tool turns + tool_calls on assistant; use_tool_role fallback Phase 2 commit #3 per docs/PHASE2.md §12. Three concrete edits per §3 context.lua row (the BLOCKER-fold-in from review): (a) Loosen Context:append shape-per-role: assistant may carry empty content if tool_calls is non-empty; role:"tool" requires tool_call_id + content. (b) Preserve tool_calls / tool_call_id on store (Phase 1 :append built {role, content} only and silently dropped extras). (c) Extend to_messages() with two emission modes selected by use_tool_role: true (default) — OpenAI-standard role:"tool" + assistant turns with tool_calls (wrapped as {id, type:"function", function:{name, arguments}}). false (fallback) — collapse assistant-with-tool_calls + its following role:"tool" turns into a single assistant text turn with synthesized "[tool: name]\n<args>\n[result]\n <content>" body; merge consecutive assistant turns so the trailing post-tool-result text doesn't yield asst/asst back-to-back (same strict-template gotcha PHASE0.md §6 warned about for user/user). Alternation assert added (N4): role:"tool" turns must trace back through zero-or-more prior tool turns to an assistant-with-tool_calls. Catches sub-loop bugs at append time. Orphan tool turns rejected. pending_exec_output behavior unchanged per §3 row: buffer persists across tool-call sub-loops, flushes on next genuine user turn (B4). Smoke-tested §12 verify-row #3: (i) default mode round-trip — 5 OpenAI-shape messages, tool_calls + tool_call_id preserved. (ii) fallback mode round-trip — collapsed into 3 messages (system/user/assistant), tool_calls + role:"tool" not emitted. (iii) multi-call: 2 tool_calls in one assistant turn followed by 2 tool replies, both modes render correctly. (iv) orphan tool turn after user — assertion fires. (v) B4: pending_exec_output survives a tool sub-loop, flushes on next :append_user. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 13:10:47 +00:00
marfrit	16490e6905	fix: buffer exec output for next user turn; alternation for strict templates User-test surfaced the bug: with `deep` (mistral-nemo-12b) active, running `list files` -> y on `CMD: ls` -> `Are there directory entries beginning with "lor"?` returned a Jinja exception: api: ... Error: Jinja Exception: After the optional system message, conversation roles must alternate user/assistant/user/assistant/... Cause: §6 specified "exec output injected into context uses role 'user' with a prefix tag '[exec output]'." This works for permissive templates (qwen2.5-coder-1.5b, the `fast` preset) but produces a back-to-back user/user pair on strict templates that enforce the OpenAI alternation contract — `[exec output]` user turn followed by the user's actual follow-up question. Fix: context.lua: - new field `pending_exec_output` (initially nil) - new method `:append_exec_output(out)` buffers (concat on subsequent captures so multi-shell-then-ai still merges everything) - new method `:append_user(content)` flushes buffered exec output as a `[exec output]\n...\n\n` prefix and appends a user turn - `:reset()` also clears the buffer repl.lua: - run_shell calls ctx:append_exec_output(out) instead of ctx:append({role="user", content="[exec output]\n"..out}) - ask_ai calls ctx:append_user(text) instead of raw :append; saves prev_pending so a broker error can restore the buffer for retry PHASE0.md §6: - amended the role-injection paragraph to describe the buffer-and- prepend policy; the §3 invariants list is untouched (this was a §6 design detail, not a locked invariant) Verification: - context unit tests cover: alternation after the failing sequence, multi-shell merge, reset clears buffer, broker-error retry path - live reproduction against `deep` (mistral-nemo) of the exact user-reported sequence succeeds; model responds with a sensible `CMD: ls \| grep '^lor'` instead of a Jinja exception Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:41:21 +00:00
marfrit	10848645af	context: in-memory turn list + max_turns sliding-window eviction Phase 0 implementation per PHASE0.md §6, §8. Context.new(opts) constructs with the §6 default system prompt (the `CMD: ` extraction contract is hard-coded in there per §3 — locked substrate, do not edit). opts overrides: system_prompt, max_turns (default 40), token_budget (default 4096; visibility only in Phase 0 per Q1, deferred to Phase 3 for accurate tokenization). API: ctx:append({role, content}) record a turn ctx:to_messages() [{system,...}, ...turns] for broker.chat ctx:enforce_budget() evict pairs (user+assistant) until #turns <= max_turns; returns count ctx:estimate_tokens() char/4 heuristic ctx:reset() drop all turns (system_prompt kept) System prompt is the §6 phrasing verbatim including the `CMD: ` clause — stored on the context, NOT in self.turns, so it is prepended freshly on every to_messages() call. Smoke covers basic ops, no-evict-at-max, evict-on-overflow, bulk eviction (14 turns -> 4), reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 11:59:25 +00:00
claude-noether	4310207738	Phase 0: scaffold tree + manifest - README, .gitignore, CLAUDE.md (project conventions) - docs/PHASE0.md — full Phase 0 manifest (locked substrate) - 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented with module-scoped responsibilities matching the manifest - config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b snappy/32k + cloud via OpenRouter through hossenfelder) File names match docs/PHASE0.md §4 exactly. Module bodies fill in across later phases; the tree shape is locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:16:07 +00:00

5 Commits