6 Commits

Author SHA1 Message Date
marfrit f26cbd9a3a phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics
Phase 7 verify finding from TC #26 against :model cloud:
  HTTP 400 from openrouter→Amazon Bedrock:
  "tools.0.custom.name: String should match pattern
   '^[a-zA-Z0-9_-]{1,128}$'"

Anthropic via Bedrock validates tool names against that regex and
rejects dots. PHASE2 originally chose "." as the namespace separator
("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not.

Separator switched to "__" (two underscores) everywhere — internal
API matches on-wire shape, no transformation layer:

  - repl.lua:
    - tools_schema builds "alias__name"
    - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __)
    - :mcp tool parser uses same split
    - :mcp tools formatter prints "alias__name"
    - HELP block shows <alias__name>
  - safety.lua confirm_tool_call: alias.* glob → alias__* glob
  - config.lua example block: keys rewritten
  - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua
    row, §5 wire-shape JSON examples, §6 auto_approve schema, §7
    meta-cmd table, §12 plan all updated. Original "." references
    preserved in commit history.

Constraint: aliases must not themselves contain "__" so the parse
stays unambiguous. Tool names from MCP servers may have underscores
freely.

Second fix bundled — uninformative broker error:
  Previously "broker error: transport: HTTP response code said error"
  Now      "broker error: transport: HTTP 400: {full body snippet}"

ffi/curl.lua M.post_sse changes:
  - FAILONERROR no longer set (was hiding the response body).
  - raw_body accumulator added alongside the SSE buffer; captures
    every byte regardless of SSE shape.
  - After perform, check status_code via curl_easy_getinfo. On >=400,
    return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged.
  - End-of-stream SSE flush only runs on 2xx (no false event on
    error bodies that aren't SSE-shaped).
  - Phase 1 callers reading just first return slot stay correct.

End-to-end verified:
  - :model cloud + tools=[boltzmann__read_file ...] +
    "Use boltzmann__read_file with path=/etc/hostname" →
    Claude emits tool_call with name="boltzmann__read_file",
    args='{"path": "/etc/hostname"}'. ok=true, transport clean.
  - Force-bad tool name "bad.name.with.dots" → err string carries
    the full bedrock 400 with the regex-pattern message visible.

TC #26 (sub-loop end-to-end) is now testable against cloud — the
error that blocked it is resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:04:57 +00:00
marfrit f5daa6afc0 docs/PHASE2: re-review NITs — M.post shape, getinfo cdef, content flattening normative
Three follow-up NITs from the post-fold-in review:

  (1) Disambiguate M.post return shape: (body, status_code) on transport
      success regardless of status; (nil, errmsg) on libcurl failure
      stays unchanged. Phase 1 callers reading only the first slot are
      unaffected.

  (2) Note that the M.post extension requires extending ffi.cdef to
      include curl_easy_getinfo + CURLINFO_RESPONSE_CODE (decimal
      2097154, CURLINFOTYPE_LONG | 2) and a long[1] out-param shim.
      Implementation detail the commit #1 author will need.

  (3) Move the tool-result content-flattening rule from §12 risk note
      into §4 normative spec (forward-referenced both ways) — §4 is
      where a future reader looking for the tool-invocation contract
      will scan.

No design changes; clarifications only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:02:35 +00:00
marfrit d3570ccea4 docs/PHASE2: review fold-in — 5 BLOCKERs + 7 CONCERNs + key NITs
Independent review of the formulate+analyze+plan draft surfaced design
gaps that would have shipped as silent bugs. Resolutions applied:

BLOCKERs:
  B1 context.lua impact widened — Phase 1 :append asserts content and
     discards extra fields. Need (a) shape-per-role assert, (b) preserve
     tool_calls/tool_call_id on store, (c) emit from to_messages().
  B2 ffi/curl.M.post extended to return (body, status_code). lmcp's
     401 returns a non-JSON-RPC body that would have been mis-decoded.
  B3 §3 typo schema -> inputSchema.
  B4 pending_exec_output × tool-call sub-loop interaction specified.
  B5 §3/§12 broker dependency contradiction — broker takes opts.tools
     from caller; no layering inversion.

CONCERNs:
  C1 M.chat return polymorphism dropped (no consumer).
  C2 tool_calls[].index absent fallback: default to 0.
  C3 Re-injection stores accumulated text, not hard-coded empty.
  C4 :mcp connect failure: no auto-retry, status-log once.
  C5/C7 JSON-RPC error AND argument-parse failure both synthesize a
     role:"tool" turn — keeps strict-template alternation legal
     exactly the way PHASE0 §6 demanded for exec output.
  C6 §9 confirms §4 amendment is additive (preserves §3 invariant).

NITs:
  N3 protocolVersion fallback (lmcp doesn't negotiate).
  N4 Alternation assert in Context:append.
  N7 Model-routing bug filed as aish#23.
  N8 Day-one fallback test for use_tool_role=false in commit #3.

Manifest status: Plan (review folded). Status line and Resolutions
sections updated; commit-by-commit roadmap reflects revised specs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:00:07 +00:00
marfrit 447e430254 docs/PHASE2 §12: implementation plan — 7-commit roadmap
Bottom-up: mcp.lua → safety.lua → context.lua → renderer.lua → broker.lua
→ repl.lua → config.lua. Same cadence as Phase 0/1.

Risks called out explicitly:
- Empty tools array → omit field entirely (some servers reject [])
- isError:false on actual failure (baseline §3 finding) → pass content
  through regardless; let model read error text
- JSON-RPC error from tools/call → aish status only, no tool turn
  appended, no model recovery
- max_tool_depth=8 cap on tool-call sub-loop
- Argument JSON streaming may yield malformed JSON → status warn + skip
- Q18 fallback (use_tool_role=true default; prefix-injection plumbed
  but dead-coded; verify can flip)
- Connect-at-startup is sequential (~30ms × N); fine for N≤3

Two items left open for review: Q18 default flip vs ship-true-flip-on-fail,
and whether :mcp connect should re-fetch tools after the initial cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 12:37:27 +00:00
marfrit 5878f7347b docs/PHASE2: analyze — lmcp v0.5.4 probed, transport simplified
Live-probed against lmcp v0.5.4 (boltzmann) + hossenfelder broker proxy:

Transport simpler than spec:
- lmcp only implements POST-per-RPC with Connection: close; no held-open
  SSE channel. Combined with capabilities.tools.listChanged=false, no
  client-side listener is needed in v1. Drops the planned M.get_sse
  addition to ffi/curl.lua — Phase 1's M.post covers MCP.

Bearer auth is universal across the fleet — config schema grew
auth_token (literal) and auth_env (env-var indirection) fields per
server, mirroring PHASE0 §10's key_env convention.

Streaming tool_calls delta shape verified — accumulator by `index`,
function.arguments arrives as chunked JSON-string. Matches the
formulate-phase assumption in §5.

Resolutions:
  Q17 transport abstraction — POST-only, no SSE channel for lmcp.
  Q21 error mapping       — result.isError (model-recoverable, feed
                             back as tool turn) vs JSON-RPC error
                             (unknown method/tool, transport-level).
  Q18 role:"tool" turn    — accepted at protocol level (live-probed).
                             Mistral-nemo template verification
                             blocked by the hossenfelder model-field
                             routing bug; full closure carried to
                             Phase 7 verify.

Open-end recorded in §11: the hossenfelder proxy routes every request
to the loaded fast model regardless of model field, blocking Phase 2
testing against mistral-nemo specifically. Parallel to the SSE
buffering issue at marfrit/aish#15; same root (boltzmann proxy code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 09:51:03 +00:00
marfrit ec6793c93c docs/PHASE2: formulate — MCP client + tool-calling bridge
Phase 2 formulate manifest. Three pillars per PHASE0.md §11 row 2:
mcp.lua (JSON-RPC 2.0 over HTTP+SSE, target: lmcp), tool-calling bridge
(OpenAI tools field <-> MCP tools/call), and the safety.lua
authorization gate (per-call confirm + auto_approve policy).

Resolves PHASE0.md §13 Q6–Q10:
  Q6  CMD: + tool-calls coexist; substrate §3 unchanged
  Q7  config-declared servers + runtime :mcp connect
  Q8  per-call confirm default, auto_approve policy in config
  Q9  hybrid system prompt: static frame + dynamic tools body field
  Q10 streaming-from-day-one on Phase 1 SSE; on_delta widens to (kind, payload)

New questions tracked in §11 (Q17–Q22): transport abstraction, role:tool
vs prefix injection (mistral-nemo template verification needed), large
tool-result handling, parallel dispatch, error mapping, aish-as-MCP-server
(parked).

§4 module layout amended: mcp.lua slots between broker.lua and router.lua.
The amendment is documented in this manifest; the actual §4 table edit
lands when implementation starts (Phase 2 implement phase).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 09:23:53 +00:00