phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics
Phase 7 verify finding from TC #26 against :model cloud: HTTP 400 from openrouter→Amazon Bedrock: "tools.0.custom.name: String should match pattern '^[a-zA-Z0-9_-]{1,128}$'" Anthropic via Bedrock validates tool names against that regex and rejects dots. PHASE2 originally chose "." as the namespace separator ("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not. Separator switched to "__" (two underscores) everywhere — internal API matches on-wire shape, no transformation layer: - repl.lua: - tools_schema builds "alias__name" - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __) - :mcp tool parser uses same split - :mcp tools formatter prints "alias__name" - HELP block shows <alias__name> - safety.lua confirm_tool_call: alias.* glob → alias__* glob - config.lua example block: keys rewritten - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua row, §5 wire-shape JSON examples, §6 auto_approve schema, §7 meta-cmd table, §12 plan all updated. Original "." references preserved in commit history. Constraint: aliases must not themselves contain "__" so the parse stays unambiguous. Tool names from MCP servers may have underscores freely. Second fix bundled — uninformative broker error: Previously "broker error: transport: HTTP response code said error" Now "broker error: transport: HTTP 400: {full body snippet}" ffi/curl.lua M.post_sse changes: - FAILONERROR no longer set (was hiding the response body). - raw_body accumulator added alongside the SSE buffer; captures every byte regardless of SSE shape. - After perform, check status_code via curl_easy_getinfo. On >=400, return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged. - End-of-stream SSE flush only runs on 2xx (no false event on error bodies that aren't SSE-shaped). - Phase 1 callers reading just first return slot stay correct. End-to-end verified: - :model cloud + tools=[boltzmann__read_file ...] + "Use boltzmann__read_file with path=/etc/hostname" → Claude emits tool_call with name="boltzmann__read_file", args='{"path": "/etc/hostname"}'. ok=true, transport clean. - Force-bad tool name "bad.name.with.dots" → err string carries the full bedrock 400 with the regex-pattern message visible. TC #26 (sub-loop end-to-end) is now testable against cloud — the error that blocked it is resolved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+31
-18
@@ -2,9 +2,22 @@
|
||||
|
||||
**Project:** aish — AI-augmented conversational shell
|
||||
**Document:** Phase 2 Requirements, Architecture & Design Decisions
|
||||
**Status:** Plan (review pass folded in 2026-05-12)
|
||||
**Status:** Verify (Phase 7) — implementation complete; live testing in progress
|
||||
**Date:** 2026-05-12
|
||||
|
||||
**Amendments since formulate:**
|
||||
- 2026-05-12 (review fold-in): see §12 "Review fold-in" subsection.
|
||||
- 2026-05-12 (Phase 7 verify, separator switch): tool-name namespace
|
||||
delimiter changed from `.` to `__` because Anthropic via Bedrock
|
||||
validates tool names against `^[a-zA-Z0-9_-]{1,128}$` — dots are
|
||||
rejected with `HTTP 400 tools.0.custom.name: String should match
|
||||
pattern '...'`. Discovered when `:model cloud` exercised TC #26
|
||||
against the real cloud path. Internal API matches on-wire shape so
|
||||
there's no transformation layer. Constraint: aliases must not
|
||||
themselves contain `__` so the parse stays unambiguous (leftmost
|
||||
`__` is the split point). Tool names from MCP servers may contain
|
||||
underscores freely. All §3/§5/§6/§7/§12 references updated below.
|
||||
|
||||
PHASE0.md is the locked substrate; PHASE1.md is layered on top. This
|
||||
manifest specifies what Phase 2 adds. Section numbers reference back to
|
||||
PHASE0.md / PHASE1.md where relevant.
|
||||
@@ -18,7 +31,7 @@ Three pillars per PHASE0.md §11 row 2:
|
||||
1. **MCP client** (`mcp.lua`) — JSON-RPC 2.0 over HTTP+SSE transport.
|
||||
Target reference implementation: `lmcp`. Operations needed for v1:
|
||||
`initialize`, `tools/list`, `tools/call`. Multiple servers may be
|
||||
connected concurrently; tools are namespaced `<server>.<tool>`.
|
||||
connected concurrently; tools are namespaced `<server>__<tool>`.
|
||||
2. **Tool-calling protocol bridge** — the broker sends OpenAI-compatible
|
||||
`tools` in the request body; the model emits `tool_calls` in the
|
||||
response; `mcp.lua` dispatches each call to the right server; the
|
||||
@@ -53,7 +66,7 @@ Three pillars per PHASE0.md §11 row 2:
|
||||
| MCP protocol version | `2025-03-26` (confirmed by live probe of boltzmann:8080/mcp) | lmcp pins this in `MCP_VERSION` and **does not negotiate** — it returns its compiled-in version regardless of what the client sends (lmcp.lua:80-91). aish sends `2025-03-26` in `initialize` and accepts whatever the server returns; on mismatch it logs `[aish] mcp <alias>: protocol version mismatch (sent X, got Y); proceeding` and continues. v1 has no version-gated behavior to abort on. |
|
||||
| MCP auth | Bearer token via `Authorization: Bearer <token>` header, per-server | Analyze finding: every lmcp deployment in mfritsche's fleet (boltzmann/hertz/pve*/nc/etc.) requires Bearer auth. Phase 2 config supports `auth_token` literal and `auth_env` env-var indirection per server (mirrors `key_env` in the models registry). lmcp servers without auth (broglie/higgs LAN-only) just leave the field nil. |
|
||||
| Tool-call wire format | OpenAI `tools` field on `/v1/chat/completions` body; `tool_calls` on assistant deltas; `role:"tool"` turn with `tool_call_id` for results | Standard, supported by llama.cpp and OpenRouter. Aligns with the existing `/v1/chat/completions` substrate invariant. |
|
||||
| Tool namespacing | `<server-alias>.<tool-name>` for both the wire-level tool name and `:mcp tools` listing | Avoids name collisions across servers. The alias comes from the config key or the connect URL hash. |
|
||||
| Tool namespacing | `<server-alias>__<tool-name>` for both the wire-level tool name and `:mcp tools` listing (was `.` at formulate; switched 2026-05-12 — see Amendments above) | Avoids name collisions across servers. The alias comes from the config key or the connect URL hash. `__` (two underscores) is within Bedrock's tool-name regex `^[a-zA-Z0-9_-]{1,128}$` whereas `.` is not. Aliases must not themselves contain `__`. |
|
||||
| `CMD:` coexistence with tool-calls | Both stay live, no policy preference. Substrate invariant §3 unchanged. | Resolves Q6 (see §10). `CMD:` is the local-shell route; MCP tools are structured-API routes; they serve different purposes. Future phases (Norris, Phase 3) may prefer tools when both are available, but Phase 2 doesn't enforce. |
|
||||
| Authorization default | Per-call confirm (mirrors PHASE0.md §10 `confirm_cmd` for shell) | Conservative default; user can opt into auto-approval per tool or per server via config. Resolves Q8. |
|
||||
| System prompt augmentation | Hybrid: static frame in `broker.lua` system prompt + dynamic `tools` array in the request body | Tool list goes in the API field where it belongs; the system prompt only mentions that tools exist and how to use them. Per-request body cost is bounded (tools change rarely; small schemas). Resolves Q9. |
|
||||
@@ -73,7 +86,7 @@ Three pillars per PHASE0.md §11 row 2:
|
||||
| `context.lua` | turns = {{role, content}, ...} + `pending_exec_output`; `Context:append` asserts `turn.content` and rebuilds the entry as `{role, content}` only — extra fields are dropped | Three concrete edits: (a) **loosen `:append`** so `role == "assistant"` can carry `tool_calls = [{id, name, arguments}]` with `content` allowed empty, and `role == "tool"` requires `tool_call_id` + `content` (the assert moves from "content required" to "shape per role"); (b) **preserve `tool_calls` and `tool_call_id`** in the stored turn (not just role+content); (c) `to_messages()` emits `tool_calls` on assistant turns and `tool_call_id` on tool turns. Add a debug assertion that `role == "tool"` follows an assistant turn with non-empty `tool_calls` (catches design bugs early; N4 in review). **`pending_exec_output` interaction**: the buffer **persists across the tool-call sub-loop** (the loop is internal — no user input happens — so there's no append_user to flush against). It flushes on the next genuine user turn, regardless of how many tool-call iterations preceded. |
|
||||
| `repl.lua` | meta cmds + ask_ai stream loop | After ask_ai sees `tool_calls`, enter a tool-execution sub-loop: confirm-gate each call via `safety.confirm_tool_call`, dispatch via `mcp.session:call_tool`, append tool turn to context, re-issue the broker request. Loop until assistant emits text without tool_calls. New meta: `:mcp connect <url> [alias]`, `:mcp list`, `:mcp tools`, `:mcp disconnect <alias>`. |
|
||||
| `renderer.lua` | streaming text + exec frame | Add `tool_call_begin(name, args)`, `tool_call_end(result, ok)`. Visual style: indented, dim, parallel to the exec frame. |
|
||||
| `config.lua` | example with models/shell/context/history | Schema additions: `mcp = { servers = { alias = { url = "..." } }, auto_approve = { ["alias.tool"] = true } }`. Documented in §10 below. |
|
||||
| `config.lua` | example with models/shell/context/history | Schema additions: `mcp = { servers = { alias = { url = "..." } }, auto_approve = { ["alias__tool"] = true } }`. Documented in §6 below. |
|
||||
| `ffi/curl.lua` | post + post_sse; `M.post` does not set `FAILONERROR`, so non-2xx responses return the body as a normal string. `ffi.cdef` exposes only `curl_easy_setopt` — no `curl_easy_getinfo` (cdef block at curl.lua:11-28). | **One small extension**: `M.post` returns **`(body, status_code)` on transport success** (status_code may be non-2xx — caller decides what to do; mcp.lua treats `>= 400` as transport failure). `(nil, errmsg)` on libcurl-level failure is **unchanged** — Phase 1 callers that read only the first slot stay correct. Requires adding `curl_easy_getinfo` + `CURLINFO_RESPONSE_CODE` (decimal 2097154, `CURLINFOTYPE_LONG | 2`) to the `ffi.cdef` block, plus a `long[1]` out-param shim. MCP auth failures from lmcp arrive as HTTP `401` with a non-JSON-RPC body (`{"error":"unauthorized"}`); `mcp.lua` must distinguish HTTP-level failure from JSON-RPC envelope errors. No SSE GET channel is added (analyze finding ruled it out for lmcp). |
|
||||
| `history.lua` | JSONL session log | Tool turns are logged like any other turn — `{role:"tool", tool_call_id:"...", content:"..."}`. Resume reconstructs them via `ctx:append` like user/assistant turns. |
|
||||
|
||||
@@ -185,7 +198,7 @@ This split resolves Q21 (with the C5/C7 review fix folded in).
|
||||
"temperature": 0.2,
|
||||
"tools": [
|
||||
{ "type":"function",
|
||||
"function": { "name":"<alias>.<tool>",
|
||||
"function": { "name":"<alias>__<tool>",
|
||||
"description":"...",
|
||||
"parameters": <inputSchema> } },
|
||||
...
|
||||
@@ -195,7 +208,7 @@ This split resolves Q21 (with the C5/C7 review fix folded in).
|
||||
|
||||
The `tools` array is assembled by `mcp.tools_schema()` — flattens
|
||||
`tools/list` results from every connected session, namespacing each tool
|
||||
as `<alias>.<name>`.
|
||||
as `<alias>__<name>`.
|
||||
|
||||
### Response handling (streaming)
|
||||
|
||||
@@ -203,7 +216,7 @@ llama.cpp / OpenAI deltas may include:
|
||||
|
||||
```json
|
||||
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_…",
|
||||
"function":{"name":"alias.tool","arguments":"{\"a\":"}}]}}]}
|
||||
"function":{"name":"alias__tool","arguments":"{\"a\":"}}]}}]}
|
||||
data: {"choices":[{"delta":{"tool_calls":[{"index":0,
|
||||
"function":{"arguments":"1}"}}]}}]}
|
||||
data: {"choices":[{"finish_reason":"tool_calls",...}]}
|
||||
@@ -233,7 +246,7 @@ that text first, then renders the tool-call frame around dispatch.
|
||||
ctx:append({
|
||||
role = "assistant",
|
||||
content = accumulated_text, -- may be "" if model emitted no prose
|
||||
tool_calls = { {id="call_…", name="alias.tool", arguments=<json-string>} },
|
||||
tool_calls = { {id="call_…", name="alias__tool", arguments=<json-string>} },
|
||||
})
|
||||
ctx:append({
|
||||
role = "tool",
|
||||
@@ -269,7 +282,7 @@ Hit-cap surfaces as a status: `[aish] tool-call depth limit reached`.
|
||||
function M.confirm_tool_call(name, args, cfg)
|
||||
local policy = cfg.mcp and cfg.mcp.auto_approve or {}
|
||||
if policy[name] then return true end
|
||||
-- Per-server prefix check: "alias.*" entries
|
||||
-- Per-server prefix check: "alias__*" entries
|
||||
local alias = name:match("^([^.]+)%.")
|
||||
if alias and policy[alias .. ".*"] then return true end
|
||||
-- Otherwise prompt
|
||||
@@ -298,8 +311,8 @@ mcp = {
|
||||
},
|
||||
},
|
||||
auto_approve = {
|
||||
["boltzmann.read_file"] = true, -- specific tool
|
||||
["broglie.*"] = true, -- whole server
|
||||
["boltzmann__read_file"] = true, -- specific tool
|
||||
["broglie__*"] = true, -- whole server
|
||||
},
|
||||
max_tool_depth = 8,
|
||||
}
|
||||
@@ -321,8 +334,8 @@ heuristic decides; for non-destructive tools, auto_approve. Outside scope here.
|
||||
| `:mcp connect <url> [<alias>]` | Open a session; perform initialize + tools/list; add to active set |
|
||||
| `:mcp disconnect <alias>` | Close one session |
|
||||
| `:mcp list` | Show connected sessions (alias, url, tool count, status) |
|
||||
| `:mcp tools` | List tools across all sessions (`alias.name` — short description) |
|
||||
| `:mcp tool <alias.name>` | Show one tool's full inputSchema (debug aid) |
|
||||
| `:mcp tools` | List tools across all sessions (`alias__name` — short description) |
|
||||
| `:mcp tool <alias__name>` | Show one tool's full inputSchema (debug aid) |
|
||||
|
||||
Existing `:help` updated to list these.
|
||||
|
||||
@@ -445,7 +458,7 @@ and Phase 1 implementation cadence.
|
||||
|
||||
2. **`safety.lua` — confirm-gate surface.** Implement just
|
||||
`M.confirm_tool_call(name, args, cfg)` per §6. Reads
|
||||
`cfg.mcp.auto_approve` for exact-match and `alias.*` glob. Falls back
|
||||
`cfg.mcp.auto_approve` for exact-match and `alias__*` glob. Falls back
|
||||
to `rl.readline` prompt. Norris-mode hooks stay out (Phase 3). **Test
|
||||
in isolation** with mocked rl + various policy shapes.
|
||||
|
||||
@@ -460,7 +473,7 @@ and Phase 1 implementation cadence.
|
||||
user turn (§3 row). **Tests in isolation**: (i) build a context with
|
||||
assistant+tool_calls + tool turns, round-trip through `to_messages()`,
|
||||
eyeball JSON shape; (ii) day-one fallback test (N8) — same context
|
||||
with `use_tool_role = false` must emit the `[tool: alias.name]\n…`
|
||||
with `use_tool_role = false` must emit the `[tool: alias__name]\n…`
|
||||
prefix shape instead of a `role:"tool"` message.
|
||||
|
||||
4. **`renderer.lua` extensions.** Add `M.tool_call_begin(name, args)`
|
||||
@@ -483,8 +496,8 @@ and Phase 1 implementation cadence.
|
||||
|
||||
6. **`repl.lua` wiring.** New module-local `mcp_sessions = {alias=session,...}`,
|
||||
populated from `config.mcp.servers` at startup. Helpers:
|
||||
- `tools_schema()` → flatten `tool` lists across sessions, namespace `alias.name`
|
||||
- `dispatch_tool_call(call)` → split `alias.tool`, look up session, call, return content
|
||||
- `tools_schema()` → flatten `tool` lists across sessions, namespace `alias__name`
|
||||
- `dispatch_tool_call(call)` → split `alias__tool`, look up session, call, return content
|
||||
- `ask_ai` loop now: stream response → if any tool_calls completed,
|
||||
for each call: `safety.confirm_tool_call` → `dispatch_tool_call` →
|
||||
append assistant-with-tool_calls + tool turn → re-call `broker.chat_stream`
|
||||
@@ -543,7 +556,7 @@ and Phase 1 implementation cadence.
|
||||
- **Q18 fallback path** (strict templates rejecting `role:"tool"`).
|
||||
Plumb a `context.use_tool_role` flag (default true). If a real-world
|
||||
rejection appears at Phase 7, flip the flag and convert tool turns to
|
||||
`[tool: alias.name]\n<content>` prefix on the next user turn (same
|
||||
`[tool: alias__name]\n<content>` prefix on the next user turn (same
|
||||
pattern as `pending_exec_output`). **Day-one verification** (N8 in
|
||||
review): commit #3 includes a small in-isolation test that builds a
|
||||
context with `use_tool_role = false`, appends an assistant+tool_calls
|
||||
|
||||
Reference in New Issue
Block a user