phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics

Phase 7 verify finding from TC #26 against :model cloud:
  HTTP 400 from openrouter→Amazon Bedrock:
  "tools.0.custom.name: String should match pattern
   '^[a-zA-Z0-9_-]{1,128}$'"

Anthropic via Bedrock validates tool names against that regex and
rejects dots. PHASE2 originally chose "." as the namespace separator
("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not.

Separator switched to "__" (two underscores) everywhere — internal
API matches on-wire shape, no transformation layer:

  - repl.lua:
    - tools_schema builds "alias__name"
    - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __)
    - :mcp tool parser uses same split
    - :mcp tools formatter prints "alias__name"
    - HELP block shows <alias__name>
  - safety.lua confirm_tool_call: alias.* glob → alias__* glob
  - config.lua example block: keys rewritten
  - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua
    row, §5 wire-shape JSON examples, §6 auto_approve schema, §7
    meta-cmd table, §12 plan all updated. Original "." references
    preserved in commit history.

Constraint: aliases must not themselves contain "__" so the parse
stays unambiguous. Tool names from MCP servers may have underscores
freely.

Second fix bundled — uninformative broker error:
  Previously "broker error: transport: HTTP response code said error"
  Now      "broker error: transport: HTTP 400: {full body snippet}"

ffi/curl.lua M.post_sse changes:
  - FAILONERROR no longer set (was hiding the response body).
  - raw_body accumulator added alongside the SSE buffer; captures
    every byte regardless of SSE shape.
  - After perform, check status_code via curl_easy_getinfo. On >=400,
    return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged.
  - End-of-stream SSE flush only runs on 2xx (no false event on
    error bodies that aren't SSE-shaped).
  - Phase 1 callers reading just first return slot stay correct.

End-to-end verified:
  - :model cloud + tools=[boltzmann__read_file ...] +
    "Use boltzmann__read_file with path=/etc/hostname" →
    Claude emits tool_call with name="boltzmann__read_file",
    args='{"path": "/etc/hostname"}'. ok=true, transport clean.
  - Force-bad tool name "bad.name.with.dots" → err string carries
    the full bedrock 400 with the regex-pattern message visible.

TC #26 (sub-loop end-to-end) is now testable against cloud — the
error that blocked it is resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-12 20:04:57 +00:00
parent 3fa6279f5b
commit f26cbd9a3a
5 changed files with 89 additions and 49 deletions
+27 -8
View File
@@ -157,7 +157,11 @@ function M.post_sse(url, body, headers, on_event, timeout_ms)
if handle == nil then return nil, "curl_easy_init returned NULL" end
-- SSE parse state: buffer holds incomplete tail between callback deliveries.
local buffer = ""
-- raw_body captures every byte we receive (regardless of SSE shape) so we
-- can surface upstream error bodies (e.g. openrouter→bedrock 400 with a
-- non-SSE JSON envelope). Truncated only at error-message time.
local buffer = ""
local raw_body = ""
local cb_error = nil
local write_cb = ffi.cast(
@@ -169,7 +173,9 @@ function M.post_sse(url, body, headers, on_event, timeout_ms)
-- documents that as process-fatal. Surface via cb_error and let
-- curl keep draining (return n) so we can report after perform.
local ok, err = pcall(function()
buffer = buffer .. ffi.string(ptr, n)
local chunk = ffi.string(ptr, n)
raw_body = raw_body .. chunk
buffer = buffer .. chunk
while true do
local b = buffer:find("\n\n", 1, true)
if not b then break end
@@ -206,21 +212,30 @@ function M.post_sse(url, body, headers, on_event, timeout_ms)
setopt_ptr (handle, OPT.HTTPHEADER, slist)
setopt_ptr (handle, OPT.WRITEFUNCTION, write_cb)
setopt_long(handle, OPT.NOSIGNAL, 1)
setopt_long(handle, OPT.FAILONERROR, 1)
-- FAILONERROR intentionally NOT set: we want to read the response body
-- on >=400 so the caller can surface upstream API errors (bedrock
-- rejecting tool-name format, openrouter quota, etc.) instead of just
-- "HTTP response code said error". Status code is checked after perform.
setopt_str (handle, OPT.USERAGENT, "aish/0.0 (luajit-ffi)")
if timeout_ms then
setopt_long(handle, OPT.TIMEOUT_MS, timeout_ms)
end
local rc = C.curl_easy_perform(handle)
local err
if rc ~= 0 then err = ffi.string(C.curl_easy_strerror(rc)) end
local err, status
if rc == 0 then
status = get_response_code(handle)
else
err = ffi.string(C.curl_easy_strerror(rc))
end
-- End-of-stream flush: the final event may lack a trailing \n\n if the
-- server closed the connection right after writing the last data: line
-- (some llama.cpp builds, and any plain HTTP/1.0 close-on-EOF feed).
-- Parse any remaining buffer content as one last event. Same pcall shield.
if rc == 0 and #buffer > 0 then
-- Only flush on 2xx — on error responses the buffer is the error body,
-- not an SSE event.
if rc == 0 and status < 400 and #buffer > 0 then
local ok, perr = pcall(function()
local data_parts = {}
for line in (buffer .. "\n"):gmatch("([^\n]*)\n") do
@@ -240,8 +255,12 @@ function M.post_sse(url, body, headers, on_event, timeout_ms)
write_cb:free()
if cb_error then return nil, "callback: " .. tostring(cb_error) end
if rc == 0 then return true end
return nil, err
if rc ~= 0 then return nil, err end
if status >= 400 then
local snippet = raw_body ~= "" and raw_body:sub(1, 400) or "(no body)"
return nil, ("HTTP %d: %s"):format(status, snippet)
end
return true
end
return M