Files
aish/safety.lua
T
marfrit f26cbd9a3a phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics
Phase 7 verify finding from TC #26 against :model cloud:
  HTTP 400 from openrouter→Amazon Bedrock:
  "tools.0.custom.name: String should match pattern
   '^[a-zA-Z0-9_-]{1,128}$'"

Anthropic via Bedrock validates tool names against that regex and
rejects dots. PHASE2 originally chose "." as the namespace separator
("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not.

Separator switched to "__" (two underscores) everywhere — internal
API matches on-wire shape, no transformation layer:

  - repl.lua:
    - tools_schema builds "alias__name"
    - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __)
    - :mcp tool parser uses same split
    - :mcp tools formatter prints "alias__name"
    - HELP block shows <alias__name>
  - safety.lua confirm_tool_call: alias.* glob → alias__* glob
  - config.lua example block: keys rewritten
  - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua
    row, §5 wire-shape JSON examples, §6 auto_approve schema, §7
    meta-cmd table, §12 plan all updated. Original "." references
    preserved in commit history.

Constraint: aliases must not themselves contain "__" so the parse
stays unambiguous. Tool names from MCP servers may have underscores
freely.

Second fix bundled — uninformative broker error:
  Previously "broker error: transport: HTTP response code said error"
  Now      "broker error: transport: HTTP 400: {full body snippet}"

ffi/curl.lua M.post_sse changes:
  - FAILONERROR no longer set (was hiding the response body).
  - raw_body accumulator added alongside the SSE buffer; captures
    every byte regardless of SSE shape.
  - After perform, check status_code via curl_easy_getinfo. On >=400,
    return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged.
  - End-of-stream SSE flush only runs on 2xx (no false event on
    error bodies that aren't SSE-shaped).
  - Phase 1 callers reading just first return slot stay correct.

End-to-end verified:
  - :model cloud + tools=[boltzmann__read_file ...] +
    "Use boltzmann__read_file with path=/etc/hostname" →
    Claude emits tool_call with name="boltzmann__read_file",
    args='{"path": "/etc/hostname"}'. ok=true, transport clean.
  - Force-bad tool name "bad.name.with.dots" → err string carries
    the full bedrock 400 with the regex-pattern message visible.

TC #26 (sub-loop end-to-end) is now testable against cloud — the
error that blocked it is resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:04:57 +00:00

56 lines
2.1 KiB
Lua

-- safety.lua — workflow safeguards for tool execution.
-- Phase 2: M.confirm_tool_call only (per-call confirm gate, with config-driven
-- auto-approve policy). See docs/PHASE2.md §6.
-- Phase 3 (deferred): destructive-op heuristic + Norris autonomous gate.
local rl = require("ffi.readline")
local json = require("dkjson")
local M = {}
-- Render the call as `name({"path":"/tmp"})` for the confirm prompt.
-- Truncate to keep one-line prompts.
local function pretty_call(name, args)
local body = ""
if args and next(args) then
local ok, encoded = pcall(json.encode, args)
if ok then
body = (#encoded <= 80) and encoded or (encoded:sub(1, 77) .. "...")
else
body = "..."
end
end
return name .. "(" .. body .. ")"
end
-- Ask the user whether tool `name` may be called with `args`, consulting
-- `cfg.mcp.auto_approve` first. Policy keys:
-- "<alias>__<tool>" → exact-match auto-approve
-- "<alias>__*" → whole-server auto-approve
-- Anything else falls back to a [y/N] prompt; empty / non-"y" answer rejects.
-- The separator switched from "." to "__" 2026-05-12 because Anthropic via
-- Bedrock rejects dots in tool names (regex ^[a-zA-Z0-9_-]{1,128}$).
function M.confirm_tool_call(name, args, cfg)
local policy = (cfg and cfg.mcp and cfg.mcp.auto_approve) or {}
if policy[name] then return true end
local alias = name:match("^(.-)__")
if alias and alias ~= "" and policy[alias .. "__*"] then return true end
local prompt = ("call '%s'? [y/N] "):format(pretty_call(name, args))
local ans = rl.readline(prompt) or ""
return ans:lower():sub(1, 1) == "y"
end
-- ---------------------------------------------------------------- Phase 3 stubs
-- Destructive-op heuristic for Norris autonomous mode. Not part of the
-- Phase 2 surface (see docs/PHASE2.md §10 / PHASE0.md §11 row 3).
function M.is_destructive(cmd)
error("safety.is_destructive: not implemented (Phase 3)")
end
function M.norris_step(plan, broker, executor)
error("safety.norris_step: not implemented (Phase 3)")
end
return M