safety + repl: wire secrets into safety.lua (closes #52)
Closes the last #13 gap — Norris broker call + is_destructive LLM
second-opinion probe were the two egress points NOT covered by the
scrub-at-egress design in commit d852aca.
Approach: option (b) per #52's fix sketch — callback-via-helpers/opts.
safety.lua does NOT gain a require("secrets") dependency (acceptance
criteria 3); integration is purely through the convention the rest
of the helpers table already uses.
safety.lua changes:
- llm_probe gains an opts table. When opts.scrub_msgs is set, the
{system, user(cmd)} message pair is scrubbed before broker.chat.
When opts.rehydrate is set, the YES/NO reply is rehydrated before
parsing (defensive — the verdict shouldn't carry placeholders but
rehydration is a safe no-op if it doesn't).
- llm_second_opinion threads opts through to llm_probe.
- M.is_destructive(cmd, cfg, opts) — opts optional; nil-opts is
backwards-compatible (no scrub, original behavior).
- M.norris_step:
* outbound broker.chat_stream message scrubbed via
helpers.scrub_msgs(ctx:to_messages(), model_cfg) when provided.
* on_delta wrapped with helpers.streaming_rehydrator():push /
:flush so the user sees rehydrated text AND text_parts
accumulates rehydrated chunks (parity with ask_ai in repl.lua).
* both M.is_destructive call sites (tool_call probe + CMD: probe)
now pass probe_opts = {scrub_msgs, rehydrate} when the
helpers carry them.
repl.lua changes:
- Norris helpers table gains scrub_msgs / rehydrate /
streaming_rehydrator closures, all nil-safe (return identity /
nil when secrets_session is nil).
- :safety check meta passes probe_opts to is_destructive when
secrets_session is configured. Without secrets, behavior unchanged.
Unit-test verified end-to-end:
- Stubbed broker.chat captures the messages it receives.
- Without opts: probe SEES `ghp_realsecretvalue_...` (control).
- With opts: probe sees `$AISH_SECRET_NNN` (correct scrub).
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1208,6 +1208,21 @@ function M.run(config)
|
||||
render_assistant_delta = renderer.assistant_delta,
|
||||
render_assistant_flush = renderer.assistant_flush,
|
||||
log_turn = log_turn,
|
||||
-- Issue #52: pass secrets-aware callbacks so safety.lua
|
||||
-- can scrub outbound Norris broker messages + LLM probe
|
||||
-- inputs + rehydrate streamed replies. All three are nil-
|
||||
-- safe; safety.lua only wires them in when present.
|
||||
scrub_msgs = function(msgs, mode_cfg)
|
||||
return scrub_messages(msgs, secrets_mode_for(mode_cfg or active_cfg))
|
||||
end,
|
||||
rehydrate = function(text)
|
||||
return secrets_session and secrets_session:rehydrate(text) or text
|
||||
end,
|
||||
streaming_rehydrator = function()
|
||||
return secrets_session
|
||||
and secrets.streaming_rehydrator(secrets_session)
|
||||
or nil
|
||||
end,
|
||||
}
|
||||
|
||||
local step_n = 1
|
||||
@@ -1641,7 +1656,19 @@ function M.run(config)
|
||||
end
|
||||
-- Pass cfg so the LLM probe runs; user can opt-out via
|
||||
-- :safety check --no-llm <cmd> if added in v2.
|
||||
local hit, reason = safety.is_destructive(cmd, config)
|
||||
-- Issue #52: thread secrets scrub/rehydrate so the probe
|
||||
-- model sees placeholders for any secrets in `cmd`.
|
||||
local probe_opts
|
||||
if secrets_session then
|
||||
probe_opts = {
|
||||
scrub_msgs = function(msgs, mode_cfg)
|
||||
return scrub_messages(msgs,
|
||||
secrets_mode_for(mode_cfg or active_cfg))
|
||||
end,
|
||||
rehydrate = function(t) return secrets_session:rehydrate(t) end,
|
||||
}
|
||||
end
|
||||
local hit, reason = safety.is_destructive(cmd, config, probe_opts)
|
||||
if hit then
|
||||
renderer.status(("DESTRUCTIVE — %s"):format(reason or "?"))
|
||||
else
|
||||
|
||||
Reference in New Issue
Block a user