safety: permission policy DSL — allow/confirm/deny rule lists (closes #9)

The confirm_cmd boolean was too coarse: true interrupts every harmless ls; false ungates everything. Most workflows want trust for read-only ops while still gating writes/network/sudo. New config: permissions = { allow = { "^ls%s", "^cat%s", "^git status" }, confirm = { "^rm%s", "^git push", "^docker%s", "^sudo%s" }, deny = { "^ssh%s+root@", "^curl%s+http[^s]" }, } Verdict order: deny > confirm > allow. First match in the chosen category wins. Unmatched defaults to "confirm". Patterns are Lua patterns (not regex) per PHASE0.md §3 — no compiled extensions. Verdict behavior in the interactive CMD: loop: - allow → run without prompt - deny → status line, skip - confirm → [y/N] prompt (same UX as legacy confirm_cmd=true) Backward compat: - permissions unset + confirm_cmd=true → always confirm - permissions unset + confirm_cmd=false → always allow - permissions set → policy table is authoritative Scope deliberately limited to the interactive AI-suggested CMD: gate. Norris autonomous mode keeps its own safety.is_destructive machinery (combining the two would double-gate or replace the LLM probe — both non-obvious behavioral changes that belong in their own issues). User-typed shell-routed lines (`router.classify → "shell"`) and :exec also bypass the policy by design — those are direct user intent. New introspection: :perms list — show the configured rule lists :perms check <cmd> — report verdict + matching rule (debug) safety.classify_command is exported and unit-tested with 12 cases covering each category, priority order (deny > allow on overlap), and both fallback paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 21:20:56 +00:00
parent 518c01a9f5
commit 17e62c0326
3 changed files with 86 additions and 4 deletions
@@ -3,6 +3,8 @@
 -- Phase 3: M.is_destructive (static pattern + LLM second-opinion gate for
 --          Norris autonomous mode) and M.norris_step (single-iteration
 --          planning loop). See docs/PHASE2.md §6 and docs/PHASE3.md §4 / §5.
+-- Issue #9: M.classify_command (allow/confirm/deny rule list — interactive
+--           CMD: gate, supersedes the confirm_cmd boolean when configured).

 local rl     = require("ffi.readline")
 local json   = require("dkjson")
@@ -10,6 +12,35 @@ local broker = require("broker")

 local M = {}

+-- ---------------------------------------------------------------- classify_command
+-- Walk config.permissions (allow / confirm / deny rule lists) against `cmd`
+-- in priority order: deny > confirm > allow. First match in the chosen
+-- category wins. Returns the verdict string and the matching pattern (for
+-- status messages); falls back to the legacy confirm_cmd boolean when no
+-- permissions table is configured. Default verdict when permissions is set
+-- but no rule matches is "confirm" — per the issue body.
+--   verdict ∈ "allow" | "confirm" | "deny"
+local function _match_any(cmd, rules)
+    if not rules then return nil end
+    for _, p in ipairs(rules) do
+        if cmd:find(p) then return p end
+    end
+    return nil
+end
+function M.classify_command(cmd, cfg)
+    local perms = cfg and cfg.permissions
+    if perms then
+        local mp = _match_any(cmd, perms.deny);    if mp then return "deny",    mp end
+              mp = _match_any(cmd, perms.confirm); if mp then return "confirm", mp end
+              mp = _match_any(cmd, perms.allow);   if mp then return "allow",   mp end
+        return "confirm", nil
+    end
+    if cfg and cfg.shell and cfg.shell.confirm_cmd then
+        return "confirm", nil
+    end
+    return "allow", nil
+end
+
 -- Render the call as `name({"path":"/tmp"})` for the confirm prompt.
 -- Truncate to keep one-line prompts.
 local function pretty_call(name, args)