91ddcb005d
Independent review surfaced 3 BLOCKERs + 6 CONCERNs + 7 NITs against
the analyze-tier draft. Resolutions applied:
BLOCKERs:
B1 Shell-wrapper bypass — static patterns leaked on bash -c, sh -c,
eval, pipe-to-shell, python -c, xargs|rm. Added 9 wrapper
patterns to §5. Norris HALTs on any wrapper invocation; user
reads the inner before proceed. The patterns are the
conservative floor against the wrapper bypass class.
B2 LLM second-opinion was self-policing — same model class
generating actions then judging them. Switched probe model
from `fast` to `deep` (qwen3-30b). Added re-roll inversion:
if first probe says NO, ask "is this SAFE?". Disagreement
between two probes → HALT. Cheap independent-class insurance.
B3 `is_destructive` would have run on interactive CMD: extraction
— a PHASE0 §6/§10 substrate amendment in disguise. Resolved
Q24: heuristic runs ONLY when norris_active == true. No
substrate change; interactive `confirm_cmd` semantics unchanged.
CONCERNs:
C1 Skip-budget: consecutive_user_skips counter; 3+ similar skips
escalate to abort/force-proceed prompt.
C2 Algorithm-vs-Q25-resolution contradiction: §4 reordered to
dispatch ALL pending actions before checking GOAL: complete.
C3 Norris-goal eviction: goal embedded directly in the dynamic
system-prompt suffix; survives sliding-window eviction.
C4 Readline use-after-free window: M.bind no longer frees old
callbacks; pin for process lifetime (bounded memory cost).
C5 GOAL: complete matcher: line-level scan, exact match after
trim — substrate-aligned with CMD: rigor.
C6 §4 step 4 tightened: auto_approve does NOT bypass destructive
heuristic; tool_call without auto_approve still HALTs even
when destructive-clear (Norris conservative).
NITs deferred or rolled into pattern table:
- chown root-path pattern tightened (NIT 2 in-line)
- Test corpus expansion noted in §12 commit #1 risk
- Other NITs are wording-level
Status: Plan (review folded). Ready for commit #1 (safety static
patterns) once another review pass clears.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>