docs/PHASE3: re-review NIT fold-in — pipe-to-sh EOL, ci= note, §12 sync

Re-review surfaced one new BLOCKER + two CONCERNs + four NITs. Folded:

N1 BLOCKER: `|%s*sh%f[%s]` missed `curl x | sh` (end-of-string canonical
   wrapper-bypass — Lua's `%f[%s]` requires transition INTO whitespace,
   which doesn't happen at EOL). Replaced with two patterns each for
   sh and bash: `|%s*sh%s` (followed by whitespace/args) and
   `|%s*sh%s*$` (end-of-string). Same for bash. Verified against 18
   wrapper-bypass test cases — all canonical idioms now HALT.

N2 CONCERN: `ci=true` rule flag had no implementation note. Added one
   sentence to §5 explaining the matcher lowercases the input string
   when ci is set.

N3 CONCERN: §12 commit #5 description was stale — still said
   "extends interactive CMD: extraction to consult is_destructive"
   which contradicts the R-B3 resolution (Norris-only). Rewrote
   commit #5 description to match R-B3, and bundled the
   ffi/readline.lua `_bound[seq]:free()` removal into commit #5's
   scope with explicit "Phase 1 amendment" callout. Same for the
   §12 risk note that still referenced the dropped behavior change.

Other NITs (N4 skip threshold, N5 approved-turn mention, N6 :model
swap interaction, N7 commit-attribution wording) are cosmetic and
will fold in-flight during implement if material.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-12 22:45:25 +00:00
parent 91ddcb005d
commit 125f800513
+23 -13
View File
@@ -254,8 +254,10 @@ local DESTRUCTIVE_PATTERNS = {
{ pat = "^%s*eval%s", reason = "eval (dynamic shell)" }, { pat = "^%s*eval%s", reason = "eval (dynamic shell)" },
{ pat = "^%s*python3?%s+%-c%s", reason = "python -c (inline script)" }, { pat = "^%s*python3?%s+%-c%s", reason = "python -c (inline script)" },
{ pat = "^%s*perl%s+%-e%s", reason = "perl -e (inline script)" }, { pat = "^%s*perl%s+%-e%s", reason = "perl -e (inline script)" },
{ pat = "|%s*sh%f[%s]", reason = "pipe-to-sh" }, { pat = "|%s*sh%s", reason = "pipe-to-sh" },
{ pat = "|%s*bash%f[%s]", reason = "pipe-to-bash" }, { pat = "|%s*sh%s*$", reason = "pipe-to-sh (eol)" },
{ pat = "|%s*bash%s", reason = "pipe-to-bash" },
{ pat = "|%s*bash%s*$", reason = "pipe-to-bash (eol)" },
{ pat = "xargs%s+.-rm", reason = "xargs ... rm" }, { pat = "xargs%s+.-rm", reason = "xargs ... rm" },
-- ── Filesystem destructive -- ── Filesystem destructive
@@ -292,7 +294,10 @@ local DESTRUCTIVE_PATTERNS = {
The 9 wrapper patterns are the conservative floor against R-B1 bypass classes. Norris emits `bash -c '...'` → wrapper hit → HALT (user can proceed if they read the inner). LLM second-opinion still runs as a backup but the static net catches the obvious cases first. The 9 wrapper patterns are the conservative floor against R-B1 bypass classes. Norris emits `bash -c '...'` → wrapper hit → HALT (user can proceed if they read the inner). LLM second-opinion still runs as a backup but the static net catches the obvious cases first.
Patterns are Lua patterns (not regex), `ci = true` enables case-insensitive match. Patterns are Lua patterns (not regex). `ci = true` enables case-insensitive
match — the matcher loop lowercases the input string when `ci` is set on
the rule, so `DROP TABLE` and `drop table x` and `Drop Table` all match
the same rule. Without `ci`, patterns are case-sensitive (the default).
### LLM second-opinion (when static doesn't HALT) ### LLM second-opinion (when static doesn't HALT)
@@ -522,11 +527,17 @@ Bottom-up, same cadence as Phase 0/1/2. Six commits expected:
5. **`repl.lua` — Norris driver + `\C-n` real binding + `:norris` meta.** 5. **`repl.lua` — Norris driver + `\C-n` real binding + `:norris` meta.**
The while-loop driver consuming `safety.norris_step`, the rebound The while-loop driver consuming `safety.norris_step`, the rebound
`\C-n` (replacing Phase 1 placeholder), the `:norris <goal>` / `\C-n` (replacing Phase 1 placeholder), the `:norris <goal>` /
`:norris off` meta cmds, and `\C-x\C-c` abort handler. Also extends `:norris off` meta cmds, and `\C-x\C-c` abort handler. **Interactive
the interactive `CMD:` confirm path to consult `is_destructive` `CMD:` extraction is UNCHANGED** — `is_destructive` runs ONLY when
first (per Q24 resolution). **Test**: mocked-broker end-to-end — `norris_active == true` (R-B3 resolution of Q24); `confirm_cmd`
submit a multi-step goal, verify driver loops correctly, hits semantics from PHASE0 §10 are preserved exactly. Bundled with this
GOAL:complete, returns to interactive. commit: `ffi/readline.lua` extension per §3 row — `rl_insert_text` +
`rl_redisplay` cdefs + `M.insert_text` / `M.redisplay` wrappers,
AND removal of the `_bound[seq]:free()` call from `M.bind` (R-C4 —
small Phase 1 amendment, called out here so the commit body cites
it). **Test**: mocked-broker end-to-end — submit a multi-step goal,
verify driver loops correctly, hits GOAL:complete, returns to
interactive.
6. **`config.lua``safety` example block.** Commented-out example 6. **`config.lua``safety` example block.** Commented-out example
showing `llm_second_opinion`, `llm_model`, `destructive_patterns`, showing `llm_second_opinion`, `llm_model`, `destructive_patterns`,
@@ -547,11 +558,10 @@ Bottom-up, same cadence as Phase 0/1/2. Six commits expected:
round-trip plus optionally an LLM second-opinion. A 16-step Norris round-trip plus optionally an LLM second-opinion. A 16-step Norris
goal could be ~32 LLM calls on the fast model. Visible as latency goal could be ~32 LLM calls on the fast model. Visible as latency
but no economic surprise on local models. but no economic surprise on local models.
- **Destructive check on interactive CMD: extraction (Q24)** is a - **Q24 resolution (R-B3)**: `is_destructive` runs ONLY in Norris
behavior change to Phase 0/1 (`confirm_cmd` users will see the mode. Interactive `CMD:` extraction continues to honor `confirm_cmd`
prompt automatically for destructive commands even with exactly as Phase 0 specified. No substrate amendment; no surprises
`confirm_cmd=false`). Documented in §9. Defensible: the worst case for users of `confirm_cmd=false` setups.
is a confirm prompt the user dismisses.
- **`GOAL: complete` extraction** uses the same `^GOAL: complete$` regex - **`GOAL: complete` extraction** uses the same `^GOAL: complete$` regex
on emitted text. Substrate-aligned with CMD: extraction. on emitted text. Substrate-aligned with CMD: extraction.