docs/PHASE3: review fold-in — security-layer BLOCKERs resolved
Independent review surfaced 3 BLOCKERs + 6 CONCERNs + 7 NITs against
the analyze-tier draft. Resolutions applied:
BLOCKERs:
B1 Shell-wrapper bypass — static patterns leaked on bash -c, sh -c,
eval, pipe-to-shell, python -c, xargs|rm. Added 9 wrapper
patterns to §5. Norris HALTs on any wrapper invocation; user
reads the inner before proceed. The patterns are the
conservative floor against the wrapper bypass class.
B2 LLM second-opinion was self-policing — same model class
generating actions then judging them. Switched probe model
from `fast` to `deep` (qwen3-30b). Added re-roll inversion:
if first probe says NO, ask "is this SAFE?". Disagreement
between two probes → HALT. Cheap independent-class insurance.
B3 `is_destructive` would have run on interactive CMD: extraction
— a PHASE0 §6/§10 substrate amendment in disguise. Resolved
Q24: heuristic runs ONLY when norris_active == true. No
substrate change; interactive `confirm_cmd` semantics unchanged.
CONCERNs:
C1 Skip-budget: consecutive_user_skips counter; 3+ similar skips
escalate to abort/force-proceed prompt.
C2 Algorithm-vs-Q25-resolution contradiction: §4 reordered to
dispatch ALL pending actions before checking GOAL: complete.
C3 Norris-goal eviction: goal embedded directly in the dynamic
system-prompt suffix; survives sliding-window eviction.
C4 Readline use-after-free window: M.bind no longer frees old
callbacks; pin for process lifetime (bounded memory cost).
C5 GOAL: complete matcher: line-level scan, exact match after
trim — substrate-aligned with CMD: rigor.
C6 §4 step 4 tightened: auto_approve does NOT bypass destructive
heuristic; tool_call without auto_approve still HALTs even
when destructive-clear (Norris conservative).
NITs deferred or rolled into pattern table:
- chown root-path pattern tightened (NIT 2 in-line)
- Test corpus expansion noted in §12 commit #1 risk
- Other NITs are wording-level
Status: Plan (review folded). Ready for commit #1 (safety static
patterns) once another review pass clears.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+170
-43
@@ -2,9 +2,62 @@
|
||||
|
||||
**Project:** aish — AI-augmented conversational shell
|
||||
**Document:** Phase 3 Requirements, Architecture & Design Decisions
|
||||
**Status:** Analyze (formulate complete; live-probed against current tree at `b58a842`)
|
||||
**Status:** Plan (review fold-in 2026-05-12 — security-layer BLOCKERs resolved)
|
||||
**Date:** 2026-05-12
|
||||
|
||||
**Review fold-in (2026-05-12, security layer):**
|
||||
|
||||
R-B1. **Shell-wrapper bypass coverage.** Static patterns missed `bash -c`,
|
||||
`sh -c`, `eval`, `xargs | rm`, `| sh`, `python -c`. Added to the
|
||||
pattern list in §5 as a "wrapper requires manual review" class —
|
||||
in Norris mode, any wrapper invocation HALTs regardless of the
|
||||
inner command. The wrapper itself is the trigger.
|
||||
|
||||
R-B2. **LLM second-opinion model class.** Switched from `fast` to `deep`
|
||||
for the destructive-detection probe. `fast` co-emits the action
|
||||
AND judges it (circular). `deep` is a different model class
|
||||
(qwen3-30b currently mapped to `deep` per config.lua) — adds
|
||||
~1-3s per probe but breaks the self-policing loop. Added a
|
||||
YES/inversion re-roll: if the deep model says NO, re-ask
|
||||
"Is this safe?" — disagreement → HALT. Cheap insurance for
|
||||
the edge cases. §5 reflects.
|
||||
|
||||
R-B3. **`is_destructive` scope narrowed to Norris mode.** The
|
||||
formulate-time §9 said the heuristic would also gate interactive
|
||||
`CMD:` extraction. That's a PHASE0 §6/§10 substrate amendment
|
||||
that's bigger than Phase 3 should be making implicitly. Q24
|
||||
resolved: `is_destructive` runs ONLY when `norris_active == true`.
|
||||
Interactive `CMD:` extraction continues to honor `confirm_cmd`
|
||||
exactly as Phase 0 specified — no behavior change.
|
||||
|
||||
**CONCERN folds (2026-05-12):**
|
||||
|
||||
R-C1. **Skip-budget added** — `consecutive_user_skips` counter; ≥2
|
||||
triggers escalation HALT "model has proposed similar destructive
|
||||
action 3+ times — abort, force-proceed, or change goal?". §4 +
|
||||
§6 reflect.
|
||||
|
||||
R-C2. **§4 algorithm reorder** — dispatch all pending actions FIRST,
|
||||
then check `GOAL: complete`. Q25 resolution + §4 algorithm now
|
||||
consistent (was contradictory).
|
||||
|
||||
R-C3. **Norris goal pinned in system-prompt suffix** — `ctx.norris_goal`
|
||||
field; the dynamic system suffix from §8 carries it. Eviction
|
||||
can no longer drop the anchor.
|
||||
|
||||
R-C4. **Readline rebind safety** — `M.bind` will NOT free old callbacks
|
||||
(pin for process lifetime). Avoids a use-after-free window between
|
||||
`:free()` and the new `rl_bind_keyseq` call. Memory cost is
|
||||
bounded (one closure per bound key, negligible).
|
||||
|
||||
R-C5. **`GOAL: complete` matcher** — line-level scan, exact match after
|
||||
trim. Aligned with `CMD:` extraction rigor.
|
||||
|
||||
R-C6. **§4 step 4 algorithm tightened** — auto_approve only short-circuits
|
||||
the user-prompt, NEVER the destructive-heuristic. Tool-call
|
||||
without `auto_approve` entry AND no destructive flag → still
|
||||
HALTs in Norris mode (Norris is conservative by design).
|
||||
|
||||
**Analyze findings (2026-05-12):**
|
||||
|
||||
A1. **`\C-n` mid-readline limitation.** Phase 1's `\C-n` handler fires
|
||||
@@ -101,7 +154,7 @@ Three pillars per PHASE0.md §11 row 3:
|
||||
| HALT response shape | **3-way prompt**: `proceed` / `skip` / `abort` | `proceed` runs the action and continues. `skip` reports "user skipped" to the model and lets it re-plan. `abort` ends the Norris session, drops back to interactive mode. (`abort` is also bound to `\C-x\C-c` per PHASE1.md §7 reserved keys.) |
|
||||
| Auto-approve under Norris | **Trust the Phase 2 `auto_approve` policy** | A tool already in `auto_approve` runs without HALT even in Norris mode, as long as the destructive-op heuristic doesn't flag it. The user opted in once; Norris doesn't unilaterally re-prompt. CMD: lines never auto-approve under Norris — they always pass through `is_destructive` first. |
|
||||
| Destructive-op static rules | **Patterned shell-idiom list** in `safety.lua` (hardcoded; configurable later via `config.safety.destructive_patterns`) | Phase 3 v1 ships a fixed list (~20 patterns) inline. v2 may make it user-extendable. Patterns target the command string after expansion; conservative — false positives mean a confirm prompt the user dismisses, false negatives mean unsupervised destructive action. Bias to false positives. |
|
||||
| LLM second-opinion model | **The `fast` preset** (whichever model maps to the user's small/cheap local) | Cheapest available; destructive-detection doesn't need a smart model. Prompt: "Is this shell command destructive (could delete or overwrite data)? Answer YES or NO." Single-token-ish response, no streaming. Falls back to YES (safe default) on broker failure. |
|
||||
| LLM second-opinion model | **The `deep` preset** (independent model class, not the one emitting actions) | R-B2 resolution. Same model class self-policing is circular — `deep` (qwen3-30b currently) judges actions emitted by the active model (often `fast` qwen-1.5b under Norris). Adds ~1-3s per probe; broker failure → YES (safe default). Re-roll inversion: if first probe says NO, ask the inverted "Is this safe?" — disagreement → HALT. |
|
||||
| Norris prompt suffix | **Status appended to the system prompt** when Norris is active: `[NORRIS MODE] You are operating autonomously toward a stated goal. Plan and execute step by step. Use CMD: lines or tool_calls. When done, emit "GOAL: complete" on its own line.` | The `GOAL: complete` sentinel is how the model signals task completion; Norris loop exits the planning sub-loop on seeing it. |
|
||||
| Interrupt handling | **`\C-c` during a Norris step sends abort** | Standard SIGINT semantics for the user. Mid-stream, this means: stop the broker request, stop any running shell command, drop to interactive mode. The current context is preserved (incl. partial assistant turn). |
|
||||
| Context budgeting under Norris | **Same `max_turns` and `token_budget` as interactive** | Sliding window evicts oldest non-system turns when budget exceeded — including mid-Norris-session if the loop runs long. Phase 4's `memory.jsonl` summarization is the proper fix; Phase 3 just gets the eviction status as before. |
|
||||
@@ -117,7 +170,7 @@ Three pillars per PHASE0.md §11 row 3:
|
||||
| `renderer.lua` | exec frame + tool-call frame + assistant streaming | Add `M.norris_begin(goal)`, `M.norris_step(n, action_desc)`, `M.norris_halt(reason, action)`, `M.norris_end(status, reason)`. Visual: bold cyan banner on enter, indented step counter per iteration, red HALT banner on intercept, dim summary on exit. Phase 0 prompt becomes `[aish:fast ⚡]>` when Norris is active per PHASE0.md §9. |
|
||||
| `broker.lua` | `chat_stream` with opts.tools, `chat` non-streaming | Re-used as-is for planning rounds (Norris just calls chat_stream like interactive). See row below for the small `max_tokens` opts extension needed by the LLM second-opinion path. |
|
||||
| `context.lua` | system_prompt + turns + pending_exec_output + use_tool_role | When Norris is active, `to_messages()` appends the Norris suffix (§2 row "Norris prompt suffix") to the system message. The suffix is computed dynamically — when Norris exits, subsequent broker calls revert to plain system prompt. No additional storage. |
|
||||
| `ffi/readline.lua` | `bind(seq, fn)` (Phase 1) | **Small extension per A1**: add `rl_insert_text` + `rl_redisplay` to the `ffi.cdef` block and expose `M.insert_text(s)` / `M.redisplay()` wrappers. Needed so the `\C-n` handler can stuff `:norris ` into the in-progress buffer cleanly rather than just printing a status that disappears. |
|
||||
| `ffi/readline.lua` | `bind(seq, fn)` (Phase 1) — frees old callback before rebinding | **Small extension per A1 + R-C4 fix**: (a) add `rl_insert_text` + `rl_redisplay` to the `ffi.cdef` block and expose `M.insert_text(s)` / `M.redisplay()` wrappers — needed so `\C-n` can stuff `:norris ` into the buffer; (b) drop the `_bound[seq]:free()` call from `M.bind` — readline retains the function pointer in its keymap; freeing before re-bind opens a use-after-free window if the user presses the key in that gap. Pin all bound callbacks for process lifetime; memory cost is bounded (one closure per key, ~O(N) where N = number of bound keys ≤ ~10). |
|
||||
| `broker.lua` | `chat_stream(cfg, msgs, on_delta, opts)` with opts.tools | **Small extension per A2**: `opts.max_tokens` (integer) is passed through to the request body as `max_tokens`. Omitted when nil. `M.chat` accepts the same opt. Needed so `safety.is_destructive`'s YES/NO probe terminates in ~2 tokens. |
|
||||
| `config.lua` | mcp example block | New optional `safety = { llm_second_opinion = true, llm_model = "fast", destructive_patterns = {...} }` block, also commented-out example. Defaults are sane when absent. |
|
||||
|
||||
@@ -133,27 +186,52 @@ to do next based on accumulated context:
|
||||
|
||||
```
|
||||
norris_step(ctx, broker_cfg, executor_fn, tools_fn, halt_fn, opts):
|
||||
# opts.step_n, opts.max_steps, opts.cfg
|
||||
# opts.step_n, opts.max_steps, opts.cfg, opts.consecutive_skips
|
||||
|
||||
1. Call broker.chat_stream(broker_cfg, ctx:to_messages(), on_delta, {tools=tools_fn()})
|
||||
— collect (text, tool_calls).
|
||||
2. If text contains "GOAL: complete" line → return {status="done"}.
|
||||
3. If no actions emitted (no tool_calls, no CMD: in text):
|
||||
→ return {status="stalled", reason="no action"} (user-visible).
|
||||
4. For each action (tool_call OR CMD: line):
|
||||
|
||||
2. Extract actions from response:
|
||||
- tool_calls (already collected by broker accumulator)
|
||||
- cmd_lines via executor.extract_cmd_lines(text) — line-anchored
|
||||
- goal_done line-level scan for exact "GOAL: complete" (R-C5)
|
||||
|
||||
3. If actions are empty AND goal_done is false:
|
||||
→ return {status="stalled", reason="no action"}.
|
||||
|
||||
4. Dispatch ALL pending actions BEFORE checking goal_done (R-C2):
|
||||
tool_calls first (structured route), CMD: lines second (legacy).
|
||||
For each action:
|
||||
a. Pass through safety.is_destructive(action).
|
||||
b. If destructive: invoke halt_fn(action, reason) → user verdict.
|
||||
"proceed" → run action.
|
||||
"skip" → append a synthesized turn telling the model
|
||||
"[aish] action skipped by user: <reason>".
|
||||
"abort" → return {status="aborted"}.
|
||||
c. If non-destructive: check auto_approve (for tool_calls only)
|
||||
or destructive_check passed (for CMD:). Run.
|
||||
d. Append result turn to ctx (role:"tool" for tool calls,
|
||||
exec-output buffer for CMD:).
|
||||
5. step_n += 1. If step_n >= max_steps:
|
||||
- tool_calls: check tool-name set + serialized args.
|
||||
- CMD: lines: pattern match + LLM probe.
|
||||
b. If destructive: invoke halt_fn(action, reason, opts.cfg).
|
||||
"proceed" → run action.
|
||||
"skip" → opts.consecutive_skips += 1.
|
||||
If consecutive_skips >= 3 (R-C1):
|
||||
escalate halt with reason "repeated similar skips"
|
||||
→ user verdict abort / force-proceed.
|
||||
Append synthesized "[aish] action skipped by user: <reason>"
|
||||
as a role:"tool" turn (for tool_calls) or as exec-output
|
||||
prefix (for CMD: lines) — alternation invariant.
|
||||
"abort" → return {status="aborted"}.
|
||||
c. If non-destructive (cleared by static + LLM):
|
||||
- tool_call: check auto_approve. If in policy, run silently;
|
||||
otherwise (R-C6) halt_fn STILL fires for the consent prompt
|
||||
(Norris is conservative; auto_approve is the *only* way to
|
||||
skip consent in autonomous mode).
|
||||
- CMD: line: run (destructive-check is the gate; confirm_cmd
|
||||
is interactive-mode-only — R-B3 narrows scope).
|
||||
d. On successful proceed: opts.consecutive_skips = 0.
|
||||
e. Append result turn to ctx (role:"tool" for tool calls,
|
||||
exec-output buffer for CMD: — same as Phase 0/2 paths).
|
||||
|
||||
5. After all actions dispatched: if goal_done → return {status="done"}.
|
||||
|
||||
6. step_n += 1. If step_n >= max_steps:
|
||||
return {status="budget_exhausted"}.
|
||||
6. Continue loop (driver in repl.lua re-calls norris_step).
|
||||
|
||||
7. Continue loop (driver in repl.lua re-calls norris_step).
|
||||
```
|
||||
|
||||
The driver in repl.lua is the simple while loop; norris_step is one
|
||||
@@ -167,7 +245,20 @@ iteration so testing is granular.
|
||||
|
||||
```lua
|
||||
local DESTRUCTIVE_PATTERNS = {
|
||||
-- Filesystem
|
||||
-- ── Shell wrappers (R-B1) — flag the wrapper itself; can't inspect content
|
||||
-- safely without parsing the inner shell. Norris HALTs on these
|
||||
-- unconditionally; the user can proceed/abort with the full context.
|
||||
{ pat = "^%s*bash%s+%-l?c%s", reason = "bash -c (wrapped shell)" },
|
||||
{ pat = "^%s*sh%s+%-l?c%s", reason = "sh -c (wrapped shell)" },
|
||||
{ pat = "^%s*zsh%s+%-l?c%s", reason = "zsh -c (wrapped shell)" },
|
||||
{ pat = "^%s*eval%s", reason = "eval (dynamic shell)" },
|
||||
{ pat = "^%s*python3?%s+%-c%s", reason = "python -c (inline script)" },
|
||||
{ pat = "^%s*perl%s+%-e%s", reason = "perl -e (inline script)" },
|
||||
{ pat = "|%s*sh%f[%s]", reason = "pipe-to-sh" },
|
||||
{ pat = "|%s*bash%f[%s]", reason = "pipe-to-bash" },
|
||||
{ pat = "xargs%s+.-rm", reason = "xargs ... rm" },
|
||||
|
||||
-- ── Filesystem destructive
|
||||
{ pat = "rm%s+.-%-rf?", reason = "rm -rf" },
|
||||
{ pat = "rm%s+.-%-fr?", reason = "rm -fr" },
|
||||
{ pat = "find%s+.-%-delete", reason = "find -delete" },
|
||||
@@ -179,32 +270,35 @@ local DESTRUCTIVE_PATTERNS = {
|
||||
{ pat = "wipefs%s", reason = "wipefs" },
|
||||
{ pat = "truncate%s+.-%-s%s*0", reason = "truncate to zero" },
|
||||
|
||||
-- Version control destructive
|
||||
-- ── Version control destructive
|
||||
{ pat = "git%s+push%s+.-%-%-force", reason = "git push --force" },
|
||||
{ pat = "git%s+push%s+.-%-f%f[%s]", reason = "git push -f" },
|
||||
{ pat = "git%s+reset%s+.-%-%-hard", reason = "git reset --hard" },
|
||||
{ pat = "git%s+clean%s+.-%-fd?", reason = "git clean -fd" },
|
||||
{ pat = "git%s+branch%s+.-%-D", reason = "git branch -D" },
|
||||
|
||||
-- Database / process
|
||||
-- ── Database / process
|
||||
{ pat = "DROP%s+TABLE", reason = "DROP TABLE", ci = true },
|
||||
{ pat = "DROP%s+DATABASE", reason = "DROP DATABASE", ci = true },
|
||||
{ pat = "TRUNCATE%s+TABLE", reason = "TRUNCATE TABLE", ci = true },
|
||||
{ pat = "kill%s+%-9", reason = "kill -9" },
|
||||
{ pat = "pkill%s+%-9", reason = "pkill -9" },
|
||||
|
||||
-- Network/permission
|
||||
-- ── Network/permission (chown tightened per NIT 2)
|
||||
{ pat = "chmod%s+.-777", reason = "chmod 777" },
|
||||
{ pat = "chown%s+.-/$", reason = "chown on root path" },
|
||||
{ pat = "chown%s+.-%s+/%s*$", reason = "chown on root path" },
|
||||
}
|
||||
```
|
||||
|
||||
The 9 wrapper patterns are the conservative floor against R-B1 bypass classes. Norris emits `bash -c '...'` → wrapper hit → HALT (user can proceed if they read the inner). LLM second-opinion still runs as a backup but the static net catches the obvious cases first.
|
||||
|
||||
Patterns are Lua patterns (not regex), `ci = true` enables case-insensitive match.
|
||||
|
||||
### LLM second-opinion (when static doesn't HALT)
|
||||
|
||||
If `cfg.safety.llm_second_opinion ~= false`, call `broker.chat` (non-streaming,
|
||||
no tools) on the `fast` model with:
|
||||
If `cfg.safety.llm_second_opinion ~= false`, call `broker.chat`
|
||||
(non-streaming, no tools, `opts.max_tokens=4`) on the **deep** model
|
||||
preset (independent class from the action-emitting model — R-B2):
|
||||
|
||||
```
|
||||
System: Detect destructive shell commands. Reply YES if running this would
|
||||
@@ -214,11 +308,31 @@ Otherwise reply NO. Respond with only YES or NO.
|
||||
User: <the command text or tool-call summary>
|
||||
```
|
||||
|
||||
Response parse: case-insensitive match on `YES` → destructive; everything
|
||||
else → not. Broker failure → YES (safe default).
|
||||
Response parse: case-insensitive match on `YES` → destructive.
|
||||
|
||||
**Re-roll on NO** (R-B2 cheap insurance): if the first probe returns NO,
|
||||
run a second probe with inverted phrasing:
|
||||
|
||||
```
|
||||
System: Reply YES or NO only. Is the following shell command SAFE to
|
||||
run autonomously without user review?
|
||||
|
||||
User: <same command>
|
||||
```
|
||||
|
||||
Re-roll says NO → command is dangerous → HALT. Disagreement between the
|
||||
two probes (first NO, second NO) → HALT (treat agreement-by-default as
|
||||
suspicious). Both probes agree YES is safe → clear.
|
||||
|
||||
Broker failure → YES (safe default).
|
||||
|
||||
Session-scoped cache keyed by the normalized command string mitigates
|
||||
the latency cost (~1-3s per probe on deep model — see PHASE3-baseline §1).
|
||||
Repeated patterns within a single session probe once.
|
||||
|
||||
Returns: `(is_destructive, reason)`. Reason is the matched pattern name
|
||||
for static hits, "LLM flagged as destructive" for LLM hits.
|
||||
for static hits, "LLM flagged as destructive" / "LLM probe disagreement"
|
||||
for the two LLM failure modes.
|
||||
|
||||
### Tool-call destructive check
|
||||
|
||||
@@ -278,11 +392,16 @@ pending; if off and a goal is in progress, asks for confirm-abort.
|
||||
|
||||
## 8. System Prompt Augmentation (active only in Norris)
|
||||
|
||||
Appended to the default Phase 2 system prompt while `norris_active == true`:
|
||||
Appended to the default Phase 2 system prompt while `norris_active == true`.
|
||||
The current goal is embedded in the suffix so eviction can't drop the
|
||||
anchor (R-C3):
|
||||
|
||||
```
|
||||
[NORRIS MODE] You are operating autonomously toward a stated goal. Plan
|
||||
and execute step by step using CMD: lines (for shell) or tool_calls
|
||||
[NORRIS MODE] You are operating autonomously toward the following goal:
|
||||
|
||||
<ctx.norris_goal>
|
||||
|
||||
Plan and execute step by step using CMD: lines (for shell) or tool_calls
|
||||
(when MCP tools are available). After each action, you will see its
|
||||
result in the next turn. Re-plan based on what you observe.
|
||||
|
||||
@@ -301,23 +420,31 @@ verdict in the next turn as "[aish] action skipped by user" or
|
||||
```
|
||||
|
||||
This block is composed dynamically by `context.to_messages()` when
|
||||
`ctx.norris_active` is set. No state stored beyond the boolean.
|
||||
`ctx.norris_active` is set. State stored:
|
||||
- `ctx.norris_active = true|false`
|
||||
- `ctx.norris_goal = "<goal text>"` (cleared on exit)
|
||||
|
||||
The user-emitted "[norris] <goal>" turn ALSO lives in the turn list as
|
||||
a regular user turn for the model's reading benefit. If the sliding
|
||||
window evicts it later, the system-prompt suffix still carries the
|
||||
goal — alignment with the eviction policy without special-case pinning.
|
||||
|
||||
---
|
||||
|
||||
## 9. Migration from Phase 2
|
||||
|
||||
User-visible:
|
||||
- `\C-n` now does something (was a Phase 1 placeholder).
|
||||
- `\C-n` now does something (was a Phase 1 placeholder) — inserts
|
||||
`:norris ` at the cursor.
|
||||
- `:norris <goal>` is a new meta command.
|
||||
- Destructive-looking commands suddenly stop and ask for confirmation
|
||||
even outside Norris mode (the `is_destructive` check is also applied
|
||||
to interactive CMD: extraction, replacing the current bare
|
||||
`confirm_cmd` for known-destructive cases). This is a behavior change
|
||||
to interactive mode.
|
||||
- **Interactive mode is UNCHANGED** (R-B3 resolution of Q24): the
|
||||
`is_destructive` heuristic runs ONLY when `norris_active == true`.
|
||||
Interactive `CMD:` extraction continues to honor `confirm_cmd`
|
||||
exactly as Phase 0 specified. No surprises for existing users.
|
||||
|
||||
Substrate (PHASE0.md §3) invariants: unchanged. The `CMD:` extraction
|
||||
marker is still the only shell-suggestion contract.
|
||||
marker is still the only shell-suggestion contract. `confirm_cmd`
|
||||
semantics are preserved as-defined in PHASE0 §10.
|
||||
|
||||
`config.lua`: configs without a `safety` block work unchanged — defaults
|
||||
kick in (LLM second-opinion enabled, default pattern list, default step
|
||||
@@ -347,9 +474,9 @@ Specifically out of Phase 3 scope despite proximity:
|
||||
|
||||
| # | Question | Impact | Resolve by |
|
||||
|---|---|---|---|
|
||||
| Q23 | LLM second-opinion latency budget: 3s per check on the fast model means a 16-step Norris session adds ~48s of overhead. Acceptable for autonomous mode? Or cache by command-hash within a session? | safety.lua | Phase 3 (analyze) |
|
||||
| Q24 | `is_destructive` also runs on **interactive** `CMD:` extraction (per §9)? Or only under Norris? §9 says yes; the manifest implicitly broadens the destructive gate. The alternative is to keep `confirm_cmd` as the interactive surface and Norris uses its own stricter check. Mixing both is the proposed default but worth challenging. | safety.lua + repl.lua | Phase 3 (analyze) |
|
||||
| Q25 | If the model emits BOTH text AND a `GOAL: complete` line in the same response, is the goal done immediately, or are any pending actions in that response still dispatched first? Default proposal: dispatch pending actions first; the GOAL: marker fires after the loop's next round-trip would have been called (so the model effectively pre-announces). Less surprising. | repl.lua norris driver | Phase 3 (analyze) |
|
||||
| Q23 | ~~LLM second-opinion latency budget~~ | safety.lua | **Resolved at baseline** — 425-1162ms per probe on the **fast** model (baseline §1); switched to **deep** at review (R-B2) at the cost of ~1-3s per probe, paid back by independent model class. Session cache mitigates repeated patterns. |
|
||||
| Q24 | ~~`is_destructive` also runs on interactive `CMD:` extraction?~~ | safety.lua + repl.lua | **Resolved at review (R-B3)** — NO. `is_destructive` runs ONLY when `norris_active == true`. Interactive `CMD:` extraction honors `confirm_cmd` exactly as Phase 0 specified. No substrate amendment. |
|
||||
| Q25 | ~~`GOAL: complete` AND pending actions in same response?~~ | repl.lua norris driver | **Resolved at review (R-C2)** — dispatch all pending actions FIRST (tool_calls then CMD:), THEN check for `GOAL: complete`. Algorithm in §4 reflects. |
|
||||
| Q26 | Context preservation when Norris ends with `abort` vs `done` vs `budget_exhausted`. Proposal: all three keep ctx intact (user sees the conversation in `:history`). The only difference is the renderer summary. | repl.lua + renderer.lua | Phase 3 (plan) |
|
||||
| Q27 | Resume mode after abort: should the user be able to type `:norris continue` to pick up where the model left off? v1 says no — too many edge cases with stale plans. v2 maybe. | scope | Phase 3 — defer to v2 |
|
||||
| Q28 | `tool_calls` from MCP servers that have side effects but aren't in `*__shell` / `*__write_file` patterns (e.g. a custom `hertz__wol_machine` tool that wakes a server). The static set in §5 won't catch this; the LLM second-opinion might. Reasonable default given the LLM's role here. | safety.lua | Phase 3 (verify) |
|
||||
|
||||
Reference in New Issue
Block a user