docs/PHASE3: analyze + baseline — \C-n mechanics, LLM latency, module pre-state
Analyze findings folded into the manifest:
A1. \C-n binding can't toggle mid-prompt without rl_insert_text /
rl_redisplay. Solution: bind those (one cdef + 2 wrappers in
ffi/readline.lua) so \C-n inserts ":norris " at the cursor; user
types goal + Enter. Routes through existing meta dispatch.
A2. broker has no max_tokens passthrough. Add opts.max_tokens for
the LLM second-opinion path (terminates at ~2 tokens; verified
proxy honors it).
A3. Phase 2 tool-sub-loop pattern IS the planner shape. safety.norris_step
is the per-iteration extraction; driver loop in repl.lua.
Module-changes table (§3) updated with the rl_insert_text and
max_tokens rows.
Baseline doc (PHASE3-baseline.md, 80 lines) captures:
- LLM second-opinion latency: 425-1162ms per probe, all 5 test
cases correct. Worst-case 16-step Norris = ~20s overhead; with
static-pattern fast-path + session cache, ~5s realistic.
- Module pre-state at commit f26cbd9 (Phase 2 tip): LOC + state
per file before Phase 3 edits.
- Six static-pattern Lua-match sanity checks (all correct).
- Carries: aish#15 (still open), aish#14, aish#32/#33.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+41
-3
@@ -2,9 +2,46 @@
|
||||
|
||||
**Project:** aish — AI-augmented conversational shell
|
||||
**Document:** Phase 3 Requirements, Architecture & Design Decisions
|
||||
**Status:** Formulate (pre-analyze)
|
||||
**Status:** Analyze (formulate complete; live-probed against current tree at `b58a842`)
|
||||
**Date:** 2026-05-12
|
||||
|
||||
**Analyze findings (2026-05-12):**
|
||||
|
||||
A1. **`\C-n` mid-readline limitation.** Phase 1's `\C-n` handler fires
|
||||
synchronously from inside the readline keystroke callback (via
|
||||
`rl_bind_keyseq` → ffi-cast Lua closure). The current binding API
|
||||
only exposes `rl_bind_keyseq` — no `rl_insert_text`,
|
||||
`rl_replace_line`, or `rl_redisplay`. So a `\C-n` callback cannot
|
||||
cleanly mutate the in-progress prompt buffer or end the
|
||||
readline call early to "transition into Norris mode".
|
||||
**Resolution**: bind `rl_insert_text` + `rl_redisplay` (single cdef
|
||||
+ 2 wrapper lines in `ffi/readline.lua`) so the `\C-n` handler
|
||||
inserts `:norris ` at the cursor and refreshes the display. User
|
||||
then types the goal + Enter, routing through the existing meta
|
||||
dispatch normally. `\C-n` becomes a typing shortcut, not a state
|
||||
toggle.
|
||||
|
||||
A2. **`broker.chat` lacks `max_tokens`.** The LLM second-opinion path
|
||||
in `safety.is_destructive` needs a tight YES/NO completion (2
|
||||
tokens max). The proxy + small models honor `max_tokens`
|
||||
correctly (verified vs hossenfelder: `max_tokens=4` returned a
|
||||
clean "YES" in 2 completion tokens). Phase 2's broker doesn't
|
||||
surface this option. **Resolution**: add `opts.max_tokens` to
|
||||
`M.chat_stream`'s opts table (Phase 2 already widened opts);
|
||||
`M.chat` passes through. Defaults nil → field omitted from the
|
||||
request body — Phase 1/2 callers unaffected.
|
||||
|
||||
A3. **Tool-sub-loop is structurally reusable.** Phase 2's `ask_ai` sub-
|
||||
loop (stream → collect text + tool_calls → dispatch → append → loop
|
||||
until pure-text response or cap) IS the planner shape Phase 3 wants.
|
||||
`safety.norris_step` per §4 is essentially this iteration extracted
|
||||
behind a function call, plus the `GOAL: complete` sentinel check.
|
||||
No structural refactor of Phase 2 needed — Norris is additive.
|
||||
|
||||
These findings tighten §3's module-changes table and §12's commit #1
|
||||
scope (adds a small `ffi/readline.lua` extension to commit #5) — see
|
||||
inline notes below where the change matters.
|
||||
|
||||
PHASE0.md is the locked substrate; PHASE1.md and PHASE2.md are layered
|
||||
on top. This manifest specifies what Phase 3 adds — **Chuck Norris
|
||||
autonomous mode**, the **destructive-op safety heuristic** that gates
|
||||
@@ -78,9 +115,10 @@ Three pillars per PHASE0.md §11 row 3:
|
||||
| `safety.lua` | `confirm_tool_call` (Phase 2 surface only) + Phase 3 stubs `is_destructive` / `norris_step` raising error() | Implement the stubs: (a) `is_destructive(cmd_or_tool_call) -> (bool, reason)` with static pattern matching + optional LLM second-opinion (controlled by `cfg.safety.llm_second_opinion`, default true); (b) `norris_step(ctx, broker_cfg, executor_fn, tools_fn, halt_fn, opts) -> {status, reason}` — single iteration of the Norris loop. Pattern list is module-local; LLM second-opinion uses `broker.chat` (non-streaming, no tools, single-shot). |
|
||||
| `repl.lua` | tool-sub-loop + `:mcp` meta + Phase 1 `\C-n` no-op binding | Replace `\C-n` body with a Norris toggle. Add `:norris <goal>` meta cmd as the explicit-launch variant. New module-local `norris_active` flag. Implement the Norris driver loop: while active, call `safety.norris_step`; handle HALT decisions; exit on `GOAL: complete`, `abort`, or step budget exceeded. Auto_approve policy from `confirm_tool_call` is consulted in-line. |
|
||||
| `renderer.lua` | exec frame + tool-call frame + assistant streaming | Add `M.norris_begin(goal)`, `M.norris_step(n, action_desc)`, `M.norris_halt(reason, action)`, `M.norris_end(status, reason)`. Visual: bold cyan banner on enter, indented step counter per iteration, red HALT banner on intercept, dim summary on exit. Phase 0 prompt becomes `[aish:fast ⚡]>` when Norris is active per PHASE0.md §9. |
|
||||
| `broker.lua` | `chat_stream` with opts.tools, `chat` non-streaming | No structural change. Norris re-uses `chat_stream` for planning rounds (same as interactive). `chat` is used by `safety.is_destructive` for LLM second-opinion. |
|
||||
| `broker.lua` | `chat_stream` with opts.tools, `chat` non-streaming | Re-used as-is for planning rounds (Norris just calls chat_stream like interactive). See row below for the small `max_tokens` opts extension needed by the LLM second-opinion path. |
|
||||
| `context.lua` | system_prompt + turns + pending_exec_output + use_tool_role | When Norris is active, `to_messages()` appends the Norris suffix (§2 row "Norris prompt suffix") to the system message. The suffix is computed dynamically — when Norris exits, subsequent broker calls revert to plain system prompt. No additional storage. |
|
||||
| `ffi/readline.lua` | `bind(seq, fn)` (Phase 1) | No additions — `\C-n` binding mechanism already in place. The Phase 1 placeholder handler is just replaced with a real one in repl.lua. |
|
||||
| `ffi/readline.lua` | `bind(seq, fn)` (Phase 1) | **Small extension per A1**: add `rl_insert_text` + `rl_redisplay` to the `ffi.cdef` block and expose `M.insert_text(s)` / `M.redisplay()` wrappers. Needed so the `\C-n` handler can stuff `:norris ` into the in-progress buffer cleanly rather than just printing a status that disappears. |
|
||||
| `broker.lua` | `chat_stream(cfg, msgs, on_delta, opts)` with opts.tools | **Small extension per A2**: `opts.max_tokens` (integer) is passed through to the request body as `max_tokens`. Omitted when nil. `M.chat` accepts the same opt. Needed so `safety.is_destructive`'s YES/NO probe terminates in ~2 tokens. |
|
||||
| `config.lua` | mcp example block | New optional `safety = { llm_second_opinion = true, llm_model = "fast", destructive_patterns = {...} }` block, also commented-out example. Defaults are sane when absent. |
|
||||
|
||||
No new module files beyond what already exists. The `\C-x\C-c` abort keybinding (PHASE1.md §7 reserved) gets wired here.
|
||||
|
||||
Reference in New Issue
Block a user