8 Commits

Author SHA1 Message Date
marfrit 76a8f97009 repl: cloud preplanner + local executor split for Norris (closes #89)
Phase 10 C4 — the orchestration commit. Splits Norris autonomous
mode into a one-shot cloud preplan + per-step local executor flow,
with graceful fall-back to single-model Norris when preplan is
disabled or fails.

run_norris additions (in order):

  1. R4 fix: clear ctx.norris_active/_goal/_tasks at the TOP so a
     prior crashed Norris can't leak stale state into the new launch.

  2. Preplan block (gated on cfg.norris.preplanner):
     - Look up the preplanner preset in cfg.models; warn + skip if
       absent.
     - Build a system prompt asking for TASK: <imperative> lines
       (R1: %d via string.format — gsub("N", ...) would corrupt
       "No prose / commentary / numbering" to "16o prose").
     - Scrub messages per the preplan model's redact policy; run
       broker.chat (non-streaming, per Q-PP2) with category
       "norris-preplan"; R7: respect pre_cfg.timeout_ms.
     - On success: rehydrate; record usage via _record_usage;
       extract_task_lines; cap to tasks_max; populate
       ctx.norris_tasks = { current = 1, list = parsed }.
     - On ANY failure (transport err / empty list / bogus preset):
       status log + leave ctx.norris_tasks nil → single-model
       fall-back. R3 design: NOT routed via call_broker; a fallback
       retry would silently swap planning models which is worse
       than a clean hard-fail.

  3. Executor cfg resolution (independent of preplan per Q-PP1):
     cfg.norris.executor names a preset → executor_cfg = that cfg.
     Unset / missing preset → executor_cfg = active_cfg (existing
     :model-selection behavior).

  4. Loop body: pass executor_cfg (not active_cfg) to
     safety.norris_step. After each "continue" result, advance
     ctx.norris_tasks.current. When current > #list, exit with
     synthesized status "tasks_complete" + reason "all N preplanned
     tasks executed".

  5. Exit cleanup: clear ctx.norris_tasks alongside the existing
     norris_active/_goal clears so a re-launch starts fresh.

renderer.norris_end gains "tasks_complete" as a non-error status
(cyan, same as "done"). Distinct from "done" (executor said
GOAL: complete) — executor exhausted the plan but didn't confirm
goal, which is a clean exit, not an error.

E2E verified (preplanner=fast, executor=fast on hossenfelder:8082):

  :norris print the date and the current uptime
  → preplanned 2 tasks via fast
  → ─ step 1/3 ─ Print the current date.
  → CMD: date → Sun May 17 ...
  → ─ step 2/3 ─ Print the current uptime.
  → CMD: uptime → ... up 1 day ...
  → NORRIS TASKS COMPLETE: all 2 preplanned tasks executed

  :cost detail correctly shows two rows for the same model:
    norris-preplan  1 calls,  95 /  12 tokens
    norris          1 calls, 364 /   9 tokens

Fall-back verified:
  cfg.norris.preplanner = "doesnotexist" →
    "[aish] preplanner 'doesnotexist' is not in cfg.models;
     running single-model" → Norris runs as Phase 6.

No-preplan path verified (no cfg.norris block):
  Norris runs exactly as Phase 6, no behavior change.

Regression: 87/87 safety, 31/31 router_model, repl loads.

Closes #89.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:21:25 +00:00
marfrit 11d0e599cd repl + renderer: tree-sitter highlighter (Phase 6 commit #5)
The largest Phase 6 commit — fence-aware stream filter in renderer.lua
+ external tree-sitter dispatch + :highlight meta in repl.lua.

renderer.lua — fence-aware filter wrapping assistant_delta:

  M.set_highlight(enabled, detected, highlight_fn)
      Called by repl.lua at startup AND on every :highlight toggle.
      Stores state in module-locals (off by default).

  State machine inside _hl_push:
    outside: pass chunks through; HOLD trailing partial-fence chars
             (per R1 — local llama.cpp splits ```python as `'``'`
             then `'`python\n'`, so naive pass-through drops the
             leading "``" and never recovers).
    inside:  buffer cumulatively until "\n```" appears; emit
             highlight_fn(body, lang) then the closing fence verbatim.
             Recursive call handles "rest" after the closing fence.

  N1: fences only open at start-of-stream OR after a newline
      (`^```` or `\n```` only). Inline backticks in prose
      ("use ``` to mark code") do not open a fence.

  R3 (PTY raw-mode toggle per highlight call): no change here — every
      executor.exec call already toggles raw-mode (existing behavior
      since Phase 1). The risk is theoretical; smoke-test interactively
      after install if multi-fence renders show flicker.

  assistant_flush handles end-of-stream gracefully: drains any held
  partial-fence tail OR an unterminated inside-fence buffer.

repl.lua — _detect_treesitter + highlighted + :highlight meta:

  _detect_treesitter()  one-shot popen probe of `tree-sitter --version`.
                        Run once at startup; cached as
                        highlight_detected.

  highlighted(body, lang_tag)   R2-placed in repl.lua (has _shq +
                                executor access). Translates the fence
                                tag (`py`, `python`, `lua`, etc.) to
                                a canonical lang via LANG_TAG, picks
                                the canonical extension via LANG_EXTENSION,
                                writes body to a tmpfile with that
                                extension, runs `tree-sitter highlight
                                <tmpfile>` via executor.exec, returns
                                the output. On ANY failure (CLI absent,
                                non-zero exit, empty output), returns
                                `body` unchanged — silent pass-through.

  R4 RESOLVED VIA REAL INSTALL: probed `tree-sitter highlight --help`
      on noether; confirmed:
        - NO `--lang` flag exists (formulate-time assumption wrong)
        - takes a PATH; language inferred from file extension
        - alternative `--scope source.X` exists but also unreliable
          without configured grammars
      Resolution: write tmpfile with `os.tmpname() .. LANG_EXTENSION[lang]`
      and pass the path. Matches the documented upstream contract.

  B4-followup: even with the CLI installed, highlighting requires
      `~/.config/tree-sitter/config.json` parser-directories with
      cloned + built `tree-sitter-<lang>` grammars. Without parsers,
      every call exits non-zero and we silently pass through. The
      :highlight install hint surfaces all three install steps so the
      user knows what's actually needed.

  :highlight [on|off|status] meta:
      no arg     -> flip
      on/off     -> set explicit
      status     -> report toggle + CLI detection state
      When toggled on AND CLI absent: emit a 4-line install hint
        (CLI install, init-config, grammar clone reminder).
      When toggled on AND CLI present: emit a 1-line note that
        parser-directories must be set up for actual highlighting.

HELP gains :highlight entry.

Tested:
  10/10 unit cases on the renderer state machine, including:
    - plain prose passthrough
    - single-chunk fence
    - B2 split fence ("``" + "`python\n" + "x=42" + "\n```")
    - N1 SOL anchor (mid-line ``` does not open)
    - trailing \n properly emitted across chunks
    - SOL-only fence open
    - prose after closing fence preserved
    - two fences in one stream
    - highlight off = passthrough (callback never fires)

  E2E :highlight meta verified:
    :highlight status -> off / detected
    :highlight on     -> toggles + emits parser-dir reminder
    :highlight status -> on / detected
    :highlight off    -> off

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Pillars 1 + 2 + 3 of Phase 6 now all implemented. Commit #6 is config
example block + status -> Implement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:27:04 +00:00
marfrit d2a53d2fc7 renderer: Norris autonomous-mode frames (Phase 3 commit #3)
Phase 3 commit #3 per docs/PHASE3.md §12. Four new renderer functions
for Norris mode visual feedback.

M.norris_begin(goal)
  Bold cyan banner on Norris entry, with the goal text on a dim
  indented line. Frames the start of the planning loop.

M.norris_step(n, max_n, descr)
  Compact one-line step counter ("─ step 3/16 ─") with optional
  description. Renders before each iteration of the planner.

M.norris_halt(step_n, max_n, reason, action)
  Bold red banner when the destructive-op gate fires. Three
  indented lines: step counter, reason (red), action text
  (truncated at 400 chars, newlines collapsed). The interactive
  proceed/skip/abort prompt is shown after this banner by repl.lua.

M.norris_end(status, reason)
  Closing banner. status ∈ {"done", "aborted", "budget_exhausted",
  "stalled", "broker_error"}. Color cyan on "done", red otherwise.
  Optional reason text on a dim line.

The interactive prompt `[aish:<model> ]>` activation lands in
commit #5 (repl.lua's prompt() function).

Smoke-tested all five frames visually — clean ANSI output, correct
truncation on long action strings, color discrimination on
done/aborted/budget_exhausted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:36:44 +00:00
marfrit c736d0e129 renderer: tool-call begin/end frames
Phase 2 commit #4 per docs/PHASE2.md §12. Adds M.tool_call_begin(name, args)
and M.tool_call_end(content, is_error) for visual parity with the existing
exec_begin/exec_end frame.

Visual cadence:
  ─── tool: <name (cyan)> ───
  <args, dim, truncated at 200 chars; omitted if empty/"{}">
  <content>
  ─── ok ───            (dim, success)
  ─── error ───         (red status word inside dim rule, on is_error=true)

Same rule glyph (━) and ANSI palette as the exec frame so the user reads
tool dispatch and shell dispatch the same way.

Smoke-tested all five shapes: success with args / empty args / error /
long args truncated / empty content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:11:42 +00:00
marfrit 1f1065157e review BLOCKER: PTY input forwarding + raw mode toggle
Phase 1 review caught a structural gap: executor.exec only drained the
PTY master fd, never forwarded user keystrokes — vim/less/htop/nano
would render and hang on input. PHASE1.md §5 specified bidirectional
multiplex but only the read leg landed. tcgetattr/tcsetattr were also
missing, so even with input forwarding the parent's line discipline
would buffer until newline (breaking single-key UIs).

ffi/libc:
  - struct termios opaque buffer + tcgetattr/tcsetattr + cfmakeraw
  - M.set_raw(fd) saves termios + applies cfmakeraw; returns saved or
    (nil, err) when fd isn't a tty (scripted / piped-stdin runs)
  - M.restore_termios(fd, saved)
  - struct pollfd + M.poll (POLLIN constant)

executor:
  - multiplex(sess): poll(stdin, master); reads master on any revents
    (POLLHUP fires when child closes its slave end, not POLLIN — the
    revents != 0 check catches both); forwards stdin keystrokes to
    master; loop exits when master read returns 0 (EOF / child gone)
  - stdin polling is only enabled when stdin_is_tty (set_raw succeeded);
    piped-stdin runs (tests / scripted) would otherwise drain queued
    aish commands into the child of the *current* cmd, swallowing them
  - raw mode is restored before returning so the user lands back at the
    aish prompt in canonical mode

renderer + repl:
  - exec_output(out, code) split into exec_begin() (top rule, before
    spawn) + exec_end(code) (closing rule with exit, after wait). PTY
    multiplex streams the body live to stdout in between; the renderer
    never re-prints the body.

PHASE1.md §3:
  - tcgetattr/tcsetattr changed from "optional" to "required for
    single-key UIs to work — done-criteria #2"; poll added to the libc
    row description.

Verified:
  - non-interactive smoke (echo / false / exit 7 / ls /nonexistent /
    printf multi-line) — all exit codes correct, output streamed live,
    a\nb\nc\n preserved byte-for-byte
  - scripted-stdin run reaches all expected lines (no stdin draining
    into a non-interactive child)
  - aish prompt + framed exec block + exit-code line all render in
    correct order

Live interactive verification (vim / less / htop in a real terminal)
still needs a user-test pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:00:53 +00:00
marfrit a722f576ac repl + renderer: streaming assistant output (Phase 1)
repl.ask_ai now drives broker.chat_stream and pumps each delta into
renderer.assistant_delta(delta) as it arrives. renderer.assistant_flush
is called when the stream ends to add a trailing newline if missing.
The full reassembled response is then handed to executor.extract_cmd_lines
for the CMD: confirm-and-execute path (unchanged from Phase 0).

renderer.assistant() is kept for non-streaming callers (none in tree
right now, but cheap to keep around). assistant_delta/flush share no
state with assistant(); they use a module-local stream_buf that tracks
the in-progress streamed block.

Q12 deferred: incremental CMD: highlighting (cursor-positioning re-
render on flush) is not implemented in Phase 1 — deltas emit raw. The
§6 CMD: marker is still extractable on the reassembled string post-
stream, which is what executor cares about. Renderer's bold+cyan
treatment for CMD: lines stays available via M.assistant().

Broker error / SSE-framed api-error path still pops the user turn and
restores ctx.pending_exec_output. Order: assistant_flush always runs
(even on error) so the cursor lands on a fresh line before the broker-
error status renders.

Live verification: `Count one to ten` against hossenfelder fast streams
deltas through to stdout incrementally; CMD: extraction works on the
reassembled string; confirm gate intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:17:27 +00:00
marfrit f22a3b33c8 renderer: assistant text, exec output frame, status line
Phase 0 minimal output formatting per PHASE0.md skeleton.

  M.assistant(text)              — line-by-line; `CMD: ` lines bold+cyan
  M.exec_output(output, code)    — top/bottom rules; exit code on closing
                                   rule (red on non-zero)
  M.status(line)                 — dim "[aish] ..." single-liner

ANSI table is local to the module (no external dep). Trailing-sentinel
pattern ((text..\"\\n\"):gmatch(\"([^\\n]*)\\n\")) preserves blank lines
in assistant output rather than squashing them, at the cost of one
extra trailing newline — acceptable for Phase 0. Real syntax-aware
formatting (tree-sitter) lands in Phase 6.

Smoke verifies escape codes are emitted (od -c shows \\033[1m\\033[36m
around CMD: line) and the visual layout looks right.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 14:42:56 +00:00
claude-noether 4310207738 Phase 0: scaffold tree + manifest
- README, .gitignore, CLAUDE.md (project conventions)
- docs/PHASE0.md — full Phase 0 manifest (locked substrate)
- 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented
  with module-scoped responsibilities matching the manifest
- config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b
  snappy/32k + cloud via OpenRouter through hossenfelder)

File names match docs/PHASE0.md §4 exactly. Module bodies fill in across
later phases; the tree shape is locked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:16:07 +00:00