docs/PHASE6: plan — fold B1/B3/B4 + add §12 commit roadmap
Status header: Analyze -> Plan.
Baseline findings folded into the design sections:
§1 (highlighter pillar) gains B4: tree-sitter absent on every
probed host; :highlight on emits install-hint when missing.
§4 (highlighter sketch) revised per B3: io.popen():close() doesn't
expose exit codes in LuaJIT. Route via executor.exec("cat tmp |
tree-sitter ...") which uses pty.spawn+waitpid and returns code
reliably. Tmpfile design retained (avoids ARGMAX + shell-escape).
§5 (:diff impl + @<r1>..<r2> retry) revised per B1: every git
invocation must use `--no-pager -c color.ui=never` to suppress
the color/keypad/line-clear escapes forkpty triggers. Factored
recommendation: helper `_git_clean_cmd(subcmd)` shared by :diff
and the @-mention diff retry.
New §12 Implementation Plan — 6 commits, bottom-up:
1. context.lua: ctx.project + compose_project + composition order
2. repl.lua: _scan_project_tree helper + :tree meta
3. repl.lua: :diff meta + _git_clean_cmd helper (B1)
4. repl.lua: expand_mentions tiered resolution (@<r1>..<r2> per A6)
5. renderer.lua + repl.lua: tree-sitter detect + fence filter +
:highlight meta (B3-revised tmpfile dispatch)
6. config.lua project example + status -> Implement
Per-commit risk index + smoke criteria. Highlighter (commit 5) is
the largest experimental surface — placed last so the rest of Phase 6
ships even if highlighter slips. Order is independent enough that
swapping 3<->4 or 5<->6 doesn't break anything; bottom-up keeps each
commit individually green.
Things deliberately not split: _shq reuse, lang map duplication for
v1, streaming-rehydration order (rehydrate -> highlight -> emit
inherits naturally from existing chunk pipeline).
Two items open at plan time, resolve at implement: _scan_project_tree
dir-arg vs hardcoded getcwd; :highlight status probing
tree-sitter --print-langs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+183
-23
@@ -2,7 +2,7 @@
|
||||
|
||||
**Project:** aish — AI-augmented conversational shell
|
||||
**Document:** Phase 6 Requirements, Architecture & Design Decisions
|
||||
**Status:** Analyze (formulate complete; current tree at `f596743` probed)
|
||||
**Status:** Plan (formulate + analyze + baseline complete; tree at `9f50206`)
|
||||
**Date:** 2026-05-16
|
||||
|
||||
**Analyze findings (2026-05-16):**
|
||||
@@ -109,6 +109,10 @@ Three pillars per PHASE0.md §11 row 6:
|
||||
identity function (zero overhead, zero hard dependency). Toggleable
|
||||
at runtime with `:highlight on|off`. Default off until the user
|
||||
opts in (don't surprise existing users with a display change).
|
||||
Per B4: tree-sitter is **absent on every fleet host probed**;
|
||||
`:highlight on` when the CLI is missing emits a status that names
|
||||
the install hint (`apt install tree-sitter` / `cargo install
|
||||
tree-sitter-cli`) rather than silently falling back to identity.
|
||||
|
||||
2. **Diff-aware code injection** — surface git diffs as first-class
|
||||
context. Two entry points:
|
||||
@@ -233,29 +237,36 @@ Edge cases: chunk boundary lands inside the fence marker itself
|
||||
machine looks at the cumulative `buf`, so partial markers are
|
||||
recovered correctly.
|
||||
|
||||
`highlighted(body, lang)` — resolved per A4 (tmpfile roundtrip):
|
||||
`highlighted(body, lang)` — **B3-revised** (supersedes A4):
|
||||
|
||||
```lua
|
||||
if not highlight_enabled or not lang_map[lang] then return body end
|
||||
local tmp = os.tmpname()
|
||||
local w = io.popen(("tree-sitter highlight --lang %s > %s")
|
||||
:format(lang_map[lang], tmp), "w")
|
||||
w:write(body)
|
||||
local _, _, code = w:close()
|
||||
local f = io.open(tmp, "rb")
|
||||
local out = f and f:read("*a") or body
|
||||
if f then f:close() end
|
||||
local f = io.open(tmp, "wb")
|
||||
if not f then return body end
|
||||
f:write(body); f:close()
|
||||
-- B3: LuaJIT io.popen():close() returns (true, nil, nil) regardless of
|
||||
-- exit code. Route via executor.exec which uses pty.spawn+waitpid and
|
||||
-- returns (out, exit_code) reliably.
|
||||
local out, code = executor.exec(
|
||||
("cat %s | tree-sitter highlight --lang %s")
|
||||
:format(_shq(tmp), lang_map[lang]))
|
||||
os.remove(tmp)
|
||||
if code ~= 0 then return body end -- pass-through on highlighter failure
|
||||
return out
|
||||
```
|
||||
|
||||
The two-handle design avoids the ARGMAX risk of shelling
|
||||
`printf '%s' BODY | tree-sitter ...` (Linux ARGMAX is ~128KB but
|
||||
LuaJIT strings can be larger) and sidesteps shell-escape edge cases
|
||||
(body may contain arbitrary bytes). Cost is one syscall per code block
|
||||
for the tmp file create/remove cycle — negligible vs the highlighter
|
||||
invocation itself.
|
||||
Why this shape (and not the formulate-time A4 sketch):
|
||||
|
||||
- **A4 assumed Lua 5.2+ popen-close-with-code.** LuaJIT (5.1 contract)
|
||||
doesn't expose the exit status via `io.popen(...):close()`. Baseline
|
||||
B3 caught this; the only reliable exit-code path in our substrate
|
||||
is `executor.exec` (pty.spawn + waitpid).
|
||||
- The tmpfile stays — avoids ARGMAX on `printf '%s' BODY |` and
|
||||
sidesteps shell-escape edge cases on arbitrary code-block bytes.
|
||||
- Cost is one syscall round (tmpfile create/remove) plus one pty
|
||||
spawn per code block — negligible vs the highlighter latency.
|
||||
- `_shq` is the existing shell-quote helper from #3 (pre/post hooks).
|
||||
|
||||
### Lang map (v1)
|
||||
|
||||
@@ -292,12 +303,16 @@ Default: off (don't change existing-user UX).
|
||||
- `:diff main..feature` → `git diff main..feature`
|
||||
- `:diff <anything else>` → passed verbatim to `git diff <anything>`
|
||||
|
||||
Implementation:
|
||||
Implementation — **B1-revised** (must disable pager + color):
|
||||
|
||||
```lua
|
||||
meta.diff = function(args)
|
||||
args = (args or ""):gsub("^%s+", ""):gsub("%s+$", "")
|
||||
local cmd = "git diff " .. args
|
||||
-- B1: forkpty makes git think it's interactive, enabling color
|
||||
-- ANSI + DEC keypad/line-clear escapes that pollute the injected
|
||||
-- context block. --no-pager kills the keypad sequences; --color=
|
||||
-- never kills the color codes. Both are required.
|
||||
local cmd = "git --no-pager -c color.ui=never diff " .. args
|
||||
local out, code = executor.exec(cmd)
|
||||
if code ~= 0 then
|
||||
renderer.status(("diff failed (exit %d)"):format(code))
|
||||
@@ -315,6 +330,11 @@ end
|
||||
The `[diff ...]\n<output>` framing matches the `[bg:N exited]` /
|
||||
`[delegate X]` conventions established in Phase 5 / #6 / #8.
|
||||
|
||||
The same `--no-pager -c color.ui=never` prefix applies to the
|
||||
`@<r1>..<r2>` resolution path in the next section, and to any
|
||||
future git verbs we add (`:log`, `:show`, etc.). Factor into a
|
||||
helper `_git_clean_cmd(subcmd)` if multiple call sites accumulate.
|
||||
|
||||
### @-mention: `@<ref1>..<ref2>` — tiered resolution (A6)
|
||||
|
||||
Extends `expand_mentions` (#7) by adding a SECOND resolution attempt
|
||||
@@ -327,12 +347,15 @@ when the first (path lookup) fails AND the token contains `..`:
|
||||
if not content and path:find("..", 1, true) then
|
||||
local r1, r2 = path:match("^(.-)%.%.(.+)$")
|
||||
if r1 and r2 and r1 ~= "" and r2 ~= "" then
|
||||
local pipe = io.popen(("git diff %s..%s 2>/dev/null")
|
||||
:format(shq(r1), shq(r2)))
|
||||
local diff = pipe and pipe:read("*a") or ""
|
||||
local _, _, code = pipe and pipe:close()
|
||||
if code == 0 and diff:match("%S") then
|
||||
content = diff
|
||||
-- B1: --no-pager + color=never (same as the :diff meta path).
|
||||
-- B3: io.popen close() doesn't expose exit codes — use the
|
||||
-- file-redirect trick OR executor.exec. Here we want a quick
|
||||
-- best-effort and the cost of an extra forkpty is acceptable.
|
||||
local out, code = executor.exec(
|
||||
("git --no-pager -c color.ui=never diff %s..%s 2>/dev/null")
|
||||
:format(shq(r1), shq(r2)))
|
||||
if code == 0 and out:match("%S") then
|
||||
content = out
|
||||
-- Note: language tag becomes "diff" regardless of path lang
|
||||
lang_override = "diff"
|
||||
end
|
||||
@@ -539,3 +562,140 @@ phases beyond 6. After Phase 6 lands, candidate next iterations
|
||||
|
||||
Phase 6 itself is self-contained — none of its three pillars introduce
|
||||
substrate dependencies on phases not yet planned.
|
||||
|
||||
---
|
||||
|
||||
## 12. Implementation Plan (commit-by-commit)
|
||||
|
||||
Bottom-up ordering: foundations first (context.lua field + composer),
|
||||
then the diff and tree surfaces that have no display-layer risk, then
|
||||
the highlighter (largest experimental surface — last so the rest of
|
||||
Phase 6 ships even if highlighter slips). Each commit leaves the tree
|
||||
green (existing tests pass + smoke ok) and adds a discrete capability.
|
||||
|
||||
### Order
|
||||
|
||||
1. **`context.lua` — `[project]` block plumbing.** Add `self.project`
|
||||
(string, nil-allowed) on `Context.new`. Add `compose_project(text)`
|
||||
helper mirroring `compose_background` / `compose_summary`. In
|
||||
`to_messages`: insert between `compose_background` and
|
||||
`compose_summary` so the read order is memory → project tree →
|
||||
earlier-summary → NORRIS. Suppressed under `self.norris_active`
|
||||
(parity with R-C1 / R-C4). No behavior change yet — nothing sets
|
||||
`ctx.project`. Smoke: `:to_messages()` still empty when project nil.
|
||||
|
||||
2. **`repl.lua` — `_scan_project_tree` helper + `:tree` meta.**
|
||||
- `_scan_project_tree(dir, opts)` per §6: `git ls-files --cached
|
||||
--others --exclude-standard` in a repo, `find . -maxdepth N
|
||||
-type f -not -path '*/\.*'` outside. Returns `(body, info)`
|
||||
where `info = { file_count, truncated }`.
|
||||
- `:tree [N|refresh|off]` meta: scans cwd, sets `ctx.project`,
|
||||
emits status with file count + truncation note.
|
||||
- `cfg.project.auto_tree` startup hook: if true, run `_scan` once
|
||||
and set `ctx.project` (before the main loop opens). Default
|
||||
false (existing configs unchanged).
|
||||
- Update HELP with `:tree` lines.
|
||||
- Smoke: in the aish repo, `:tree` injects a ~32-file block;
|
||||
`:to_messages()` shows the `[project]` block in the system prompt.
|
||||
|
||||
3. **`repl.lua` — `:diff` meta + `_git_clean_cmd` helper (B1).**
|
||||
- `_git_clean_cmd(subcmd_and_args)` returns the `git --no-pager
|
||||
-c color.ui=never <subcmd_and_args>` prefix. Used by `:diff`
|
||||
and the `@<r1>..<r2>` path in commit #4.
|
||||
- `:diff [args]` meta per §5 (B1-revised): runs the clean git
|
||||
command via `executor.exec`, appends `[diff <args>]\n<out>`
|
||||
to context as exec_output. Empty / non-repo / bad-ref paths
|
||||
emit status and skip.
|
||||
- Update HELP with `:diff` line.
|
||||
- Smoke: `:diff` from a dirty aish checkout injects the working
|
||||
tree diff; `:diff staged` works; `:diff junkref` emits status
|
||||
and skips.
|
||||
|
||||
4. **`repl.lua` — `expand_mentions` tiered resolution (A6).**
|
||||
Extend the existing path-resolution loop with the diff-retry
|
||||
branch from §5: if `_read_truncated` returns nil AND the token
|
||||
contains `..`, parse as `<r1>..<r2>` and try `_git_clean_cmd(
|
||||
"diff <r1>..<r2>")`. On success, replace with a fenced `diff`
|
||||
block. Preserves existing peel-on-trailing-punct logic. Smoke:
|
||||
`@HEAD~1..HEAD` expands inline; `@origin/main..feature` works
|
||||
when the ref exists; `@../sibling.txt` still resolves as file.
|
||||
|
||||
5. **`renderer.lua` + `repl.lua` — tree-sitter highlighter.**
|
||||
This commit is the largest single change in Phase 6. Substeps:
|
||||
|
||||
a. `_detect_treesitter()` in repl.lua: one-shot popen of
|
||||
`command -v tree-sitter && tree-sitter --version`. Stash
|
||||
result on a local.
|
||||
|
||||
b. `renderer.lua` — fence-aware state machine wrapping
|
||||
`assistant_delta`. Exports `renderer.set_highlight(enabled,
|
||||
detected)` so repl.lua wires the toggle + cli-availability
|
||||
flags. State: `outside` (pass-through) / `inside` (buffer until
|
||||
closing fence). On close: call `highlighted(body, lang)` and
|
||||
emit. Algorithm per §4; bytes-of-cumulative-buf scan as B2
|
||||
requires for fragment-across-boundary fences.
|
||||
|
||||
c. `highlighted(body, lang)` per §4 (B3-revised): write body to
|
||||
`os.tmpname()`, invoke via `executor.exec("cat tmp |
|
||||
tree-sitter highlight --lang X")`, capture out + exit code,
|
||||
cleanup tmp, pass-through on failure.
|
||||
|
||||
d. `:highlight [on|off|status]` meta in repl.lua. `:highlight on`
|
||||
when CLI absent → status with install hint (B4); `:highlight
|
||||
status` always reports current toggle + CLI availability.
|
||||
|
||||
e. HELP update; PHASE6 status → Implement.
|
||||
|
||||
6. **`config.lua` + docs/PHASE6 status bump.**
|
||||
- Add commented-out `project = { auto_tree = false, tree_depth = 3,
|
||||
tree_max_chars = 4096 }` block in config.lua (parity with the
|
||||
Phase 1-5 example blocks).
|
||||
- PHASE6.md status header → **Implement** (matches Phase 5
|
||||
cadence — manifest tracks implementation state).
|
||||
|
||||
### Risk index per commit
|
||||
|
||||
| Commit | Risk | Mitigation |
|
||||
|---|---|---|
|
||||
| 1 (compose_project) | Composition-order regression breaks Phase 4/5 callers | Order test: empty memory + empty project = identical sys_content to pre-Phase-6 baseline |
|
||||
| 2 (:tree) | `find` fallback picks up node_modules / target / build / etc. | Document in status warning; users in non-repo cwds scope via `:tree <depth>` |
|
||||
| 3 (:diff) | B1 — color/keypad codes leak if a future caller forgets the helper | All call sites must go through `_git_clean_cmd`; lint by grep before commit |
|
||||
| 4 (@<r1>..<r2>) | False positive on `@../sibling.txt` when no such file exists | A6's tiered resolution: only retry as diff when file lookup fails. `@../sibling.txt` resolves as path; if the path is missing, diff retry runs and naturally fails — same outcome as before |
|
||||
| 5 (highlighter) | Fence detector misclassifies inline ` ` ``` ` ` triple-backtick in prose | State machine triggers on `^```` at the start of a line OR following a newline only; mid-line backticks don't open a fence. Document in §4. |
|
||||
| 5 (highlighter) | tmpfile race / leak on crash | `os.remove(tmp)` in normal exit path; OS cleans `/tmp/lua_*` files on reboot. Single-user trust per PHASE0 §12. |
|
||||
| 6 (config bump) | none — pure docs / commented config |
|
||||
|
||||
### Tests + smoke per commit
|
||||
|
||||
Each commit must:
|
||||
- Pass `luajit test_safety.lua` (87/87) and `luajit test_router_model.lua` (31/31)
|
||||
- Load cleanly: `luajit -e 'package.path="./?.lua;./vendor/?.lua;"..package.path; require("repl"); print("ok")'`
|
||||
- Pass a feature-specific smoke (described per row above)
|
||||
|
||||
No new test framework dependency. Per-feature unit tests can live as
|
||||
inline `luajit -e '...'` blocks in commit messages or as a dedicated
|
||||
`test_phase6.lua` if the surface area justifies it (decide at impl-time).
|
||||
|
||||
### Things deliberately NOT split into a separate commit
|
||||
|
||||
- `_shq` (shell-quote helper) — already exists in repl.lua from #3.
|
||||
Reuse in commit 5 (highlighter); no new helper.
|
||||
- Lang map — small enough to copy locally in commit 5 (~15 lines);
|
||||
the existing `_lang_of(path)` in `expand_mentions` uses a similar
|
||||
but smaller map. Factor only if a third caller appears.
|
||||
- Streaming-rehydration interaction with the highlighter — `secrets_session`
|
||||
rehydrate runs BEFORE the highlight filter in the chunk pipeline.
|
||||
Order: `chunk → rehydrator:push → highlight_filter → emit`. The
|
||||
highlighter operates on plain text only; rehydrated placeholders
|
||||
resolve to real values which the highlighter sees as code. No
|
||||
special wiring needed.
|
||||
|
||||
### Open at plan-time (resolve at implement)
|
||||
|
||||
- Whether `_scan_project_tree` should honor a per-call `opts.dir`
|
||||
override (so a future feature like "scan `<other-dir>`" lands cheaply)
|
||||
vs hardcoding `libc.getcwd()`. Default to taking `dir` as arg;
|
||||
the `:tree` meta passes `libc.getcwd()` explicitly.
|
||||
- Whether `:highlight status` should also probe `tree-sitter --print-langs`
|
||||
to show which langs are actually available. Nice-to-have; defer
|
||||
unless install paths produce variable lang sets in practice.
|
||||
|
||||
Reference in New Issue
Block a user