Phase 6 formulate manifest. Three pillars per PHASE0 §11 row 6:
1. Tree-sitter syntax highlighting hooks
External `tree-sitter` CLI when present, no-op otherwise.
Honors PHASE0 §3 (no compiled extensions). Toggleable
at runtime; off by default so existing UX is unchanged.
2. Diff-aware code injection
:diff [args] meta + @<ref1>..<ref2> @-mention extension.
Shells out to `git diff`; output flows through the existing
exec-output context channel.
3. Project-level file-tree context
:tree meta + optional cfg.project.auto_tree startup inject.
git ls-files in a repo, find fallback otherwise. Composed
into the system prompt as a new [project] block between
[background] and [earlier summary]. Suppressed under Norris
(R-C1 / R-C4 parity).
Module changes: renderer.lua (fence-aware highlight filter), context.lua
(compose_project), repl.lua (3 new metas, 3 new helpers, expand_mentions
extension). No new module files in v1.
Doc covers: scope + done-when criteria, tech decisions table, module
changes table, per-pillar deep dive with example code, UX surface
summary, out-of-scope list, risks, and 6 open questions to resolve
in analyze (Q-H1/Q-H2 highlighter, Q-D1/Q-D2 diff, Q-T1/Q-T2 tree).
Scope confirmed via AskUserQuestion: all three subsurfaces in scope;
tree-sitter approach is external CLI w/ no-op fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
18 KiB
aish — Phase 6 Manifest
Project: aish — AI-augmented conversational shell Document: Phase 6 Requirements, Architecture & Design Decisions Status: Formulate (pre-analyze) Date: 2026-05-16
PHASE0 is the locked substrate; PHASE1-5 are layered on top. This manifest specifies what Phase 6 adds — tree-sitter syntax highlighting hooks, diff-aware code injection, and project-level context (file-tree summary).
1. Scope of Phase 6
Three pillars per PHASE0.md §11 row 6:
-
Tree-sitter syntax highlighting hooks — when an external
tree-sitterCLI is detected at startup, assistant code-fence content is filtered through it for ANSI-colorized display. Plain prose streams unchanged. When the CLI is absent, the filter is the identity function (zero overhead, zero hard dependency). Toggleable at runtime with:highlight on|off. Default off until the user opts in (don't surprise existing users with a display change). -
Diff-aware code injection — surface git diffs as first-class context. Two entry points:
- Meta verb:
:diff [args]runsgit diff <args>from cwd, appends output to context as exec-output.:diff staged,:diff HEAD~3,:diff main..featureall delegate to git's argument grammar. - @-mention extension:
@HEAD..feature(a ref-range expression anywhere a@pathwould go) expands inline as a fenceddiffblock, mirroring how@README.mdalready works.
- Meta verb:
-
Project-level context (file-tree summary) —
git ls-files-based tree summary of the cwd, injected as a[project]block in the system prompt. Two entry points:- Meta verb:
:tree [depth]injects on demand;:tree refreshre-scans. - Auto-inject at startup when
cfg.project.auto_tree = true— gated like memory injection so existing configs don't change behavior.
- Meta verb:
Phase 6 is done when:
- With
tree-sitterCLI installed and:highlight on, the assistant replypy\nprint("hi")\nshows up with ANSI colors. Without the CLI,:highlight onis a no-op + emits a status warning. :difffrom a dirty git repo shows the working-tree diff in the exec-output frame; the model sees it on the next ask_ai turn.@HEAD~1..HEADin a prompt expands inline to a fenced diff block.:treeinjects a[project] <N files>:block visible inctx:to_messages()(via the system prompt assembly).- With
cfg.project.auto_tree = true, the project block appears on every broker call (subject tomax_charscap). - Existing configs without
cfg.projectand with:highlight off(default) behave exactly like Phase 5 (Phase 5 regression coverage).
2. Technology Decisions (delta from Phase 5)
| Decision | Choice | Rationale |
|---|---|---|
| Highlight backend | External tree-sitter CLI (tree-sitter highlight --lang X) |
Honors PHASE0 §3: no compiled extensions, no luarocks. Detected once at startup; absence → identity filter. Opt-in via :highlight on so install-state changes don't break users. |
| Highlight buffering | Accumulate inside fenced code blocks, emit on closing fence; pass-through outside fences | Streaming UX preserved for prose. Code blocks get colorized atomically, accepting a per-block latency (~ block streaming time). Per-chunk highlighting would split a token across tree-sitter invocations and corrupt the output. |
| Lang detection | First-line fence info-string ( ```py, ```python, ```lua) → normalized via small map (py→python, js→javascript, etc.) |
The lang tag mirrors the one we already emit in expand_mentions (#7). No tag → identity (no highlight). |
| Diff backend | Shell out to git diff <args> via executor.exec |
Honors substrate (no libgit2 FFI). The existing exec frame handles capture + stream. git is universally present where aish makes sense. |
| Diff failure | Bail with status [aish] :diff failed (not a git repo / bad ref); do NOT inject empty output |
Avoids polluting context with stale or empty diffs. |
| Tree backend | git ls-files --cached --others --exclude-standard when cwd is a git repo, else find . -type f -not -path './.*' |
Free .gitignore honor in repos; sensible default outside. Both are POSIX-portable. |
| Tree summary form | Sorted relative paths, grouped by directory at depth ≤ cfg.project.tree_depth (default 3), truncated by char count cfg.project.tree_max_chars (default 4096) |
One block, deterministic order, cheap to compute. Matches the [background] memory block convention (Phase 4) so the system prompt's compositional shape stays familiar. |
| Tree injection point | context.lua: new compose_project(...) adds a [project] <header>\n<body> block to the system content, between [background] and [earlier summary] |
Same suppression rule as [background]/[earlier summary]: NOT injected during Norris (R-C1 / R-C4 — planner stays on its anchor). |
| Tree refresh policy | One scan at startup if auto; :tree refresh to re-scan on demand |
Scanning on every ask_ai is wasteful for slow filesystems. Manual refresh is sufficient for v1. |
| @-mention diff syntax | @<ref>..<ref> (two .. separator) only — recognized via the existing trailing-punct peel logic |
Avoids ambiguity with literal paths. @HEAD alone is NOT a diff trigger (would collide with files literally named HEAD). |
3. Module Changes
| File | State after Phase 5 | Phase 6 changes |
|---|---|---|
renderer.lua |
assistant_delta(text) writes chunks; assistant_flush() finalizes |
Add fence-aware filter inside the assistant stream. State machine: outside-fence (pass-through) / inside-fence (buffer, emit on close). On close, pipe buffer through tree-sitter highlight --lang <X> (if highlight enabled), emit result. Toggle exposed as renderer.set_highlight(bool). |
executor.lua |
extract_cmd_lines, extract_cmd_bg_lines, extract_delegate_lines |
No changes. Diff and tree use the existing exec path. |
context.lua |
system prompt = base + [background] + [earlier summary] + NORRIS suffix | Add self.project = "..." field + compose_project(self.project) helper. Injection between [background] and [earlier summary]. Suppressed under Norris. |
repl.lua |
meta dispatch + main loop + #13 secrets wiring | New helpers: _detect_treesitter() (run once at startup), _run_git_diff(args), _scan_project_tree(dir, opts). New meta: :highlight, :diff, :tree. Extend expand_mentions to recognize <ref>..<ref> token shape. |
config.lua |
example blocks for mcp/safety/memory/routing/secrets/etc. | Add commented-out project = { auto_tree = false, tree_depth = 3, tree_max_chars = 4096 } block. |
No new module files in v1. Three new helpers in repl.lua keep the
file growing but consolidate the Phase 6 surface. If the highlighter
filter grows past ~80 LOC, lift it into highlight.lua as a follow-up.
4. Pillar 1 — Tree-sitter highlighting
Detection (startup, once)
local function _detect_treesitter()
local pipe = io.popen("command -v tree-sitter 2>/dev/null && tree-sitter --version 2>/dev/null")
local ok = pipe and pipe:read("*l") and pipe:close()
return ok
end
If not present, renderer.set_highlight(true) emits a status warning
and leaves the filter as a no-op. Don't error; the user can install
tree-sitter and re-toggle.
Stream filter
The filter wraps renderer.assistant_delta. State machine:
state = "outside" | "inside"
buf = "" -- only used in "inside"
lang = nil -- captured at fence open
push(chunk):
if state == "outside":
look for ```<lang>\n in chunk
if found:
emit chunk up to fence-open
state = "inside"; lang = parsed; buf = chunk after fence-open
else:
emit chunk as-is
if state == "inside":
buf = buf .. chunk
look for \n``` in buf
if found:
fence_body = buf up to closing
rest = buf after closing
emit highlighted(fence_body, lang)
emit closing fence verbatim
emit rest as-is (recurse with state="outside")
state = "outside"; buf = ""
else:
-- still buffering; nothing emitted this push
Edge cases: chunk boundary lands inside the fence marker itself
(e.g., chunk ends with ``, next starts with \n). The state
machine looks at the cumulative buf, so partial markers are
recovered correctly.
highlighted(body, lang):
if not highlight_enabled or not lang_map[lang]:
return body
pipe = io.popen("tree-sitter highlight --lang " .. lang_map[lang], "w")
pipe:write(body); pipe:close()
-- HOWEVER: io.popen("w") doesn't read back stdout. We need both:
-- write body to stdin AND capture stdout. Easiest: temp file or
-- /tmp pipe trick. tbd in analyze.
Open Q-H1 (analyze): how to popen for write+read simultaneously
without forkpty. Candidates: temp file roundtrip, popen with shell
piping printf '%s' BODY | tree-sitter highlight | cat. The shell
pipe is cleanest if we shell-escape body.
Lang map (v1)
local LANG_MAP = {
py = "python", python = "python",
lua = "lua",
js = "javascript", javascript = "javascript", ts = "typescript",
sh = "bash", bash = "bash",
c = "c", h = "c", cpp = "cpp", cc = "cpp",
rs = "rust", go = "go", java = "java", rb = "ruby",
md = "markdown", json = "json",
}
Reuses the same map as expand_mentions. Factor into a shared
helper once both reference it (small _lang_of_ext() in repl.lua).
Toggle
:highlight (no arg) → flip. :highlight on|off → set explicit.
:highlight status → report enabled + whether tree-sitter is present.
Default: off (don't change existing-user UX).
5. Pillar 2 — Diff-aware code injection
Meta: :diff [args]
:diff→git diff(working tree vs index):diff staged→git diff --cached:diff HEAD→git diff HEAD:diff main..feature→git diff main..feature:diff <anything else>→ passed verbatim togit diff <anything>
Implementation:
meta.diff = function(args)
args = (args or ""):gsub("^%s+", ""):gsub("%s+$", "")
local cmd = "git diff " .. args
local out, code = executor.exec(cmd)
if code ~= 0 then
renderer.status(("diff failed (exit %d)"):format(code))
return
end
if out == "" or out:gsub("%s", "") == "" then
renderer.status("(no diff)")
return
end
ctx:append_exec_output(("[diff %s]\n%s"):format(
args == "" and "(working tree)" or args, out))
end
The [diff ...]\n<output> framing matches the [bg:N exited] /
[delegate X] conventions established in Phase 5 / #6 / #8.
@-mention: @<ref1>..<ref2>
Extends expand_mentions (#7). After the existing path-resolution
attempt fails, try interpreting the token as a git diff-range:
local r1, r2 = path:match("^(.-)%.%.(.+)$")
if r1 and r2 and r1 ~= "" and r2 ~= "" then
-- candidate diff range; try `git diff <r1>..<r2>`
local pipe = io.popen(("git diff %q..%q 2>/dev/null")
:format(r1, r2))
...
end
Output replaces the token with:
```diff
<content>
```
Same fence-with-lang shape as the @path expansion.
Risk: false-positive on legitimate paths containing .. like
@../sibling.txt. Mitigation: only interpret as diff-range when
the token contains NO / (paths have /, ref-ranges don't). Refs
with / like origin/main..feature ARE common — for those, the
user can fall back to :diff origin/main..feature.
6. Pillar 3 — Project file-tree
Meta: :tree [depth]
:tree→ scan + inject with default depth and char cap:tree <N>→ override depth for this scan:tree refresh→ re-scan with cached opts:tree off→ clearctx.project
Scan logic
local function _scan_project_tree(dir, opts)
opts = opts or {}
local max_chars = opts.max_chars or 4096
local depth = opts.depth or 3
-- Prefer git ls-files for .gitignore honor; fall back to find.
local in_git = os.execute("cd " .. shq(dir) .. " && git rev-parse --git-dir >/dev/null 2>&1") == 0
local listcmd
if in_git then
listcmd = ("cd %s && git ls-files --cached --others --exclude-standard"):format(shq(dir))
else
listcmd = ("find %s -maxdepth %d -type f -not -path '*/\\.*' 2>/dev/null"):format(shq(dir), depth + 1)
end
local pipe = io.popen(listcmd)
if not pipe then return nil, "scan failed" end
local files = {}
for line in pipe:lines() do
-- Depth filter: count `/` separators
local _, slashes = line:gsub("/", "")
if slashes < depth then files[#files + 1] = line end
end
pipe:close()
table.sort(files)
-- Build a tree-ish summary, truncate by char count.
local body = table.concat(files, "\n")
local truncated = false
if #body > max_chars then
body = body:sub(1, max_chars) .. "\n... (truncated)"
truncated = true
end
return body, { file_count = #files, truncated = truncated }
end
Injection
ctx.project = "..." (string), composed into the system prompt
between [background] and [earlier conversation summary]:
[project] 142 files (truncated at 4096B):
README.md
broker.lua
config.lua
context.lua
...
Suppressed under Norris (R-C1 / R-C4 — planner stays focused; the project context can be re-introduced via the Norris goal text if needed).
Auto-inject
cfg.project.auto_tree = true runs the scan once at startup and
sets ctx.project. Default false (existing configs unchanged).
7. UX Surface Summary
| Meta | Behavior |
|---|---|
:highlight [on/off/status] |
Toggle tree-sitter highlighter (no-op when CLI absent) |
:diff [args] |
git diff <args>, append output to context as [diff ...] |
:tree [N/refresh/off] |
Scan/refresh/clear project file-tree block |
| @-mention | Behavior |
|---|---|
@path |
Existing (#7) file expansion |
@<ref1>..<ref2> |
New: inline git diff <r1>..<r2> expansion |
| Config | Default | Effect |
|---|---|---|
cfg.project.auto_tree |
false |
Inject project tree at startup |
cfg.project.tree_depth |
3 |
Depth filter for the scan |
cfg.project.tree_max_chars |
4096 |
Truncation cap for the injected block |
(no config flag for :highlight) |
— | Runtime toggle only; no persistence in v1 |
8. Out of Scope (Phase 6)
- Pure-Lua syntax highlighter — defer to a future phase if tree-sitter CLI absence becomes a practical pain point. v1 says "install tree-sitter or accept plain text".
- bat/glow/chroma integration — only
tree-sitteris wired. Other highlighters can be added behind the same:highlighttoggle later (config fieldcfg.highlight.backend = "tree-sitter"|"bat"|...). - Smart diff context selection — no AI-driven "which diff to show".
User explicitly says
:diff <range>or@<r1>..<r2>. - File-tree LRU / smart summarization — v1 is a flat truncated list. Hierarchical roll-up ("docs/ — 8 files") is a v2 polish.
- Watching for file changes — no fs-notify reload. Re-scan via
:tree refresh. - Diff history —
:diffdoesn't track its previous invocations. Each invocation is independent. - Inline diff highlighting — the
difflang is inLANG_MAPsotree-sitter highlight --lang diffworks, but we don't ship custom ANSI for added/removed lines — tree-sitter's own theme covers it.
9. Risks
| Risk | Mitigation |
|---|---|
tree-sitter CLI not on fleet → most users get no highlighting |
It's opt-in; default off; status warning on toggle when absent. |
| Highlighter latency on long code blocks (whole-block buffering) | Accepted trade-off vs corrupting output. If painful in practice, add a per-block size cap above which we pass-through unhighlighted. |
git diff on huge changesets blows context budget |
Diff output reuses enforce_budget eviction (it's just exec output). User can :diff <subdir> to scope. v2 could add a --max-bytes truncation. |
git ls-files in a non-git cwd → falls back to find, may pick up node_modules / target / etc. |
Document in config example; v2 could honor .aishignore or similar. |
@<ref1>..<ref2> collides with paths like @../sibling.txt |
Mitigation: require NO / in the token for diff interpretation. Paths with .. segments use :diff explicitly. |
| Project tree injection adds tokens to every broker call | Char cap + opt-in auto_tree = false default. Suppressed under Norris. |
:highlight on mid-stream produces inconsistent rendering for the in-flight turn |
Toggle takes effect from the NEXT assistant turn. Document this. |
10. Open Questions (Phase 6)
| # | Question | Impact | Resolution target |
|---|---|---|---|
| Q-H1 | How to popen tree-sitter highlight with simultaneous stdin write + stdout capture (Lua/LuaJIT lacks popen3). Candidates: temp-file roundtrip, shell-pipe wrapper printf '%s' BODY | tree-sitter ... with shell-escape, or use io.popen("w") + a second io.open(output_file) after the process completes. |
Highlighter correctness | Analyze |
| Q-D1 | Should :diff honor a per-call confirm gate (it shells out and reads git history; safe but noisy)? |
UX | Analyze |
| Q-D2 | Should @<r1>..<r2> accept refs with / (origin/main..feature)? Doing so means we can't use the no-/ heuristic to disambiguate from paths. Alternative: require explicit prefix like @diff:origin/main..feature. |
@-mention grammar | Analyze |
| Q-T1 | When cfg.project.auto_tree = true, should the project block update on cd (since the cwd changed)? Or stay fixed at startup-cwd? |
UX expectation | Analyze |
| Q-T2 | Should cfg.project accept a list of include/exclude glob patterns, or just rely on git's .gitignore? |
Configurability | Analyze |
| Q-H2 | Should highlighting also apply to user-pasted code (expand_mentions @path), not just assistant output? | Symmetry | Analyze |
11. Phase 6 → Phase 7+ Out-of-band
The §11 "Planned Phase Sequence" table in PHASE0.md does not list phases beyond 6. After Phase 6 lands, candidate next iterations (non-binding, for the formulate of Phase 7 to confirm):
- Phase 7: secret-redaction wiring into
safety.lua(#52 follow-up filed during Phase 5/13 close); session-multiplex / tmux parity surfaces (out of scope per §12 — explicitly rejected); or other backlog as it accumulates on Gitea.
Phase 6 itself is self-contained — none of its three pillars introduce substrate dependencies on phases not yet planned.