# Phase 6 Baseline — pre-implementation measurements **Date:** 2026-05-16 **Tree probed:** `ad52fe4` (Phase 5 + #2/#3/#4/#5/#6/#7/#8/#9/#10/#11/#13/#14/#23/#32/#33/#51/#52 follow-up). **Hosts probed:** noether (primary), higgs (Pi5). **Broker probed:** `hossenfelder.fritz.box:8082` (local `qwen-coder-7b-snappy-8k`, cloud `anthropic/claude-haiku-4.5`). This is the Phase 7 (verify) anchor for Phase 6. Captures the world just before tree-sitter / diff / project-tree implementation lands. --- ## B1. `git` output through `executor.exec` carries ANSI + terminal control `executor.exec` uses `pty.spawn` (forkpty). When git's stdout is a PTY, git enables both color output AND interactive pager defaults (DEC keypad mode `\27[?1h=` ... `\27[?1l>`, line-clear `\27[K`). Observation: ``` > executor.exec("git diff --stat HEAD~1..HEAD") exit=0 len=173 \27[?1h= docs/PHASE6.md | 207 \27[32m++...\27[m\27[31m--...\27[m\27[m 1 file changed, 166 insertions(+), 41 deletions(-)\27[m \27[K\27[?1l> ``` With `--no-pager`: keypad sequences gone, color stays: ``` > executor.exec("git --no-pager diff --stat HEAD~1..HEAD") exit=0 len=148 docs/PHASE6.md | 207 \27[32m++...\27[m\27[31m--...\27[m 1 file changed, 166 insertions(+), 41 deletions(-) ``` With `--no-pager --color=never`: clean. ``` > executor.exec("git --no-pager diff --color=never --stat HEAD~1..HEAD") exit=0 len=132 clean=true docs/PHASE6.md | 207 +++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 166 insertions(+), 41 deletions(-) ``` **Implication for §5 (`:diff` meta):** the implementation MUST use both `--no-pager` and `--color=never`. Without either, the injected context block carries escape codes that confuse the model AND inflate token counts. The same flags apply to any future `git log` / `git show` / `git blame` verbs that might land beyond Phase 6. --- ## B2. SSE chunk size envelope (relevant to fence-aware highlighter) `renderer.assistant_delta` receives whatever chunks the broker streams. Measured against two model classes: ### Local llama.cpp (`qwen-coder-7b-snappy-8k`) ``` prompt: "reply with a python code block that prints hello world, then a brief explanation" max_tokens: 150 chunks: 97 total: 423 chars sizes: min=1, max=13, median=4 fences: fence at char 58 -> chunk 14 ('```') fence at char 91 -> chunk 23 ('``') <-- split fence ``` **The local model splits fences across chunks** (`'``'` arrives, the final ` ` ` is in the next chunk). The fence-aware filter MUST handle fragment-across-boundary correctly. ### Cloud (`anthropic/claude-haiku-4.5` via OpenRouter) ``` prompt: "write a 5-line python hello world example wrapped in a code fence" max_tokens: 150 chunks: 3 total: 60 chars sizes: 7 / 27 / 26 fences: fence at char 0 -> chunk 0 ('```python\n# Hello World in') fence at char 57 -> chunk 2 ('\nprint("Hello, World!")\n```') ``` Cloud delivers BIG chunks (median ~26 chars); fences typically arrive intact within a single chunk. **Implication for §4 (highlight stream filter):** the state machine must accumulate enough `buf` to detect a fence opening or closing even when only `'``'` arrives in a chunk. The §4 design already specifies "look at the cumulative `buf`, so partial markers are recovered correctly" — confirmed necessary by local-model behavior. --- ## B3. **LuaJIT `io.popen():close()` does NOT expose exit codes** This is a divergence from Lua 5.2+ behavior assumed by the §4 (A4) highlighter resolution: ``` > luajit -e "for _, cmd in ipairs({'true','false','exit 7'}) do local p = io.popen(cmd); local ok, err, code = p:close() print(cmd, ok, err, code) end" true true nil nil false true nil nil exit 7 true nil nil ``` `io.popen():close()` returns `(true, nil, nil)` regardless of child exit status. The exit code is silently discarded. **Revised Q-H1 resolution (supersedes A4):** the highlighter must detect tree-sitter failure via a different channel. Cleanest path: write the body to a tmpfile, then invoke the highlighter via `executor.exec("cat tmpfile | tree-sitter highlight --lang X")`. `executor.exec` uses its own forkpty + waitpid path and DOES return `(out, exit_code)` reliably. Updated sketch: ```lua local function highlighted(body, lang) if not highlight_enabled or not lang_map[lang] then return body end local tmp = os.tmpname() local f = io.open(tmp, "wb") if not f then return body end f:write(body); f:close() local out, code = executor.exec( ("cat %s | tree-sitter highlight --lang %s") :format(_shq(tmp), lang_map[lang])) os.remove(tmp) if code ~= 0 then return body end return out end ``` Cost: tmp file write + read + remove + one executor.exec roundtrip per code block. Acceptable; tree-sitter highlighter latency dominates. **This finding will fold into PHASE6.md §4 during the analyze revision** (or as a baseline-time amendment). --- ## B4. tree-sitter CLI presence on the fleet ``` noether (local primary): ABSENT (which tree-sitter -> not found) higgs (Pi5 / Debian 13): ABSENT (which tree-sitter -> not found) ``` **Implication for §1 (scope):** the design's "external CLI when present, no-op otherwise" decision is the right call — on the fleet as-tested, ZERO hosts ship tree-sitter by default. Users who want highlighting will need to opt in explicitly (apt / cargo / manual install). Documentation should mention this clearly in PHASE6 implementation notes + the config example. `:highlight on` against a host without the CLI should emit a clear "tree-sitter CLI not found; install with e.g. `apt install tree-sitter` or `cargo install tree-sitter-cli`" status, not silently no-op. --- ## B5. Project-tree envelope (`git ls-files` performance) ``` > time git -C /home/mfritsche/src/aish ls-files --cached --others --exclude-standard >/dev/null real 0.002s files: 32, total: 449 chars, avg/file: 14 ``` Sampling other repos on noether (`~/src/*` with `.git/`): | Repo | Files | Time | |---|---|---| | aish | 32 | 2 ms | | ampere-fourier | 15 | 5 ms | | ampere-kernel-decoders | 23 | 1 ms | | cfw | 25 | (similar) | **Implication for §6 (`:tree` scan):** - Scan latency on typical local repos is negligible (<10ms). - The 4096-char default `tree_max_chars` cap accommodates ~290 paths at the observed avg of 14 chars/path — fine for most aish-target workflows. - Repos with thousands of files (kernel, nix-pkgs, etc.) WILL exceed the cap; users can lower `tree_depth` or raise the cap. The §9 risk row already covers this; no design change needed. --- ## B6. `os.tmpname()` behavior ``` > luajit -e "for i = 1, 3 do print(os.tmpname()) end" /tmp/lua_qAGTFV /tmp/lua_RhpXLK /tmp/lua_F9WtYx ``` LuaJIT's `os.tmpname` returns POSIX-style `/tmp/lua_XXXXXX` paths. Adequate for B3's tmpfile-roundtrip pattern. No filesystem-level race window — `os.tmpname` uses `mkstemp(3)` semantics on Linux (returns a unique name; the caller is responsible for `io.open` and cleanup). Note: B3's pattern does `f:write(body); f:close()` between the name and use — the open-with-O_EXCL guarantee from mkstemp is implicit via Lua's `io.open`. Acceptable for a local-only tmpfile holding short-lived code-block content; not a security concern (we trust the local user per PHASE0 §12). --- ## Summary | Finding | Affects | Resolution | |---|---|---| | B1 git ANSI/pager leakage | §5 `:diff` impl | Add `--no-pager --color=never` to every git invocation | | B2 SSE chunk envelope | §4 fence filter | Existing accumulator design is correct; local-model split-fence case confirmed necessary | | B3 io.popen no exit code | §4 (A4) highlighter | Revise: route via `executor.exec("cat tmp \| tree-sitter ...")` for reliable exit code | | B4 no tree-sitter on fleet | §1 / docs | Highlighter is opt-in; absent-CLI emits install-hint status | | B5 tree scan envelope | §6 `:tree` | No change; defaults fit observed repo sizes | | B6 os.tmpname semantics | §4 highlighter | Confirmed adequate for tmpfile-roundtrip | No structural changes to the formulate/analyze design. B1, B3, and B4 surface as implementation-time amendments to PHASE6.md sections §4, §5, and §1 respectively. Will fold these into the manifest during plan.