marfrit/aish - aish - marfrit's space

Author	SHA1	Message	Date
marfrit	c5116bf129	docs/PHASE2-baseline: pre-implementation measurements Phase 7 (verify) anchor. Captures: - MCP RPC round-trip timings against boltzmann lmcp v0.5.4 (all sub-100ms on LAN; LLM is the latency floor, not the transport). - 6 fixture responses saved to /tmp/aish-baseline/ covering initialize, notifications/initialized, tools/list, tools/call success, isError, and JSON-RPC unknown-tool error. - Baseline design finding: boltzmann's read_file returns isError:false even on failure (error text in content). aish should treat content as authoritative, isError as advisory; feed both to the model. PHASE2.md §4's "pass-through" stance already accommodates; no manifest amendment needed. - Streaming tool_calls delta shape verified against hossenfelder; matches PHASE2.md §5. - Pre-MCP aish behavior snapshot: loaded model emits markdown code-fence ignoring the CMD: contract — once MCP tools exist the model gets a structured path that doesn't depend on prose-formatting compliance. - Module pre-state at Phase 1 head `5878f73`: LOC + capability snapshot per module so Phase 2 diff has a reference frame. - Two boltzmann-proxy blockers (SSE buffering, model-field routing) carried explicitly into Phase 7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 12:34:32 +00:00
marfrit	5878f7347b	docs/PHASE2: analyze — lmcp v0.5.4 probed, transport simplified Live-probed against lmcp v0.5.4 (boltzmann) + hossenfelder broker proxy: Transport simpler than spec: - lmcp only implements POST-per-RPC with Connection: close; no held-open SSE channel. Combined with capabilities.tools.listChanged=false, no client-side listener is needed in v1. Drops the planned M.get_sse addition to ffi/curl.lua — Phase 1's M.post covers MCP. Bearer auth is universal across the fleet — config schema grew auth_token (literal) and auth_env (env-var indirection) fields per server, mirroring PHASE0 §10's key_env convention. Streaming tool_calls delta shape verified — accumulator by `index`, function.arguments arrives as chunked JSON-string. Matches the formulate-phase assumption in §5. Resolutions: Q17 transport abstraction — POST-only, no SSE channel for lmcp. Q21 error mapping — result.isError (model-recoverable, feed back as tool turn) vs JSON-RPC error (unknown method/tool, transport-level). Q18 role:"tool" turn — accepted at protocol level (live-probed). Mistral-nemo template verification blocked by the hossenfelder model-field routing bug; full closure carried to Phase 7 verify. Open-end recorded in §11: the hossenfelder proxy routes every request to the loaded fast model regardless of model field, blocking Phase 2 testing against mistral-nemo specifically. Parallel to the SSE buffering issue at marfrit/aish#15; same root (boltzmann proxy code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 09:51:03 +00:00
marfrit	ec6793c93c	docs/PHASE2: formulate — MCP client + tool-calling bridge Phase 2 formulate manifest. Three pillars per PHASE0.md §11 row 2: mcp.lua (JSON-RPC 2.0 over HTTP+SSE, target: lmcp), tool-calling bridge (OpenAI tools field <-> MCP tools/call), and the safety.lua authorization gate (per-call confirm + auto_approve policy). Resolves PHASE0.md §13 Q6–Q10: Q6 CMD: + tool-calls coexist; substrate §3 unchanged Q7 config-declared servers + runtime :mcp connect Q8 per-call confirm default, auto_approve policy in config Q9 hybrid system prompt: static frame + dynamic tools body field Q10 streaming-from-day-one on Phase 1 SSE; on_delta widens to (kind, payload) New questions tracked in §11 (Q17–Q22): transport abstraction, role:tool vs prefix injection (mistral-nemo template verification needed), large tool-result handling, parallel dispatch, error mapping, aish-as-MCP-server (parked). §4 module layout amended: mcp.lua slots between broker.lua and router.lua. The amendment is documented in this manifest; the actual §4 table edit lands when implementation starts (Phase 2 implement phase). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 09:23:53 +00:00
marfrit	7d62eb5659	review followups: pcall shield, :resume guard, shell quoting, nits CONCERNs from the Phase 1 review pass: ffi/curl.lua: - SSE write_cb body is now pcall-wrapped. A Lua error in on_event (or in the parse loop itself) is captured into cb_error and surfaced after curl_easy_perform rather than propagating across the FFI callback boundary (which LuaJIT documents as process-fatal). The EOS flush path gets the same shield. Errors return (nil, "callback: <msg>") from post_sse. history.lua: - sh_singlequote() escapes shell metacharacters; the mkdir -p and ls -1 shell-outs no longer double-quote (where $(...) and $VAR still expand) — single-quote with embedded-' escaping is the safe form. - M.load now returns (turns, meta) instead of (meta, turns). turns is ALWAYS a table on success, never nil-when-no-header; failure path is the unambiguous (nil, err). Callers can `if not turns then` without the previous ambiguity. repl.lua :resume updated to the new shape. repl.lua :resume: - Refuse to resume into a non-empty ctx — silent overwrite was the Q15 default, but the review surfaced the no-undo / no-warning failure mode. User must :reset (or :save then re-launch) to express intent. The current session's on-disk log is unaffected either way. NITs: - ffi/libc.lua READ_BUF: comment noting it's module-shared and Phase 1 has no reentrant readers; revisit when that changes. - PHASE1.md §7: \C-x\C-c reservation pinned to Phase 3 ("deferred from Phase 1 — no consumer here") rather than the previous dangling "(or here)". Regression suite verifies: - history.load new signature on success + failure paths - shell-quoted history.dir with $ doesn't trip - aish scripted run: ctx with 2 turns refuses :resume anchor with a clear status; user must :reset first Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:05:23 +00:00
marfrit	1f1065157e	review BLOCKER: PTY input forwarding + raw mode toggle Phase 1 review caught a structural gap: executor.exec only drained the PTY master fd, never forwarded user keystrokes — vim/less/htop/nano would render and hang on input. PHASE1.md §5 specified bidirectional multiplex but only the read leg landed. tcgetattr/tcsetattr were also missing, so even with input forwarding the parent's line discipline would buffer until newline (breaking single-key UIs). ffi/libc: - struct termios opaque buffer + tcgetattr/tcsetattr + cfmakeraw - M.set_raw(fd) saves termios + applies cfmakeraw; returns saved or (nil, err) when fd isn't a tty (scripted / piped-stdin runs) - M.restore_termios(fd, saved) - struct pollfd + M.poll (POLLIN constant) executor: - multiplex(sess): poll(stdin, master); reads master on any revents (POLLHUP fires when child closes its slave end, not POLLIN — the revents != 0 check catches both); forwards stdin keystrokes to master; loop exits when master read returns 0 (EOF / child gone) - stdin polling is only enabled when stdin_is_tty (set_raw succeeded); piped-stdin runs (tests / scripted) would otherwise drain queued aish commands into the child of the current cmd, swallowing them - raw mode is restored before returning so the user lands back at the aish prompt in canonical mode renderer + repl: - exec_output(out, code) split into exec_begin() (top rule, before spawn) + exec_end(code) (closing rule with exit, after wait). PTY multiplex streams the body live to stdout in between; the renderer never re-prints the body. PHASE1.md §3: - tcgetattr/tcsetattr changed from "optional" to "required for single-key UIs to work — done-criteria #2"; poll added to the libc row description. Verified: - non-interactive smoke (echo / false / exit 7 / ls /nonexistent / printf multi-line) — all exit codes correct, output streamed live, a\nb\nc\n preserved byte-for-byte - scripted-stdin run reaches all expected lines (no stdin draining into a non-interactive child) - aish prompt + framed exec block + exit-code line all render in correct order Live interactive verification (vim / less / htop in a real terminal) still needs a user-test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:00:53 +00:00
marfrit	ee4d7f86d6	executor: swap popen+sentinel for pty.spawn (Phase 1) Replaces the Phase 0 io.popen + sentinel-echo exit-code recovery with forkpty + waitpid via ffi/pty. The §7 amendment paragraph on PHASE0.md is rewritten to point at PHASE1.md §5 — the workaround is gone, not just renamed. User-visible behavioral changes: - Interactive commands (vim, less, htop, top) now work via $cmd / :exec / known-command shell paths because the child has a real PTY for line discipline. - Exit codes are accurate: `false` -> 1, `exit 7` -> 7, signal kill -> 128+N (bash convention), shell parse error -> sh's 2. - Broken-shell-syntax cmd now shows the actual sh diagnostic (e.g. "Syntax error: end of file unexpected") instead of Phase 0's "(no output — possible shell parse error)" guess. - Output normalization: PTY emits CR LF; executor collapses \r\n -> \n to keep the Phase 0 contract ("output uses \n separators"). Code path: pty.spawn(cmd) -> drain master_fd until EOF -> wait() returns ("exit", N) \| ("signal", N) \| ... -> exit_code mapped: exit -> N, signal -> 128+N, else -1 Phase 0 invariants intact: `cd` interception unchanged (still libc.chdir per §3 + §7), `CMD: ` extraction unchanged. PHASE0.md §7: the "LuaJIT 2.1 popen-close caveat" paragraph is rewritten to "Superseded by Phase 1" — points at PHASE1.md §5 for the live model. The illustrative sketch is left in place as historical context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:08:27 +00:00
marfrit	539408f480	phase1 formulate: scope, tech decisions, module changes, open questions Inner-loop Phase 1 (formulate) deliverable for the milestone Phase 1 of the aish project. Drafts docs/PHASE1.md to specify what lands on top of the Phase 0 substrate — no code changes, no §3 invariant amendments. Phase 1 milestone scope per PHASE0.md §11: 1. SSE streaming via libcurl FFI (existing WRITEFUNCTION hook) 2. PTY-backed exec via forkpty(3); replaces popen + retires the §7 sentinel exit-code workaround in favor of waitpid 3. Session persistence as append-only JSONL under <config.history.dir>/sessions/<utc>.jsonl 4. Readline custom bindings (rl_bind_keyseq); Phase 1 reserves \C-n as a no-op for Phase 3's Norris consumer Module growth (no new file names beyond the §4-stubs): ffi/curl -> M.post_sse(url, body, headers, on_event) ffi/pty -> M.spawn / read / write / close / wait ffi/libc -> waitpid + WEXITSTATUS + tcgetattr/tcsetattr ffi/readline -> M.bind(seq, fn) broker -> M.chat_stream; M.chat becomes a buffering wrapper executor -> PTY path; sentinel hack deleted repl -> :save, :resume <name>, :sessions; streaming render renderer -> assistant_delta + assistant_flush history -> open / append / load / list_sessions Open questions Q11–Q16 (six new) tracked in §10: - SSE shape uniformity across OpenRouter routes (Q11, Phase 7) - CMD: highlight-on-stream strategy (Q12, plan phase) - tty raw-mode recovery on Lua error (Q13, plan phase) - bind \C-n now or defer to Phase 3 (Q14, plan phase) - :resume into non-empty context (Q15, plan phase) - session-log fsync policy (Q16, default close-only; tracked) Next inner phase is "analyze": for each module change, identify dependencies + risks + per-commit ordering. Then baseline (capture Phase 0 behaviors we want to preserve), plan, review, implement, verify, memory-update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:56:20 +00:00
marfrit	16490e6905	fix: buffer exec output for next user turn; alternation for strict templates User-test surfaced the bug: with `deep` (mistral-nemo-12b) active, running `list files` -> y on `CMD: ls` -> `Are there directory entries beginning with "lor"?` returned a Jinja exception: api: ... Error: Jinja Exception: After the optional system message, conversation roles must alternate user/assistant/user/assistant/... Cause: §6 specified "exec output injected into context uses role 'user' with a prefix tag '[exec output]'." This works for permissive templates (qwen2.5-coder-1.5b, the `fast` preset) but produces a back-to-back user/user pair on strict templates that enforce the OpenAI alternation contract — `[exec output]` user turn followed by the user's actual follow-up question. Fix: context.lua: - new field `pending_exec_output` (initially nil) - new method `:append_exec_output(out)` buffers (concat on subsequent captures so multi-shell-then-ai still merges everything) - new method `:append_user(content)` flushes buffered exec output as a `[exec output]\n...\n\n` prefix and appends a user turn - `:reset()` also clears the buffer repl.lua: - run_shell calls ctx:append_exec_output(out) instead of ctx:append({role="user", content="[exec output]\n"..out}) - ask_ai calls ctx:append_user(text) instead of raw :append; saves prev_pending so a broker error can restore the buffer for retry PHASE0.md §6: - amended the role-injection paragraph to describe the buffer-and- prepend policy; the §3 invariants list is untouched (this was a §6 design detail, not a locked invariant) Verification: - context unit tests cover: alternation after the failing sequence, multi-shell merge, reset clears buffer, broker-error retry path - live reproduction against `deep` (mistral-nemo) of the exact user-reported sequence succeeds; model responds with a sensible `CMD: ls \| grep '^lor'` instead of a Jinja exception Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:41:21 +00:00
marfrit	a76ff664b3	phase0 amendment: §3/§7/§10 close review-surfaced manifest gaps Three additions to PHASE0.md, all surfaced by the Phase 5 review of the Phase 0 implementation. No invariant changes; manifest now matches implementation reality. §3 — FFI loader fallback paragraph. ffi.load("name") needs the unversioned `libname.so` symlink that comes with the -dev package. Phase 0 loaders try unversioned first then versioned sonames so runtime-only hosts (no -dev) work as-is. Documents the actual behavior in ffi/readline.lua and ffi/curl.lua. §7 — LuaJIT 2.1 popen-close caveat paragraph. The §7 sketch had been showing Lua 5.2's three-return io.popen():close() shape; LuaJIT 2.1 follows the Lua 5.1 ABI and returns just `true`. Phase 0 recovers the exit status with a sentinel echo (`echo __AISH_EXIT_<tag>__$?`). Phase 1 PTY+waitpid replaces the hack and the sketch becomes accurate. Sketch left as-is (it's the right shape conceptually); caveat now explicit. §10 — cwd-relative package.path note. Phase 0 prepends `./?.lua; ./vendor/?.lua`, so aish must run from the repo root. Cwd-independent resolution is a later concern. Also clarifies that --config is strict (no fallback if the path is unopenable) — matches main.lua post the review-followup commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:44:20 +00:00
marfrit	2704edd57d	phase0 amendment: vendor dkjson 2.8 under vendor/ Captures the JSON-library decision noted as open in CLAUDE.md §6. dkjson is pure Lua (preserves §3's "no compiled extensions" invariant), single file, redistributable (MIT/X11). Sourced from Debian's `lua-dkjson` package (/usr/share/lua/5.1/dkjson.lua, version 2.8) — Debian's curated copy of the upstream at dkolf.de. Vendoring (rather than relying on a system lua-dkjson install) keeps aish self-contained per the §3 "no luarocks packages" invariant: any host with luajit can run the tree as-is. PHASE0.md §3 grows one row recording the choice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 11:30:16 +00:00
claude-noether	e1d1931006	phase0 review: tighten phase 2 row + add Q9, Q10, sharpen Q6 Captures three findings from the review of `013c625` ("phase0 amendment: insert MCP phase 2"). Opening as a PR rather than direct-to-main: the non-PR-flow convention works fine for autonomous work, but feedback- required iteration needs a readable medium that isn't the Claude Code transcript. §11 phase 2 row: spell out two scope items the original row left implicit — the system-prompt rewrite to declare the tools schema (Phase 0's `CMD:` contract is hard-coded into the prompt) and `safety.lua` extension to gate tool calls (per Q8). §13 Q6: explicit note that choosing "retire `CMD:`" requires a §3 invariant amendment in the same commit — keeps the substrate-vs-phase boundary honest. Adds (§3 if retiring) to the impact column. §13 Q9 (new): MCP system-prompt augmentation locus — static block in broker.lua / per-request assembly from connected servers / hybrid. Real architectural call with token-cost tradeoff per option. §13 Q10 (new): tool-call streaming vs the Phase 1 SSE substrate — phase-ordering question. Either Phase 2 lands on the blocking Phase 0 broker and refits when SSE arrives, or Phase 1 SSE moves before MCP so tool-call deltas stream from day one.	2026-05-10 06:06:14 +00:00
marfrit	ca8ff107c7	docs: fix Phase-N references stale after MCP renumber Sweep four call-sites pointing at the wrong phase number: - README.md:19 — Norris mode "Phase 2" → Phase 3 (renumbered by `013c625`) - README.md:62 — safety.lua "Phase 2+" → Phase 3+ (same renumber) - PHASE0.md:58 — safety.lua "(Phase 1)" → (Phase 3) (was wrong pre-013c625 too — referenced Phase 1 when Norris was actually Phase 2) - PHASE0.md:214 — Norris-mode prompt example "(Phase 1)" → (Phase 3) (same pre-existing wrong reference) Caught by review of `013c625`. No semantic change; mechanical phase-number sweep only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 05:43:58 +00:00
marfrit	013c6257f2	phase0 amendment: insert MCP phase 2, renumber subsequent phases MCP/tool-calling lands as a distinct phase, before Norris mode so the autonomous planner has tools as substrate. lmcp speaks MCP standard JSON-RPC 2.0 over HTTP/SSE — fits the existing libcurl FFI plan; tool calls ride the OpenAI-compatible `tools` field on /v1/chat/completions, so the §6 broker contract is unchanged at the transport level. §8: tokenization concern bumped Phase 2 → Phase 3 (still tracks Norris). §11: Norris→3, memory→4, routing→5, tree-sitter→6. §13: Q1/Q2/Q3/Q5 phase numbers tracked the renumber; added Q6 (CMD: vs tools coexistence), Q7 (server discovery), Q8 (tool-call auth gate). No §3 invariant broken. No code touched — Phase 0 implementation per the locked manifest is still the next move. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 05:37:58 +00:00
claude-noether	4310207738	Phase 0: scaffold tree + manifest - README, .gitignore, CLAUDE.md (project conventions) - docs/PHASE0.md — full Phase 0 manifest (locked substrate) - 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented with module-scoped responsibilities matching the manifest - config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b snappy/32k + cloud via OpenRouter through hossenfelder) File names match docs/PHASE0.md §4 exactly. Module bodies fill in across later phases; the tree shape is locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 23:16:07 +00:00

14 Commits