marfrit/aish - aish - marfrit's space

Author	SHA1	Message	Date
marfrit	1f34b6dce8	config + docs/PHASE7: example block + status -> Implement (Phase 7 commit #6 ) R9-resolved single-owner of the status bump (commit #5 didn't touch PHASE7.md). N5: PHASE0 §11 amendment landed in commit `3bad07b` (formulate); not re-applied here. config.lua: - Commented-out `cost = { warn_at_dollars, warn_at_tokens }` block with parity to the Phase 1-6 example blocks. - Notes warn flags are independent (R4) and per-turn usage flows to session/*.jsonl for after-the-fact analysis. docs/PHASE7.md: - Status header bumped: "Plan + review fold-in" -> "Implement" - Lists the 6 implement commits inline for traceability: `7364963` broker: usage capture + opts widening `7b4a9be` context: accumulator helpers `8adebd5` repl: _record_usage + opts.category at 5 sites `b30212a` safety + repl: opts.category for Norris + probe `0d6ff93` repl: :cost meta surface this config example + status bump Phase 7 implementation is complete. Next inner-loop step is verify (7) — user-driven smoke tests, then memory-update (8). Regression: test_safety 87/87, test_router_model 31/31, repl loads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 23:02:55 +00:00
marfrit	d4c20f09df	docs/PHASE7: review fold-in — 3 BLOCKERs + 6 CONCERNs + 5 NITs Sonnet-reviewed (per the reviews-use-sonnet feedback memory). BLOCKERs (RESOLVED in-place): R1. M.chat would silently return (text, nil) for ALL non-streaming callers — 4 of 5 categories (summarize/delegate/memory_summarize/ probe) flow through broker.chat, NOT chat_stream. §4 now shows the explicit M.chat update that captures kind=="usage" alongside "text" and returns (text, usage). R2. call_broker fallback retry would credit usage to the wrong model name. Fix: broker emits payload.model = model_cfg.model (which IS the fallback's name when called with fb_cfg — chat_stream's upvar). Wrapper keys by payload.model, NOT outer model_name. §4 + §13 commit 3 reflect. R3. build_request has TWO internal callers inside broker.lua itself, not just the public surface. Plan §13 commit 1 risk row now spells this out explicitly so the implementer doesn't read "every caller already passes opts" as "external-only". CONCERNs (FOLDED): R4. Single cost_warn_fired flag covers two thresholds — first-to-fire suppresses the other. Split into ctx.cost_warn_state = { dollars = false, tokens = false }; :cost reset clears both. §7 + §13. R5. Warn-check centralization — single _record_usage helper in repl.lua wraps ctx:add_usage AND does threshold check. safety.lua routes via helpers.on_usage / opts.on_usage callbacks. context.lua stays decoupled from renderer. R6. Preserve nil-vs-0 cost distinction. Accumulator slot gains `is_local = true` (sticky) when ANY recorded usage had cost==nil. `:cost detail` annotation comes from is_local flag, not a fragile cost==0 heuristic. R7. :cost detail sort needs 3-level deterministic key: (cost desc, model asc, category asc) — table.sort is unstable. R8. call_broker fallback passes opts.include_usage unchanged. Documented as known assumption (B1 confirms both backends accept; future-broken fallback can pass include_usage=false). R9. :resume does NOT restore historical usage_totals. Per-turn usage IS in session JSONL for scripting; cross-session aggregation is Q-C2 deferred. Documented in §8. R10. $%.4f loses sub-cent precision (cloud cost 0.000028 -> $0.0000). Widened to $%.6f in §6 + §7 warn message format. NITs (APPLIED): N1. §4 pseudocode comment notes `if doc.usage` branch is independent of choice branch (handles both B2 emission shapes). N2. §2 stale "B7" reference corrected to B3. N3. §13 commit 3 row gains explicit dependency note on commit 1's R1. N4. §13 commit 4 spells out llm_probe -> llm_second_opinion -> M.is_destructive signature chain widening. N5. §3 + §13 commit 6 — PHASE0 §11 amendment already in tree (`3bad07b`); commit 6 must NOT re-apply. PHASE7.md now 803 lines (was 528 after plan). +275/-57. Ready for implementation phase pending user gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:55:59 +00:00
marfrit	0f14dc1727	docs/PHASE7: plan — §13 commit roadmap Status: Analyze -> Plan. Q-C4 was the last open question pending baseline; now resolved per B1 (stream_options accepted by both backends; required for local). §13 Implementation Plan added — 6 commits, bottom-up: 1. broker.lua: usage extraction from final SSE chunk; build_request signature widening to (model_cfg, msgs, stream, opts); on_delta ("usage", payload); chat returns (text, usage); opts.category passthrough. 2. context.lua: usage_totals + cost_warn_fired fields; add_usage / total_cost / total_tokens helpers; :reset preserves both. 3. repl.lua: wire opts.category at 5 non-Norris call sites (main, delegate x2, summarize, memory_summarize); on_delta("usage") branch routes to ctx:add_usage. 4. safety.lua: wire opts.category for Norris main broker + is_ destructive LLM probe; helpers.on_usage callback convention (no new module dep — matches #52's scrub_msgs pattern). 5. repl.lua: :cost meta surface + warn-threshold check + HELP. 6. config.lua: commented cost example block + PHASE7.md status bump to Implement. Per-commit risk index covers signature-change blast radius, missed call-site lint, and warn-flag one-shot semantics. Lua's multi- return semantics keep broker.chat backwards-compat automatic. Two items left open at plan, resolve at implement: - is_destructive opts.on_usage vs cfg.helpers threading - per-turn verbose mode (deferred; v1 = :cost on demand only) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:50:39 +00:00
marfrit	f0bccdec48	docs/PHASE7: analyze — probe broker surface + resolve Qs in-place Status: Formulate -> Analyze (tree at `3bad07b` probed). 11 findings (A1-A11), 5/6 open Qs resolved (Q-C4 deferred to baseline): A1. broker.chat_stream surface clean — usage capture via closure-local + on_delta("usage") emission after curl.post_sse returns. A2. 7 caller sites for opts.category threading (probe / norris / summarize / main / delegate x2 / memory_summarize). A3. build_request signature widens to (model_cfg, msgs, stream, opts) to absorb tools / max_tokens / include_usage / stream_options without further positional growth. A4. Q-C3 RESOLVED: free-form categories (caller decides); matches Phase 6 helpers/skills convention. A5. Q-C5 RESOLVED: warn fires on the call that crossed (no NEXT-call delay). A6. Q-C6 RESOLVED: :reset does NOT clear cost_warn_fired; only :cost reset clears. A7. Norris call-graph rewires (commit `955bd82`) — secrets streaming rehydrator wraps only "text" kind; new "usage" kind passes through unchanged. No new entanglement. A8. ctx.usage_totals survives :reset (R8 parity with memory_items, project). A9. Session JSONL inherits the new field automatically (dkjson opaque encoding). A10. Q-C1 PARTIAL: defensive silent skip when provider omits usage. Real probe required for local model — baseline action. A11. Q-C4 deferred to baseline (real broker probe). §2 build_request row updated to mention the A3 refactor. §11 Open Qs table now shows all 6 with resolutions; only Q-C4 remains as a baseline-time probe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:49:03 +00:00
marfrit	3bad07b2da	docs/PHASE7: formulate — cost / usage observability Phase 7 formulate manifest + PHASE0 §11 amendment to add the Phase 7 row (substrate amendment per CLAUDE.md §3, lands in the same commit). Four pillars: 1. Usage capture in broker.chat_stream — extract `usage` from the final SSE chunk (OpenAI streaming spec with `stream_options: {include_usage: true}`). Surface via new on_delta("usage", payload) kind. broker.chat returns (text, usage) — backward- compat: existing callers ignore the second value. 2. Per-session accumulator on ctx — ctx.usage_totals[model][category] tables (categories: main / delegate / summarize / memory_summarize / probe / norris, tagged at the call site via opts.category). :reset preserves usage_totals (R8 parity with memory_items / project). Session JSONL gains an optional `usage` field on assistant turns for after-the-fact analysis. 3. :cost meta surface — :cost (summary), :cost detail (per-model + per-category breakdown), :cost reset (zero the meter). Pure-Lua read of ctx.usage_totals; no broker calls. 4. Optional warn thresholds — cfg.cost.warn_at_dollars / warn_at_tokens emit a one-shot status when crossed. Default off; useful with cloud presets configured. Doc covers scope + done-when criteria, tech decisions table, module changes, per-pillar deep dive with code sketches, UX surface, out of scope, risks, 6 open questions to resolve in analyze. Open at formulate: Q-C1 — provider-without-usage handling (local llama.cpp probably) Q-C2 — cross-session persistence (defer to phase 8) Q-C3 — categories closed-set vs free-form Q-C4 — does hossenfelder forward stream_options to all backends? Q-C5 — warn fires on the call that crosses, or the next one? Q-C6 — :reset clears cost_warn_fired too, or only :cost reset? Scope confirmed via AskUserQuestion: cost/usage observability (chosen over project-local config overlay and session search/tag). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:47:58 +00:00

5 Commits