94b7d869262ceff0e0e3da03b841049a1eb56ce4
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
1f34b6dce8 |
config + docs/PHASE7: example block + status -> Implement (Phase 7 commit #6)
R9-resolved single-owner of the status bump (commit #5 didn't touch PHASE7.md). N5: PHASE0 §11 amendment landed in commit |
||
|
|
d4c20f09df |
docs/PHASE7: review fold-in — 3 BLOCKERs + 6 CONCERNs + 5 NITs
Sonnet-reviewed (per the reviews-use-sonnet feedback memory).
BLOCKERs (RESOLVED in-place):
R1. M.chat would silently return (text, nil) for ALL non-streaming
callers — 4 of 5 categories (summarize/delegate/memory_summarize/
probe) flow through broker.chat, NOT chat_stream. §4 now shows
the explicit M.chat update that captures kind=="usage" alongside
"text" and returns (text, usage).
R2. call_broker fallback retry would credit usage to the wrong model
name. Fix: broker emits payload.model = model_cfg.model (which IS
the fallback's name when called with fb_cfg — chat_stream's
upvar). Wrapper keys by payload.model, NOT outer model_name. §4
+ §13 commit 3 reflect.
R3. build_request has TWO internal callers inside broker.lua itself,
not just the public surface. Plan §13 commit 1 risk row now
spells this out explicitly so the implementer doesn't read "every
caller already passes opts" as "external-only".
CONCERNs (FOLDED):
R4. Single cost_warn_fired flag covers two thresholds — first-to-fire
suppresses the other. Split into ctx.cost_warn_state = { dollars
= false, tokens = false }; :cost reset clears both. §7 + §13.
R5. Warn-check centralization — single _record_usage helper in
repl.lua wraps ctx:add_usage AND does threshold check. safety.lua
routes via helpers.on_usage / opts.on_usage callbacks. context.lua
stays decoupled from renderer.
R6. Preserve nil-vs-0 cost distinction. Accumulator slot gains
`is_local = true` (sticky) when ANY recorded usage had cost==nil.
`:cost detail` annotation comes from is_local flag, not a
fragile cost==0 heuristic.
R7. :cost detail sort needs 3-level deterministic key:
(cost desc, model asc, category asc) — table.sort is unstable.
R8. call_broker fallback passes opts.include_usage unchanged.
Documented as known assumption (B1 confirms both backends
accept; future-broken fallback can pass include_usage=false).
R9. :resume does NOT restore historical usage_totals. Per-turn usage
IS in session JSONL for scripting; cross-session aggregation is
Q-C2 deferred. Documented in §8.
R10. $%.4f loses sub-cent precision (cloud cost 0.000028 -> $0.0000).
Widened to $%.6f in §6 + §7 warn message format.
NITs (APPLIED):
N1. §4 pseudocode comment notes `if doc.usage` branch is independent
of choice branch (handles both B2 emission shapes).
N2. §2 stale "B7" reference corrected to B3.
N3. §13 commit 3 row gains explicit dependency note on commit 1's R1.
N4. §13 commit 4 spells out llm_probe -> llm_second_opinion ->
M.is_destructive signature chain widening.
N5. §3 + §13 commit 6 — PHASE0 §11 amendment already in tree
(
|
||
|
|
0f14dc1727 |
docs/PHASE7: plan — §13 commit roadmap
Status: Analyze -> Plan.
Q-C4 was the last open question pending baseline; now resolved per
B1 (stream_options accepted by both backends; required for local).
§13 Implementation Plan added — 6 commits, bottom-up:
1. broker.lua: usage extraction from final SSE chunk; build_request
signature widening to (model_cfg, msgs, stream, opts); on_delta
("usage", payload); chat returns (text, usage); opts.category
passthrough.
2. context.lua: usage_totals + cost_warn_fired fields; add_usage /
total_cost / total_tokens helpers; :reset preserves both.
3. repl.lua: wire opts.category at 5 non-Norris call sites (main,
delegate x2, summarize, memory_summarize); on_delta("usage")
branch routes to ctx:add_usage.
4. safety.lua: wire opts.category for Norris main broker + is_
destructive LLM probe; helpers.on_usage callback convention
(no new module dep — matches #52's scrub_msgs pattern).
5. repl.lua: :cost meta surface + warn-threshold check + HELP.
6. config.lua: commented cost example block + PHASE7.md status
bump to Implement.
Per-commit risk index covers signature-change blast radius, missed
call-site lint, and warn-flag one-shot semantics. Lua's multi-
return semantics keep broker.chat backwards-compat automatic.
Two items left open at plan, resolve at implement:
- is_destructive opts.on_usage vs cfg.helpers threading
- per-turn verbose mode (deferred; v1 = :cost on demand only)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f0bccdec48 |
docs/PHASE7: analyze — probe broker surface + resolve Qs in-place
Status: Formulate -> Analyze (tree at |
||
|
|
3bad07b2da |
docs/PHASE7: formulate — cost / usage observability
Phase 7 formulate manifest + PHASE0 §11 amendment to add the Phase 7
row (substrate amendment per CLAUDE.md §3, lands in the same commit).
Four pillars:
1. Usage capture in broker.chat_stream — extract `usage` from the
final SSE chunk (OpenAI streaming spec with `stream_options:
{include_usage: true}`). Surface via new on_delta("usage",
payload) kind. broker.chat returns (text, usage) — backward-
compat: existing callers ignore the second value.
2. Per-session accumulator on ctx — ctx.usage_totals[model][category]
tables (categories: main / delegate / summarize / memory_summarize
/ probe / norris, tagged at the call site via opts.category).
:reset preserves usage_totals (R8 parity with memory_items /
project). Session JSONL gains an optional `usage` field on
assistant turns for after-the-fact analysis.
3. :cost meta surface — :cost (summary), :cost detail (per-model +
per-category breakdown), :cost reset (zero the meter). Pure-Lua
read of ctx.usage_totals; no broker calls.
4. Optional warn thresholds — cfg.cost.warn_at_dollars /
warn_at_tokens emit a one-shot status when crossed. Default off;
useful with cloud presets configured.
Doc covers scope + done-when criteria, tech decisions table, module
changes, per-pillar deep dive with code sketches, UX surface, out of
scope, risks, 6 open questions to resolve in analyze.
Open at formulate:
Q-C1 — provider-without-usage handling (local llama.cpp probably)
Q-C2 — cross-session persistence (defer to phase 8)
Q-C3 — categories closed-set vs free-form
Q-C4 — does hossenfelder forward stream_options to all backends?
Q-C5 — warn fires on the call that crosses, or the next one?
Q-C6 — :reset clears cost_warn_fired too, or only :cost reset?
Scope confirmed via AskUserQuestion: cost/usage observability
(chosen over project-local config overlay and session search/tag).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|