Status: Formulate -> Analyze (tree at 3bad07b probed).
11 findings (A1-A11), 5/6 open Qs resolved (Q-C4 deferred to baseline):
A1. broker.chat_stream surface clean — usage capture via closure-local
+ on_delta("usage") emission after curl.post_sse returns.
A2. 7 caller sites for opts.category threading (probe / norris /
summarize / main / delegate x2 / memory_summarize).
A3. build_request signature widens to (model_cfg, msgs, stream, opts)
to absorb tools / max_tokens / include_usage / stream_options
without further positional growth.
A4. Q-C3 RESOLVED: free-form categories (caller decides); matches
Phase 6 helpers/skills convention.
A5. Q-C5 RESOLVED: warn fires on the call that crossed (no NEXT-call
delay).
A6. Q-C6 RESOLVED: :reset does NOT clear cost_warn_fired; only
:cost reset clears.
A7. Norris call-graph rewires (commit 955bd82) — secrets streaming
rehydrator wraps only "text" kind; new "usage" kind passes
through unchanged. No new entanglement.
A8. ctx.usage_totals survives :reset (R8 parity with memory_items,
project).
A9. Session JSONL inherits the new field automatically (dkjson
opaque encoding).
A10. Q-C1 PARTIAL: defensive silent skip when provider omits usage.
Real probe required for local model — baseline action.
A11. Q-C4 deferred to baseline (real broker probe).
§2 build_request row updated to mention the A3 refactor.
§11 Open Qs table now shows all 6 with resolutions; only Q-C4
remains as a baseline-time probe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 7 formulate manifest + PHASE0 §11 amendment to add the Phase 7
row (substrate amendment per CLAUDE.md §3, lands in the same commit).
Four pillars:
1. Usage capture in broker.chat_stream — extract `usage` from the
final SSE chunk (OpenAI streaming spec with `stream_options:
{include_usage: true}`). Surface via new on_delta("usage",
payload) kind. broker.chat returns (text, usage) — backward-
compat: existing callers ignore the second value.
2. Per-session accumulator on ctx — ctx.usage_totals[model][category]
tables (categories: main / delegate / summarize / memory_summarize
/ probe / norris, tagged at the call site via opts.category).
:reset preserves usage_totals (R8 parity with memory_items /
project). Session JSONL gains an optional `usage` field on
assistant turns for after-the-fact analysis.
3. :cost meta surface — :cost (summary), :cost detail (per-model +
per-category breakdown), :cost reset (zero the meter). Pure-Lua
read of ctx.usage_totals; no broker calls.
4. Optional warn thresholds — cfg.cost.warn_at_dollars /
warn_at_tokens emit a one-shot status when crossed. Default off;
useful with cloud presets configured.
Doc covers scope + done-when criteria, tech decisions table, module
changes, per-pillar deep dive with code sketches, UX surface, out of
scope, risks, 6 open questions to resolve in analyze.
Open at formulate:
Q-C1 — provider-without-usage handling (local llama.cpp probably)
Q-C2 — cross-session persistence (defer to phase 8)
Q-C3 — categories closed-set vs free-form
Q-C4 — does hossenfelder forward stream_options to all backends?
Q-C5 — warn fires on the call that crosses, or the next one?
Q-C6 — :reset clears cost_warn_fired too, or only :cost reset?
Scope confirmed via AskUserQuestion: cost/usage observability
(chosen over project-local config overlay and session search/tag).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>