docs/PHASE7: formulate — cost / usage observability
Phase 7 formulate manifest + PHASE0 §11 amendment to add the Phase 7
row (substrate amendment per CLAUDE.md §3, lands in the same commit).
Four pillars:
1. Usage capture in broker.chat_stream — extract `usage` from the
final SSE chunk (OpenAI streaming spec with `stream_options:
{include_usage: true}`). Surface via new on_delta("usage",
payload) kind. broker.chat returns (text, usage) — backward-
compat: existing callers ignore the second value.
2. Per-session accumulator on ctx — ctx.usage_totals[model][category]
tables (categories: main / delegate / summarize / memory_summarize
/ probe / norris, tagged at the call site via opts.category).
:reset preserves usage_totals (R8 parity with memory_items /
project). Session JSONL gains an optional `usage` field on
assistant turns for after-the-fact analysis.
3. :cost meta surface — :cost (summary), :cost detail (per-model +
per-category breakdown), :cost reset (zero the meter). Pure-Lua
read of ctx.usage_totals; no broker calls.
4. Optional warn thresholds — cfg.cost.warn_at_dollars /
warn_at_tokens emit a one-shot status when crossed. Default off;
useful with cloud presets configured.
Doc covers scope + done-when criteria, tech decisions table, module
changes, per-pillar deep dive with code sketches, UX surface, out of
scope, risks, 6 open questions to resolve in analyze.
Open at formulate:
Q-C1 — provider-without-usage handling (local llama.cpp probably)
Q-C2 — cross-session persistence (defer to phase 8)
Q-C3 — categories closed-set vs free-form
Q-C4 — does hossenfelder forward stream_options to all backends?
Q-C5 — warn fires on the call that crosses, or the next one?
Q-C6 — :reset clears cost_warn_fired too, or only :cost reset?
Scope confirmed via AskUserQuestion: cost/usage observability
(chosen over project-local config overlay and session search/tag).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -316,6 +316,7 @@ from somewhere else.
|
||||
| **4** | `memory.jsonl` summarization, startup context injection from memory, `:history` management, pruning |
|
||||
| **5** | Multi-model routing by task type, cloud fallback, context summarization via fast model on eviction |
|
||||
| **6** | Tree-sitter syntax highlighting hooks, diff-aware code injection, project-level context (file tree summary) |
|
||||
| **7** | Cost / usage observability: broker captures `usage` + `cost`; per-session accumulator on ctx; `:cost` reporter; optional warn thresholds |
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user