1a136d81b7
Status: Formulate -> Analyze. 12 findings (A1-A12); 5/6 open Qs
resolved in-place (Q-T5 deferred to baseline).
MAJOR FINDING:
A1. enforce_budget ONLY checks max_turns, NOT token_budget — even
with accurate tokenization, eviction decisions are unaffected.
The new estimate_tokens() would just feed the prompt template
display. Pillar 5 added: enforce_budget evicts when EITHER
max_turns OR token_budget is exceeded. This is the real
motivation for accurate tokenization.
Other findings:
A2. ffi.curl.M.post signature confirmed (body, status) / (nil, err).
A3. Single caller of estimate_tokens today; enforce_budget becomes
the second (more frequent) caller — per-turn _tokens cache
becomes important.
A4. Q-T1: cache lives on turn dict; dies with turns on :reset.
A5. Q-T2: closure captures active_cfg upval; follows :model switch
naturally.
A6. Q-T3: opt-out skips the probe entirely (no wiring).
A7. Q-T6: tools-schema tokens deferred to follow-up (fixed per
session; under-count bounded).
A8. _tokens cache invalidation: only :reset; turn content is
immutable after append.
A9. Probe latency ~50ms/call locally; per-turn cache amortizes to
O(1) after first count.
A10. estimate_tokens called OUTSIDE streaming callback; no race.
A11. role:"tool" turns tokenize identically; per-turn cache works.
A12. include_usage (Phase 7) and tokenize (Phase 8) are orthogonal —
different endpoints, different code paths.
§1 expanded to 5 pillars (pillar 5 = enforce_budget extension).
§3 context.lua row updated to reference the enforce_budget change
+ per-turn _tokens cache. §9 risk row added: accurate counts mean
the default token_budget=4096 is finally ENFORCED — sessions that
spilled silently under char/4 may now evict earlier.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>