config + docs/PHASE8: example block + status -> Implement (Phase 8 commit #5)

config.lua:
  - Commented-out `tokenize = { use_endpoint = true }` block with
    parity to the Phase 1-7 example blocks.
  - Documents the two consequences: (1) per-turn network cost
    (~30ms first time, cached after) and (2) token_budget is now
    actually enforced — sessions that fit under char/4 may evict
    earlier under accurate counts.
  - Notes cloud /tokenize 404 fallback path.

docs/PHASE8.md:
  - Status header bumped: "Plan + review fold-in" -> "Implement"
  - Lists the 5 implement commits inline for traceability:
      7ef2a6e  broker: token_count + endpoint cache
      8502517  context: tokenize_fn + _tokens cache
      db26d0c  context: enforce_budget honors token_budget (R2 guard)
      94b7d86  repl: wire tokenize_fn + :cost detail estimate row
      this     config example + status bump

Phase 8 implementation is complete. Resolves Q1 (PHASE0 §13,
originally Phase 3, deferred forward). Next inner-loop step: verify
(7) — file test cases, run autonomous, close. Then memory-update (8).

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-16 23:32:16 +00:00
parent 94b7d86926
commit 08dba69fce
2 changed files with 19 additions and 1 deletions
+18
View File
@@ -284,4 +284,22 @@ return {
-- written to session/*.jsonl (assistant-turn `usage` field) for
-- after-the-fact scripting; cross-session aggregation deferred
-- to a future phase (Q-C2).
-- Phase 8 (docs/PHASE8.md): accurate tokenization via the broker's
-- /tokenize endpoint, replacing the Phase 0 §8 char/4 heuristic.
-- Two consequences when use_endpoint=true:
-- (1) Context:estimate_tokens hits <endpoint>/tokenize once per
-- new turn (cached on the turn dict thereafter). Network
-- cost is one round-trip (~30ms) per fresh turn; subsequent
-- calls reuse the cache.
-- (2) Context:enforce_budget actually ENFORCES token_budget now
-- (previously only max_turns was checked). Sessions that
-- fit under char/4 may evict earlier — raise token_budget
-- to match your model's real context window if needed.
-- Cloud endpoints (OpenRouter) don't expose /tokenize; capability
-- cached as unsupported on first probe -> silent char/4 fallback.
--
-- tokenize = {
-- use_endpoint = true,
-- },
}