From 08dba69fcea7e3ae6671abd2a71b1860db3da050 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Sat, 16 May 2026 23:32:16 +0000 Subject: [PATCH] config + docs/PHASE8: example block + status -> Implement (Phase 8 commit #5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit config.lua: - Commented-out `tokenize = { use_endpoint = true }` block with parity to the Phase 1-7 example blocks. - Documents the two consequences: (1) per-turn network cost (~30ms first time, cached after) and (2) token_budget is now actually enforced — sessions that fit under char/4 may evict earlier under accurate counts. - Notes cloud /tokenize 404 fallback path. docs/PHASE8.md: - Status header bumped: "Plan + review fold-in" -> "Implement" - Lists the 5 implement commits inline for traceability: 7ef2a6e broker: token_count + endpoint cache 8502517 context: tokenize_fn + _tokens cache db26d0c context: enforce_budget honors token_budget (R2 guard) 94b7d86 repl: wire tokenize_fn + :cost detail estimate row this config example + status bump Phase 8 implementation is complete. Resolves Q1 (PHASE0 §13, originally Phase 3, deferred forward). Next inner-loop step: verify (7) — file test cases, run autonomous, close. Then memory-update (8). Regression: test_safety 87/87, test_router_model 31/31, repl loads. Co-Authored-By: Claude Opus 4.7 (1M context) --- config.lua | 18 ++++++++++++++++++ docs/PHASE8.md | 2 +- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/config.lua b/config.lua index cc3c08b..aea7aa9 100644 --- a/config.lua +++ b/config.lua @@ -284,4 +284,22 @@ return { -- written to session/*.jsonl (assistant-turn `usage` field) for -- after-the-fact scripting; cross-session aggregation deferred -- to a future phase (Q-C2). + + -- Phase 8 (docs/PHASE8.md): accurate tokenization via the broker's + -- /tokenize endpoint, replacing the Phase 0 §8 char/4 heuristic. + -- Two consequences when use_endpoint=true: + -- (1) Context:estimate_tokens hits /tokenize once per + -- new turn (cached on the turn dict thereafter). Network + -- cost is one round-trip (~30ms) per fresh turn; subsequent + -- calls reuse the cache. + -- (2) Context:enforce_budget actually ENFORCES token_budget now + -- (previously only max_turns was checked). Sessions that + -- fit under char/4 may evict earlier — raise token_budget + -- to match your model's real context window if needed. + -- Cloud endpoints (OpenRouter) don't expose /tokenize; capability + -- cached as unsupported on first probe -> silent char/4 fallback. + -- + -- tokenize = { + -- use_endpoint = true, + -- }, } diff --git a/docs/PHASE8.md b/docs/PHASE8.md index f8afe14..fff7daa 100644 --- a/docs/PHASE8.md +++ b/docs/PHASE8.md @@ -2,7 +2,7 @@ **Project:** aish — AI-augmented conversational shell **Document:** Phase 8 Requirements, Architecture & Design Decisions -**Status:** Plan + review fold-in (tree at `aa64ad3`) +**Status:** Implement (5 commits landed: 7ef2a6e, 8502517, db26d0c, 94b7d86, this) **Date:** 2026-05-16 **Review findings (independent Sonnet agent, 2026-05-16) — 2 BLOCKERs