From a9b39cd435899602fa99c33a918fc3e7b3df7f4d Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Wed, 13 May 2026 11:32:20 +0000 Subject: [PATCH] config: Phase 5 routing + summarize-on-evict example (commit #5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 5 commit #5 (final) per docs/PHASE5.md §11. Documentation-only; commented-out example showing: - routing.auto (per-request auto-routing toggle) - routing.classes (class → model mapping; reasoning = nil by default per R-N2 cost-safety) - routing.fallback (single-hop retry to cloud on transport fail) - routing.fallback_model (default "cloud" if uncommented) - context.summarize_on_evict + summarizer_model + max_summary_chars (shown INSIDE the context = {...} block above) All defaults OFF — Phase 5 is opt-in across the board. Existing configs without `routing` or `context.summarize_on_evict` behave identically to Phase 4. Phase 5 implementation complete: #1 3e57824 router.classify_model + 31-case corpus #2 03497b5 context summarize_fn callback + summary block in to_messages #3 40ea0b4 repl routing + fallback + summarize_fn wiring + :route/:fallback #4 - (bundled into #3 since meta cmds are trivial additions) #5 (this) config example block Phase 5 verify-partial: - router.classify_model: 31/31 case corpus passes - context summarize-on-evict: mock callback fires correctly (additive + compress paths), summary suppressed under Norris, :reset clears it - repl meta cmds: :route on/off/classes/check + :fallback on/off all work; :route check reports class + "routing currently disabled" suffix when auto is off (N1) Verify-pending: end-to-end with real broker (route a code question, see it land on deep; kill local backend, see fallback fire to cloud). Co-Authored-By: Claude Opus 4.7 (1M context) --- config.lua | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/config.lua b/config.lua index ede80da..0a5bb05 100644 --- a/config.lua +++ b/config.lua @@ -146,4 +146,49 @@ return { -- -- (cloud may have variable cost per session). -- summarizer_model = "fast", -- }, + + -- Phase 5 (docs/PHASE5.md): multi-model routing + cloud fallback + + -- summarize-on-evict. OFF by default — auto-routing can spend money + -- silently on the cloud preset; require explicit opt-in. + -- + -- routing = { + -- -- Enable auto-routing per request. When true, router.classify_model + -- -- inspects each prompt and may switch the model for THAT request + -- -- only (the :model selection is preserved across requests). + -- -- Default false. Toggle at runtime with :route on / :route off. + -- auto = true, + -- + -- -- Class → model mapping. nil = "keep current" (heuristic fires + -- -- but no override). Ships with reasoning = nil because mapping + -- -- "explain ..." prompts to a paid cloud model would spend money + -- -- silently — opt in by uncommenting the reasoning line below. + -- classes = { + -- code = "deep", -- code-like prompts to local deep + -- -- reasoning = "cloud", -- OPT-IN: "explain"/"why"/"how does" → paid + -- -- default = nil, -- keep active model + -- }, + -- + -- -- Single-hop retry on transport failure (HTTP 5xx, 408, + -- -- 404 model_not_found, DNS, connection refused, timeouts). + -- -- Retries against fallback_model once. Skipped if any text + -- -- has already streamed (no partial-output duplication). + -- -- Toggle at runtime with :fallback on / :fallback off. + -- fallback = false, -- default off (cost-safety) + -- fallback_model = "cloud", + -- }, + + -- ── Phase 5 context summarization on sliding-window eviction. + -- Set INSIDE the context = { ... } block above to enable: + -- context = { + -- max_turns = 40, + -- token_budget = 4096, + -- summarize_on_evict = true, + -- summarizer_model = "fast", -- model name in models{} + -- max_summary_chars = 2000, + -- }, + -- When summarize_on_evict is true, evicted turn pairs are fed to + -- summarizer_model and the result lives on ctx.summary, appended to + -- the system prompt as [earlier conversation summary]. Suppressed + -- in Norris mode (R-C4 — planner stays on its goal). If broker + -- fails, falls back to Phase 0 silent eviction (no crash). }