context: proactive periodic summarization (closes #101)
Closes #101 (FR-A from the 2026-05-17 German strategy analysis, small-model improvement strategy 5: "History-Zusammenfassung via local"). Phase 5 summarize-on-evict only fires at budget pressure — exactly when the local model is already suffering. Small models benefit from tight context from turn 1, not "after eviction". This commit adds CADENCE-triggered summarization that fires every N appends regardless of budget, folding turns older than `summarize_keep_recent` into ctx.summary via the existing Phase 5 summarize_fn closure. context.lua additions: - New ctx fields: summarize_every_n_turns, summarize_keep_recent (default 4), _turns_since_summarize (counter). - Context:append bumps the counter on every store. - Context:enforce_cadence — the new entry point. Returns the number of turns folded (0 on no-op). Guards: * disabled (cfg unset OR summarize_fn unset) -> 0 * not yet due (_turns_since_summarize < N) -> 0 * Norris-active (Phase 5 R-C4 parity — planner stays on goal) -> 0 * #turns <= keep_recent (nothing to fold) -> 0 * summarize_fn returns nil/empty -> 0 (defer to enforce_budget later) Orphan-tool guard: when the fold slice would end on an assistant-with-tool_calls, peel back the right edge until the next live turn isn't role=tool. Strict chat templates reject tool-without-assistant-anchor (#87 already encountered this). - If ctx.summary grows past max_summary_chars after the fold, compress in a second pass (same shape as enforce_budget's Phase 5 logic). repl.lua wiring: - ctx_opts continues to copy all config.context keys; the new summarize_every_n_turns / summarize_keep_recent fields flow through automatically. - make_summarize_fn is now wired when EITHER summarize_on_evict OR summarize_every_n_turns is set (same closure, different trigger — Phase 5's #51 #issue eviction path uses it on budget; #101 uses it on cadence). - New status_cadence_fold helper: "[aish] proactively summarized N older turns". - ask_ai's existing enforce_budget call site now first fires enforce_cadence, then enforce_budget. Cadence comes first so the token estimate enforce_budget sees is the tighter post-fold one — no spurious eviction of turns we just summarized. - Norris path NOT wired: enforce_cadence is a no-op there via the norris_active guard (consistent with Phase 5 R-C4). 18 inline unit cases for enforce_cadence: - cfg disabled / no summarize_fn / below cadence -> 0 - cadence met -> exact fold count (N - keep) - summary contains folded contents; first/last live turn IDs match - cadence counter resets; second fold fires after another N appends - Norris-active -> suppressed - orphan-tool guard: peels back when last folded = asst+tool_calls - summary compression triggers when over max_summary_chars E2E verified on hossenfelder:8082, summarize_every_n_turns=4 / summarize_keep_recent=2: 5 user turns -> 2 cadence fires: [aish] proactively summarized 2 older turns [aish] proactively summarized 4 older turns :cost detail shows main=5 calls, summarize=2 calls (matches fires). Estimated ctx token count: 180 (vs ~1000 unsummarized). Flag-off path: no status, identical to pre-#101 behavior. Regression: 87/87 safety, 31/31 router_model, repl loads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -361,7 +361,11 @@ function M.run(config)
|
||||
if config.context then
|
||||
for k, v in pairs(config.context) do ctx_opts[k] = v end
|
||||
end
|
||||
if config.context and config.context.summarize_on_evict then
|
||||
-- #101: summarize_fn is also needed for cadence-triggered
|
||||
-- summarization (Context:enforce_cadence). Wire it whenever
|
||||
-- EITHER feature is enabled — same closure, different trigger.
|
||||
if config.context and (config.context.summarize_on_evict
|
||||
or config.context.summarize_every_n_turns) then
|
||||
ctx_opts.summarize_fn = make_summarize_fn()
|
||||
end
|
||||
-- Phase 8 (docs/PHASE8.md): when cfg.tokenize.use_endpoint is true,
|
||||
@@ -674,6 +678,13 @@ function M.run(config)
|
||||
end
|
||||
end
|
||||
|
||||
-- #101: status line for cadence-triggered fold.
|
||||
local function status_cadence_fold(n)
|
||||
if n and n > 0 then
|
||||
renderer.status(("proactively summarized %d older turns"):format(n))
|
||||
end
|
||||
end
|
||||
|
||||
-- ── Phase 5: fallback eligibility per PHASE5.md §5 ──────────────────
|
||||
-- All transport-failure patterns must match against the err string
|
||||
-- as broker.lua emits it (with "transport: " prefix). The matcher
|
||||
@@ -1146,6 +1157,11 @@ function M.run(config)
|
||||
-- loop body re-runs broker.chat_stream with the now-extended ctx
|
||||
end
|
||||
|
||||
-- #101: proactively fold older turns into ctx.summary on
|
||||
-- cadence (when cfg.context.summarize_every_n_turns is set).
|
||||
-- BEFORE enforce_budget so it sees a tighter token estimate
|
||||
-- and doesn't evict turns we just summarized.
|
||||
status_cadence_fold(ctx:enforce_cadence())
|
||||
status_evictions(ctx:enforce_budget())
|
||||
|
||||
-- CMD: extraction on the final pure-text response only.
|
||||
|
||||
Reference in New Issue
Block a user