context: proactive periodic summarization (not just on eviction) #101

Closed
opened 2026-05-17 09:15:32 +00:00 by claude-noether · 1 comment
Collaborator

Motivation

Phase 5 summarize-on-evict only fires when the token budget is exhausted. For small local models (which effectively use a fraction of their advertised window per #87 analysis), this means turns 1..N stay raw in the prompt until eviction kicks in — exactly when context bloat is hurting most. The German analysis 2026-05-17 strategy 5 ("History-Zusammenfassung via local", 3h, medium ROI) flagged this gap.

Proposal

Proactively summarize older turns on a cadence, before budget pressure forces it:

context = {
  ...
  -- when set, after every N turns, fold turns older than M into ctx.summary
  summarize_every_n_turns = 10,
  summarize_keep_recent   = 4,     -- keep this many at the tail unsummarized
  summarizer_model        = "fast", -- reuses Phase 5 cfg key
}

Reuses Phase 5's existing make_summarize_fn + summarize_fn callback path (no new module). New trigger in Context:enforce_budget (or a new Context:enforce_cadence) that fires periodically based on turn count, not budget.

Composition

  • Composes with #87 route-aware compression: even tighter prompts per local call.
  • Composes with Phase 10's Norris suppression rule (Phase 5 R-C4): NOT during Norris (planner stays on goal anchor).
  • No regression: cfg flag unset → existing behavior (eviction-triggered only).

Acceptance

  • 20-turn conversation with summarize_every_n_turns=10 shows ctx.summary populated after turn 10, refreshed at turn 20.
  • :history still shows full turn list (summary is for the wire prompt only).
  • Without the cfg flag, behavior is identical to today.

Effort

~4h. Per-method: extend make_summarize_fn's wiring; add cadence trigger in context.lua; cfg example block.

## Motivation Phase 5 `summarize-on-evict` only fires when the token budget is exhausted. For small local models (which effectively use a fraction of their advertised window per `#87` analysis), this means turns 1..N stay raw in the prompt until eviction kicks in — exactly when context bloat is hurting most. The German analysis 2026-05-17 strategy 5 ("History-Zusammenfassung via local", 3h, medium ROI) flagged this gap. ## Proposal Proactively summarize older turns on a cadence, before budget pressure forces it: ```lua context = { ... -- when set, after every N turns, fold turns older than M into ctx.summary summarize_every_n_turns = 10, summarize_keep_recent = 4, -- keep this many at the tail unsummarized summarizer_model = "fast", -- reuses Phase 5 cfg key } ``` Reuses Phase 5's existing `make_summarize_fn` + `summarize_fn` callback path (no new module). New trigger in `Context:enforce_budget` (or a new `Context:enforce_cadence`) that fires periodically based on turn count, not budget. ## Composition - Composes with `#87` route-aware compression: even tighter prompts per local call. - Composes with Phase 10's Norris suppression rule (Phase 5 R-C4): NOT during Norris (planner stays on goal anchor). - No regression: cfg flag unset → existing behavior (eviction-triggered only). ## Acceptance - 20-turn conversation with `summarize_every_n_turns=10` shows ctx.summary populated after turn 10, refreshed at turn 20. - :history still shows full turn list (summary is for the wire prompt only). - Without the cfg flag, behavior is identical to today. ## Effort ~4h. Per-method: extend `make_summarize_fn`'s wiring; add cadence trigger in context.lua; cfg example block.
Author
Collaborator

Implemented in commit a3c1813. Cadence-triggered fold via Context:enforce_cadence; 18 unit cases + E2E on hossenfelder:8082 (2 fires across 5 turns, ctx tokens dropped from ~1000 raw to 180 summarized).

Implemented in commit a3c1813. Cadence-triggered fold via Context:enforce_cadence; 18 unit cases + E2E on hossenfelder:8082 (2 fires across 5 turns, ctx tokens dropped from ~1000 raw to 180 summarized).
Sign in to join this conversation.