Commit Graph

14 Commits

Author SHA1 Message Date
marfrit c55077bc07 context + repl + config: route-aware context compression (closes #87)
Small local models effectively use a fraction of their advertised
context window. Per-request compression for routes that hit a
local-compress-flagged model preset: keeps only the last N turns
and tail-truncates oversized content. Cloud routes get the full
context unchanged.

Changes:

- context.lua _compress_turns(turns, keep, max_chars): returns a
  new list (self.turns NEVER mutated) with the last `keep` turns
  preserved + content tail-truncated to `max_chars`. Defensive:
  drops tool turns at the slice head (orphaned without their
  assistant-with-tool_calls anchor — strict chat templates would
  reject them; same gotcha PHASE0 §6 warned about for user/user).

- Context:to_messages(opts) — opts.compress = { keep_turns,
  max_turn_chars } swaps the turn iterable for the compressed
  view. Affects BOTH the use_tool_role=true path and the
  use_tool_role=false fallback (PHASE2.md Q18 strict-template
  workaround). Persistence + display via :history see the full
  uncompressed ctx.turns.

- repl.lua ask_ai: when req_cfg (the routed model's cfg) has
  `local_compress = true`, build compress_opts from
  config.context.compress (defaults keep_turns=2, max_turn_chars=800).
  Pass through ctx:to_messages alongside the existing
  system_prompt_override (#86) — orthogonal opts that compose.

- Norris unaffected: safety.norris_step builds its own messages
  array; the planner needs full history per PHASE3 design.

- config.lua gains a header comment explaining the per-model opt-in
  + the context.compress defaults block + the documented tool-turn
  truncation trade-off.

13 unit cases verified:
  - no opts -> full turn list (no regression)
  - keep_turns=2 -> exactly last 2 emitted
  - long content tail-truncated to max_chars
  - self.turns unchanged after render
  - orphan tool-turn at slice head dropped (no chat-template violation)
  - tool turn included WITH its assistant anchor when keep_turns >= 3

E2E against live local broker:
  - models.fast.local_compress = true; keep_turns=1; max=200
  - 4-turn session: each broker call sees ONLY the current turn
    (verified by short coherent CMD replies despite no cross-turn
    memory available to the model). FR-promised small-model
    friendliness in action; conversation continuity is the
    documented trade-off.

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:50:07 +00:00
marfrit 047d629a66 context + repl + config: per-class system_prompt override (closes #86)
Small local models follow precise structured instructions better than
natural language. Per-routing-class system_prompt override gives them
tighter instructions for THAT request while preserving ambient context.

Changes:

- Context:to_messages(opts) — opts.system_prompt_override REPLACES
  the base system_prompt for THIS render only (state unchanged).
  Dynamic blocks ([background], [project], [earlier summary], NORRIS
  suffix) still compose on top. opts is optional; nil-safe for old
  callers.

- repl.lua ask_ai — captures req_class from router.classify_model
  (already returned by Phase 5; previously discarded after the
  status line). Looks up config.routing.system_prompts[req_class];
  passes as opts.system_prompt_override to ctx:to_messages each
  iteration of the tool-sub-loop.

- Gating: override fires only when routing.auto is on (no class ->
  no override). If system_prompts[class] absent for a class, fall
  through to the default system_prompt (no surprise).

- Norris unaffected: safety.norris_step builds its own messages
  array; doesn't go through this path.

- config.lua gains a commented-out example showing routing.system_
  prompts with the code/default examples from the FR body.

Smoke verified:
  - 12-case context.lua unit test: opts nil/absent/present, override
    replaces base, dynamic blocks still compose, state unchanged
    after call, Norris-mode coexistence (suffix still present;
    background still suppressed).
  - E2E against cloud broker with routing.system_prompts.code set:
    triple-backtick prompt -> code class -> override fires; model
    emits terse code-only output. Non-code prompt -> default class
    -> no override -> normal verbose-ish reply.

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 05:41:15 +00:00
marfrit db26d0ccb7 context: enforce_budget honors token_budget + R2 guard (Phase 8 commit #3)
Pillar 5 (analyze finding A1) — the real value-add of Phase 8.
Until now, ctx.token_budget = 4096 was set but never enforced;
enforce_budget only looked at max_turns. With commit #2's accurate
tokenization wired in (via commit #4), eviction now finally fires
when the actual context fills the budget.

Loop condition change:

  before:
    while #self.turns > self.max_turns do

  after:
    while (#self.turns > self.max_turns
           or self:estimate_tokens() > self.token_budget)
          and #self.turns > 0 do

R2 guard: the `and #self.turns > 0` clause is essential. When
system_prompt alone exceeds token_budget (e.g. a 5000-token [project]
block with token_budget=4096), the OR-condition stays true even
when turns are empty — table.remove on a 0-length list would
no-op forever while evicted++ spins. Sonnet review caught this;
without the guard, real users could hit an infinite loop just
by setting a small token_budget + opening a large project tree.

Per-pair eviction logic (summarize callback + pair-pop) inside the
loop is unchanged. The estimate_tokens call is potentially expensive
under tokenize_fn — commit #2's per-turn cache amortizes to O(N)
per iteration after first fill; for max_turns=40 + budget=4096
sessions the worst case is microseconds per call.

Unit-verified across 5 cases (with and without tokenize_fn):
  1. max_turns eviction unchanged (no behavior regression).
  2. char/4 path: tight budget evicts to 0 when sys > budget,
     exits via R2 guard.
  3. char/4 path: practical budget evicts to a stable count.
  4. tokenize_fn stub: evicts to exactly the (budget - sys)/per-turn
     count.
  5. R2 critical: zero turns + oversize sys -> immediate exit,
     evicted=0, no spin.

Behavior change for existing users: a session that fit under
token_budget=4096 by char/4 (~16K chars) may now evict earlier
because accurate counts are HIGHER for most natural-text inputs
(per baseline B2). Users on cloud presets with very large context
windows (Claude 200K) should raise token_budget to match — see §9
risk row in PHASE8.md.

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:30:37 +00:00
marfrit 8502517021 context: tokenize_fn + per-turn _tokens cache (Phase 8 commit #2)
Foundation for accurate Context:estimate_tokens. When the optional
tokenize_fn is wired (Phase 8 commit #4 wires it from repl.lua),
estimate_tokens uses it with per-turn caching for O(1) amortized
cost. char/4 path unchanged when tokenize_fn nil.

Changes:

- Context.new accepts opts.tokenize_fn -> stored as self.tokenize_fn.

- Context:estimate_tokens:
    if tokenize_fn nil -> existing char/4 (no behavior change).
    if tokenize_fn set ->
      - tokenize self.system_prompt every call (dynamic per
        compose_background/project/summary; can't cache).
      - for each turn: if t._tokens nil -> compute + cache; else
        use cached. Turn content immutable after append (we never
        mutate stored turns) so cache never goes stale.

- :reset wipes self.turns which takes the _tokens cache with them;
  new turns start with t._tokens == nil and lazy-set on first count.

8/8 unit cases verified:
  - char/4 path unchanged when no tokenize_fn
  - tokenize_fn called 1+ N times on first estimate (sys + N turns)
  - subsequent estimates fire only 1 tokenize call (sys; turns cached)
  - new turn fires +1 tokenize call on next estimate
  - :reset + fresh turn fires fresh tokenize call (cache died with turn)

No callers wire tokenize_fn yet — Phase 8 commit #4 lands the
repl.lua wiring (after commit #3 adds the enforce_budget extension
that's the real beneficiary of accuracy).

Regression: test_safety 87/87, test_router_model 31/31.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:29:56 +00:00
marfrit 7b4a9becc2 context: cost/usage accumulator (Phase 7 commit #2)
Adds the per-conversation accumulator that broker.lua's
on_delta("usage", ...) payload feeds into. No callers yet —
commit #3 wires the broker callback to ctx:add_usage in repl.lua,
commit #4 in safety.lua.

Changes:

- Context.new: new fields `usage_totals = {}` and
  `cost_warn_state = { dollars = false, tokens = false }`. R4:
  two independent flags so warn_at_dollars firing doesn't
  suppress warn_at_tokens (or vice versa).

- Context:add_usage(model_name, category, usage):
  Increments usage_totals[model_name][category] slot. R6: when
  usage.cost is nil (local llama.cpp per B3), sets a sticky
  `is_local = true` flag on the slot AND does NOT add to cost
  (preserves the local-vs-cloud-zero distinction for :cost detail
  annotation). When usage.cost is a number (cloud), accumulates.

- Context:total_cost() / total_tokens() — pure-Lua summation
  across all slots; total_tokens returns (prompt, completion).

- Context:reset_usage() — explicit :cost reset path; zeros
  usage_totals AND clears both flags atomically.

- Context:reset() — R8 parity: does NOT clear usage_totals OR
  cost_warn_state. Matches the Phase 4 memory_items / Phase 6
  project rule ("ambient context survives a user-driven
  conversation reset").

Smoke verified (20/20 unit cases):
  - Empty zeros; cloud cost accumulation; local nil-cost preserves
    is_local=true sticky; calls counter; cost summation across
    multiple cloud calls; is_local sticky after a later nil-cost
    call on a cloud slot; separate slots per (model, category);
    :reset preserves; :reset_usage zeros both totals and flags.

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:57:56 +00:00
marfrit c4fc7fde01 context: [project] block plumbing (Phase 6 commit #1)
Foundation for Phase 6 — adds the field + composer + composition
order with no callers yet. Nothing sets ctx.project; the meta hookup
and startup auto-inject land in commit #2.

Changes:
  - Context.new gains `project` (string, nil) and `_project_opts`
    (cached scan opts for `:tree refresh`; R7).
  - compose_project(text) helper mirrors compose_background /
    compose_summary. Returns "" for nil/empty; otherwise emits
    "\n\n[project]\n" + text.
  - to_messages inserts compose_project BETWEEN compose_background
    and compose_summary so the model reads memory facts -> project
    tree -> earlier conversation -> NORRIS suffix.
  - Same Norris-suppression guard as the other two dynamic blocks
    (R-C1 / R-C4 parity; planner stays on goal anchor).
  - Context:reset preserves ctx.project (R8 — matches the Phase 4
    memory_items rule; startup-injected facts survive a user-driven
    context reset).

Smoke verified (14/14 inline cases):
  - project nil -> no [project] block in sys_content
  - project set -> block present with contents
  - ordering: [background] < [project] < [earlier conversation summary]
  - norris_active suppresses all three; NORRIS suffix still appears
  - :reset clears turns/pending_exec_output/summary; preserves
    memory_items AND project

Regression: test_safety 87/87, test_router_model 31/31, repl loads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:08:54 +00:00
marfrit 03497b5eea context: summarize-on-evict callback + summary block (Phase 5 commit #2)
Phase 5 commit #2 per docs/PHASE5.md §3 / §6.

Context.new opts additions:
  - summarize_fn(prior_summary, evicted_turns) -> string|nil callback
    per R-B1 canonical signature:
      (nil, [turns])  → first-time summarize
      (str, [turns])  → additive: extend prior summary
      (str, nil)      → compress: re-summarize the prior
    nil return → silent eviction (Phase 0 behavior preserved)
  - max_summary_chars (default 2000) — when ctx.summary grows past
    this, the callback is invoked AGAIN with the compress signal
    so the summary stays bounded across long sessions

Context.summary (string|nil) is the rolling summary state. Composed
into the SYSTEM MESSAGE (not as a turns[] entry — A3 resolution
avoids system/system back-to-back). compose_summary() emits:

  [earlier conversation summary]
  <ctx.summary>

between [background] and the NORRIS suffix. Both [background] and
[earlier summary] are SUPPRESSED when ctx.norris_active (R-C4 —
mirrors R-C1 from Phase 4; planner stays focused on its goal).

enforce_budget() rewrite:
  - Collects the evicted pair before removing.
  - Calls summarize_fn(self.summary, pair) under pcall — wraps any
    callback error so a broken summarizer can't crash the REPL.
  - Updates self.summary if callback returned non-empty string.
  - If new summary exceeds max_summary_chars, invokes compress pass
    (callback with evicted=nil).
  - Removes pair from turns (same final state as Phase 0).

Context:reset() clears the summary alongside turns + pending_exec_output.

Smoke-tested with a mock summarizer over a 10-turn context with
max_turns=4 and max_summary_chars=80:
  - 6 turns evicted to bring count down to 4
  - Callback fired 4 times (3 additive + 1 compress when summary
    crossed 80 chars)
  - to_messages includes [earlier conversation summary] block
  - Under norris_active=true, summary suppressed (block absent)
  - :reset clears ctx.summary

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:18:37 +00:00
marfrit c1a5c736ec context: [background] memory injection block (Phase 4 commit #2)
Phase 4 commit #2 per docs/PHASE4.md §5/§12.

ctx.memory_items (array of {kind, content, ...}) loaded by repl.lua
at startup from history.load_memory(). When non-empty AND ctx not in
Norris mode, to_messages() appends a [background] block to the
system prompt:

  [background] (memory.jsonl; manage via :memory)
  - (fact) User prefers terse responses
  - (context) Project: aish (LuaJIT REPL)

Suppression under Norris (R-C1): when ctx.norris_active is true the
[background] block is omitted. Norris already anchors via its
NORRIS suffix carrying the goal; a 2KB background block per planning
iteration would add ~16K tokens of redundant input over an 8-step
run.

Suffix composition order is now:
  1. DEFAULT_SYSTEM_PROMPT (Phase 0 + Phase 2 MCP, statically embedded)
  2. [background] block — when memory_items non-empty AND NOT norris_active
  3. NORRIS MODE block — when norris_active

repl.lua wiring (memory_items population at startup, :memory meta cmds,
:remember shortcut, :memory inject for live refresh) lands in commit #3.

Verified composition order with 4 cases:
  default-only         → 697 chars, no background, no norris
  memory_items only    → 824 chars, background YES, no norris
  memory + norris      → 1451 chars, background NO, norris YES (suppressed)
  norris only          → 1451 chars, background NO, norris YES

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 04:52:42 +00:00
marfrit a404b2a152 repl: Norris driver + \C-n + :norris/:safety meta (Phase 3 commit #5)
Phase 3 commit #5 per docs/PHASE3.md §12. Wires safety.norris_step
(commit #4) into the REPL with the user-facing surface.

ffi/readline.lua extensions (A1 + R-C4):
  - rl_insert_text + rl_redisplay added to ffi.cdef block; M.insert_text
    and M.redisplay wrappers exposed.
  - M.bind: removed `:free()` on previous callback. Now keeps every
    bound callback pinned for process lifetime in `_pinned` list
    (alongside `_bound[seq]` for current lookup). Avoids the
    use-after-free window between unbind and rebind that R-C4 flagged.
    Memory cost is bounded — one closure per key sequence binding.

context.lua Norris suffix (R-C3 / §8):
  - to_messages() composes a dynamic NORRIS MODE block onto the
    system prompt when ctx.norris_active is set. The block carries
    ctx.norris_goal so eviction of the user's "[norris] goal:" turn
    doesn't lose the anchor. Returns to plain system prompt when
    Norris exits.

repl.lua Norris driver:
  - prompt() now shows  marker when ctx.norris_active per PHASE0.md §9.
  - \C-n bound to a real handler — inserts ":norris " at the cursor
    (replaces Phase 1 status placeholder).
  - run_norris(goal) function: sets norris_active + norris_goal,
    appends a "[norris] <goal>" user turn, renders the banner, then
    loops calling safety.norris_step with an injected helpers table
    until a terminal status returns. Renders the closing banner.
  - norris_halt(): the [N] proceed/skip/abort prompt called by
    safety.norris_step via helpers.halt. Empty input → abort (safe).
  - dispatch_tool(): factored from the Phase 2 ask_ai code so
    safety.norris_step can call it.
  - norris_exec(): factored exec path for autonomous mode (skips
    the interactive run_shell cd-status renderer).
  - :norris <goal>  meta — launches autonomous mode
  - :norris off     meta — drops Norris flag (rare; usually 'abort')
  - :safety patterns meta — lists active is_destructive rules
  - :safety check <cmd> meta — probes a hypothetical command

End-to-end mock-driven test:
  Submitted ":norris find files in /tmp" → banner → step 1 emits
  tool_call (auto_approved per policy) → dispatched → frame rendered
  → step 2 emits "GOAL: complete" → sub-loop exits → DONE banner.
  2 broker invocations, no stalls.

config.lua safety example block lands in commit #6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:42:14 +00:00
marfrit 7e9cfff04d repl: tool-call sub-loop + :mcp meta + system-prompt augmentation
Phase 2 commit #6 per docs/PHASE2.md §12. End-to-end wiring of the MCP
tool-call flow on top of broker/safety/context/renderer/mcp.

repl.lua additions:
  - mcp_sessions table populated from config.mcp.servers at startup.
    connect_mcp() helper does initialize + caches tools/list. Failures
    status-logged once; absent from mcp_sessions until manual reconnect
    (C4 — no auto-retry).
  - tools_schema() flattens connected sessions' tools into the OpenAI
    {type:"function", function:{name,description,parameters}} shape with
    "<alias>.<name>" namespacing.
  - flatten_content() concatenates content[type="text"] blocks; one-shot
    status warning when non-text blocks (image/resource) are dropped
    (§4 normative spec, v1 only handles text).
  - dispatch_tool_call(name, args_table) splits alias.tool, looks up
    session, calls. Returns (content_string, is_error). Errors of every
    flavor (missing alias, no session, rpc_error, transport_error)
    yield a synthesized "[aish] ..." string so callers always have a
    body for the role:"tool" turn — alternation preserved per C5/C7.
  - ask_ai rewritten as a sub-loop that re-issues the broker request
    until the model returns pure text or max_tool_depth (default 8) is
    hit. Each iteration: stream response → if tool_calls present,
    confirm-gate each → dispatch → append role:"tool" turn → continue.
    Argument-JSON parse failure produces a synthesized tool turn (C7).
    Decline at confirm produces "[aish] tool call declined by user"
    tool turn (alternation guarantee).
  - :mcp meta with sub-commands: list / tools / tool <a.n> / connect
    <url> [alias] / disconnect <alias>. HELP block extended.

context.lua: DEFAULT_SYSTEM_PROMPT grows by ~4 lines per PHASE2.md §8
(hybrid prompt: static frame about MCP + dynamic tools list in the
request body). Block is always present even when no MCP servers
configured — ~60 tokens for clarity that 'CMD:' remains the fallback.

CMD: extraction unchanged — runs on the FINAL pure-text response only
(not on intermediate iterations of the tool sub-loop). Substrate §3
invariant preserved.

End-to-end verified two ways:
  (1) Direct broker probe: aish's tools_schema fed through
      broker.chat_stream against hossenfelder → qwen-1.5b emits one
      tool_call payload with correct id + name="boltzmann.list_dir"
      + args='{"path":"/tmp"}'. Accumulator stitched the JSON-string
      across fragmented deltas.
  (2) Mocked-broker sub-loop test: ask_ai feeds 'list /tmp', mock
      emits text + tool_call, sub-loop dispatches against LIVE
      boltzmann lmcp (auto_approve via policy), 80+ files rendered
      inside the tool_call frame, broker re-invoked with the
      extended context, mock returns pure text, sub-loop terminates.
      Total broker invocations: 2.

Known: the loaded fast model (qwen-1.5b) tends to emit "CMD: ..."
suggestions even when an MCP tool is the better path; the small
model's system-prompt compliance is weak. Larger models and the
analyze-time direct probe confirm the tools_schema and tool_calls
flow is wire-correct — Phase 7 verify will exercise this against
qwen3-30b or cloud models when available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 15:20:42 +00:00
marfrit 7c221a8aae context: tool turns + tool_calls on assistant; use_tool_role fallback
Phase 2 commit #3 per docs/PHASE2.md §12. Three concrete edits per §3
context.lua row (the BLOCKER-fold-in from review):

  (a) Loosen Context:append shape-per-role: assistant may carry empty
      content if tool_calls is non-empty; role:"tool" requires
      tool_call_id + content.

  (b) Preserve tool_calls / tool_call_id on store (Phase 1 :append
      built {role, content} only and silently dropped extras).

  (c) Extend to_messages() with two emission modes selected by
      use_tool_role:
        true  (default) — OpenAI-standard role:"tool" + assistant
          turns with tool_calls (wrapped as {id, type:"function",
          function:{name, arguments}}).
        false (fallback) — collapse assistant-with-tool_calls + its
          following role:"tool" turns into a single assistant text
          turn with synthesized "[tool: name]\n<args>\n[result]\n
          <content>" body; merge consecutive assistant turns so the
          trailing post-tool-result text doesn't yield asst/asst
          back-to-back (same strict-template gotcha PHASE0.md §6
          warned about for user/user).

Alternation assert added (N4): role:"tool" turns must trace back
through zero-or-more prior tool turns to an assistant-with-tool_calls.
Catches sub-loop bugs at append time. Orphan tool turns rejected.

pending_exec_output behavior unchanged per §3 row: buffer persists
across tool-call sub-loops, flushes on next genuine user turn (B4).

Smoke-tested §12 verify-row #3:
  (i)   default mode round-trip — 5 OpenAI-shape messages, tool_calls
        + tool_call_id preserved.
  (ii)  fallback mode round-trip — collapsed into 3 messages
        (system/user/assistant), tool_calls + role:"tool" not emitted.
  (iii) multi-call: 2 tool_calls in one assistant turn followed by 2
        tool replies, both modes render correctly.
  (iv)  orphan tool turn after user — assertion fires.
  (v)   B4: pending_exec_output survives a tool sub-loop, flushes on
        next :append_user.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 13:10:47 +00:00
marfrit 16490e6905 fix: buffer exec output for next user turn; alternation for strict templates
User-test surfaced the bug: with `deep` (mistral-nemo-12b) active,
running `list files` -> y on `CMD: ls` -> `Are there directory entries
beginning with "lor"?` returned a Jinja exception:

    api: ... Error: Jinja Exception: After the optional system message,
    conversation roles must alternate user/assistant/user/assistant/...

Cause: §6 specified "exec output injected into context uses role 'user'
with a prefix tag '[exec output]'." This works for permissive templates
(qwen2.5-coder-1.5b, the `fast` preset) but produces a back-to-back
user/user pair on strict templates that enforce the OpenAI alternation
contract — `[exec output]` user turn followed by the user's actual
follow-up question.

Fix:

context.lua:
  - new field `pending_exec_output` (initially nil)
  - new method `:append_exec_output(out)` buffers (concat on subsequent
    captures so multi-shell-then-ai still merges everything)
  - new method `:append_user(content)` flushes buffered exec output as
    a `[exec output]\n...\n\n` prefix and appends a user turn
  - `:reset()` also clears the buffer

repl.lua:
  - run_shell calls ctx:append_exec_output(out) instead of
    ctx:append({role="user", content="[exec output]\n"..out})
  - ask_ai calls ctx:append_user(text) instead of raw :append; saves
    prev_pending so a broker error can restore the buffer for retry

PHASE0.md §6:
  - amended the role-injection paragraph to describe the buffer-and-
    prepend policy; the §3 invariants list is untouched (this was a §6
    design detail, not a locked invariant)

Verification:
  - context unit tests cover: alternation after the failing sequence,
    multi-shell merge, reset clears buffer, broker-error retry path
  - live reproduction against `deep` (mistral-nemo) of the exact
    user-reported sequence succeeds; model responds with a sensible
    `CMD: ls | grep '^lor'` instead of a Jinja exception

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:41:21 +00:00
marfrit 10848645af context: in-memory turn list + max_turns sliding-window eviction
Phase 0 implementation per PHASE0.md §6, §8.

Context.new(opts) constructs with the §6 default system prompt (the
`CMD: ` extraction contract is hard-coded in there per §3 — locked
substrate, do not edit). opts overrides: system_prompt, max_turns
(default 40), token_budget (default 4096; visibility only in Phase 0
per Q1, deferred to Phase 3 for accurate tokenization).

API:
  ctx:append({role, content})    record a turn
  ctx:to_messages()              [{system,...}, ...turns] for broker.chat
  ctx:enforce_budget()           evict pairs (user+assistant) until
                                 #turns <= max_turns; returns count
  ctx:estimate_tokens()          char/4 heuristic
  ctx:reset()                    drop all turns (system_prompt kept)

System prompt is the §6 phrasing verbatim including the `CMD: ` clause
— stored on the context, NOT in self.turns, so it is prepended freshly
on every to_messages() call.

Smoke covers basic ops, no-evict-at-max, evict-on-overflow, bulk
eviction (14 turns -> 4), reset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:59:25 +00:00
claude-noether 4310207738 Phase 0: scaffold tree + manifest
- README, .gitignore, CLAUDE.md (project conventions)
- docs/PHASE0.md — full Phase 0 manifest (locked substrate)
- 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented
  with module-scoped responsibilities matching the manifest
- config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b
  snappy/32k + cloud via OpenRouter through hossenfelder)

File names match docs/PHASE0.md §4 exactly. Module bodies fill in across
later phases; the tree shape is locked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:16:07 +00:00