docs/PHASE5: analyze — surface clean, summary lives on ctx.summary not turns
A1. router.lua surface clean; classify_model is a natural sibling of
classify. No structural refactor.
A2. broker error message shapes confirmed: all transport errors carry
"transport: " prefix; "api: " for SSE-framed semantic errors;
"broker: " for config bugs. Fallback matcher must strip the prefix
before testing — list of eligible patterns tightened in §5.
A3. Q38 RESOLVED — summary doesn't go in ctx.turns (would create
system/system back-to-back, same gotcha as PHASE0 §6 user/user).
Instead lives on ctx.summary (string) and composes into the
system message between [background] and NORRIS suffix. No new
role:"system" turn; no alternation risk. §3 + §6 reflect.
Module-changes table updated to specify ctx.summary string field +
the to_messages composition order. Storage shape diagram in §6
rewritten.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+74
-18
@@ -2,9 +2,43 @@
|
||||
|
||||
**Project:** aish — AI-augmented conversational shell
|
||||
**Document:** Phase 5 Requirements, Architecture & Design Decisions
|
||||
**Status:** Formulate (pre-analyze)
|
||||
**Status:** Analyze (formulate complete; current tree at `4453b93` probed)
|
||||
**Date:** 2026-05-13
|
||||
|
||||
**Analyze findings (2026-05-13):**
|
||||
|
||||
A1. **router.lua surface clean** — already a pure-Lua module with
|
||||
`M.classify(line, config) -> (kind, payload)`. Adding
|
||||
`M.classify_model(text, cfg) -> name | nil` is a natural sibling.
|
||||
No structural refactor.
|
||||
|
||||
A2. **broker error message shapes** all carry transport-stage prefixes
|
||||
that the fallback matcher must account for. The actual shapes
|
||||
callers see:
|
||||
"transport: HTTP %d%d%d: <body-snippet>" (from post_sse status>=400)
|
||||
"transport: Timeout was reached"
|
||||
"transport: Couldn't resolve host"
|
||||
"transport: Connection refused"
|
||||
"transport: HTTP response code said error" (rare; from FAILONERROR)
|
||||
"api: <error.message>" (SSE-framed error envelope)
|
||||
"broker: model_cfg.endpoint and .model required" (config bug)
|
||||
Fallback patterns in §5 should match against the "transport: "
|
||||
prefix explicitly. "api: ..." errors don't fall back (they're
|
||||
semantic — bad request shape, not server failure). "broker: ..."
|
||||
errors don't fall back either (config bug).
|
||||
|
||||
A3. **Q38 resolved at analyze** — placing the rolling summary as
|
||||
`turns[1]` with `role:"system"` would produce system/system
|
||||
back-to-back in to_messages output (msg[1] is the composed
|
||||
system prompt; msg[2] would be the summary as another system
|
||||
message). Strict templates may reject this same way they reject
|
||||
user/user (PHASE0 §6). Resolution: render the summary INSIDE the
|
||||
composed system message (same pattern as the [background] and
|
||||
NORRIS blocks). Storage stays simple — keep `_summary` text on
|
||||
`ctx.summary` (NOT in `ctx.turns`), append to the system prompt
|
||||
in `to_messages` alongside the [background] and NORRIS blocks.
|
||||
§6 + §3 reflect.
|
||||
|
||||
PHASE0 is the locked substrate; PHASE1-4 are layered on top. This manifest
|
||||
specifies what Phase 5 adds — **multi-model routing**, **cloud fallback**,
|
||||
and **context summarization on eviction**.
|
||||
@@ -77,7 +111,7 @@ Three pillars per PHASE0.md §11 row 5:
|
||||
| File | State after Phase 4 | Phase 5 changes |
|
||||
|---|---|---|
|
||||
| `router.lua` | `classify(line, config)` → `(kind, payload)` for shell/AI/meta dispatch | Add `M.classify_model(text, cfg) -> name | nil`. Heuristics: line length > N, presence of code-fence backticks, keywords like "traceback", "stacktrace", "explain", "why does", etc. Returns the model NAME (string) or nil = keep current. |
|
||||
| `context.lua` | turns + memory_items + Norris suffix | Extend `enforce_budget()` to invoke a callback (passed in via `Context.new(opts.summarize_fn)`) when about to evict. If callback returns text, prepend a `{role="system", _summary=true, content=...}` turn (replacing prior `_summary` turn if present). If callback returns nil, fall back to silent eviction. The summarize callback itself lives in repl.lua (because it needs broker.chat). |
|
||||
| `context.lua` | turns + memory_items + Norris suffix | Extend `enforce_budget()` to invoke a callback (passed via `Context.new(opts.summarize_fn)`) when about to evict. Store the returned summary as `ctx.summary` (string) — NOT a turn (A3 — avoids system/system alternation). `to_messages` composes it into the system message alongside `[background]` and NORRIS, between them: `system → [background] → [earlier summary] → NORRIS`. New evictions append to `ctx.summary`; when its length exceeds `max_summary_chars` (default 2000), the callback is invoked AGAIN with `(prior_summary, new_evicted_turns)` to re-summarize. Silent eviction is the fallback when the callback returns nil. |
|
||||
| `repl.lua` | tool-sub-loop + meta + memory injection | (a) Pre-broker hook: if `cfg.routing.auto`, call `router.classify_model(text, cfg)` and switch `active_cfg` for THIS request only (revert after). (b) Post-broker error hook: if err matches a fallback pattern AND `cfg.routing.cloud_fallback`, retry against the fallback model once. (c) Wire `Context.new` with a `summarize_fn = function(turns) ... end` closure that calls `broker.chat(cfg.models[cfg.context.summarizer_model], ..., {max_tokens=300})`. |
|
||||
| `broker.lua` | streaming + opts.tools/max_tokens/timeout_ms | Unchanged — Phase 5 composes on top of the existing surface. |
|
||||
| `config.lua` | example with mcp/safety/memory blocks | Add commented-out `routing = {...}` and `context.summarize_on_evict = true` example. |
|
||||
@@ -150,17 +184,23 @@ ONLY when `cfg.routing.cloud_fallback == true`. Otherwise returns false.
|
||||
|
||||
### Fallback-eligible error patterns
|
||||
|
||||
| Pattern | Meaning |
|
||||
All patterns match against the err string AS IT ARRIVES from broker.lua,
|
||||
which is prefixed `"transport: "` for libcurl/HTTP issues (A2 confirmed).
|
||||
The matcher strips the prefix before testing.
|
||||
|
||||
| Pattern (after prefix strip) | Meaning |
|
||||
|---|---|
|
||||
| `HTTP 5%d%d` | server-side error (502 Bad Gateway, 503 Unavailable, 504 Timeout) |
|
||||
| `HTTP 404.*model_not_found` | the routed model isn't loaded on the local backend |
|
||||
| `^HTTP 5%d%d` | server-side error (502 Bad Gateway, 503 Unavailable, 504 Timeout) |
|
||||
| `^HTTP 404.*model_not_found` | the routed model isn't loaded on the local backend |
|
||||
| `Couldn'?t resolve host` | DNS / unreachable local broker |
|
||||
| `Connection refused` | broker not listening |
|
||||
| `Timeout was reached` | broker too slow |
|
||||
|
||||
Errors NOT matched (and therefore NOT retried):
|
||||
Errors NOT matched (NOT retried):
|
||||
- HTTP 401 / 403 (auth failure — won't get better on cloud)
|
||||
- HTTP 400 (bad request — schema issue)
|
||||
- `^api:` errors (semantic — bad request shape)
|
||||
- `^broker:` errors (config bug — endpoint/model missing)
|
||||
- Lua-level errors (broker pipeline bug, not transport)
|
||||
|
||||
---
|
||||
@@ -172,28 +212,44 @@ nil` closure. When set AND `enforce_budget` would evict, the callback
|
||||
is invoked with the evicted slice; the returned summary (if non-nil)
|
||||
replaces the rolling summary turn.
|
||||
|
||||
### Storage shape
|
||||
### Storage shape (post-A3 resolution)
|
||||
|
||||
Index 1 of `ctx.turns` is reserved for the summary turn when present:
|
||||
The rolling summary lives on `ctx.summary` (a string), NOT in `ctx.turns`:
|
||||
|
||||
```lua
|
||||
{ role = "system", _summary = true,
|
||||
content = "[earlier conversation summary]\n<summary text>" }
|
||||
ctx.summary = "Earlier conversation: user discussed X, asked about Y, "
|
||||
.. "agreed to Z. Later asked..."
|
||||
```
|
||||
|
||||
`to_messages()` renders this normally (system role; model sees it).
|
||||
`to_messages()` composes it into the system message between `[background]`
|
||||
and the NORRIS suffix:
|
||||
|
||||
```
|
||||
DEFAULT_SYSTEM_PROMPT
|
||||
|
||||
[background] (memory items)
|
||||
- (fact) ...
|
||||
|
||||
[earlier conversation summary]
|
||||
<ctx.summary>
|
||||
|
||||
[NORRIS MODE] (if active)
|
||||
...
|
||||
```
|
||||
|
||||
No new role:"system" message at turns[1] — avoids system/system alternation.
|
||||
|
||||
### Summary update flow
|
||||
|
||||
1. enforce_budget identifies the oldest 2 turns to evict (user + assistant).
|
||||
2. If `summarize_fn` is set, call it with those 2 turns.
|
||||
2. If `summarize_fn` is set, call it with `(prior_summary, evicted_turns)`.
|
||||
3. If summary text returned:
|
||||
- If `ctx.turns[1]._summary` exists, append to its content
|
||||
(truncate at `max_summary_chars` default 2000 — re-summarize via
|
||||
same callback if exceeded).
|
||||
- Else insert a new `_summary` turn at index 1.
|
||||
- Replace `ctx.summary` with the new text.
|
||||
- If `#ctx.summary > max_summary_chars` (default 2000), invoke the
|
||||
callback once more with `(ctx.summary, {})` to re-summarize for
|
||||
compactness. Lossy by design — Q40 documents this trade-off.
|
||||
4. Remove the evicted turns from `ctx.turns`.
|
||||
5. If callback returned nil → silent eviction (Phase 0 behavior).
|
||||
5. If callback returned nil → silent eviction; `ctx.summary` unchanged.
|
||||
|
||||
### Failure handling
|
||||
|
||||
@@ -261,7 +317,7 @@ Specifically out of Phase 5:
|
||||
| # | Question | Impact | Resolve by |
|
||||
|---|---|---|---|
|
||||
| Q37 | Should routing apply to `:ask <text>` (explicit AI route) the same way it does to bare prompts? Yes seems obvious but worth documenting. | repl.lua | Phase 5 (plan) |
|
||||
| Q38 | Summary turn placement: index 1 (right after a putative system prompt — but to_messages prepends the system prompt fresh each call) vs index 0 in self.turns. Index 1 in self.turns means the system message comes from `to_messages()` and the `_summary` turn lives next as a `role:"system"` message. Strict templates may reject system/system back-to-back. | context.lua | Phase 5 (analyze) |
|
||||
| Q38 | ~~Summary turn placement: index 1 vs index 0~~ | context.lua | **Resolved at analyze (A3)**: NEITHER — summary lives on `ctx.summary` (string) and composes into the SYSTEM MESSAGE alongside [background] and NORRIS suffix. No new role:"system" message; no alternation risk. |
|
||||
| Q39 | When fallback fires AND the user is in Norris mode, does the fallback model also drive the planning loop? Or single-request-only fallback? v1 says single-request only (Norris stays on its configured model). | repl.lua + safety.lua | Phase 5 (plan) |
|
||||
| Q40 | Summarizer recursion: the summary itself might be summarized later when it grows past max_summary_chars. Does the re-summarize lose fidelity? Probably yes; acceptable trade-off. Note the lossy-by-design contract in §6. | context.lua | Phase 5 (verify) |
|
||||
| Q41 | Eligibility patterns for fallback: should HTTP 408 (Request Timeout, distinct from Timeout was reached) be matched? Some servers emit it. Default: yes — pattern `HTTP 408`. Phase 7 verify will adjust. | repl.lua | Phase 7 (verify) |
|
||||
|
||||
Reference in New Issue
Block a user