A1. history.lua surface lines up cleanly for the memory additions —
no structural refactor; pure additive functions mirroring the
session pattern.
A2. Counter persistence: scan at open, cache next_id in handle.
O(n) load (n bounded by curation, ~hundreds), no sidecar file.
Persisted ids let forget-tombstones target items even across
restarts.
A3. System-prompt suffix order locked: DEFAULT (carrying Phase 2 MCP
block baked in) → Phase 4 [background] → Phase 3 NORRIS. Token
cost measured: default ~174 toks, +NORRIS ~364 toks, +NORRIS+2KB
background ~865 toks. Well within typical context budgets.
No manifest amendments needed — §3/§5 already match. Findings recorded
inline as Phase 7 anchors.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
19 KiB
aish — Phase 4 Manifest
Project: aish — AI-augmented conversational shell
Document: Phase 4 Requirements, Architecture & Design Decisions
Status: Analyze (formulate complete; current tree at bea7175 probed)
Date: 2026-05-13
Analyze findings (2026-05-13):
A1. history.lua surface is clean — M.open/Session:append/
Session:close/M.load/M.list_sessions. The memory functions
can mirror this exactly: M.open_memory/memory:add/
memory:forget/memory:close/M.load_memory. No structural
refactor needed; pure additions.
A2. Counter persistence — scan at open, cache in handle. Phase 1's
session log writes a {"meta":{...}} header on first creation but
doesn't track entry-id (turns aren't numbered). For memory, the
monotonic id is needed for forget-targeting. Cheapest correct
approach: on M.open_memory, read all lines once, find the max
id field present (skipping the meta header if any), cache as
handle.next_id. Subsequent add calls increment in-memory and
persist on the next append. O(n) at open is acceptable since n is
bounded by user curation (~hundreds, not millions). No sidecar.
A3. System-prompt suffix order, post-analyze: actual current
composition is DEFAULT_SYSTEM_PROMPT (which has Phase 2 MCP
guidance already baked-in as a static block) → optional NORRIS
dynamic suffix. The Phase 2 MCP block is NOT computed dynamically
— it's part of DEFAULT_SYSTEM_PROMPT. So Phase 4's [background]
block lives between DEFAULT and NORRIS. Token cost measured:
- DEFAULT: 697 chars (~174 tokens)
- DEFAULT + NORRIS: 1458 chars (~364 tokens)
- DEFAULT + 2KB background + NORRIS: ~3460 chars (~865 tokens)
Within typical 4-8K context budgets.
These findings don't require manifest changes — the §3 module-changes table and §5 injection mechanism already match. Recording the measurements here so verify (Phase 7) has anchors.
PHASE0 is the locked substrate; PHASE1, PHASE2, PHASE3 are layered on top. This manifest specifies what Phase 4 adds — cross-session memory — and the user-facing surface for managing it.
1. Scope of Phase 4
Three pillars per PHASE0.md §11 row 4:
-
memory.jsonlpersistent store — a single append-only file (<config.history.dir>/memory.jsonl) carrying user-curated facts, preferences, and project context that survive aish restarts. Same storage convention as session logs but a separate file because the read pattern (load at startup) and write pattern (curated only) differ from session logs (append-every-turn). -
Startup context injection — at REPL boot, recent memory items are loaded into the live
Contextso the model sees them on the very first turn. Injection is bounded (token budget) and visible to the user via:memory list. -
:memorymanagement surface + automatic candidate extraction — meta commands foradd,list,forget,clear, plus an opt-in summarizer that runs at session end (or on demand) extracting candidate facts from the session log for the user to triage into memory.
Phase 4 is done when:
:remember <text>(alias for:memory add <text>) writes a line tomemory.jsonland the next REPL boot sees it in context.:memory listshows current memory items with their IDs and ages.:memory forget <id>removes one item;:memory clearremoves all (with confirm).- At startup, the top-N most recent memory items are prepended to the Context as a single "background:" block (configurable cap).
:memory summarizeruns the active model over the current session log and proposes candidate memory items; the user accepts/rejects per-candidate via prompt.- Existing configs without a
memorysection behave exactly like Phase 3 (no startup injection, no auto-summarize).
2. Technology Decisions (delta from Phase 3)
| Decision | Choice | Rationale |
|---|---|---|
| Storage format | Append-only JSONL, one item per line | Same convention as Phase 1's session logs. Greppable, robust to truncation, no parser dependency beyond vendored dkjson. |
| Storage location | <config.history.dir>/memory.jsonl (sibling to sessions/) |
Co-located with session logs; users can back up one directory. Defaults to ~/.local/share/aish/memory.jsonl. |
| Memory-item shape | {id, ts, kind, content, tags?, source?} |
id is monotonic int (counter persisted in memory.id); kind ∈ {"fact","pref","context"} lightly typed for future routing; content is the body text; optional tags array; optional source carrying session-id provenance when auto-extracted. |
| Forget semantics | Append a tombstone, don't rewrite the file ({id, ts, kind:"forget", target:<other_id>}) |
Append-only preserves history. M.load_memory resolves tombstones during read — silently drops any item whose id appears as a forget-target. :memory clear writes one tombstone per active item; could also support a wildcard forget. |
| Auto-summarize cadence | Manual only in v1 (:memory summarize). Auto-trigger on :quit or by token count is Q-list material. |
Conservative; users opt in. Avoids burning tokens on every session end. Manual surface lets the user QA candidates before they land. |
| Summarizer model | The fast preset by default (cheap; quality good-enough for extraction); configurable via cfg.memory.summarizer_model |
Summarization is recall over precision — fast model's tendency to err on the side of inclusion is fine because the user filters per-candidate. |
| Startup injection mechanism | A new dynamic block on the system prompt, appended by context.to_messages() when ctx.memory_items is non-empty |
Same hybrid-prompt pattern as Phase 2's MCP block and Phase 3's NORRIS suffix. No new context structure beyond a list on the Context. |
| Injection budget | cfg.memory.inject_max_chars (default 2000 chars total — roughly 500 tokens) |
Cap so memory doesn't eat the whole context. LRU-by-ts selection if items exceed budget. |
| Pruning policy | Manual :memory forget + optional cfg.memory.prune_older_than_days (default unset — no auto-pruning) |
Conservative defaults; user owns the lifecycle. |
| Interaction with sessions | memory.jsonl is independent of sessions/*.jsonl. Session JSONL stays the per-conversation log; memory is the curated cross-session knowledge |
Distinct concerns. Session log answers "what did we talk about last Tuesday?"; memory answers "what does aish know about me/this-project?". |
| Concurrency | Single-writer assumed (one aish process per memory dir). Reader is the same process | Same assumption as session logs. Multi-process memory sharing is out of scope. |
3. Module Changes
| File | State after Phase 3 | Phase 4 changes |
|---|---|---|
history.lua |
M.open(path, meta), session:append(turn), M.load(path), M.list_sessions(dir) |
Add memory functions alongside session functions: M.open_memory(path) -> memory_handle; memory:add(kind, content, tags?, source?) -> id; memory:forget(id); M.load_memory(path) -> items_table (resolves tombstones). memory_handle is similar shape to session_handle — internal fd + monotonic counter. |
context.lua |
system prompt + MCP block + NORRIS suffix toggle | Add a memory_items field on Context. to_messages() composes a dynamic "[background]" block on the system prompt when memory_items is non-empty AND not already in Norris mode (don't double-pile). Cap respected via the inject_max_chars budget. |
repl.lua |
meta cmds + tool sub-loop + Norris driver | New meta: :remember <text> (shortcut for :memory add fact <text>); :memory add <kind> <text>; :memory list; :memory forget <id>; :memory clear; :memory summarize. At startup, after loading config + opening session, also open memory handle and inject the top-N items into ctx.memory_items. |
broker.lua |
streaming chat + opts.tools/max_tokens/timeout_ms | No structural changes. Used by the summarizer (calls broker.chat with the session log as a single user turn). |
config.lua |
example with mcp + safety blocks | Add commented-out memory = { ... } example. Default behavior is "no memory injection, no auto-summarize". |
executor.lua |
unchanged | unchanged |
safety.lua |
is_destructive + norris_step | unchanged |
No new module files. All Phase 4 functionality grows existing files —
mostly history.lua and repl.lua.
4. memory.jsonl Format
{"id":1,"ts":"2026-05-13T19:01:01Z","kind":"fact","content":"User prefers terse responses; no end-of-turn summaries."}
{"id":2,"ts":"2026-05-13T19:01:35Z","kind":"pref","content":"Default to :model deep for code reasoning tasks."}
{"id":3,"ts":"2026-05-13T19:02:00Z","kind":"context","content":"Current project: aish (LuaJIT REPL with MCP tools).","tags":["aish","luajit"]}
{"id":4,"ts":"2026-05-13T20:00:00Z","kind":"forget","target":2}
After load_memory, item id=2 is dropped because of the tombstone.
Active items: 1, 3.
kind values
fact— factual statement about the user, their environment, or project state.pref— user preference for aish behavior (response style, default model, etc.).context— project / domain context that helps the model orient on common tasks.forget— tombstone; refers to another id viatarget.
v1 is lightly typed — the model sees all kinds identically as a flat
list in the [background] block. Future phases may route them
differently (e.g. pref into a system-prompt section, context into
a user-style preamble). Today they're prose.
5. Startup Injection
When aish boots and cfg.memory is present (or memory.jsonl exists):
history.load_memory(path)reads all items, applies tombstone resolution, returns active items sorted bytsdescending (most recent first).- Take items until
cfg.memory.inject_max_chars(default 2000) is consumed. Older items are dropped from injection (still in the file). - Store on
ctx.memory_itemsas an array of{kind, content}(id and ts not needed at render-time).
context.to_messages() composition:
<DEFAULT_SYSTEM_PROMPT>
<Phase 2 MCP block>
[background] (memory loaded at startup; managed via :memory)
- (fact) User prefers terse responses; no end-of-turn summaries.
- (context) Current project: aish (LuaJIT REPL with MCP tools).
Order of suffixes on the system prompt:
- Default Phase 0 prompt
- Phase 2 MCP guidance block (always present)
- Phase 4 [background] block (when memory_items non-empty)
- Phase 3 NORRIS MODE block (when norris_active)
Norris is last so its instructions take precedence when active.
6. :memory summarize (Manual Auto-Extraction)
:memory summarize triggers the active model (or
cfg.memory.summarizer_model if set) to read the current session's
turns and propose candidate memory items.
Flow
-
Build a prompt: "Read the following conversation transcript. Extract facts, preferences, or context worth remembering across future sessions. Output ONE candidate per line, prefixed with the kind:
fact: …,pref: …, orcontext: …. Maximum 10 candidates." -
Send
ctx:to_messages()minus the [background] suffix (avoid feedback) + the user prompt above. -
Parse the response line-by-line for
(fact|pref|context): <content>shapes. -
For each candidate, prompt the user:
[memory] candidate (fact): User prefers terse responses; no end-of-turn summaries. keep? [y/N/edit]y→ write to memory.jsonl.N(or empty) → drop.edit→ readline-edit the content before write.
-
Status when done:
[aish] memory: added N candidates.
Why manual not automatic in v1
A successful auto-summarize that runs at every :quit would either:
- be expensive (tokens on every exit)
- drift over time if the model picks up noise
- compete with the user's intentional
:remember <text>curation
Manual gives the user the trigger. Q-list tracks auto-cadence options.
7. Meta Commands (Phase 4 additions)
| Command | Action |
|---|---|
:remember <text> |
Shortcut for :memory add fact <text> |
:memory add <kind> <text> |
Append a memory item (kind ∈ fact, pref, context) |
:memory list |
Show all active memory items (id + ts + kind + content) |
:memory forget <id> |
Append a tombstone for <id> |
:memory clear |
Forget all active items (with [y/N] confirm) |
:memory summarize |
Extract candidate items from current session via LLM |
:memory inject |
Re-inject current memory.jsonl items into Context (after edits) |
:help updated.
8. Configuration Schema (Phase 4 example block)
memory = {
-- Path defaults to <history.dir>/memory.jsonl. Override per fleet
-- if you want shared memory (read-only is safer than write-shared).
-- path = (history.dir or "~/.local/share/aish") .. "/memory.jsonl",
-- Cap on how much memory content is injected into the system prompt
-- at startup. Roughly 2000 chars ≈ 500 tokens. Older items are
-- dropped from injection if exceeded; they remain in the file.
inject_max_chars = 2000,
-- Which model to use for :memory summarize. Defaults to the active
-- model when nil. Use "fast" for speed; "deep" for better quality.
summarizer_model = "fast",
-- Auto-prune items older than N days at startup. nil = never auto-prune.
-- Manual :memory forget always works regardless.
-- prune_older_than_days = 90,
}
9. Migration from Phase 3
User-visible:
:remember,:memory list / forget / clear / summarizeare new meta commands.- A
[background]block in the system prompt appears when memory items exist. - Existing configs without
memory = {...}continue to work — no injection, no auto-summarize. Phase 3 behavior intact.
Substrate (PHASE0.md §3) invariants: unchanged.
The [background] system-prompt suffix is composed dynamically by
context.to_messages() (same pattern as Phase 2 MCP block and Phase 3
NORRIS suffix). No new substrate contract.
10. Out of Scope (Phase 4)
Per PHASE0.md §11 these belong to later phases:
- Multi-model routing / cloud fallback (Phase 5).
- Tree-sitter syntax highlighting (Phase 6).
Specifically out of Phase 4 scope despite proximity:
- Multi-process memory sharing (single-writer assumed v1).
- Retrieval-augmented injection (RAG over memory.jsonl) — v1 just LRU.
- Auto-trigger of
:memory summarizeat:quit(Q-list). - Memory categories beyond fact/pref/context — minimal typing v1.
- Cross-aish-instance memory sync (memory.jsonl in a synced dir works coincidentally; not designed for it).
- Encryption at rest — same posture as session logs (none in v1).
11. Open Questions
| # | Question | Impact | Resolve by |
|---|---|---|---|
| Q31 | Auto-summarize trigger: manual only (current), automatic at :quit, automatic on token-budget eviction, or config-flagged threshold? |
history.lua + repl.lua | Phase 4 (analyze) |
| Q32 | Editing memory items in place: :memory edit <id> to rewrite content? Append-only means edit = new id + forget old. Worth the extra meta? |
history.lua + UX | Phase 4 (analyze) |
| Q33 | Memory injection while in Norris mode: does the [background] block stay, get suppressed, or merge with the Norris goal? Proposal: keep both; Norris is the last block and dominates. | context.lua | Phase 4 (plan) |
| Q34 | Memory kinds: stick with fact/pref/context or split prefs into a dedicated section of the system prompt (where they're more impactful)? v1 says no — flat list. | context.lua + UX | Phase 5 if it bites |
| Q35 | Privacy / redaction: :memory summarize could capture sensitive tokens from a chat (passwords, paths). Should it auto-redact? Strip command-history-style? |
safety.lua + memory.lua | Phase 4 (verify) — review user-emergent risk |
| Q36 | Memory deduplication: user adds the same fact twice. Detect and warn, dedupe silently, or allow? v1: allow (cheap; user can :memory list to spot). |
history.lua | Phase 4 (verify) |
12. Implementation Plan (commit-by-commit)
Bottom-up, same cadence as Phase 0/1/2/3. Five commits expected:
-
history.lua— memory store. AddM.open_memory,memory:add(kind, content, tags?, source?),memory:forget(id),M.load_memory(path)with tombstone resolution. Persistent monotonic counter via a sidecarmemory.idfile (or scan the JSONL for max id at open time — pick at analyze). Test in isolation: round-trip add/forget/load against a temp file. -
context.lua— memory injection. Addctx.memory_itemsand the[background]block composer into_messages(). Cap byinject_max_chars. Test in isolation: assert composition order (MCP → background → Norris); cap honored. -
repl.lua—:remember+:memory list / add / forget / clear / inject. At startup, after MCP setup, open the memory handle + LRU-load items. Hook the meta dispatch. No summarize yet. End-to-end: run aish,:remember X,:quit, restart,:memory listshows X,:historyshows X in [background]. -
:memory summarize— manual extraction. Bundle a system-prompt for the summarizer model; parse response; per-candidate confirm prompt; append accepted items. End-to-end: short conversation, summarize, accept one of two candidates, restart, verify accepted one persists. -
config.lua— example memory block. Documentation-only; commented-out example. Final commit.
Risk / non-obvious
- Counter persistence:
memory:addneeds a monotonic id. Options: (a) sidecarmemory.idfile with a single integer, (b) scan the JSONL on open for max id, (c) use timestamp as id (no monotonic guarantee across rapid adds). Plan: (b) — scan once at open; cache in the handle. Wraps if integer overflow but at 2^53 entries we're fine. - Tombstone resolution at load: build a set of forget-target ids from kind=="forget" entries; filter active items to exclude. Order doesn't matter (tombstones can appear before their targets if the file is hand-edited; the resolution is set-based).
- Empty file at open vs nonexistent file: both should yield an
empty memory handle. Phase 1's
history.openalready handles file creation; extend the pattern. - System prompt growth: the suffix-stacking pattern is up to 4
blocks now (default + MCP + background + Norris). Token cost ~200
- ~80 + 2000 + ~250 = ~2530 chars baseline before any user/asst turns. Worth measuring at baseline phase.
:memory summarizeparse robustness: small models may emit "fact: ..." sometimes with markdown bullets, sometimes without. Parser should tolerate^[-*]?\s*(fact|pref|context):\s*(.+).:memory clearwith confirm: same UX as Phase 3 destructive prompts.[y/N]default-no.
Open at plan; resolve at review
- Whether
:remembershould append to the LIVEctx.memory_itemsimmediately (so the model sees it on the next turn without restart) or only on next session boot. v1 says yes — append both to file AND to live ctx for immediate visibility. - Whether the summarizer should be fed the FULL session log or just recent turns (token budget). v1 says full minus the [background] suffix; cap at session-log size <= 64KB or last N turns.
End of Phase 4 Manifest — aish