Documents the new cadence-summarization keys (summarize_every_n_turns,
summarize_keep_recent) inline with the existing Phase 5 eviction
summarize block. Notes the composition with summarize_on_evict and
the Norris suppression parity.
Config-only commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the new opt-in keys (auto_summarize_on_quit,
min_turns_for_summary, summary_model) inline with the existing
Phase 4 memory block. Notes that the older summarizer_model key
is still honored for back-compat.
Config-only commit; no behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the new Phase 10 / #89 surface for users: when, why, and
how to set cfg.norris.preplanner + .executor + .tasks_max. Notes the
graceful single-model fall-back when preplanner is unset OR fails,
and the design choice that preplan does NOT route via call_broker
(retry would silently swap planning models).
C5: config-only commit; no behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Small local models effectively use a fraction of their advertised
context window. Per-request compression for routes that hit a
local-compress-flagged model preset: keeps only the last N turns
and tail-truncates oversized content. Cloud routes get the full
context unchanged.
Changes:
- context.lua _compress_turns(turns, keep, max_chars): returns a
new list (self.turns NEVER mutated) with the last `keep` turns
preserved + content tail-truncated to `max_chars`. Defensive:
drops tool turns at the slice head (orphaned without their
assistant-with-tool_calls anchor — strict chat templates would
reject them; same gotcha PHASE0 §6 warned about for user/user).
- Context:to_messages(opts) — opts.compress = { keep_turns,
max_turn_chars } swaps the turn iterable for the compressed
view. Affects BOTH the use_tool_role=true path and the
use_tool_role=false fallback (PHASE2.md Q18 strict-template
workaround). Persistence + display via :history see the full
uncompressed ctx.turns.
- repl.lua ask_ai: when req_cfg (the routed model's cfg) has
`local_compress = true`, build compress_opts from
config.context.compress (defaults keep_turns=2, max_turn_chars=800).
Pass through ctx:to_messages alongside the existing
system_prompt_override (#86) — orthogonal opts that compose.
- Norris unaffected: safety.norris_step builds its own messages
array; the planner needs full history per PHASE3 design.
- config.lua gains a header comment explaining the per-model opt-in
+ the context.compress defaults block + the documented tool-turn
truncation trade-off.
13 unit cases verified:
- no opts -> full turn list (no regression)
- keep_turns=2 -> exactly last 2 emitted
- long content tail-truncated to max_chars
- self.turns unchanged after render
- orphan tool-turn at slice head dropped (no chat-template violation)
- tool turn included WITH its assistant anchor when keep_turns >= 3
E2E against live local broker:
- models.fast.local_compress = true; keep_turns=1; max=200
- 4-turn session: each broker call sees ONLY the current turn
(verified by short coherent CMD replies despite no cross-turn
memory available to the model). FR-promised small-model
friendliness in action; conversation continuity is the
documented trade-off.
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
llama.cpp constrains the sampler to ONLY emit tokens matching a
GBNF grammar. For small models this kills format drift at the
token level — `CMD: <cmd>` is enforced by the sampler rather than
hoped for via prompt discipline.
Probe finding (this commit's pre-implementation): cloud (Anthropic
via Bedrock) silently IGNORES the `grammar` field — returns normally
via standard sampling. Default passthrough is safe for all routes;
no per-model opt-in/opt-out needed in v1.
Changes:
- broker.lua build_request: `if opts.grammar then req.grammar =
opts.grammar end`. Misformed grammar surfaces at request time
via the existing transport-error path.
- repl.lua ask_ai: `grammar_override = config.routing.grammars
[req_class]` (same gating shape as #86's system_prompts override).
Passed via opts.grammar in the call_broker invocation.
- safety.lua is_destructive threads cfg.safety.probe_grammar through
opts.grammar so llm_probe constrains the YES/NO output. Skips
the regex-match dance entirely when the model can't drift.
Caller-provided opts.grammar takes precedence over cfg.
- config.lua gains two commented examples:
* routing.grammars per class
* safety.probe_grammar for the destructive probe
6 unit cases verified (stubbed curl.post_sse / broker.chat):
- default: no grammar in body
- opts.grammar -> body contains grammar JSON-encoded
- safety probe_grammar reaches llm_probe via opts
- no probe_grammar configured -> opts.grammar nil
- caller opts.grammar takes precedence over cfg.safety.probe_grammar
E2E against live local broker:
- `routing.grammars.default = "root ::= \\"ACK\\""` configured;
prompted "tell me a long story about a fox" -> model output
EXACTLY "ACK" (sampler forced; would normally produce paragraphs).
Grammar passthrough end-to-end confirmed.
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Small local models follow precise structured instructions better than
natural language. Per-routing-class system_prompt override gives them
tighter instructions for THAT request while preserving ambient context.
Changes:
- Context:to_messages(opts) — opts.system_prompt_override REPLACES
the base system_prompt for THIS render only (state unchanged).
Dynamic blocks ([background], [project], [earlier summary], NORRIS
suffix) still compose on top. opts is optional; nil-safe for old
callers.
- repl.lua ask_ai — captures req_class from router.classify_model
(already returned by Phase 5; previously discarded after the
status line). Looks up config.routing.system_prompts[req_class];
passes as opts.system_prompt_override to ctx:to_messages each
iteration of the tool-sub-loop.
- Gating: override fires only when routing.auto is on (no class ->
no override). If system_prompts[class] absent for a class, fall
through to the default system_prompt (no surprise).
- Norris unaffected: safety.norris_step builds its own messages
array; doesn't go through this path.
- config.lua gains a commented-out example showing routing.system_
prompts with the code/default examples from the FR body.
Smoke verified:
- 12-case context.lua unit test: opts nil/absent/present, override
replaces base, dynamic blocks still compose, state unchanged
after call, Norris-mode coexistence (suffix still present;
background still suppressed).
- E2E against cloud broker with routing.system_prompts.code set:
triple-backtick prompt -> code class -> override fires; model
emits terse code-only output. Non-code prompt -> default class
-> no override -> normal verbose-ish reply.
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
config.lua header gains a Phase 9 paragraph documenting the
project-overlay feature + the R7 shallow-merge warning ("if your
.aish.lua sets a top-level block, it REPLACES the user's entire
block — list every entry OR omit the block"). Inspect at runtime
via `:config show`.
docs/PHASE9.md status header bumped: "Plan + review fold-in" ->
"Implement". Lists the 4 implement commits inline:
e525063 history: trust file helpers
34b465d main: project-overlay loader
5b6ee55 repl: :config show meta + HELP
this config template comment + status bump
Phase 9 implementation complete. Next inner-loop step: verify
(file TCs, run autonomous, close) + memory-update.
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R9-resolved single-owner of the status bump (commit #5 didn't touch
PHASE6.md per the review fold-in).
config.lua:
- Commented-out `project = { auto_tree, tree_depth, tree_max_chars }`
block with the same shape as the Phase 1-5 example blocks.
- Note that :diff / :tree / :highlight all work without config; the
`project` block ONLY controls the startup auto-inject.
- Note about :highlight v1 having no config flag (runtime-only),
cross-references the in-REPL install hint.
docs/PHASE6.md:
- Status header bumped: "Plan + review fold-in" -> "Implement"
- Lists the 6 implement commits in the header for traceability:
c4fc7fd context: compose_project plumbing
d1dce83 _scan_project_tree + :tree + auto_tree hook
4d5f93a :diff + _git_clean_cmd (B1 helper)
0d63f01 expand_mentions @<r1>..<r2> tiered resolution
11d0e59 tree-sitter highlighter (renderer fence filter +
highlighted dispatch + :highlight meta)
this config example + status bump
Phase 6 implementation is complete. Next inner-loop step is verify
(7) — user-driven smoke tests against the live broker on each pillar
plus filing of issues for any defects, then memory-update (8).
Regression: test_safety 87/87, test_router_model 31/31, repl loads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plumbs the secrets.lua module (commit e4b818b) into the conversation
pipeline. Hook points:
ask_ai — scrub_messages(ctx:to_messages(), mode) before
call_broker; rehydrate streamed deltas via
streaming_rehydrator so the user sees real values
while text_parts accumulates rehydrated chunks
(final_resp is plain — CMD: / DELEGATE: extractors
see plain values)
MCP dispatch — dispatch_tool_call rehydrates the args table before
sess:call_tool so the trusted MCP server receives
real values (the model emitted placeholders because
it saw a scrubbed context)
DELEGATE: & :delegate
— scrub sub_msgs before broker.chat; rehydrate sub_text
before appending to context, so future turns see
real values restored
Phase 5 summarize-on-evict
— scrub sum_msgs before broker.chat; rehydrate the
reply that becomes ctx.summary
:memory summarize
— same scrub + rehydrate pair
Mode resolution per call: model_cfg.redact → config.secrets.default →
"vault+autodetect" if vault loaded, else "off".
ctx storage convention: PLAIN values throughout. The scrub happens at
the egress (broker call) per the active redact mode; ctx.turns never
holds placeholders for content the user typed or executor produced.
The model's own emissions (assistant tool_call arguments) may carry
placeholders because the model saw the scrubbed context — rehydrated
at MCP dispatch and otherwise harmless on re-serialization (idempotent
re-scrubbing).
New meta:
:secrets [status] vault entries, placeholders allocated this
session, active broker mode. Never prints
actual values (vault file is itself a
secret per gotcha 7).
:secrets check <text> dry-run scrub against the active broker's
mode — shows the output transformation.
Documented in config.lua with a commented-out block + per-broker
redact field example.
Deferred to a follow-up issue (clearly scoped):
- safety.lua broker call sites (Norris main loop, is_destructive
LLM second-opinion probe) — same wiring pattern, but they don't
currently see secrets_session; needs threading through helpers.
- @-mention file content is appended PLAIN to ctx and scrubbed at
egress alongside the rest of the user turn (covered by the
ask_ai scrub).
- exec output streamed live to terminal is pre-scrub (user sees
real values in their own shell — by design); the captured-for-
context copy is scrubbed at egress alongside the rest.
This is the "full scope" implementation chosen via AskUserQuestion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Builds, long-running network calls, and file watches no longer block
the turn. A new "CMD&: <cmd>" marker (analogue of CMD:) tells the REPL
to spawn the command in the background, return immediately, and poll
for completion between user inputs.
Process model: shell-wrapped to avoid needing fork()/execv() FFI.
nohup sh -c '(<cmd>) > <log> 2>&1; echo $? > <status>' </dev/null
>/dev/null 2>&1 & echo $!
The child is reparented to init; we hold only the PID and the path to
the .status sidecar. Completion is detected by the .status file
existing (the wrapper writes it as its last act). No waitpid needed —
the child isn't ours after the popen subshell exits.
Storage: <history.dir>/bg/<id>.log + <id>.status. The directory is
created lazily at startup (mkdir -p). Requires history.dir to be
configured; without it CMD&: emits an error status and the model
sees an "[bg failed to start]" exec-output note.
check_bg_done() runs at the top of each main-loop iteration alongside
check_every_due(). When a job is detected as exited, the REPL:
- emits a status line "[bg:<id> exited <code>, <bytes>, <secs>s wall] <cmd>"
- appends the same string to ctx as exec output, so the model sees
the completion on its next turn (natural follow-up: "ok the build
finished; let me check the log")
Meta surface:
:bg-spawn <cmd> start a bg job directly (no AI needed; also
useful for testing without depending on the
model emitting CMD&:)
:bg-list show running/done jobs (id, pid, state, runtime, cmd)
:bg-output <id> dump the log file to stdout
:bg-kill <id> SIGTERM (note: only delivers if the PID is
still the actual command — long-lived shells
may need pkill by name)
Scope (deliberately limited for v1):
- No callback-mode readline: bg completion detection is pre-prompt,
not mid-readline. If a build finishes while the user is typing,
notification comes when they hit Enter.
- Permission policy DSL (#9) does NOT apply to CMD&: — the
asynchronous gating model wasn't designed for the y/N flow.
Filed as follow-up if needed.
- Norris not extended: helpers.exec_cmd is still synchronous; the
planner doesn't dispatch bg jobs.
- Plan mode interaction: CMD&: in plan mode emits "PLAN: & <cmd>"
and a "[plan] would bg-run: <cmd>" exec-output note, no spawn.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The confirm_cmd boolean was too coarse: true interrupts every harmless
ls; false ungates everything. Most workflows want trust for read-only
ops while still gating writes/network/sudo.
New config:
permissions = {
allow = { "^ls%s", "^cat%s", "^git status" },
confirm = { "^rm%s", "^git push", "^docker%s", "^sudo%s" },
deny = { "^ssh%s+root@", "^curl%s+http[^s]" },
}
Verdict order: deny > confirm > allow. First match in the chosen
category wins. Unmatched defaults to "confirm". Patterns are Lua
patterns (not regex) per PHASE0.md §3 — no compiled extensions.
Verdict behavior in the interactive CMD: loop:
- allow → run without prompt
- deny → status line, skip
- confirm → [y/N] prompt (same UX as legacy confirm_cmd=true)
Backward compat:
- permissions unset + confirm_cmd=true → always confirm
- permissions unset + confirm_cmd=false → always allow
- permissions set → policy table is authoritative
Scope deliberately limited to the interactive AI-suggested CMD: gate.
Norris autonomous mode keeps its own safety.is_destructive machinery
(combining the two would double-gate or replace the LLM probe — both
non-obvious behavioral changes that belong in their own issues).
User-typed shell-routed lines (`router.classify → "shell"`) and
:exec also bypass the policy by design — those are direct user intent.
New introspection:
:perms list — show the configured rule lists
:perms check <cmd> — report verdict + matching rule (debug)
safety.classify_command is exported and unit-tested with 12 cases
covering each category, priority order (deny > allow on overlap),
and both fallback paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Optional shell scripts trigger around every CMD: execution. Use cases:
audit logging, auto-format-after-edit, custom safety gates beyond the
existing confirm_cmd boolean.
Config shape:
hooks = {
pre_cmd = "/path/to/pre-script",
post_cmd = "/path/to/post-script",
}
Contract per hook invocation:
- The command line is piped to the hook on stdin.
- Env vars: AISH_CMD (the command), AISH_TURN (#ctx.turns at the
moment of dispatch), AISH_CWD (libc.getcwd() result).
- Hook stdout is streamed live to the terminal via executor.exec
(so the user sees its output regardless of exit status).
Pre-hook: non-zero exit aborts the command and emits a status line
including the exit code. last_exec_code is set to the hook's exit
so the {last_status} prompt template variable reflects the abort.
Post-hook: exit code is ignored (the spec says so); only the visible
stdout matters. Runs after the command's exec_end frame.
Tested with success, abort, and stdin-matches-env paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
At-a-glance situational awareness: see the active model, context fill,
mode flags, and cwd in the prompt itself — prevents "wait, am I still
in plan mode?" surprises.
Example config:
shell = {
prompt = "[{model} {ctx_used}/{ctx_max}t T{turn} {mode}] {cwd_short} > ",
}
Variables (substituted via {name}):
{model} active preset name
{ctx_used} char/4 token heuristic (Phase 0 §8; accurate is Q1)
{ctx_max} config.context.token_budget
{turn} #ctx.turns
{cwd} libc.getcwd() (chdir-aware; PWD env may drift)
{cwd_short} cwd with $HOME -> ~
{last_status} last exec exit code, "" if none yet
{mode} "norris" | "plan" | "normal"
Default behavior unchanged when shell.prompt is unset — keeps the
"[aish:<model>]>" form with norris ⚡ and plan markers.
Side wiring:
- ffi/libc.lua gains getcwd() (chdir() doesn't update PWD).
- run_shell records exit code into last_exec_code for {last_status}.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
qwen3-30b-a3b-instruct isn't loaded on hossenfelder right now (per
/v1/models). deepseek-coder-v2-lite IS loaded — 16B MoE with ~2.4B
active params; fast enough that the 30-min timeout from the
qwen3-30b config was wildly over-budget.
Switched to deepseek-coder-v2-lite for the time being. Restore
qwen3-30b when the slot is back up.
Live-probed: YES/NO destructive probe via the deep model preset
returns "YES." in ~4.8s — well within the new 5-min timeout, and
fast enough that the Phase 3 LLM second-opinion path is now
functional again without falling back to "fail-safe YES" on every
ambiguous command.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 commit #5 (final) per docs/PHASE5.md §11. Documentation-only;
commented-out example showing:
- routing.auto (per-request auto-routing toggle)
- routing.classes (class → model mapping; reasoning = nil
by default per R-N2 cost-safety)
- routing.fallback (single-hop retry to cloud on transport fail)
- routing.fallback_model (default "cloud" if uncommented)
- context.summarize_on_evict + summarizer_model + max_summary_chars
(shown INSIDE the context = {...} block above)
All defaults OFF — Phase 5 is opt-in across the board. Existing configs
without `routing` or `context.summarize_on_evict` behave identically to
Phase 4.
Phase 5 implementation complete:
#13e57824 router.classify_model + 31-case corpus
#203497b5 context summarize_fn callback + summary block in to_messages
#340ea0b4 repl routing + fallback + summarize_fn wiring + :route/:fallback
#4 - (bundled into #3 since meta cmds are trivial additions)
#5 (this) config example block
Phase 5 verify-partial:
- router.classify_model: 31/31 case corpus passes
- context summarize-on-evict: mock callback fires correctly (additive
+ compress paths), summary suppressed under Norris, :reset clears it
- repl meta cmds: :route on/off/classes/check + :fallback on/off all
work; :route check reports class + "routing currently disabled"
suffix when auto is off (N1)
Verify-pending: end-to-end with real broker (route a code question, see
it land on deep; kill local backend, see fallback fire to cloud).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 commit #5 (final) per docs/PHASE4.md §12. Documentation-only;
commented-out example showing:
- inject_max_chars (cap on startup injection; default 2000)
- summarizer_model (which configured model :memory summarize uses)
The block is OFF by default. The :memory meta surface (:remember,
:memory list/forget/clear/inject/summarize) works without the block —
items persist to <history.dir>/memory.jsonl regardless. The block only
configures the injection-into-system-prompt behavior + summarizer model
choice.
Phase 4 implementation complete:
#1199dd87 history.lua memory store + ffi/libc.lua flock
#2c1a5c73 context.lua [background] block (suppressed in Norris)
#33b074af repl.lua memory handle + :remember + :memory meta
#4f22d21d :memory summarize — LLM candidate extraction
#5 (this) config.lua memory example block
Phase 4 verify-partial:
- history memory round-trip tests: add/forget/load all green
- flock single-writer enforcement verified
- context composition order (DEFAULT → [background] → NORRIS) +
Norris suppression all green
- End-to-end persistence across boots: :remember on boot 1 visible
on boot 2 as injected memory items
- :memory forget id-not-active surfaces clean status (N1)
- :memory clear with [y/N] confirm gate works
- :memory summarize wire-correct against fast model (candidate
parsing tolerates bullets; per-candidate y/N/edit prompts fire)
Verify-pending: real-model summarizer quality test (deep/cloud);
multi-process flock contention test; long-running :memory inject
race with running broker stream.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3 commit #6 (final) per docs/PHASE3.md §12. Documentation-only;
commented-out example showing the safety schema:
- llm_second_opinion (bool, default true)
- llm_model (string, default deep→default_model fallback)
- max_norris_steps (int, default 8)
The block notes the model-selection trade-off (R-B2): cloud is the
independent-class fast option (costs money), deep is the local-but-slow
option, fast is self-policing and NOT recommended.
No behavior change to existing configs — safety defaults kick in when
the block is absent.
Phase 3 implementation complete:
#1bd59ce7 safety static patterns (34 rules) + 87-case test corpus
#22abd5da LLM second-opinion + session cache + opts.max_tokens
#3d2a53d2 renderer Norris frames
#411b1f56 safety.norris_step planner (single iteration)
#5a404b2a repl driver + \C-n real binding + :norris/:safety meta
+ readline rl_insert_text/rl_redisplay
#6 (this) config.lua safety example block
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 7 verify finding from TC #26 against :model cloud:
HTTP 400 from openrouter→Amazon Bedrock:
"tools.0.custom.name: String should match pattern
'^[a-zA-Z0-9_-]{1,128}$'"
Anthropic via Bedrock validates tool names against that regex and
rejects dots. PHASE2 originally chose "." as the namespace separator
("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not.
Separator switched to "__" (two underscores) everywhere — internal
API matches on-wire shape, no transformation layer:
- repl.lua:
- tools_schema builds "alias__name"
- dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __)
- :mcp tool parser uses same split
- :mcp tools formatter prints "alias__name"
- HELP block shows <alias__name>
- safety.lua confirm_tool_call: alias.* glob → alias__* glob
- config.lua example block: keys rewritten
- docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua
row, §5 wire-shape JSON examples, §6 auto_approve schema, §7
meta-cmd table, §12 plan all updated. Original "." references
preserved in commit history.
Constraint: aliases must not themselves contain "__" so the parse
stays unambiguous. Tool names from MCP servers may have underscores
freely.
Second fix bundled — uninformative broker error:
Previously "broker error: transport: HTTP response code said error"
Now "broker error: transport: HTTP 400: {full body snippet}"
ffi/curl.lua M.post_sse changes:
- FAILONERROR no longer set (was hiding the response body).
- raw_body accumulator added alongside the SSE buffer; captures
every byte regardless of SSE shape.
- After perform, check status_code via curl_easy_getinfo. On >=400,
return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged.
- End-of-stream SSE flush only runs on 2xx (no false event on
error bodies that aren't SSE-shaped).
- Phase 1 callers reading just first return slot stay correct.
End-to-end verified:
- :model cloud + tools=[boltzmann__read_file ...] +
"Use boltzmann__read_file with path=/etc/hostname" →
Claude emits tool_call with name="boltzmann__read_file",
args='{"path": "/etc/hostname"}'. ok=true, transport clean.
- Force-bad tool name "bad.name.with.dots" → err string carries
the full bedrock 400 with the regex-pattern message visible.
TC #26 (sub-loop end-to-end) is now testable against cloud — the
error that blocked it is resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 commit #7 (final) per docs/PHASE2.md §12. Two changes bundled:
(1) commented-out mcp = {...} example block (~40 lines) at the end of
config.lua showing the Phase 2 schema:
- mcp.servers — alias → {url, auth_token | auth_env}
- mcp.auto_approve — "<alias>.<tool>" or "<alias>.*" globs
- mcp.max_tool_depth — sub-loop budget per ask_ai turn
The block is OFF by default; uncomment + adjust per fleet to
activate. Documentation-only; no behavior change to existing
configs (mcp_sessions stays empty, tools_schema() returns [],
broker omits the field — full Phase 1 compatibility).
(2) User-authored: deep model preset switched from
mistral-nemo-12b-instruct to qwen3-30b-a3b-instruct, with a 10-min
timeout_ms accommodating the larger model's RK3588 inference time.
Reason: nemo backend is dormant per the proxy /v1/models discovery
(aish#23 now returns 404 cleanly for unknown models instead of
silent fallback); qwen3-30b is the practical "deep" alternative.
Phase 2 implementation is now complete — 7 of 7 commits landed:
#16c194de mcp.lua + ffi/curl status_code + PHASE0 §4 amendment
#20fde77f safety.lua confirm_tool_call
#37c221a8 context.lua tool turns + use_tool_role fallback
#4c736d0e renderer.lua tool-call frames
#5efdc728 broker.lua opts.tools + tool_call accumulator
#67e9cfff repl.lua sub-loop + :mcp meta + system-prompt block
#7 (this) config.lua example + deep model switch
Next phase-loop step: verify (Phase 7). Files written are wired and
isolated-tested; end-to-end model-driven verification waits on either
a more compliant model or explicit forcing of tool_calls from the
prompt — known to be marginal with the loaded qwen-1.5b but proven
correct against direct probes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves issue #12 by partial-accept of the recommendation.
What landed:
- Single broker URL: http://hossenfelder.fritz.box:8082 for all three
presets (fast / deep / cloud). Server-side model-aware routing; no
client-side cloud auth (proxy holds the OpenRouter bearer).
- Models from hossenfelder's /v1/models inventory:
fast -> qwen2.5-coder-1.5b-q4_k_m.gguf (boltzmann local)
deep -> mistral-nemo-12b-instruct (boltzmann local)
cloud -> anthropic/claude-haiku-4.5 (OpenRouter route)
- `cloud` was already pointing at hossenfelder but with https://; flipped
to http:// so it matches the proxy's actual scheme.
What deferred:
- Schema rename `models` -> `brokers` (and the 5-cloud-preset shape
suggested in #12) — would touch repl.lua + broker.lua. Not blocking
Phase 7. If multi-preset becomes useful in practice, file a separate
issue for the rename then.
Phase 7 verification (live broker test):
- broker.chat(fast, [user="say pong"]) -> "CMD: echo pong" in ~3s
- multi-turn arithmetic (7*8=56, *2=112) preserved across turns
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README, .gitignore, CLAUDE.md (project conventions)
- docs/PHASE0.md — full Phase 0 manifest (locked substrate)
- 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented
with module-scoped responsibilities matching the manifest
- config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b
snappy/32k + cloud via OpenRouter through hossenfelder)
File names match docs/PHASE0.md §4 exactly. Module bodies fill in across
later phases; the tree shape is locked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>