f0bccdec48636248b437e0abfcf3f2ea875caaaf
14 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ac58b19da2 |
config + docs/PHASE6: example block + status -> Implement (Phase 6 commit #6)
R9-resolved single-owner of the status bump (commit #5 didn't touch PHASE6.md per the review fold-in). config.lua: - Commented-out `project = { auto_tree, tree_depth, tree_max_chars }` block with the same shape as the Phase 1-5 example blocks. - Note that :diff / :tree / :highlight all work without config; the `project` block ONLY controls the startup auto-inject. - Note about :highlight v1 having no config flag (runtime-only), cross-references the in-REPL install hint. docs/PHASE6.md: - Status header bumped: "Plan + review fold-in" -> "Implement" - Lists the 6 implement commits in the header for traceability: |
||
|
|
d852acadc2 |
repl: wire #13 secrets — scrub outbound, rehydrate stream + tool args
Plumbs the secrets.lua module (commit
|
||
|
|
f94d16fc89 |
repl: background CMD&: with handle/poll (closes #8)
Builds, long-running network calls, and file watches no longer block
the turn. A new "CMD&: <cmd>" marker (analogue of CMD:) tells the REPL
to spawn the command in the background, return immediately, and poll
for completion between user inputs.
Process model: shell-wrapped to avoid needing fork()/execv() FFI.
nohup sh -c '(<cmd>) > <log> 2>&1; echo $? > <status>' </dev/null
>/dev/null 2>&1 & echo $!
The child is reparented to init; we hold only the PID and the path to
the .status sidecar. Completion is detected by the .status file
existing (the wrapper writes it as its last act). No waitpid needed —
the child isn't ours after the popen subshell exits.
Storage: <history.dir>/bg/<id>.log + <id>.status. The directory is
created lazily at startup (mkdir -p). Requires history.dir to be
configured; without it CMD&: emits an error status and the model
sees an "[bg failed to start]" exec-output note.
check_bg_done() runs at the top of each main-loop iteration alongside
check_every_due(). When a job is detected as exited, the REPL:
- emits a status line "[bg:<id> exited <code>, <bytes>, <secs>s wall] <cmd>"
- appends the same string to ctx as exec output, so the model sees
the completion on its next turn (natural follow-up: "ok the build
finished; let me check the log")
Meta surface:
:bg-spawn <cmd> start a bg job directly (no AI needed; also
useful for testing without depending on the
model emitting CMD&:)
:bg-list show running/done jobs (id, pid, state, runtime, cmd)
:bg-output <id> dump the log file to stdout
:bg-kill <id> SIGTERM (note: only delivers if the PID is
still the actual command — long-lived shells
may need pkill by name)
Scope (deliberately limited for v1):
- No callback-mode readline: bg completion detection is pre-prompt,
not mid-readline. If a build finishes while the user is typing,
notification comes when they hit Enter.
- Permission policy DSL (#9) does NOT apply to CMD&: — the
asynchronous gating model wasn't designed for the y/N flow.
Filed as follow-up if needed.
- Norris not extended: helpers.exec_cmd is still synchronous; the
planner doesn't dispatch bg jobs.
- Plan mode interaction: CMD&: in plan mode emits "PLAN: & <cmd>"
and a "[plan] would bg-run: <cmd>" exec-output note, no spawn.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
17e62c0326 |
safety: permission policy DSL — allow/confirm/deny rule lists (closes #9)
The confirm_cmd boolean was too coarse: true interrupts every harmless
ls; false ungates everything. Most workflows want trust for read-only
ops while still gating writes/network/sudo.
New config:
permissions = {
allow = { "^ls%s", "^cat%s", "^git status" },
confirm = { "^rm%s", "^git push", "^docker%s", "^sudo%s" },
deny = { "^ssh%s+root@", "^curl%s+http[^s]" },
}
Verdict order: deny > confirm > allow. First match in the chosen
category wins. Unmatched defaults to "confirm". Patterns are Lua
patterns (not regex) per PHASE0.md §3 — no compiled extensions.
Verdict behavior in the interactive CMD: loop:
- allow → run without prompt
- deny → status line, skip
- confirm → [y/N] prompt (same UX as legacy confirm_cmd=true)
Backward compat:
- permissions unset + confirm_cmd=true → always confirm
- permissions unset + confirm_cmd=false → always allow
- permissions set → policy table is authoritative
Scope deliberately limited to the interactive AI-suggested CMD: gate.
Norris autonomous mode keeps its own safety.is_destructive machinery
(combining the two would double-gate or replace the LLM probe — both
non-obvious behavioral changes that belong in their own issues).
User-typed shell-routed lines (`router.classify → "shell"`) and
:exec also bypass the policy by design — those are direct user intent.
New introspection:
:perms list — show the configured rule lists
:perms check <cmd> — report verdict + matching rule (debug)
safety.classify_command is exported and unit-tested with 12 cases
covering each category, priority order (deny > allow on overlap),
and both fallback paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
fb15f7a690 |
repl: pre/post CMD hooks via config.hooks (closes #3)
Optional shell scripts trigger around every CMD: execution. Use cases:
audit logging, auto-format-after-edit, custom safety gates beyond the
existing confirm_cmd boolean.
Config shape:
hooks = {
pre_cmd = "/path/to/pre-script",
post_cmd = "/path/to/post-script",
}
Contract per hook invocation:
- The command line is piped to the hook on stdin.
- Env vars: AISH_CMD (the command), AISH_TURN (#ctx.turns at the
moment of dispatch), AISH_CWD (libc.getcwd() result).
- Hook stdout is streamed live to the terminal via executor.exec
(so the user sees its output regardless of exit status).
Pre-hook: non-zero exit aborts the command and emits a status line
including the exit code. last_exec_code is set to the hook's exit
so the {last_status} prompt template variable reflects the abort.
Post-hook: exit code is ignored (the spec says so); only the visible
stdout matters. Runs after the command's exec_end frame.
Tested with success, abort, and stdin-matches-env paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d738f339cb |
repl: configurable prompt template via config.shell.prompt (closes #10)
At-a-glance situational awareness: see the active model, context fill,
mode flags, and cwd in the prompt itself — prevents "wait, am I still
in plan mode?" surprises.
Example config:
shell = {
prompt = "[{model} {ctx_used}/{ctx_max}t T{turn} {mode}] {cwd_short} > ",
}
Variables (substituted via {name}):
{model} active preset name
{ctx_used} char/4 token heuristic (Phase 0 §8; accurate is Q1)
{ctx_max} config.context.token_budget
{turn} #ctx.turns
{cwd} libc.getcwd() (chdir-aware; PWD env may drift)
{cwd_short} cwd with $HOME -> ~
{last_status} last exec exit code, "" if none yet
{mode} "norris" | "plan" | "normal"
Default behavior unchanged when shell.prompt is unset — keeps the
"[aish:<model>]>" form with norris ⚡ and plan markers.
Side wiring:
- ffi/libc.lua gains getcwd() (chdir() doesn't update PWD).
- run_shell records exit code into last_exec_code for {last_status}.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d72689f709 |
config: deep model → deepseek-coder-v2-lite (temporary)
qwen3-30b-a3b-instruct isn't loaded on hossenfelder right now (per /v1/models). deepseek-coder-v2-lite IS loaded — 16B MoE with ~2.4B active params; fast enough that the 30-min timeout from the qwen3-30b config was wildly over-budget. Switched to deepseek-coder-v2-lite for the time being. Restore qwen3-30b when the slot is back up. Live-probed: YES/NO destructive probe via the deep model preset returns "YES." in ~4.8s — well within the new 5-min timeout, and fast enough that the Phase 3 LLM second-opinion path is now functional again without falling back to "fail-safe YES" on every ambiguous command. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a9b39cd435 |
config: Phase 5 routing + summarize-on-evict example (commit #5)
Phase 5 commit #5 (final) per docs/PHASE5.md §11. Documentation-only; commented-out example showing: - routing.auto (per-request auto-routing toggle) - routing.classes (class → model mapping; reasoning = nil by default per R-N2 cost-safety) - routing.fallback (single-hop retry to cloud on transport fail) - routing.fallback_model (default "cloud" if uncommented) - context.summarize_on_evict + summarizer_model + max_summary_chars (shown INSIDE the context = {...} block above) All defaults OFF — Phase 5 is opt-in across the board. Existing configs without `routing` or `context.summarize_on_evict` behave identically to Phase 4. Phase 5 implementation complete: #1 |
||
|
|
27784f9b68 |
config: Phase 4 memory example block (commit #5)
Phase 4 commit #5 (final) per docs/PHASE4.md §12. Documentation-only; commented-out example showing: - inject_max_chars (cap on startup injection; default 2000) - summarizer_model (which configured model :memory summarize uses) The block is OFF by default. The :memory meta surface (:remember, :memory list/forget/clear/inject/summarize) works without the block — items persist to <history.dir>/memory.jsonl regardless. The block only configures the injection-into-system-prompt behavior + summarizer model choice. Phase 4 implementation complete: #1 |
||
|
|
50666d092f |
config: Phase 3 safety example block (commit #6)
Phase 3 commit #6 (final) per docs/PHASE3.md §12. Documentation-only; commented-out example showing the safety schema: - llm_second_opinion (bool, default true) - llm_model (string, default deep→default_model fallback) - max_norris_steps (int, default 8) The block notes the model-selection trade-off (R-B2): cloud is the independent-class fast option (costs money), deep is the local-but-slow option, fast is self-policing and NOT recommended. No behavior change to existing configs — safety defaults kick in when the block is absent. Phase 3 implementation complete: #1 |
||
|
|
f26cbd9a3a |
phase2 amend: __ separator (Bedrock-safe) + post_sse error diagnostics
Phase 7 verify finding from TC #26 against :model cloud: HTTP 400 from openrouter→Amazon Bedrock: "tools.0.custom.name: String should match pattern '^[a-zA-Z0-9_-]{1,128}$'" Anthropic via Bedrock validates tool names against that regex and rejects dots. PHASE2 originally chose "." as the namespace separator ("boltzmann.list_dir"); OpenAI tolerated it, Bedrock does not. Separator switched to "__" (two underscores) everywhere — internal API matches on-wire shape, no transformation layer: - repl.lua: - tools_schema builds "alias__name" - dispatch_tool_call splits via "^(.-)__(.+)$" (non-greedy → leftmost __) - :mcp tool parser uses same split - :mcp tools formatter prints "alias__name" - HELP block shows <alias__name> - safety.lua confirm_tool_call: alias.* glob → alias__* glob - config.lua example block: keys rewritten - docs/PHASE2.md: amendment header added; §1, §2 row, §3 config.lua row, §5 wire-shape JSON examples, §6 auto_approve schema, §7 meta-cmd table, §12 plan all updated. Original "." references preserved in commit history. Constraint: aliases must not themselves contain "__" so the parse stays unambiguous. Tool names from MCP servers may have underscores freely. Second fix bundled — uninformative broker error: Previously "broker error: transport: HTTP response code said error" Now "broker error: transport: HTTP 400: {full body snippet}" ffi/curl.lua M.post_sse changes: - FAILONERROR no longer set (was hiding the response body). - raw_body accumulator added alongside the SSE buffer; captures every byte regardless of SSE shape. - After perform, check status_code via curl_easy_getinfo. On >=400, return (nil, "HTTP <code>: <body[:400]>"). 2xx unchanged. - End-of-stream SSE flush only runs on 2xx (no false event on error bodies that aren't SSE-shaped). - Phase 1 callers reading just first return slot stay correct. End-to-end verified: - :model cloud + tools=[boltzmann__read_file ...] + "Use boltzmann__read_file with path=/etc/hostname" → Claude emits tool_call with name="boltzmann__read_file", args='{"path": "/etc/hostname"}'. ok=true, transport clean. - Force-bad tool name "bad.name.with.dots" → err string carries the full bedrock 400 with the regex-pattern message visible. TC #26 (sub-loop end-to-end) is now testable against cloud — the error that blocked it is resolved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
09800d192a |
config: Phase 2 mcp example block + deep model switch
Phase 2 commit #7 (final) per docs/PHASE2.md §12. Two changes bundled: (1) commented-out mcp = {...} example block (~40 lines) at the end of config.lua showing the Phase 2 schema: - mcp.servers — alias → {url, auth_token | auth_env} - mcp.auto_approve — "<alias>.<tool>" or "<alias>.*" globs - mcp.max_tool_depth — sub-loop budget per ask_ai turn The block is OFF by default; uncomment + adjust per fleet to activate. Documentation-only; no behavior change to existing configs (mcp_sessions stays empty, tools_schema() returns [], broker omits the field — full Phase 1 compatibility). (2) User-authored: deep model preset switched from mistral-nemo-12b-instruct to qwen3-30b-a3b-instruct, with a 10-min timeout_ms accommodating the larger model's RK3588 inference time. Reason: nemo backend is dormant per the proxy /v1/models discovery (aish#23 now returns 404 cleanly for unknown models instead of silent fallback); qwen3-30b is the practical "deep" alternative. Phase 2 implementation is now complete — 7 of 7 commits landed: #1 |
||
|
|
8870eb0451 |
config: route all presets through hossenfelder per issue #12
Resolves issue #12 by partial-accept of the recommendation. What landed: - Single broker URL: http://hossenfelder.fritz.box:8082 for all three presets (fast / deep / cloud). Server-side model-aware routing; no client-side cloud auth (proxy holds the OpenRouter bearer). - Models from hossenfelder's /v1/models inventory: fast -> qwen2.5-coder-1.5b-q4_k_m.gguf (boltzmann local) deep -> mistral-nemo-12b-instruct (boltzmann local) cloud -> anthropic/claude-haiku-4.5 (OpenRouter route) - `cloud` was already pointing at hossenfelder but with https://; flipped to http:// so it matches the proxy's actual scheme. What deferred: - Schema rename `models` -> `brokers` (and the 5-cloud-preset shape suggested in #12) — would touch repl.lua + broker.lua. Not blocking Phase 7. If multi-preset becomes useful in practice, file a separate issue for the rename then. Phase 7 verification (live broker test): - broker.chat(fast, [user="say pong"]) -> "CMD: echo pong" in ~3s - multi-turn arithmetic (7*8=56, *2=112) preserved across turns Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4310207738 |
Phase 0: scaffold tree + manifest
- README, .gitignore, CLAUDE.md (project conventions) - docs/PHASE0.md — full Phase 0 manifest (locked substrate) - 10 root .lua modules + 4 ffi/ bindings, all stubs raising NotImplemented with module-scoped responsibilities matching the manifest - config.lua wired to current dirac/hossenfelder endpoints (qwen-coder-7b snappy/32k + cloud via OpenRouter through hossenfelder) File names match docs/PHASE0.md §4 exactly. Module bodies fill in across later phases; the tree shape is locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |