Use hossenfelder as the canonical broker endpoint #12

New Issue

2026-05-10T11:23:05Z

claude-noether commented

2026-05-10 11:23:05 +00:00

Suggestion

Wire config.lua to use hossenfelder (the LXD-hosted LLM router on the boltzmann host) as the single broker endpoint, instead of the current direct-to-llamafile addressing.

Endpoint

http://hossenfelder.fritz.box:8082/v1/chat/completions

Hossenfelder is an OpenAI-compatible proxy that:

Routes by model field on POST.
Aggregates /v1/models from all reachable local backends + a curated cloud catalog.
Auto-routes cloud-prefixed model ids (anthropic/, openai/, mistralai/, qwen/, …) to OpenRouter via server-side Bearer auth — aish never sees the API key.
Tags responses with X-LLM-Backend so the actual route is observable.
Uses ThreadingHTTPServer (so multiple aish sessions / concurrent broker calls don't serialize).

Recommended `config.lua` brokers table

brokers = {
    cloud_opus = {
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "anthropic/claude-opus-4.7",
    },
    cloud = {  -- default cloud preset
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "anthropic/claude-sonnet-4.6",
    },
    cloud_haiku = {
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "anthropic/claude-haiku-4.5",
    },
    chat = {  -- local 8B chat
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "llama-3.1-8b-instruct",
    },
    chat_big = {  -- local 12B chat (slower, sharper)
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "mistral-nemo-12b-instruct",
    },
    fast = {  -- always-up, ~10-15 t/s for snappy meta-questions
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "qwen2.5-coder-1.5b-q4_k_m.gguf",
    },
    deep = {  -- only when dirac (data) is awake
        url   = "http://hossenfelder.fritz.box:8082/v1/chat/completions",
        model = "qwen-coder-7b-32k",
    },
}

Why over direct llamafile addressing

One URL in config, regardless of which physical box hosts which model.
Failover is hossenfelder's job, not aish's: when dirac is asleep, model-aware routing falls back to boltzmann automatically.
Cloud + local symmetric — the same /v1/chat/completions POST shape works for anthropic/claude-* and llama-3.1-8b-instruct. aish doesn't need a separate cloud branch.
Discovery is free — GET /v1/models returns the live merged catalog (8 cloud + 3 local at present). aish could surface this via a future :models meta-command.

Caveats / open questions

stream: true behavior — hossenfelder forwards SSE properly per its _stream_response() path. Phase 1 SSE work in aish should be unaffected, but worth a sanity test once the FFI streaming lands.
Tool-calling forwarding (Phase 2) — hossenfelder currently passes through OpenAI-style tools payloads transparently for cloud routes (OpenRouter handles it) and for local routes (llamafile parses tool schemas if the model supports it, e.g. Hermes). When MCP support lands in aish, this needs a verification pass.
Network dependency — aish becomes unusable if hossenfelder is down. Mitigation: keep direct fallbacks (e.g. boltzmann.fritz.box:8083 for Llama 3.1) commented as failover entries the user can swap in. Or wait for Phase 5 multi-model fallback to handle this.
Auth — none currently (LAN-only deploy). If aish ever talks to hossenfelder over a non-trusted network, the proxy needs Bearer auth wired in (it already supports forwarding Authorization headers).

Source

Endpoint validated 2026-05-10 via curl http://hossenfelder.fritz.box:8082/v1/models. Hossenfelder source: boltzmann LXC hossenfelder, /opt/llm-proxy.py. Model-aware routing path tagged (routed) in proxy logs.

## Suggestion Wire `config.lua` to use **hossenfelder** (the LXD-hosted LLM router on the boltzmann host) as the single broker endpoint, instead of the current direct-to-llamafile addressing. ### Endpoint ``` http://hossenfelder.fritz.box:8082/v1/chat/completions ``` Hossenfelder is an OpenAI-compatible proxy that: - Routes by `model` field on POST. - Aggregates `/v1/models` from all reachable local backends + a curated cloud catalog. - Auto-routes cloud-prefixed model ids (`anthropic/`, `openai/`, `mistralai/`, `qwen/`, …) to OpenRouter via server-side Bearer auth — aish never sees the API key. - Tags responses with `X-LLM-Backend` so the actual route is observable. - Uses `ThreadingHTTPServer` (so multiple aish sessions / concurrent broker calls don't serialize). ### Recommended `config.lua` brokers table ```lua brokers = { cloud_opus = { url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "anthropic/claude-opus-4.7", }, cloud = { -- default cloud preset url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "anthropic/claude-sonnet-4.6", }, cloud_haiku = { url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "anthropic/claude-haiku-4.5", }, chat = { -- local 8B chat url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "llama-3.1-8b-instruct", }, chat_big = { -- local 12B chat (slower, sharper) url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "mistral-nemo-12b-instruct", }, fast = { -- always-up, ~10-15 t/s for snappy meta-questions url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "qwen2.5-coder-1.5b-q4_k_m.gguf", }, deep = { -- only when dirac (data) is awake url = "http://hossenfelder.fritz.box:8082/v1/chat/completions", model = "qwen-coder-7b-32k", }, } ``` ### Why over direct llamafile addressing - **One URL** in config, regardless of which physical box hosts which model. - **Failover** is hossenfelder's job, not aish's: when dirac is asleep, model-aware routing falls back to boltzmann automatically. - **Cloud + local symmetric** — the same `/v1/chat/completions` POST shape works for `anthropic/claude-*` and `llama-3.1-8b-instruct`. aish doesn't need a separate cloud branch. - **Discovery is free** — `GET /v1/models` returns the live merged catalog (8 cloud + 3 local at present). aish could surface this via a future `:models` meta-command. ### Caveats / open questions 1. **`stream: true` behavior** — hossenfelder forwards SSE properly per its `_stream_response()` path. Phase 1 SSE work in aish should be unaffected, but worth a sanity test once the FFI streaming lands. 2. **Tool-calling forwarding (Phase 2)** — hossenfelder currently passes through OpenAI-style `tools` payloads transparently for cloud routes (OpenRouter handles it) and for local routes (llamafile parses tool schemas if the model supports it, e.g. Hermes). When MCP support lands in aish, this needs a verification pass. 3. **Network dependency** — aish becomes unusable if hossenfelder is down. Mitigation: keep direct fallbacks (e.g. `boltzmann.fritz.box:8083` for Llama 3.1) commented as failover entries the user can swap in. Or wait for Phase 5 multi-model fallback to handle this. 4. **Auth** — none currently (LAN-only deploy). If aish ever talks to hossenfelder over a non-trusted network, the proxy needs Bearer auth wired in (it already supports forwarding `Authorization` headers). ### Source Endpoint validated 2026-05-10 via `curl http://hossenfelder.fritz.box:8082/v1/models`. Hossenfelder source: boltzmann LXC `hossenfelder`, `/opt/llm-proxy.py`. Model-aware routing path tagged `(routed)` in proxy logs.

claude-noether added the recommendation label 2026-05-10 11:23:05 +00:00

marfrit referenced this issue from a commit

2026-05-10 14:10:02 +00:00

broker: blocking POST /v1/chat/completions via ffi/curl + dkjson

marfrit referenced this issue from a commit

2026-05-10 15:17:42 +00:00

repl: readline loop, dispatch, all Phase 0 meta commands

marfrit referenced this issue from a commit

2026-05-10 17:12:28 +00:00

main: --config/--help arg parsing, vendor on package.path, REPL start

claude-noether commented

2026-05-10 18:06:08 +00:00

Resolved by partial-accept in commit 8870eb0 (config: route all presets through hossenfelder per issue #12).

What landed

Single broker URL http://hossenfelder.fritz.box:8082 for all three presets (fast / deep / cloud)
Models picked from GET /v1/models: qwen2.5-coder-1.5b (fast), mistral-nemo-12b (deep), anthropic/claude-haiku-4.5 (cloud)
The pre-existing https:// on cloud was flipped to http:// to match the proxy

What deferred

Schema rename models → brokers (with multi-preset siblings like cloud_opus / cloud_haiku / cloud_sonnet) — would touch repl.lua + broker.lua. Not blocking. If the multi-preset shape becomes useful in practice, open a separate issue for the rename then.

Phase 7 live verification

Single-turn broker.chat(fast, ...) round-trip: ~3s, valid response
Multi-turn arithmetic (7×8=56, ×2=112) preserved across turns
Both local (boltzmann) and cloud (OpenRouter) routes reachable end-to-end

Closing.

Resolved by partial-accept in commit `8870eb0` (`config: route all presets through hossenfelder per issue #12`). **What landed** - Single broker URL `http://hossenfelder.fritz.box:8082` for all three presets (fast / deep / cloud) - Models picked from `GET /v1/models`: qwen2.5-coder-1.5b (fast), mistral-nemo-12b (deep), anthropic/claude-haiku-4.5 (cloud) - The pre-existing `https://` on cloud was flipped to `http://` to match the proxy **What deferred** - Schema rename `models` → `brokers` (with multi-preset siblings like cloud_opus / cloud_haiku / cloud_sonnet) — would touch repl.lua + broker.lua. Not blocking. If the multi-preset shape becomes useful in practice, open a separate issue for the rename then. **Phase 7 live verification** - Single-turn `broker.chat(fast, ...)` round-trip: ~3s, valid response - Multi-turn arithmetic (7×8=56, ×2=112) preserved across turns - Both local (boltzmann) and cloud (OpenRouter) routes reachable end-to-end Closing.

claude-noether closed this issue

2026-05-10 18:06:08 +00:00

marfrit referenced this issue from a commit

2026-05-10 18:06:10 +00:00

config: route all presets through hossenfelder per issue #12

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marfrit/aish#12

Use hossenfelder as the canonical broker endpoint #12

Suggestion

Endpoint

Recommended config.lua brokers table

Why over direct llamafile addressing

Caveats / open questions

Source

Recommended `config.lua` brokers table