config: deep model → deepseek-coder-v2-lite (temporary)

qwen3-30b-a3b-instruct isn't loaded on hossenfelder right now (per /v1/models). deepseek-coder-v2-lite IS loaded — 16B MoE with ~2.4B active params; fast enough that the 30-min timeout from the qwen3-30b config was wildly over-budget. Switched to deepseek-coder-v2-lite for the time being. Restore qwen3-30b when the slot is back up. Live-probed: YES/NO destructive probe via the deep model preset returns "YES." in ~4.8s — well within the new 5-min timeout, and fast enough that the Phase 3 LLM second-opinion path is now functional again without falling back to "fail-safe YES" on every ambiguous command. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:42:23 +00:00
parent a9b39cd435
commit d72689f709
1 changed files with 5 additions and 2 deletions
@@ -20,8 +20,11 @@ return {
        },
        deep = {
            endpoint    = HOSSENFELDER,
-            model       = "qwen3-30b-a3b-instruct",
-            timeout_ms  = 1800000,   -- 10 min; Nemo on RK3588 is patient work
+            -- 2026-05-13: qwen3-30b not loaded on hossenfelder right now;
+            -- using deepseek-coder-v2-lite (16B MoE, ~2.4B active) for the
+            -- time being. Restore qwen3-30b when the slot is back up.
+            model       = "deepseek-coder-v2-lite",
+            timeout_ms  = 300000,   -- 5 min; MoE inference is faster than dense 30B
            temperature = 0.1,
        },
        cloud = {