4 Commits

Author SHA1 Message Date
test0r b81b021b5b v0.5.4: ship examples/lmcp.service template
Companion to lmcp-hub.service. Gives a copy-and-edit starting point for
per-host lmcp instances (foo-tools style). Handles the Arch-vs-Debian
/usr/bin/lua vs /usr/bin/lua5.4 split via a comment pointing users to
override ExecStart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 20:12:43 +00:00
test0r 17af91a99b v0.5.3: hub hardening — hard ssh timeout, parallel probes, sticky DOWN cache
Three fixes addressing the recurring "hub wedges on offline backends" class
of failures (2 incidents in 24h, root-caused to single-threaded Lua + an
uninterruptible os.execute ssh call):

1. Hard wall-clock cap on ssh fallback via GNU `timeout --kill-after=2 30`.
   ConnectTimeout alone only bounds TCP connect; a half-dead sshd (auth
   stall, remote bash-s hang) used to freeze the whole event loop
   indefinitely. Configurable via LMCP_HUB_SSH_HARD_TIMEOUT. Also adds
   ServerAliveInterval=5/Count=2 so an established-but-dead tunnel dies.

2. Parallel lmcp probes for remote_list_hosts. Shells out a single bash
   fan-out of curl -m 3 calls, bounded by PROBE_BUDGET. Wall clock for a
   full 12-backend probe went from ~28 s (sum of per-host ssh connect
   timeouts) to ~3 s.

3. Probe is lmcp-only — ssh is no longer used as health check. The hub
   exists to absorb lots of offline hosts, so an expensive ssh per probe
   was the exact wrong tradeoff. Actual remote_* tool calls still fall
   through to ssh fallback when lmcp is down.

4. Sticky DOWN cache with exponential backoff: 60 → 120 → 240 → 480 →
   900 s. Prevents a sleeping fleet from burning probe budget on every
   health check. UP hosts still use 30 s TTL. Tunable via
   LMCP_HUB_PROBE_TTL_{UP,DOWN_MIN,DOWN_MAX}.

5. Per-request logging to stderr (tool, host, via, elapsed) — invisible
   before, now captured in journal for the next hang's RCA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:58:44 +00:00
test0r b29a2716d1 v0.5.2: shell_bg / remote_shell_bg — background launch tools
server.lua gains a shell_bg tool that launches a detached command via
setsid + nohup + stdio-redirect + &, returns immediately with PID and
log path. Linux-only for MVP (Windows Start-Process equivalent TBD).

hub.lua gains remote_shell_bg, forwarding to backend shell_bg. lmcp-only,
no ssh fallback — fallback for fire-and-forget is semantically murky.

Addresses the 'how do I launch a daemon over lmcp without the sentinel-
file wrapper blocking forever' question. Existing remote_shell keeps
its current synchronous-with-timeout behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 09:34:24 +00:00
test0r 490e688cc1 v0.5.0: add hub — fleet-wide MCP broker
One lmcp server on a central host (typically hertz) that proxies
remote_* tools to every backend in a registry, with a clean SSH
fallback for hosts whose lmcp is temporarily down or not installed.

Tools: remote_list_hosts, remote_{shell,read_file,write_file,edit_file,
list_dir,search_files}. Each takes a `host` argument naming the target
in /opt/herding/etc/hub-backends.conf (or $LMCP_HUB_BACKENDS).

Lazy 30s health cache; `remote_list_hosts force=true` bypasses it.
Bearer auth on inbound (standard lmcp opts.conf / LMCP_TOKEN machinery);
backend Bearer tokens kept in the registry and forwarded per-call.

SSH fallback uses `ssh host 'bash -s' < local_script` — stdin-piped
script body is the canonical shell-escape-free technique. Covers
shell/read_file/write_file/list_dir/search_files. edit_file is lmcp-only
because the literal-match + uniqueness check is nontrivial to replicate
safely in shell.

Ships an example systemd unit and a commented backends.conf template
in examples/. No migration required for existing lmcp deployments —
hub.lua is additive alongside the existing server.lua.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:29:13 +00:00