The merge commit renamed arch/arm64/xor-neon-ffixed-x18-scs-build-fix-danctnix/ to arch/arm64/scs-arm-neon-build-fix/ (= main's canonical name) but the include reference in ohm.yaml didn't get updated atomically. Update the include path to match the renamed dir; ka-promote would have exit-2'd on this manifest otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kernel-agent
Owns the kernel side of the home fleet: source/branch/patch curation, per-host build orchestration, promote-to-fleet pipeline. Peer to His (home infra). Uses His for ops it doesn't own (waking data, host provisioning); files Gitea issues for coordination it can't decide alone.
Targets: dev/work hosts only. Infra hosts (noether, hertz, dcw2/3, turing,
nuccies as compile-only) are NOT in the promote list — explicit opt-in via
fleet/<host>.yaml manifest.
Customized today: ampere · boltzmann · fresnel · ohm Anticipated Debian targets: higgs · clevo · pi-fleet (when they ask for it)
Lifecycle
┌────────────────────────────────────────────────────────────────┐
│ INPUT — campaign session │
│ patches in marfrit/<campaign>/ or marfrit/misc-kernel-patches│
│ triggers: ka-promote, ka-close, ka-abandon │
└─────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────────────▼──────────────────────────────────────┐
│ ORCHESTRATION — kernel-agent │
│ resolve manifest by scope tag │
│ pre-flight target build host (minimal; thorough nightly) │
│ on miss → [ka:host-changed] block to His │
└─────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────────────▼──────────────────────────────────────┐
│ BUILD │
│ aarch64: kbuild-aarch64 on boltzmann (primary) │
│ fermi on hertz (fallback) │
│ distcc pool: tesla + dcc1 + dcc2 (zeroconf) │
│ x86_64: kbuild-x86 on data (wakes via wake-host lmcp) │
│ ccache + 5-min watcher (hertz cron) for stalls/errors │
│ wall-clock cap (absolute), warn on degraded distcc pool │
└─────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────────────▼──────────────────────────────────────┐
│ SIGN │
│ build host submits unsigned .pkg.tar.zst / .deb to hertz │
│ hertz signs with existing marfrit-packages key (one key, │
│ pkg + repo db) │
│ hertz pushes to packages.reauktion.de │
└─────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────────────▼──────────────────────────────────────┐
│ INSTALL — consent-via-action │
│ kernel-agent files [ka:installable] │
│ session-hook reminders (escalating: now, +1h, +6h, daily) │
│ YOU run ka-install <host> │
│ → backup current → pacman/apt -U → reboot │
└─────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────────────▼──────────────────────────────────────┐
│ VERIFY — post-install (auto, by hertz cron) │
│ Bar 1: SSH heartbeat (10 min) │
│ Bar 2: package version installed │
│ Bar 3: DTB/sysfs matches manifest (custom-DTB hosts) │
│ Bar 4: per-patch probe (manifest opt-in, simple lang) │
│ Bar 5: burn-in N hours (host opt-in) │
│ failure → [ka:regression] block, host marked drifted │
└────────────────────────────────────────────────────────────────┘
Loopback (7→4): yank patches from manifest; host drifted; next
install converges. No automatic rollback;
backup at /sparfuxdata/kernel-agent-backups/
on hertz, 7-day retention, you fetch + reinstall.
Agent boundaries
peer agents
┌───────────────────────────────────────┐
│ │
His ←──── lmcp tools (ops) ────→ kernel-agent
│ wake-host, host-status, │
│ prepare-build-host, ... │
│ │
└─── Gitea issues (coordination) ───────┘
▲ ▲
│ │
campaign sessions
(Bin · MegabitChip · RockHard ·
Neutron · fresnel-fourier ·
ohm_gl_fix · besser · ...)
▲
│
subagents inside session
(Janet · avr-specialist · Plan)
no independent identity, contribute
to whatever the calling session ships
Routine ops between peer agents go through lmcp tools (sync, idempotent, no per-call audit trail). Coordination goes through Gitea issues (async, persistent, audit trail per item).
Verbs (explicit, parameterized, audit-issue auto-filed)
ka-promote <host> # resolve fleet/<host>.yaml → cumulative.patch + manifest.lock [bin/ka-promote — implemented Phase 6, issue #22]
ka-import <campaign> <patch-or-glob> --to <scope> # patches from campaign → scope-tagged tree (today: manual git workflow)
ka-close <campaign> --status success
ka-abandon <campaign> --keep-as-archive | --purge-from-fleet
ka-build <host> # render PKGBUILD template with cumulative b2sum, run makepkg [next verb, issue TBD]
ka-install <host> # scp + pacman -U + extlinux/mkinitcpio + heartbeat [last verb, issue TBD]
ka-keep <job-id> [--for <duration>]
ka-pause-prune / ka-resume-prune
ka-restore-archive <job-id>
ka-snooze <issue-id> [--for <duration>]
ka-debug <job-id> # shells into the same container that ran the build
ka-status # per-host one-liner with drift/pending state [bin/ka-status — implemented Phase 1]
ka-migrate-tree --from <p> --to <p>
ka-wake-data # wraps wake-host data through His
Note: the original spec had ka-promote <campaign> <patch-or-glob> --to <scope>
("promote patches from a campaign into the canonical tree"). That semantic
moved to ka-import to free ka-promote for the manifest-resolution role
its issue (#22) and the implemented bin/ka-promote actually fulfil. ka-import
remains unimplemented — patches still land in patches/ via the regular git
- PR workflow.
Conversational invocation triggers a y/n confirmation enumerating what will happen. Direct CLI invocation executes immediately.
Block-severity issues — what halts what
[ka:patch-fail] only that patch's promotes
[ka:campaign-conflict] those patches across the involved campaigns
[ka:host-drifted] installs to that host (builds OK)
[ka:build-fail] builds routing to that build host
[ka:bootstrap-missing] builds for that build host
[ka:host-changed] builds to that host until pre-flight re-passes
[ka:signing-fail] global (all builds need signing)
[ka:regression] installs to that host until triaged
Scoped per issue. No implicit cross-domain propagation. Dependency cascades detected at promote-time, not propagated globally.
Patch tree (in marfrit/kernel-agent)
patches/
├── arch/{arm64,x86_64}/
├── soc/{rockchip/{rk3399,rk3566,rk3588},...}/
├── module/<som-name>/
├── board/<board-name>/
├── driver/<driver-name>/
└── subsystem/<subsystem-name>/
Each patch lives at the narrowest scope that's correct (a board patch goes
under board/, an SoC-wide fix under soc/). Per-host manifest resolves
tags + explicit includes. Reorgs via ka-migrate-tree (atomic tree +
manifest rewrite); paths stable otherwise.
Build hosts
Host Where Role Wake? Notes
──────────────────────────────────────────────────────────────────────────
boltzmann Rock 5 ITX+ aarch64 primary always container kbuild-aarch64
ampere CoolPi GenBook aarch64 secondary on-demand RK3588 32GB; same uarch as boltzmann,
wakes via His; idle 30 min → release
fermi hertz LXD aarch64 fallback always matches kbuild-aarch64 profile
kbuild-x86 data CT x86_64 on-demand wakes via His; idle 30 min → release
Native make on the assigned build host. No distcc for kernel-agent
builds (feedback_kernel_agent_no_distcc.md, locked 2026-05-09). ccache
stays per-host. distcc remains in scope for userspace package builds.
Files / paths
/srv/kernel-agent/source/<job-id>/ live build dir (kbuild UID owns)
/srv/kernel-agent/ccache/ persistent across builds
/srv/kernel-agent/output/<job-id>/ built packages, pre-sign
/srv/kernel-agent/manifest/ per-host manifests (yaml)
/srv/kernel-agent/keep/ failed builds tagged ka-keep
hertz:/sparfuxdata/kernel-agent-backups/<host>/<version>/ 7-day
hertz:/sparfuxdata/kernel-agent-archive/<job-id>/ 1-year (cron)
https://logs.reauktion.de/<host>/<job-id>/ 1-year (cron on lagrange)
Repos:
marfrit/kernel-agent— agent source, manifests, scope-tagged patch treemarfrit/<campaign>— each campaign owns its repomarfrit/misc-kernel-patches— landing pad for one-off non-campaign fixesmarfrit-packages— kernel package PKGBUILDs / .debs
Identity
Issues filed as the host the agent runs on (claude-noether by default, per
reference_claude_noether_gitea.md). Title prefix [ka:*] carries the role.
No new Gitea identity; per-host bootstrap one-liner already covers this.
Reminder channel
Active Claude session top-of-conversation hook only — no email, no HA, no
DokuWiki. Cadence: escalating ladder (initial → +1h → +6h → daily). Snooze
via ka-snooze <issue-id> [--for <duration>].
Hard rules — won't change without re-litigation
- Never auto-promote. Closure is your explicit verb.
- Never auto-install. Reboot only happens inside
ka-install. - Never reach into
$HOMEon any host. - Never targets infra hosts (noether, hertz, dcw*, turing) without explicit
fleet/manifest opt-in. - Never sudo-mutates host setup. His provisions; agent consumes.
- Refuse abandon without
--keep-as-archive|--purge-from-fleetflag. - Refuse promote of patches lacking scope tag.
Bootstrap reference build (2026-05-09 — fresnel)
First end-to-end run, before ka-promote / ka-build / ka-install existed.
Documented here as the canonical worked example; the substrate that the ka-*
verbs are/will-be implemented against. Issue #3 (fresnel DTS persistence) closed by this
build. ka-promote (issue #22) replaced the manual step #1 below as of 2026-05-18.
Inputs
- Baseline: torvalds/linux @
v7.0(verified during ka-promote Phase 3, issue #22 — mmind/linux-rockchip does not ship a plainv7.0tag despite earlier docs; mmind kept in fresnel.yaml as informationalpatch_authoring_context). - Patches (scope
board/pinebook-pro):0001-arm64-dts-rk3399-pinebook-pro-add-OC-OPP-tables-1704-2184.patch0002-arm64-dts-rk3399-pinebook-pro-enable-hdmi-sound.patch0003-arm64-dts-rk3399-pinebook-pro-spi1-max-freq-10MHz.patch
- Manifest:
fleet/fresnel.yaml(tree=mmind v7.0, 3 patches above, alongside-install vslinux-eos-arm). - .config source: snapshot from fresnel
/usr/lib/modules/6.19.10-1-eos-arm/build/.config, recovered from the data backintime backup (May 7 snapshot) since the laptop was off when the build started;make olddefconfigto fold in v7.0 new symbols (one harmlessBOOTPARAM_SOFTLOCKUP_PANICwarning, ignored).
Manual substitute for each ka-* verb
| Designed verb | What we did manually | Status |
|---|---|---|
ka-import fresnel-fourier <patches> --to board/pinebook-pro (originally named ka-promote in this row) |
Authored 3 patches with proper headers/scope tags, pushed to marfrit/kernel-agent/patches/board/pinebook-pro/ via Gitea contents API as claude-noether. |
still manual — ka-import unimplemented |
ka-promote fresnel (new — manifest → cumulative.patch + manifest.lock) |
n/a (didn't exist) | automated 2026-05-18, issue #22 |
ka-build fresnel |
On boltzmann: cloned linux v7.0 from kernel.org, ran makepkg -s --skipchecksums --skippgpcheck against marfrit-packages/arch/linux-fresnel-fourier/PKGBUILD. Native aarch64 (boltzmann is RK3588). One headers-pkg bug discovered (ln -sr on missing parent dir) and fixed mid-flight. Repackaged. |
still manual — next verb to implement |
ka-sign + push |
scp pkgs hertz → sudo /opt/herding/bin/marfrit-publish-arch aarch64 <pkg> per pkg. Script signs with key 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C, runs repo-add, rsyncs to nc. |
still manual — folded into ka-build |
ka-install fresnel (consent-via-action) |
sudo pacman -U /tmp/<pkg> over LAN scp (HTTPS to nc was throttled by fresnel's wifi). pacman post-transaction hook updated extlinux. mkinitcpio run manually because the standard hook trigger watches vmlinuz not Image. |
still manual — last verb to implement |
| Bar 1..3 verification | SSH heartbeat OK, pacman -Q linux-fresnel-fourier = 7.0-1, post-reboot cluster0 1.704 GHz / cluster1 2.184 GHz confirmed. |
folded into ka-install |
Files / locations involved
git.reauktion.de/marfrit/kernel-agent/patches/board/pinebook-pro/— patchesgit.reauktion.de/marfrit/kernel-agent/fleet/fresnel.yaml— manifestgit.reauktion.de/marfrit/marfrit-packages/arch/linux-fresnel-fourier/— PKGBUILD + 3 patches + config + extlinux hook+script + mkinitcpio presetboltzmann:~/src/kernel-agent-bootstrap/— local build root (baseline clone, patches, build dir, artifacts)hertz:/tmp/ka-publish/— staging for sign+push (transient)hertz:/sparfuxdata/kernel-agent-backups/fresnel/6.19.9-99-eos-arm/fresnel-boot-pre-install.tgz— pre-install /boot snapshot (71MB, 7-day retention per design)https://packages.reauktion.de/arch/aarch64/linux-fresnel-fourier-7.0-1-aarch64.pkg.tar.zst— published artifactfresnel:/boot/{Image,initramfs,dtbs}-fresnel-fourier{,/...}— installed artifactsfresnel:/boot/extlinux/extlinux.conf— managed block tagged>>> linux-fresnel-fourier (managed) >>>…<<<
What was learned that ka-* should bake in
- mkinitcpio's stock hook watches
vmlinuz, notImage. ARM kernel installs must explicitly runmkinitcpio -p <preset>from the install hook, OR ship a custom alpm hook withTarget = boot/Image-<suffix>. - Headers PKGBUILD:
ln -sr "${_builddir}" "${pkgdir}/usr/src/${pkgbase}"needs a precedinginstall -d "${pkgdir}/usr/src". Cargo-cult from arch'slinuxpackage without checking that pacman pre-creates/usr/srcfor kernels. - HTTPS download from nc.reauktion.de can stall on slow wifi (fresnel @ 181 ms
ping). Same-LAN scp from hertz (which already has the published pkgs in
/tmp/ka-publish/) is the workaround. ka-install should detect and prefer LAN-fanout. - Manifest must carry the kernel suffix (
-fresnel-fourier) explicitly so alongside-install paths (/boot/Image-<suffix>,/boot/dtbs-<suffix>/,/boot/initramfs-<suffix>.img) don't collide with the EOS-stock paths. - Backup target needs
install -d -o $USER -g $USERfirst time per host —/sparfuxdata/kernel-agent-backups/<host>/<version>/is created lazily.
Out of scope this round (explicit defer)
- vb2 dma_resv RFC v2 — resolved 2026-05-15. Markus iterated v2 locally
on boltzmann reaching pkgrel=14; the v2 series attaches the fence at
device_run(slept-OK context per Dufresne's v1 review). Now carried inpatches/subsystem/media/videobuf2/dma-resv-release-fence/and included infleet/fresnel.yaml. Still in scope for upstream targeting; default remains "build-tree only, no PR until explicitly asked" (feedback_no_upstream.md). - panfrost IOMMU_CACHE for RK3399 — sibling kernel work that targets the readback transitive-proof gap that vb2_dma_resv alone doesn't close. Still deferred until that lands; ship together when ready.
- Replace
linux-eos-armrather than coexist alongside — preserves easy rollback at u-boot. Can flip toprovides=(linux-eos-arm) conflicts=(...)later once burn-in proves the OC kernel reliable.
Open follow-ups (post-rollout)
- Migrate
github.com/marfrit/misc_patches/genbook/kernel/(9 patches against linux-6.19.9) into proper Coulomb/RockHard campaign repo with scope tags applied. Some patches will need splitting (e.g., 0010 suspend/resume is multi-scope and should split into soc:rk3588 + board:coolpi-cm5-genbook pieces). — Issue #1. - Migrate
besser/patches/(~30 BES2600 staging series) into the scope-tagged tree atdriver/bes2600/with promote eligibility per series. — Issue #2. - Decide whether boltzmann (BredOS-stock today) becomes a Neutron-managed
custom kernel target or stays stock. Decision deferred per memory
project_neutron.md. — Issue #4. fresnel DTS persistence— closed by the bootstrap reference build above. Issue #3 closed.