claude-noether 4d98a8169d fleet/ohm + patches/driver/bes2600/queue-pending-record-lock-bh-danctnix: bundle besser#18 fix into the migration
Pulls the besser#18 lockdep fix (originally on
noether/bes2600-pending-record-lock-bh / PR #30) into this PR so the
ohm migration ships a single self-consistent pkgrel that contains all
three goal components: kernel-agent flow + Patch I + besser#18 fix
(plus the GCC 15 SCS Makefile workaround, no-op while SCS=n).

ohm.yaml includes now resolve to 4 patches:
  1. driver/bes2600/cumulative-c5x-danctnix/             (148 149 B)
  2. driver/bes2600/scan-filter-5ghz-danctnix/           (  7 735 B)
  3. arch/arm64/xor-neon-ffixed-x18-scs-build-fix-danctnix/ (1 562 B)
  4. driver/bes2600/queue-pending-record-lock-bh-danctnix/  (5 258 B)
  ----
  cumulative.patch                                       (162 704 B)
  b2sum 0eb091ddaba4a8f1c3c2a78eb8c621cdc6e6dfed6c43f7dac03e508a05b...

Trailer-strip applied to the besser#18 patch source for the same
reason as the SCS patch — it's now the last in the concatenated
cumulative, and patch(1) errors on the orphan '-- \n2.54.0\n' EOF
sentinel. Same gotcha documented in 84734ba.

PR #30 (the standalone besser#18 mirror PR) becomes superfluous
once this lands; close it as 'bundled into #28'.
2026-05-18 18:01:41 +02:00

kernel-agent

Owns the kernel side of the home fleet: source/branch/patch curation, per-host build orchestration, promote-to-fleet pipeline. Peer to His (home infra). Uses His for ops it doesn't own (waking data, host provisioning); files Gitea issues for coordination it can't decide alone.

Targets: dev/work hosts only. Infra hosts (noether, hertz, dcw2/3, turing, nuccies as compile-only) are NOT in the promote list — explicit opt-in via fleet/<host>.yaml manifest.

Customized today: ampere · boltzmann · fresnel · ohm Anticipated Debian targets: higgs · clevo · pi-fleet (when they ask for it)

Lifecycle

       ┌────────────────────────────────────────────────────────────────┐
       │ INPUT — campaign session                                       │
       │   patches in marfrit/<campaign>/ or marfrit/misc-kernel-patches│
       │   triggers: ka-promote, ka-close, ka-abandon                   │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ ORCHESTRATION — kernel-agent                                   │
       │   resolve manifest by scope tag                                │
       │   pre-flight target build host (minimal; thorough nightly)     │
       │   on miss → [ka:host-changed] block to His                     │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ BUILD                                                          │
       │   aarch64: kbuild-aarch64 on boltzmann (primary)               │
       │            fermi on hertz (fallback)                           │
       │            distcc pool: tesla + dcc1 + dcc2 (zeroconf)         │
       │   x86_64:  kbuild-x86 on data (wakes via wake-host lmcp)       │
       │   ccache + 5-min watcher (hertz cron) for stalls/errors        │
       │   wall-clock cap (absolute), warn on degraded distcc pool      │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ SIGN                                                           │
       │   build host submits unsigned .pkg.tar.zst / .deb to hertz     │
       │   hertz signs with existing marfrit-packages key (one key,     │
       │     pkg + repo db)                                             │
       │   hertz pushes to packages.reauktion.de                        │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ INSTALL — consent-via-action                                   │
       │   kernel-agent files [ka:installable]                          │
       │   session-hook reminders (escalating: now, +1h, +6h, daily)    │
       │   YOU run ka-install <host>                                    │
       │     → backup current → pacman/apt -U → reboot                  │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ VERIFY — post-install (auto, by hertz cron)                    │
       │   Bar 1: SSH heartbeat (10 min)                                │
       │   Bar 2: package version installed                             │
       │   Bar 3: DTB/sysfs matches manifest (custom-DTB hosts)         │
       │   Bar 4: per-patch probe (manifest opt-in, simple lang)        │
       │   Bar 5: burn-in N hours (host opt-in)                         │
       │   failure → [ka:regression] block, host marked drifted         │
       └────────────────────────────────────────────────────────────────┘

       Loopback (7→4):  yank patches from manifest; host drifted; next
                        install converges. No automatic rollback;
                        backup at /sparfuxdata/kernel-agent-backups/
                        on hertz, 7-day retention, you fetch + reinstall.

Agent boundaries

                          peer agents
            ┌───────────────────────────────────────┐
            │                                       │
           His  ←──── lmcp tools (ops) ────→  kernel-agent
            │     wake-host, host-status,           │
            │     prepare-build-host, ...           │
            │                                       │
            └─── Gitea issues (coordination) ───────┘

                          ▲   ▲
                          │   │
                  campaign sessions
              (Bin · MegabitChip · RockHard ·
               Neutron · fresnel-fourier ·
               ohm_gl_fix · besser · ...)
                          ▲
                          │
                  subagents inside session
              (Janet · avr-specialist · Plan)
              no independent identity, contribute
              to whatever the calling session ships

Routine ops between peer agents go through lmcp tools (sync, idempotent, no per-call audit trail). Coordination goes through Gitea issues (async, persistent, audit trail per item).

Verbs (explicit, parameterized, audit-issue auto-filed)

ka-promote <host>          # resolve fleet/<host>.yaml → cumulative.patch + manifest.lock   [bin/ka-promote — implemented Phase 6, issue #22]
ka-import <campaign> <patch-or-glob> --to <scope>   # patches from campaign → scope-tagged tree (today: manual git workflow)
ka-close <campaign> --status success
ka-abandon <campaign> --keep-as-archive | --purge-from-fleet
ka-build <host>            # render PKGBUILD template with cumulative b2sum, run makepkg     [next verb, issue TBD]
ka-install <host>          # scp + pacman -U + extlinux/mkinitcpio + heartbeat               [last verb, issue TBD]
ka-keep <job-id> [--for <duration>]
ka-pause-prune  / ka-resume-prune
ka-restore-archive <job-id>
ka-snooze <issue-id> [--for <duration>]
ka-debug <job-id>          # shells into the same container that ran the build
ka-status                  # per-host one-liner with drift/pending state   [bin/ka-status — implemented Phase 1]
ka-migrate-tree --from <p> --to <p>
ka-wake-data               # wraps wake-host data through His

Note: the original spec had ka-promote <campaign> <patch-or-glob> --to <scope> ("promote patches from a campaign into the canonical tree"). That semantic moved to ka-import to free ka-promote for the manifest-resolution role its issue (#22) and the implemented bin/ka-promote actually fulfil. ka-import remains unimplemented — patches still land in patches/ via the regular git

  • PR workflow.

Conversational invocation triggers a y/n confirmation enumerating what will happen. Direct CLI invocation executes immediately.

Block-severity issues — what halts what

[ka:patch-fail]        only that patch's promotes
[ka:campaign-conflict] those patches across the involved campaigns
[ka:host-drifted]      installs to that host (builds OK)
[ka:build-fail]        builds routing to that build host
[ka:bootstrap-missing] builds for that build host
[ka:host-changed]      builds to that host until pre-flight re-passes
[ka:signing-fail]      global (all builds need signing)
[ka:regression]        installs to that host until triaged

Scoped per issue. No implicit cross-domain propagation. Dependency cascades detected at promote-time, not propagated globally.

Patch tree (in marfrit/kernel-agent)

patches/
├── arch/{arm64,x86_64}/
├── soc/{rockchip/{rk3399,rk3566,rk3588},...}/
├── module/<som-name>/
├── board/<board-name>/
├── driver/<driver-name>/
└── subsystem/<subsystem-name>/

Each patch lives at the narrowest scope that's correct (a board patch goes under board/, an SoC-wide fix under soc/). Per-host manifest resolves tags + explicit includes. Reorgs via ka-migrate-tree (atomic tree + manifest rewrite); paths stable otherwise.

Build hosts

Host             Where           Role              Wake?     Notes
──────────────────────────────────────────────────────────────────────────
boltzmann        Rock 5 ITX+     aarch64 primary   always    container kbuild-aarch64
ampere           CoolPi GenBook  aarch64 secondary on-demand RK3588 32GB; same uarch as boltzmann,
                                                            wakes via His; idle 30 min → release
fermi            hertz LXD       aarch64 fallback  always    matches kbuild-aarch64 profile
kbuild-x86       data CT         x86_64            on-demand wakes via His; idle 30 min → release

Native make on the assigned build host. No distcc for kernel-agent builds (feedback_kernel_agent_no_distcc.md, locked 2026-05-09). ccache stays per-host. distcc remains in scope for userspace package builds.

Files / paths

/srv/kernel-agent/source/<job-id>/    live build dir (kbuild UID owns)
/srv/kernel-agent/ccache/             persistent across builds
/srv/kernel-agent/output/<job-id>/    built packages, pre-sign
/srv/kernel-agent/manifest/           per-host manifests (yaml)
/srv/kernel-agent/keep/               failed builds tagged ka-keep

hertz:/sparfuxdata/kernel-agent-backups/<host>/<version>/   7-day
hertz:/sparfuxdata/kernel-agent-archive/<job-id>/           1-year (cron)

https://logs.reauktion.de/<host>/<job-id>/   1-year (cron on lagrange)

Repos:

  • marfrit/kernel-agent — agent source, manifests, scope-tagged patch tree
  • marfrit/<campaign> — each campaign owns its repo
  • marfrit/misc-kernel-patches — landing pad for one-off non-campaign fixes
  • marfrit-packages — kernel package PKGBUILDs / .debs

Identity

Issues filed as the host the agent runs on (claude-noether by default, per reference_claude_noether_gitea.md). Title prefix [ka:*] carries the role. No new Gitea identity; per-host bootstrap one-liner already covers this.

Reminder channel

Active Claude session top-of-conversation hook only — no email, no HA, no DokuWiki. Cadence: escalating ladder (initial → +1h → +6h → daily). Snooze via ka-snooze <issue-id> [--for <duration>].

Hard rules — won't change without re-litigation

  • Never auto-promote. Closure is your explicit verb.
  • Never auto-install. Reboot only happens inside ka-install.
  • Never reach into $HOME on any host.
  • Never targets infra hosts (noether, hertz, dcw*, turing) without explicit fleet/ manifest opt-in.
  • Never sudo-mutates host setup. His provisions; agent consumes.
  • Refuse abandon without --keep-as-archive | --purge-from-fleet flag.
  • Refuse promote of patches lacking scope tag.

Bootstrap reference build (2026-05-09 — fresnel)

First end-to-end run, before ka-promote / ka-build / ka-install existed. Documented here as the canonical worked example; the substrate that the ka-* verbs are/will-be implemented against. Issue #3 (fresnel DTS persistence) closed by this build. ka-promote (issue #22) replaced the manual step #1 below as of 2026-05-18.

Inputs

  • Baseline: torvalds/linux @ v7.0 (verified during ka-promote Phase 3, issue #22 — mmind/linux-rockchip does not ship a plain v7.0 tag despite earlier docs; mmind kept in fresnel.yaml as informational patch_authoring_context).
  • Patches (scope board/pinebook-pro):
    • 0001-arm64-dts-rk3399-pinebook-pro-add-OC-OPP-tables-1704-2184.patch
    • 0002-arm64-dts-rk3399-pinebook-pro-enable-hdmi-sound.patch
    • 0003-arm64-dts-rk3399-pinebook-pro-spi1-max-freq-10MHz.patch
  • Manifest: fleet/fresnel.yaml (tree=mmind v7.0, 3 patches above, alongside-install vs linux-eos-arm).
  • .config source: snapshot from fresnel /usr/lib/modules/6.19.10-1-eos-arm/build/.config, recovered from the data backintime backup (May 7 snapshot) since the laptop was off when the build started; make olddefconfig to fold in v7.0 new symbols (one harmless BOOTPARAM_SOFTLOCKUP_PANIC warning, ignored).

Manual substitute for each ka-* verb

Designed verb What we did manually Status
ka-import fresnel-fourier <patches> --to board/pinebook-pro (originally named ka-promote in this row) Authored 3 patches with proper headers/scope tags, pushed to marfrit/kernel-agent/patches/board/pinebook-pro/ via Gitea contents API as claude-noether. still manual — ka-import unimplemented
ka-promote fresnel (new — manifest → cumulative.patch + manifest.lock) n/a (didn't exist) automated 2026-05-18, issue #22
ka-build fresnel On boltzmann: cloned linux v7.0 from kernel.org, ran makepkg -s --skipchecksums --skippgpcheck against marfrit-packages/arch/linux-fresnel-fourier/PKGBUILD. Native aarch64 (boltzmann is RK3588). One headers-pkg bug discovered (ln -sr on missing parent dir) and fixed mid-flight. Repackaged. still manual — next verb to implement
ka-sign + push scp pkgs hertz → sudo /opt/herding/bin/marfrit-publish-arch aarch64 <pkg> per pkg. Script signs with key 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C, runs repo-add, rsyncs to nc. still manual — folded into ka-build
ka-install fresnel (consent-via-action) sudo pacman -U /tmp/<pkg> over LAN scp (HTTPS to nc was throttled by fresnel's wifi). pacman post-transaction hook updated extlinux. mkinitcpio run manually because the standard hook trigger watches vmlinuz not Image. still manual — last verb to implement
Bar 1..3 verification SSH heartbeat OK, pacman -Q linux-fresnel-fourier = 7.0-1, post-reboot cluster0 1.704 GHz / cluster1 2.184 GHz confirmed. folded into ka-install

Files / locations involved

  • git.reauktion.de/marfrit/kernel-agent/patches/board/pinebook-pro/ — patches
  • git.reauktion.de/marfrit/kernel-agent/fleet/fresnel.yaml — manifest
  • git.reauktion.de/marfrit/marfrit-packages/arch/linux-fresnel-fourier/ — PKGBUILD + 3 patches + config + extlinux hook+script + mkinitcpio preset
  • boltzmann:~/src/kernel-agent-bootstrap/ — local build root (baseline clone, patches, build dir, artifacts)
  • hertz:/tmp/ka-publish/ — staging for sign+push (transient)
  • hertz:/sparfuxdata/kernel-agent-backups/fresnel/6.19.9-99-eos-arm/fresnel-boot-pre-install.tgz — pre-install /boot snapshot (71MB, 7-day retention per design)
  • https://packages.reauktion.de/arch/aarch64/linux-fresnel-fourier-7.0-1-aarch64.pkg.tar.zst — published artifact
  • fresnel:/boot/{Image,initramfs,dtbs}-fresnel-fourier{,/...} — installed artifacts
  • fresnel:/boot/extlinux/extlinux.conf — managed block tagged >>> linux-fresnel-fourier (managed) >>><<<

What was learned that ka-* should bake in

  • mkinitcpio's stock hook watches vmlinuz, not Image. ARM kernel installs must explicitly run mkinitcpio -p <preset> from the install hook, OR ship a custom alpm hook with Target = boot/Image-<suffix>.
  • Headers PKGBUILD: ln -sr "${_builddir}" "${pkgdir}/usr/src/${pkgbase}" needs a preceding install -d "${pkgdir}/usr/src". Cargo-cult from arch's linux package without checking that pacman pre-creates /usr/src for kernels.
  • HTTPS download from nc.reauktion.de can stall on slow wifi (fresnel @ 181 ms ping). Same-LAN scp from hertz (which already has the published pkgs in /tmp/ka-publish/) is the workaround. ka-install should detect and prefer LAN-fanout.
  • Manifest must carry the kernel suffix (-fresnel-fourier) explicitly so alongside-install paths (/boot/Image-<suffix>, /boot/dtbs-<suffix>/, /boot/initramfs-<suffix>.img) don't collide with the EOS-stock paths.
  • Backup target needs install -d -o $USER -g $USER first time per host — /sparfuxdata/kernel-agent-backups/<host>/<version>/ is created lazily.

Out of scope this round (explicit defer)

  • vb2 dma_resv RFC v2resolved 2026-05-15. Markus iterated v2 locally on boltzmann reaching pkgrel=14; the v2 series attaches the fence at device_run (slept-OK context per Dufresne's v1 review). Now carried in patches/subsystem/media/videobuf2/dma-resv-release-fence/ and included in fleet/fresnel.yaml. Still in scope for upstream targeting; default remains "build-tree only, no PR until explicitly asked" (feedback_no_upstream.md).
  • panfrost IOMMU_CACHE for RK3399 — sibling kernel work that targets the readback transitive-proof gap that vb2_dma_resv alone doesn't close. Still deferred until that lands; ship together when ready.
  • Replace linux-eos-arm rather than coexist alongside — preserves easy rollback at u-boot. Can flip to provides=(linux-eos-arm) conflicts=(...) later once burn-in proves the OC kernel reliable.

Open follow-ups (post-rollout)

  • Migrate github.com/marfrit/misc_patches/genbook/kernel/ (9 patches against linux-6.19.9) into proper Coulomb/RockHard campaign repo with scope tags applied. Some patches will need splitting (e.g., 0010 suspend/resume is multi-scope and should split into soc:rk3588 + board:coolpi-cm5-genbook pieces). — Issue #1.
  • Migrate besser/patches/ (~30 BES2600 staging series) into the scope-tagged tree at driver/bes2600/ with promote eligibility per series. — Issue #2.
  • Decide whether boltzmann (BredOS-stock today) becomes a Neutron-managed custom kernel target or stays stock. Decision deferred per memory project_neutron.md. — Issue #4.
  • fresnel DTS persistenceclosed by the bootstrap reference build above. Issue #3 closed.
S
Description
Kernel source/branch/patch curation, per-host build orchestration, promote-to-fleet pipeline for the home fleet. Peer agent to His (home infra).
Readme 730 KiB
Languages
Shell 67.4%
Python 32.6%