claude-noether a840f76907 patches/arch/arm64/xor-neon-...: fix malformed @@ hunk counts
The hunk header @@ -9,6 +9,10 @@ understated both old (actual 7) and
new (actual 12) line counts by 1. patch(1) standalone tolerates this
via fuzz, but in the concatenated cumulative the wrong counts cause
patch to mis-judge the hunk boundary and read the trailing context
line ('lib-...uaccess_flushcache.o') as the start of a new patch
header — 'malformed patch at line 4526'.

Cumulative b2sum: bd42cd39106298879eeb... -> ad9e2cb533957f218058...
(size unchanged at 157 458; only the @@ counts in the SCS patch
differ)
2026-05-18 16:58:19 +02:00

kernel-agent

Owns the kernel side of the home fleet: source/branch/patch curation, per-host build orchestration, promote-to-fleet pipeline. Peer to His (home infra). Uses His for ops it doesn't own (waking data, host provisioning); files Gitea issues for coordination it can't decide alone.

Targets: dev/work hosts only. Infra hosts (noether, hertz, dcw2/3, turing, nuccies as compile-only) are NOT in the promote list — explicit opt-in via fleet/<host>.yaml manifest.

Customized today: ampere · boltzmann · fresnel · ohm Anticipated Debian targets: higgs · clevo · pi-fleet (when they ask for it)

Lifecycle

       ┌────────────────────────────────────────────────────────────────┐
       │ INPUT — campaign session                                       │
       │   patches in marfrit/<campaign>/ or marfrit/misc-kernel-patches│
       │   triggers: ka-promote, ka-close, ka-abandon                   │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ ORCHESTRATION — kernel-agent                                   │
       │   resolve manifest by scope tag                                │
       │   pre-flight target build host (minimal; thorough nightly)     │
       │   on miss → [ka:host-changed] block to His                     │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ BUILD                                                          │
       │   aarch64: kbuild-aarch64 on boltzmann (primary)               │
       │            fermi on hertz (fallback)                           │
       │            distcc pool: tesla + dcc1 + dcc2 (zeroconf)         │
       │   x86_64:  kbuild-x86 on data (wakes via wake-host lmcp)       │
       │   ccache + 5-min watcher (hertz cron) for stalls/errors        │
       │   wall-clock cap (absolute), warn on degraded distcc pool      │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ SIGN                                                           │
       │   build host submits unsigned .pkg.tar.zst / .deb to hertz     │
       │   hertz signs with existing marfrit-packages key (one key,     │
       │     pkg + repo db)                                             │
       │   hertz pushes to packages.reauktion.de                        │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ INSTALL — consent-via-action                                   │
       │   kernel-agent files [ka:installable]                          │
       │   session-hook reminders (escalating: now, +1h, +6h, daily)    │
       │   YOU run ka-install <host>                                    │
       │     → backup current → pacman/apt -U → reboot                  │
       └─────────────────────────┬──────────────────────────────────────┘
                                 │
       ┌─────────────────────────▼──────────────────────────────────────┐
       │ VERIFY — post-install (auto, by hertz cron)                    │
       │   Bar 1: SSH heartbeat (10 min)                                │
       │   Bar 2: package version installed                             │
       │   Bar 3: DTB/sysfs matches manifest (custom-DTB hosts)         │
       │   Bar 4: per-patch probe (manifest opt-in, simple lang)        │
       │   Bar 5: burn-in N hours (host opt-in)                         │
       │   failure → [ka:regression] block, host marked drifted         │
       └────────────────────────────────────────────────────────────────┘

       Loopback (7→4):  yank patches from manifest; host drifted; next
                        install converges. No automatic rollback;
                        backup at /sparfuxdata/kernel-agent-backups/
                        on hertz, 7-day retention, you fetch + reinstall.

Agent boundaries

                          peer agents
            ┌───────────────────────────────────────┐
            │                                       │
           His  ←──── lmcp tools (ops) ────→  kernel-agent
            │     wake-host, host-status,           │
            │     prepare-build-host, ...           │
            │                                       │
            └─── Gitea issues (coordination) ───────┘

                          ▲   ▲
                          │   │
                  campaign sessions
              (Bin · MegabitChip · RockHard ·
               Neutron · fresnel-fourier ·
               ohm_gl_fix · besser · ...)
                          ▲
                          │
                  subagents inside session
              (Janet · avr-specialist · Plan)
              no independent identity, contribute
              to whatever the calling session ships

Routine ops between peer agents go through lmcp tools (sync, idempotent, no per-call audit trail). Coordination goes through Gitea issues (async, persistent, audit trail per item).

Verbs (explicit, parameterized, audit-issue auto-filed)

ka-promote <host>          # resolve fleet/<host>.yaml → cumulative.patch + manifest.lock   [bin/ka-promote — implemented Phase 6, issue #22]
ka-import <campaign> <patch-or-glob> --to <scope>   # patches from campaign → scope-tagged tree (today: manual git workflow)
ka-close <campaign> --status success
ka-abandon <campaign> --keep-as-archive | --purge-from-fleet
ka-build <host>            # render PKGBUILD template with cumulative b2sum, run makepkg     [next verb, issue TBD]
ka-install <host>          # scp + pacman -U + extlinux/mkinitcpio + heartbeat               [last verb, issue TBD]
ka-keep <job-id> [--for <duration>]
ka-pause-prune  / ka-resume-prune
ka-restore-archive <job-id>
ka-snooze <issue-id> [--for <duration>]
ka-debug <job-id>          # shells into the same container that ran the build
ka-status                  # per-host one-liner with drift/pending state   [bin/ka-status — implemented Phase 1]
ka-migrate-tree --from <p> --to <p>
ka-wake-data               # wraps wake-host data through His

Note: the original spec had ka-promote <campaign> <patch-or-glob> --to <scope> ("promote patches from a campaign into the canonical tree"). That semantic moved to ka-import to free ka-promote for the manifest-resolution role its issue (#22) and the implemented bin/ka-promote actually fulfil. ka-import remains unimplemented — patches still land in patches/ via the regular git

  • PR workflow.

Conversational invocation triggers a y/n confirmation enumerating what will happen. Direct CLI invocation executes immediately.

Block-severity issues — what halts what

[ka:patch-fail]        only that patch's promotes
[ka:campaign-conflict] those patches across the involved campaigns
[ka:host-drifted]      installs to that host (builds OK)
[ka:build-fail]        builds routing to that build host
[ka:bootstrap-missing] builds for that build host
[ka:host-changed]      builds to that host until pre-flight re-passes
[ka:signing-fail]      global (all builds need signing)
[ka:regression]        installs to that host until triaged

Scoped per issue. No implicit cross-domain propagation. Dependency cascades detected at promote-time, not propagated globally.

Patch tree (in marfrit/kernel-agent)

patches/
├── arch/{arm64,x86_64}/
├── soc/{rockchip/{rk3399,rk3566,rk3588},...}/
├── module/<som-name>/
├── board/<board-name>/
├── driver/<driver-name>/
└── subsystem/<subsystem-name>/

Each patch lives at the narrowest scope that's correct (a board patch goes under board/, an SoC-wide fix under soc/). Per-host manifest resolves tags + explicit includes. Reorgs via ka-migrate-tree (atomic tree + manifest rewrite); paths stable otherwise.

Build hosts

Host             Where           Role              Wake?     Notes
──────────────────────────────────────────────────────────────────────────
boltzmann        Rock 5 ITX+     aarch64 primary   always    container kbuild-aarch64
ampere           CoolPi GenBook  aarch64 secondary on-demand RK3588 32GB; same uarch as boltzmann,
                                                            wakes via His; idle 30 min → release
fermi            hertz LXD       aarch64 fallback  always    matches kbuild-aarch64 profile
kbuild-x86       data CT         x86_64            on-demand wakes via His; idle 30 min → release

Native make on the assigned build host. No distcc for kernel-agent builds (feedback_kernel_agent_no_distcc.md, locked 2026-05-09). ccache stays per-host. distcc remains in scope for userspace package builds.

Files / paths

/srv/kernel-agent/source/<job-id>/    live build dir (kbuild UID owns)
/srv/kernel-agent/ccache/             persistent across builds
/srv/kernel-agent/output/<job-id>/    built packages, pre-sign
/srv/kernel-agent/manifest/           per-host manifests (yaml)
/srv/kernel-agent/keep/               failed builds tagged ka-keep

hertz:/sparfuxdata/kernel-agent-backups/<host>/<version>/   7-day
hertz:/sparfuxdata/kernel-agent-archive/<job-id>/           1-year (cron)

https://logs.reauktion.de/<host>/<job-id>/   1-year (cron on lagrange)

Repos:

  • marfrit/kernel-agent — agent source, manifests, scope-tagged patch tree
  • marfrit/<campaign> — each campaign owns its repo
  • marfrit/misc-kernel-patches — landing pad for one-off non-campaign fixes
  • marfrit-packages — kernel package PKGBUILDs / .debs

Identity

Issues filed as the host the agent runs on (claude-noether by default, per reference_claude_noether_gitea.md). Title prefix [ka:*] carries the role. No new Gitea identity; per-host bootstrap one-liner already covers this.

Reminder channel

Active Claude session top-of-conversation hook only — no email, no HA, no DokuWiki. Cadence: escalating ladder (initial → +1h → +6h → daily). Snooze via ka-snooze <issue-id> [--for <duration>].

Hard rules — won't change without re-litigation

  • Never auto-promote. Closure is your explicit verb.
  • Never auto-install. Reboot only happens inside ka-install.
  • Never reach into $HOME on any host.
  • Never targets infra hosts (noether, hertz, dcw*, turing) without explicit fleet/ manifest opt-in.
  • Never sudo-mutates host setup. His provisions; agent consumes.
  • Refuse abandon without --keep-as-archive | --purge-from-fleet flag.
  • Refuse promote of patches lacking scope tag.

Bootstrap reference build (2026-05-09 — fresnel)

First end-to-end run, before ka-promote / ka-build / ka-install existed. Documented here as the canonical worked example; the substrate that the ka-* verbs are/will-be implemented against. Issue #3 (fresnel DTS persistence) closed by this build. ka-promote (issue #22) replaced the manual step #1 below as of 2026-05-18.

Inputs

  • Baseline: torvalds/linux @ v7.0 (verified during ka-promote Phase 3, issue #22 — mmind/linux-rockchip does not ship a plain v7.0 tag despite earlier docs; mmind kept in fresnel.yaml as informational patch_authoring_context).
  • Patches (scope board/pinebook-pro):
    • 0001-arm64-dts-rk3399-pinebook-pro-add-OC-OPP-tables-1704-2184.patch
    • 0002-arm64-dts-rk3399-pinebook-pro-enable-hdmi-sound.patch
    • 0003-arm64-dts-rk3399-pinebook-pro-spi1-max-freq-10MHz.patch
  • Manifest: fleet/fresnel.yaml (tree=mmind v7.0, 3 patches above, alongside-install vs linux-eos-arm).
  • .config source: snapshot from fresnel /usr/lib/modules/6.19.10-1-eos-arm/build/.config, recovered from the data backintime backup (May 7 snapshot) since the laptop was off when the build started; make olddefconfig to fold in v7.0 new symbols (one harmless BOOTPARAM_SOFTLOCKUP_PANIC warning, ignored).

Manual substitute for each ka-* verb

Designed verb What we did manually Status
ka-import fresnel-fourier <patches> --to board/pinebook-pro (originally named ka-promote in this row) Authored 3 patches with proper headers/scope tags, pushed to marfrit/kernel-agent/patches/board/pinebook-pro/ via Gitea contents API as claude-noether. still manual — ka-import unimplemented
ka-promote fresnel (new — manifest → cumulative.patch + manifest.lock) n/a (didn't exist) automated 2026-05-18, issue #22
ka-build fresnel On boltzmann: cloned linux v7.0 from kernel.org, ran makepkg -s --skipchecksums --skippgpcheck against marfrit-packages/arch/linux-fresnel-fourier/PKGBUILD. Native aarch64 (boltzmann is RK3588). One headers-pkg bug discovered (ln -sr on missing parent dir) and fixed mid-flight. Repackaged. still manual — next verb to implement
ka-sign + push scp pkgs hertz → sudo /opt/herding/bin/marfrit-publish-arch aarch64 <pkg> per pkg. Script signs with key 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C, runs repo-add, rsyncs to nc. still manual — folded into ka-build
ka-install fresnel (consent-via-action) sudo pacman -U /tmp/<pkg> over LAN scp (HTTPS to nc was throttled by fresnel's wifi). pacman post-transaction hook updated extlinux. mkinitcpio run manually because the standard hook trigger watches vmlinuz not Image. still manual — last verb to implement
Bar 1..3 verification SSH heartbeat OK, pacman -Q linux-fresnel-fourier = 7.0-1, post-reboot cluster0 1.704 GHz / cluster1 2.184 GHz confirmed. folded into ka-install

Files / locations involved

  • git.reauktion.de/marfrit/kernel-agent/patches/board/pinebook-pro/ — patches
  • git.reauktion.de/marfrit/kernel-agent/fleet/fresnel.yaml — manifest
  • git.reauktion.de/marfrit/marfrit-packages/arch/linux-fresnel-fourier/ — PKGBUILD + 3 patches + config + extlinux hook+script + mkinitcpio preset
  • boltzmann:~/src/kernel-agent-bootstrap/ — local build root (baseline clone, patches, build dir, artifacts)
  • hertz:/tmp/ka-publish/ — staging for sign+push (transient)
  • hertz:/sparfuxdata/kernel-agent-backups/fresnel/6.19.9-99-eos-arm/fresnel-boot-pre-install.tgz — pre-install /boot snapshot (71MB, 7-day retention per design)
  • https://packages.reauktion.de/arch/aarch64/linux-fresnel-fourier-7.0-1-aarch64.pkg.tar.zst — published artifact
  • fresnel:/boot/{Image,initramfs,dtbs}-fresnel-fourier{,/...} — installed artifacts
  • fresnel:/boot/extlinux/extlinux.conf — managed block tagged >>> linux-fresnel-fourier (managed) >>><<<

What was learned that ka-* should bake in

  • mkinitcpio's stock hook watches vmlinuz, not Image. ARM kernel installs must explicitly run mkinitcpio -p <preset> from the install hook, OR ship a custom alpm hook with Target = boot/Image-<suffix>.
  • Headers PKGBUILD: ln -sr "${_builddir}" "${pkgdir}/usr/src/${pkgbase}" needs a preceding install -d "${pkgdir}/usr/src". Cargo-cult from arch's linux package without checking that pacman pre-creates /usr/src for kernels.
  • HTTPS download from nc.reauktion.de can stall on slow wifi (fresnel @ 181 ms ping). Same-LAN scp from hertz (which already has the published pkgs in /tmp/ka-publish/) is the workaround. ka-install should detect and prefer LAN-fanout.
  • Manifest must carry the kernel suffix (-fresnel-fourier) explicitly so alongside-install paths (/boot/Image-<suffix>, /boot/dtbs-<suffix>/, /boot/initramfs-<suffix>.img) don't collide with the EOS-stock paths.
  • Backup target needs install -d -o $USER -g $USER first time per host — /sparfuxdata/kernel-agent-backups/<host>/<version>/ is created lazily.

Out of scope this round (explicit defer)

  • vb2 dma_resv RFC v2resolved 2026-05-15. Markus iterated v2 locally on boltzmann reaching pkgrel=14; the v2 series attaches the fence at device_run (slept-OK context per Dufresne's v1 review). Now carried in patches/subsystem/media/videobuf2/dma-resv-release-fence/ and included in fleet/fresnel.yaml. Still in scope for upstream targeting; default remains "build-tree only, no PR until explicitly asked" (feedback_no_upstream.md).
  • panfrost IOMMU_CACHE for RK3399 — sibling kernel work that targets the readback transitive-proof gap that vb2_dma_resv alone doesn't close. Still deferred until that lands; ship together when ready.
  • Replace linux-eos-arm rather than coexist alongside — preserves easy rollback at u-boot. Can flip to provides=(linux-eos-arm) conflicts=(...) later once burn-in proves the OC kernel reliable.

Open follow-ups (post-rollout)

  • Migrate github.com/marfrit/misc_patches/genbook/kernel/ (9 patches against linux-6.19.9) into proper Coulomb/RockHard campaign repo with scope tags applied. Some patches will need splitting (e.g., 0010 suspend/resume is multi-scope and should split into soc:rk3588 + board:coolpi-cm5-genbook pieces). — Issue #1.
  • Migrate besser/patches/ (~30 BES2600 staging series) into the scope-tagged tree at driver/bes2600/ with promote eligibility per series. — Issue #2.
  • Decide whether boltzmann (BredOS-stock today) becomes a Neutron-managed custom kernel target or stays stock. Decision deferred per memory project_neutron.md. — Issue #4.
  • fresnel DTS persistenceclosed by the bootstrap reference build above. Issue #3 closed.
S
Description
Kernel source/branch/patch curation, per-host build orchestration, promote-to-fleet pipeline for the home fleet. Peer agent to His (home infra).
Readme 730 KiB
Languages
Shell 67.4%
Python 32.6%