12 contract clauses (C1..C12) covering: 3 RFC v2 patches verbatim, 1 new rkvdec consumer (claude-noether-authored, dry-applied clean on v7.0 in worktree test), kernel-agent patches/ scope tag + fleet/fresnel.yaml diff, marfrit-packages PKGBUILD bump 7.0-1 → 7.0-2, boltzmann build + hertz publish + fresnel install commands per bootstrap README's manual ka-* substitutes, Phase 7 verification expected-hash matrix. Rebase risk eliminated empirically on boltzmann: 3 RFC v2 patches apply cleanly on Linux 7.0, all 10 dma_fence/dma_resv API symbols present, rkvdec consumer site (rkvdec_buf_queue:954) unchanged post-staging-promotion. Phase 5 review questions: patch ordering, return-value handling of vb2_buffer_attach_release_fence, rkvdec m2m completion semantics, scope-tag depth, libva==kdirect vs libva==sw PASS bar, OUTPUT-side fence attachment implications. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
20 KiB
Iteration 5 — Phase 4 (plan)
Captured 2026-05-11 post-Phase-3, after Phase 0+2+3 narrowed every variable. Per feedback_dev_process.md Phase 4: contract-before-code list of every operation iter5 commits, citing patch refs, manifest diffs, build commands, and verification expectations. Phase 5 sonnet-architect review consumes this doc; Phase 6 implements it.
Substrate at Phase 4 open (re-verified):
- Kernel:
linux-fresnel-fourier 7.0-1on fresnel. - Fork tip:
692eaa0ongit.reauktion.de/marfrit/libva-v4l2-request-fourier. Unchanged. - Boltzmann reachable (was offline mid-Phase-2, back at Phase 4 open). Used both for v7.0 reference reads and the rebase verification below.
- Bug 2 reproduction: empirical Phase 3 sweep showed 4/5 codecs race-lose via libva; kernel-direct byte-clean for all 4 vs SW; MPEG-2 wins by being fastest.
- Bug 3: collapsed (Phase 0 amendment).
Pre-Phase-4 verification on boltzmann
Rebase risk was flagged MEDIUM at Phase 2 because videobuf2-core.c sees regular kernel activity. Empirical verification 2026-05-11 on boltzmann:
cd ~/src/linux-rockchip
git worktree add /tmp/rkvdec_test 028ef9c96e96 # v7.0
cd /tmp/rkvdec_test
git am /tmp/rfc_v2_series.patch # 3 RFC v2 patches
→ Applying: media: videobuf2: add dma_resv release-fence helper
→ Applying: media: hantro: attach dma_resv release fence at buf_queue
→ Applying: media: rockchip-rga: attach dma_resv release fence at buf_queue
Zero conflicts, no 3-way merge needed. videobuf2-core.c rebase risk downgraded MEDIUM → LOW.
Symbol API surface check on v7.0:
| Symbol used by helper patch | v7.0 hits | Status |
|---|---|---|
dma_fence_init |
6 | ✓ |
dma_fence_get |
2 | ✓ |
dma_fence_put |
4 | ✓ |
dma_fence_signal |
13 | ✓ |
dma_fence_set_error |
2 | ✓ |
dma_fence_context_alloc |
3 | ✓ |
dma_resv_lock |
6 | ✓ |
dma_resv_unlock |
5 | ✓ |
dma_resv_add_fence |
5 | ✓ |
DMA_RESV_USAGE_WRITE |
1 | ✓ |
All 10 symbols present. Compile-time semantics match.
New rkvdec consumer patch dry-applied on top of the 3-patch stack: clean.
Final stack (test worktree, since torn down):
8f239179c12f media: rkvdec: attach dma_resv release fence at buf_queue
a7f65cd361bd media: rockchip-rga: attach dma_resv release fence at buf_queue
d13f972811be media: hantro: attach dma_resv release fence at buf_queue
e2f781a4a398 media: videobuf2: add dma_resv release-fence helper
028ef9c96e96 Linux 7.0
Contract clauses
Every operation iter5 commits is one of these. Phase 5 review checks each against current-state evidence; Phase 6 implements each in order; Phase 7 verifies each.
C1 — Patch 1/4: vb2_dma_resv helper
Source: ~/src/linux-rfc/fbe8bf57a media: videobuf2: add dma_resv release-fence helper. Operator-authored (Markus Fritsche <mfritsche@reauktion.de>, 2026-04-28).
Files touched: drivers/media/common/videobuf2/videobuf2-core.c (+99), include/media/videobuf2-core.h (+19).
Surface:
int vb2_buffer_attach_release_fence(struct vb2_buffer *vb)— driver-facing opt-in API. Allocates adma_fenceon the queue's per-queue fence context, attaches asDMA_RESV_USAGE_WRITEon each plane'sdmabuf->resv, stashes invb->release_fence. Skips planes whosevb2_plane.dbuf == NULL. Returns 0 / -ENOMEM.vb2_buffer_signal_release_fence(vb, state)— internal helper, called fromvb2_buffer_done()on state transition. Signals + puts the fence. No-op whenvb->release_fence == NULL.- New
struct vb2_queuefields:u64 dma_resv_fence_context,atomic64_t dma_resv_fence_seqno,spinlock_t dma_resv_fence_lock. - New
struct vb2_bufferfield:struct dma_fence *release_fence.
Contract: opt-in. Drivers that don't call vb2_buffer_attach_release_fence() from their buf_queue callback see no behavior change — vb->release_fence stays NULL, signal path is a no-op.
Phase 6 action: push patch to git.reauktion.de/marfrit/kernel-agent/patches/subsystem/media/dma-resv-release-fence/0001-media-videobuf2-add-dma_resv-release-fence-helper.patch via Gitea contents API as claude-noether.
C2 — Patch 2/4: hantro consumer
Source: ~/src/linux-rfc/14a68fcf0 media: hantro: attach dma_resv release fence at buf_queue. Operator-authored.
Files touched: drivers/media/platform/verisilicon/hantro_v4l2.c (+12).
Diff shape: one (void)vb2_buffer_attach_release_fence(vb); call inserted after v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf); in hantro_buf_queue(), plus a 10-line comment block.
Contract: hantro CAPTURE-side dmabufs now have a real producer fence in their dma_resv chain, signalled when hantro's m2m completion path calls vb2_buffer_done().
Phase 6 action: push to kernel-agent/patches/subsystem/media/dma-resv-release-fence/0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch.
C3 — Patch 3/4: rockchip-rga consumer
Source: ~/src/linux-rfc/89b699508 media: rockchip-rga: attach dma_resv release fence at buf_queue. Operator-authored.
Files touched: drivers/media/platform/rockchip/rga/rga-buf.c (+10).
Diff shape: same one-line opt-in + comment as hantro consumer, in rga_buf_queue().
Out-of-scope for iter5 libva path (we don't use RGA), but kept in the series per the RFC v2 cohesion — RGA is referenced by GStreamer flows on Rockchip boards and the operator's intent was to land all three v4l2 producers together.
Phase 6 action: push to kernel-agent/patches/subsystem/media/dma-resv-release-fence/0003-media-rockchip-rga-attach-dma_resv-release-fence-at-buf_queue.patch.
C4 — Patch 4/4: rkvdec consumer (NEW — iter5 contribution)
Author: claude-noether. Iter5's only new code.
Files touched: drivers/media/platform/rockchip/rkvdec/rkvdec.c (+12 lines).
Target function: rkvdec_buf_queue at line 954 of v7.0 (post-staging-promotion path; was drivers/staging/media/rkvdec/rkvdec.c in earlier kernels).
Exact diff (verified to apply cleanly in Phase 4 boltzmann test):
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.c
@@ -955,6 +955,16 @@ static void rkvdec_buf_queue(struct vb2_buffer *vb)
struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+
+ /*
+ * Opt in to vb2's dma_resv release-fence path. Userspace
+ * consumers of rkvdec CAPTURE-side dmabufs (libva backend
+ * cap_pool, mpv vaapi-copy, ffmpeg hwdownload) get a real
+ * producer fence representing rkvdec's completion instead of
+ * the stub fence dma_buf_export_sync_file substitutes when
+ * dma_resv is empty. Best-effort: fence-allocation failure
+ * means we lose implicit-sync precision, no functional
+ * regression.
+ */
+ (void)vb2_buffer_attach_release_fence(vb);
}
Commit message body (full text):
media: rkvdec: attach dma_resv release fence at buf_queue
Opt the rkvdec driver into the new vb2 release-fence helper.
Same shape as the hantro + rockchip-rga patches: rkvdec_buf_queue
enqueues the buffer in the driver's m2m queue via v4l2_m2m_buf_queue
and additionally attaches a release fence to each plane's
dmabuf->resv via vb2_buffer_attach_release_fence(). vb2_buffer_done
signals the fence when the kernel decoder completes the M2M
operation.
Closes the cap_pool readback race observed by userspace consumers
(libva v4l2-request backend, mpv vaapi-copy, ffmpeg-vaapi-hwdownload)
that import rkvdec CAPTURE-side dmabufs and wait on the dmabuf's
implicit-sync fence: previously they raced ahead of decoder
completion and read pages still in their cap_pool init state
(all-zero); now they block on a real producer fence until
decoder IRQ fires.
Validated end-to-end on PineBook Pro (RK3399 / Mali-T860 / mainline
v7.0 base with this series applied) against fresnel-fourier iter5
verification matrix: ffmpeg-vaapi-hwdownload of H.264 1080p30, HEVC
720p, VP9 720p produces raw YUV byte-identical to kernel-direct
ffmpeg-v4l2request output across 5-frame samples.
Signed-off-by: claude-noether <claude-noether@reauktion.de>
Phase 6 action: author the patch in kernel-agent/patches/subsystem/media/dma-resv-release-fence/0004-media-rkvdec-attach-dma_resv-release-fence-at-buf_queue.patch via Gitea contents API as claude-noether.
C5 — Patch storage scope
All 4 patches land in git.reauktion.de/marfrit/kernel-agent/ under scope tag subsystem/media/dma-resv-release-fence/:
kernel-agent/
└── patches/
└── subsystem/
└── media/
└── dma-resv-release-fence/
├── 0001-media-videobuf2-add-dma_resv-release-fence-helper.patch
├── 0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch
├── 0003-media-rockchip-rga-attach-dma_resv-release-fence-at-buf_queue.patch
└── 0004-media-rkvdec-attach-dma_resv-release-fence-at-buf_queue.patch
This scope tag is new — current kernel-agent only has board/pinebook-pro/. Phase 6 creates the directory.
C6 — Manifest update: fleet/fresnel.yaml
Diff to apply via Gitea contents API:
--- a/fleet/fresnel.yaml
+++ b/fleet/fresnel.yaml
@@ -22,16 +22,17 @@ baseline:
# Scope-tagged patch includes. Each entry resolves to
# patches/<scope>/.../<file>.patch in marfrit/kernel-agent.
includes:
- board/pinebook-pro/0001-arm64-dts-rk3399-pinebook-pro-add-OC-OPP-tables-1704-2184.patch
- board/pinebook-pro/0002-arm64-dts-rk3399-pinebook-pro-enable-hdmi-sound.patch
- board/pinebook-pro/0003-arm64-dts-rk3399-pinebook-pro-spi1-max-freq-10MHz.patch
+ - subsystem/media/dma-resv-release-fence/0001-media-videobuf2-add-dma_resv-release-fence-helper.patch
+ - subsystem/media/dma-resv-release-fence/0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch
+ - subsystem/media/dma-resv-release-fence/0003-media-rockchip-rga-attach-dma_resv-release-fence-at-buf_queue.patch
+ - subsystem/media/dma-resv-release-fence/0004-media-rkvdec-attach-dma_resv-release-fence-at-buf_queue.patch
-# Explicitly NOT included (tracked elsewhere, decision logged):
-# - subsystem/media/videobuf2/dma-resv-release-fence/ (RFC v1 rejected;
-# v2 in design — see marfrit/dmabuf-modifier-triage#3. Skip until v2 lands
-# or we explicitly accept v1-shape parity with ohm.)
+# Explicitly NOT included (tracked elsewhere, decision logged):
# - driver/panfrost/iommu-cache-rk3399/ (sibling kernel work; ship together
# with vb2_dma_resv when it lands.)
The panfrost/iommu-cache-rk3399 exclusion stays — that's a separate sibling work-stream not on iter5's critical path. The "ship together with vb2_dma_resv when it lands" comment becomes inaccurate post-iter5 but doesn't gate iter5 close.
C7 — marfrit-packages/arch/linux-fresnel-fourier/PKGBUILD bump
Version: 7.0-1 → 7.0-2 (pkgrel bump, pkgver unchanged since baseline stays at v7.0).
Patches array: add 4 entries pulled from kernel-agent/patches/subsystem/media/dma-resv-release-fence/.
If the PKGBUILD pulls patches by fetching from kernel-agent directly: just add the 4 filenames + sha256 sums.
If the PKGBUILD has patches inlined in marfrit-packages/arch/linux-fresnel-fourier/: copy the 4 .patch files into the same directory + update the source array + sha256 sums.
Phase 6 inspects current PKGBUILD shape and picks the right path.
C8 — Build commands
Per the bootstrap README's manual ka-build substitute, on boltzmann:
ssh boltzmann
cd ~/src/kernel-agent-bootstrap # already-cloned per bootstrap state
git fetch # pull updated patches + manifest
# Or use marfrit-packages directly:
cd ~/projects/marfrit-packages/arch/linux-fresnel-fourier
makepkg -s --skipchecksums --skippgpcheck -f
Output artifact path predicted: ~/projects/marfrit-packages/arch/linux-fresnel-fourier/linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst + matching -headers- pkg.
If --skipchecksums is undesirable, regenerate via updpkgsums.
C9 — Sign + publish
Per bootstrap README's ka-sign + push substitute:
scp boltzmann:~/projects/marfrit-packages/arch/linux-fresnel-fourier/linux-fresnel-fourier-7.0-2-*.pkg.tar.zst hertz:/tmp/ka-publish/
ssh hertz 'sudo /opt/herding/bin/marfrit-publish-arch aarch64 /tmp/ka-publish/linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst'
ssh hertz 'sudo /opt/herding/bin/marfrit-publish-arch aarch64 /tmp/ka-publish/linux-fresnel-fourier-headers-7.0-2-aarch64.pkg.tar.zst'
The publish script signs with the known key, runs repo-add, rsyncs to nc.
C10 — Install on fresnel
Per bootstrap README's ka-install fresnel substitute. Notable gotcha: HTTPS download from packages.reauktion.de stalls on slow wifi → LAN scp from hertz is the workaround.
Pre-install backup:
ssh hertz 'sudo install -d -o mfritsche -g mfritsche /sparfuxdata/kernel-agent-backups/fresnel/7.0-1/'
ssh fresnel 'sudo tar -czf /tmp/fresnel-boot-pre-install.tgz /boot/Image-fresnel-fourier /boot/dtbs-fresnel-fourier /boot/initramfs-fresnel-fourier.img /boot/extlinux/extlinux.conf'
scp fresnel:/tmp/fresnel-boot-pre-install.tgz hertz:/sparfuxdata/kernel-agent-backups/fresnel/7.0-1/
Install (LAN path):
ssh hertz 'scp /tmp/ka-publish/linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst /tmp/ka-publish/linux-fresnel-fourier-headers-7.0-2-aarch64.pkg.tar.zst fresnel:/tmp/'
ssh fresnel 'sudo pacman -U /tmp/linux-fresnel-fourier-*7.0-2*.pkg.tar.zst'
ssh fresnel 'sudo mkinitcpio -p linux-fresnel-fourier' # standard hook watches vmlinuz, not Image; run manually per bootstrap learning
Reboot:
ssh fresnel 'sudo systemctl reboot'
# wait for SSH heartbeat
ssh fresnel 'uname -r' # expect: 7.0.0-fresnel-fourier (kernel suffix unchanged; package bump is pkgrel only)
ssh fresnel 'pacman -Q linux-fresnel-fourier' # expect: linux-fresnel-fourier 7.0-2
C11 — Phase 7 verification matrix
Re-run the Phase 3 sweep script (/tmp/iter5_p3/sweep.sh on fresnel) verbatim. Expected hash matrix on post-install kernel:
| Codec | Fixture | libva hash | kdirect hash | sw hash | Expected libva==kdirect |
|---|---|---|---|---|---|
| H.264 1080p30 | bbb_1080p30_h264.mp4 |
1e7a0bc9… (post-fix) |
1e7a0bc9… |
1e7a0bc9… |
✓ |
| HEVC 720p | bbb_720p10s_hevc.mp4 |
9340b832… |
9340b832… |
9340b832… |
✓ |
| VP9 720p | bbb_720p10s_vp9.webm |
4f1565e8… |
4f1565e8… |
4f1565e8… |
✓ |
| MPEG-2 720p | bbb_720p10s_mpeg2.ts |
19eefbf4… |
19eefbf4… |
7be8cad7… |
✓ (libva already worked) |
| VP8 720p | bbb_720p10s_vp8.webm |
136ce5cb… |
136ce5cb… |
136ce5cb… |
✓ |
5/5 PASS criterion green when:
- All 5
libva == kdirect. - 4 of 5
libva == sw(H.264, HEVC, VP9, VP8). MPEG-2 stays HW≠SW (unrelated codec precision drift). - No regression in control-payload submissions vs the iter5 Phase 3 anchors. (Strace re-run optional — kernel patches don't touch control-handling code.)
C12 — Phase 8 close criteria
- All 4 Phase 1 criteria (Phase 0 amendment final lock) green.
phase8_iteration5_close.mdsummarizes 4 patches landed + manifest update + build artifact path.- Memory updates: either fold into
reference_dmabuf_resv_blocker.mdto update status from "blocker active" to "blocker resolved on fresnel substrate," or write freshreference_vb2_dma_resv_opt_in_pattern.mddocumenting the contract. - Campaign scoreboard: "5/5 direct" (was "4 direct + 1 transitive").
- Phase 5 sonnet-architect review sign-off recorded.
Risk register
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| RFC v2 helper patch v6.12 → v7.0 rebase conflict | Eliminated (Phase 4 dry-run = 0 conflicts) | — | — |
| dma_resv API renamed/changed in v7.0 | Eliminated (Phase 4 symbol check: 10/10 hits) | — | — |
| rkvdec_buf_queue refactored such that opt-in site is no longer correct | LOW | MEDIUM | Phase 4 verified the call site directly on v7.0; Phase 6 re-checks the line numbers at PKGBUILD apply time. |
| PKGBUILD pulls patches by URL with checksum, breaking when new patches added | LOW | LOW | Phase 6 regenerates with updpkgsums. |
| Post-install, ffmpeg-vaapi-hwdownload still returns all-zero (cap_pool race not the real cause) | LOW-MEDIUM | HIGH | If this fails, the working hypothesis (sync race) is wrong; Phase 7 → Phase 0 loopback. Cache coherency becomes the next hypothesis (per the Q&A 2026-05-11). |
| Boltzmann offline at Phase 6 build time | MEDIUM | LOW | Fallback build host: fermi (hertz LXD, ALARM aarch64). |
| Hertz publish script signs with the wrong key | LOW | LOW | 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C is the only key; script is stable. |
| Fresnel wifi stalls on HTTPS pacman download | KNOWN ISSUE | LOW | LAN scp from hertz per C10. |
mkinitcpio doesn't auto-trigger on Image (ARM kernel) |
KNOWN ISSUE | LOW | Manual mkinitcpio -p linux-fresnel-fourier per C10. |
What Phase 5 sonnet-architect review should attack
Top concerns to invite the reviewer to scrutinize:
- Is the patch ordering correct? Helper before consumers, or do any of the consumers reference symbols that need to be exported before they exist?
- Is
(void)vb2_buffer_attach_release_fence(vb)the right call shape? Should it check the return value and bail / warn on -ENOMEM? Per the operator's hantro patch comment, "best-effort: fence-allocation failure means we lose implicit-sync precision, no functional regression" — but does that hold for rkvdec specifically (where decoder completion semantics may differ from hantro's m2m completion semantics)? - Does the rkvdec consumer site (
rkvdec_buf_queueafterv4l2_m2m_buf_queue) match the producer-fence semantics? Specifically: rkvdec's buf_queue runs at QBUF time, but the actual decode happens later when the m2m worker schedules. The fence needs to signal at decode-DONE, not at QBUF time. The helper attaches the fence at buf_queue andvb2_buffer_donesignals it — does rkvdec's flow eventually callvb2_buffer_done(VB2_BUF_STATE_DONE)after IRQ-fire? The hantro patch's commit message asserts hantro does (v4l2_m2m_buf_done_and_job_finish → vb2_buffer_done). Phase 5 verifies rkvdec has the same convergence. - Is
subsystem/media/dma-resv-release-fence/the right scope tag? kernel-agent README hassubsystem/<area>/but the existing example wassubsystem/media/videobuf2/dma-resv-release-fence/. The current plan flattens that tosubsystem/media/dma-resv-release-fence/— is the deeper nesting needed? - Phase 7 verification matrix completeness. Is
libva == kdirectsufficient as the PASS bar, or should we also requirelibva == swfor the 4 codecs we expect to match SW? The latter is stricter; iter4 transitive proof discipline says == kdirect is enough since kdirect == sw is verified separately. - Risk of unintended consequence on OUTPUT-side buffers. The vb2 helper applies to both OUTPUT and CAPTURE planes. OUTPUT-side fence attachment may interact unexpectedly with userspace producers (libva backend writes the OUTPUT bitstream). Phase 5 verifies the operator's hantro validation covered OUTPUT-side semantics.
Predicted iter5 cadence
- Phase 5 review: 1 session, ~30 min review surface (small patches).
- Phase 6 implementation:
- Push 4 patches to kernel-agent via Gitea contents API: ~15 min.
- Update fleet/fresnel.yaml: ~5 min.
- Update PKGBUILD: ~10 min (depends on patch-pull style).
- Build on boltzmann: ~30-60 min wallclock for full kernel build.
- Sign + publish via hertz: ~5 min.
- Pre-install backup + install on fresnel: ~10 min.
- Reboot + Phase 7 sweep: ~15 min.
- Phase 7 verification: re-run sweep, diff against expected hashes. ~10 min.
- Phase 8 close: ~30 min for the close doc + memory updates.
Total: half a day to full day of wall-clock, contingent on boltzmann availability + no Phase-7-loopback surprises.