Files
fresnel-fourier/phase4_iter12_plan.md
marfrit de889898b8 iter12 Phase 4 + 6: integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2
User signaled RFC v2 is prepared at boltzmann:~/v2-patch-work/v2-out/.
Three patches:
  0001 media: videobuf2: add opt-in dma_resv producer fence helper
  0002 media: hantro: attach dma_resv release fence at device_run
  0003 media: rockchip-rga: attach dma_resv release fence at ...

v2 key change vs v1: attach moves from buf_queue to m2m device_run
(Dufresne's finite-time-contract concern). Build the kernel package
on boltzmann (~/src/kernel-agent-bootstrap/.../linux-fresnel-fourier/),
deploy to fresnel, reboot, retest.

sddm auto-login as mfritsche staged in /etc/sddm.conf.d/20-autologin.conf
on fresnel before reboot per user authorization.

Phase 0's α-16 OUTPUT-byte dump candidate parked; kernel substrate
upgrade takes precedence given RFC v2 is the long-stalled
reference_dmabuf_resv_blocker.md unblock.

Iter12 outcomes:
  PASS  = Bug 4/5 hashes shift toward kdirect after reboot.
  PARTIAL = kernel upgraded cleanly, no regression, hashes unchanged.

Either outcome is valuable — substrate moves forward regardless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:56:15 +00:00

6.0 KiB
Raw Permalink Blame History

Iteration 12 — Phase 4 plan + Phase 6 implementation

Major pivot at iter12 Phase 0 close: user signaled "RFC v2 is already prepared and should be included in fresnel's running kernel." This changes iter12's vehicle from "OUTPUT byte dump (α-16)" to "integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2 and reboot fresnel."

Why this matters for Bug 4 + Bug 5

Per reference_dmabuf_resv_blocker.md:

RK3399 hantro/rkvdec CAPTURE → libva readback returns all-zero pages until vb2_dma_resv kernel patches land.

The bugs we've been chasing (Bug 4 H.264 partial-fill, Bug 5 HEVC all-zero) are not in libva's wire-payload. iter9 + iter11 exhausted the wire-byte search space with 10 confirmed-inert field changes (constraint_set_flags, POC sentinel, reference_ts magnitude, sps_max_num_reorder_pics, IRAP/IDR flags, num_entry_point_offsets, …). The kernel decodes correctly; the readback via libva's cached-mmap returns stale / zero / partial-write data.

RFC v2 attaches a dma_resv exclusive WRITE fence to V4L2-produced dmabufs at device_run time. Userspace consumers that implicit-sync via poll(POLLIN) or DMA_BUF_IOCTL_EXPORT_SYNC_FILE will:

  • See a real signalled fence after decode completes (not the stub).
  • Get cache-coherent buffer access guaranteed by the fence semantics.

Whether this fixes our libva → ffmpeg-vaapi-hwdownload pixel readback path specifically depends on whether ffmpeg-vaapi's read uses the fence. If it does (or if installing the patch incidentally fixes the cache-coherency mechanism it touches), Bugs 4 + 5 may close.

Patches

Located at boltzmann:~/v2-patch-work/v2-out/:

Patch Subject Size
0000 cover letter 7,063 bytes
0001 media: videobuf2: add opt-in dma_resv producer fence helper 13,089 bytes
0002 media: hantro: attach dma_resv release fence at device_run 4,136 bytes
0003 media: rockchip-rga: attach dma_resv release fence at … 4,469 bytes

Cover-letter excerpt:

v2 of the vb2 dma_resv producer-fence series. Two substantive changes addressing v1 review:

  1. Attach point moves from buf_queue to m2m device_run. v1 attached at buf_queue, which violates the dma_fence finite-time contract (Nicolas Dufresne, [linux-rockchip][2]): a fence published when userspace QBUFs is observable to consumers immediately, but the OUTPUT-side driver may be starved indefinitely if userspace never queues bitstream chunks. By moving the attach to device_run, we publish only after the m2m core has committed to running the job.

Integration plan

  1. Copy the 3 patches into ~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/ on boltzmann as 0004-*.patch, 0005-*.patch, 0006-*.patch. ✓ done.
  2. Update PKGBUILD source=() array to include them. ✓ done.
  3. Bump pkgrel from 1 to 2. ✓ done.
  4. Run makepkg -fs --noconfirm on boltzmann (native, 8 cores, no distcc per kernel-agent policy). in progress.
  5. Stage sddm auto-login as mfritsche (/etc/sddm.conf.d/20-autologin.conf) on fresnel before reboot. ✓ done.
  6. scp linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst from boltzmann to fresnel.
  7. sudo pacman -U on fresnel.
  8. Reboot fresnel.
  9. Verify new kernel running (uname -a reports 7.0.0-fresnel-fourier with newer build timestamp).
  10. Run 5-codec sweep + γ instrumentation to check whether Bug 4 + Bug 5 hashes change.

Risks

  • R-1: RFC v2 patch doesn't apply cleanly to mmind v7.0 base. Probability: low — same base as v1 which built clean (existing 7.0-1 package). Mitigation: if patch fails, fix conflicts manually on boltzmann; rebuild.
  • R-2: Kernel boots but new module signatures break. Probability: very low — patches are media-driver-only; no symbol exports the rest of the kernel depends on. Mitigation: keep 7.0-1 entry in extlinux menu (PKGBUILD's hook already creates a parallel boot option per the PKGBUILD comment "User picks at u-boot menu. Reverting = boot the linux-eos-arm entry.").
  • R-3: Kernel boots but libva-vaapi still produces broken HEVC/H.264 output. The most likely outcome — RFC v2's fence machinery may not engage with libva's specific cached-mmap readback (ffmpeg-vaapi-hwdownload). Bug 4 + Bug 5 stay open; iter12 close as PARTIAL with kernel-substrate upgraded.
  • R-4: Kernel boots and FIXES Bug 4 + Bug 5. Bigger campaign milestone. Possible if libva's path implicitly benefits from the fence-publishing.
  • R-5: Some other regression (panfrost, sound, USB) post-reboot. Probability: very low (patches are media-only). Mitigation: rollback via extlinux 7.0-1 option.

Phase 5 review concerns

Reviewer should confirm:

  • The 3 patches modify only drivers/media/common/videobuf2/, drivers/media/platform/verisilicon/hantro/, drivers/media/platform/rockchip/rga/. No accidental wide changes.
  • PKGBUILD source array correctness (all 3 patches listed; SKIP'd checksums for all entries).
  • extlinux-add hook still files a parallel boot entry so 7.0-1 remains as fallback.

This is a quick review since the user has already prepared the patches and authorized the kernel work flow.

Phase 7 verification

After reboot:

  1. uname -a shows new build timestamp.
  2. dmesg | grep -i "vb2\|dma_resv\|hantro\|rkvdec" — any new RFC v2 trace messages OR no errors.
  3. 5-codec sweep:
    • H.264 → was 71ac099b…. Compare.
    • HEVC → was 06b2c5a0…. Compare.
    • VP9 → was 4f1565e8…. Must hold.
    • MPEG-2 → was 19eefbf4…. Must hold.
    • VP8 → was bcc57ed5…. Compare (may change).
  4. γ dump verifies what's in CAPTURE buffers post-DQBUF.

iter12 PASS = libva hashes for Bug 4 + Bug 5 codecs change toward kdirect values + 4 working anchors hold. iter12 PARTIAL = kernel upgraded cleanly, no regression, but Bug 4 + 5 hashes unchanged.

Iter12 close shape

Both PASS and PARTIAL are valuable — the kernel substrate moves to vb2_dma_resv-enabled, which the campaign has wanted since iter1 closed reference_dmabuf_resv_blocker.md. Whatever the libva pixel result, the kernel substrate upgrade is achieved.