From de889898b89a2971ba67fe6aa99ea1d70457ecf2 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Thu, 14 May 2026 06:56:15 +0000 Subject: [PATCH] iter12 Phase 4 + 6: integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User signaled RFC v2 is prepared at boltzmann:~/v2-patch-work/v2-out/. Three patches: 0001 media: videobuf2: add opt-in dma_resv producer fence helper 0002 media: hantro: attach dma_resv release fence at device_run 0003 media: rockchip-rga: attach dma_resv release fence at ... v2 key change vs v1: attach moves from buf_queue to m2m device_run (Dufresne's finite-time-contract concern). Build the kernel package on boltzmann (~/src/kernel-agent-bootstrap/.../linux-fresnel-fourier/), deploy to fresnel, reboot, retest. sddm auto-login as mfritsche staged in /etc/sddm.conf.d/20-autologin.conf on fresnel before reboot per user authorization. Phase 0's α-16 OUTPUT-byte dump candidate parked; kernel substrate upgrade takes precedence given RFC v2 is the long-stalled reference_dmabuf_resv_blocker.md unblock. Iter12 outcomes: PASS = Bug 4/5 hashes shift toward kdirect after reboot. PARTIAL = kernel upgraded cleanly, no regression, hashes unchanged. Either outcome is valuable — substrate moves forward regardless. Co-Authored-By: Claude Opus 4.7 (1M context) --- phase4_iter12_plan.md | 81 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 phase4_iter12_plan.md diff --git a/phase4_iter12_plan.md b/phase4_iter12_plan.md new file mode 100644 index 0000000..096dffd --- /dev/null +++ b/phase4_iter12_plan.md @@ -0,0 +1,81 @@ +# Iteration 12 — Phase 4 plan + Phase 6 implementation + +Major pivot at iter12 Phase 0 close: user signaled "RFC v2 is already prepared and should be included in fresnel's running kernel." This changes iter12's vehicle from "OUTPUT byte dump (α-16)" to **"integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2 and reboot fresnel."** + +## Why this matters for Bug 4 + Bug 5 + +Per `reference_dmabuf_resv_blocker.md`: +> RK3399 hantro/rkvdec CAPTURE → libva readback returns all-zero pages until vb2_dma_resv kernel patches land. + +The bugs we've been chasing (Bug 4 H.264 partial-fill, Bug 5 HEVC all-zero) are not in libva's wire-payload. iter9 + iter11 exhausted the wire-byte search space with 10 confirmed-inert field changes (constraint_set_flags, POC sentinel, reference_ts magnitude, sps_max_num_reorder_pics, IRAP/IDR flags, num_entry_point_offsets, …). The kernel decodes correctly; the readback via libva's cached-mmap returns stale / zero / partial-write data. + +RFC v2 attaches a `dma_resv` exclusive WRITE fence to V4L2-produced dmabufs at `device_run` time. Userspace consumers that implicit-sync via `poll(POLLIN)` or `DMA_BUF_IOCTL_EXPORT_SYNC_FILE` will: +- See a real signalled fence after decode completes (not the stub). +- Get cache-coherent buffer access guaranteed by the fence semantics. + +Whether this fixes our libva → ffmpeg-vaapi-hwdownload pixel readback path specifically depends on whether ffmpeg-vaapi's read uses the fence. If it does (or if installing the patch incidentally fixes the cache-coherency mechanism it touches), Bugs 4 + 5 may close. + +## Patches + +Located at `boltzmann:~/v2-patch-work/v2-out/`: + +| Patch | Subject | Size | +|---|---|---| +| 0000 | cover letter | 7,063 bytes | +| 0001 | media: videobuf2: add opt-in dma_resv producer fence helper | 13,089 bytes | +| 0002 | media: hantro: attach dma_resv release fence at device_run | 4,136 bytes | +| 0003 | media: rockchip-rga: attach dma_resv release fence at … | 4,469 bytes | + +Cover-letter excerpt: +> v2 of the vb2 dma_resv producer-fence series. Two substantive changes addressing v1 review: +> 1. Attach point moves from buf_queue to m2m device_run. v1 attached at buf_queue, which violates the dma_fence finite-time contract (Nicolas Dufresne, [linux-rockchip][2]): a fence published when userspace QBUFs is observable to consumers immediately, but the OUTPUT-side driver may be starved indefinitely if userspace never queues bitstream chunks. By moving the attach to device_run, we publish only after the m2m core has committed to running the job. + +## Integration plan + +1. Copy the 3 patches into `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` on boltzmann as `0004-*.patch`, `0005-*.patch`, `0006-*.patch`. ✓ done. +2. Update PKGBUILD source=() array to include them. ✓ done. +3. Bump `pkgrel` from 1 to 2. ✓ done. +4. Run `makepkg -fs --noconfirm` on boltzmann (native, 8 cores, no distcc per kernel-agent policy). ⏳ in progress. +5. Stage sddm auto-login as `mfritsche` (`/etc/sddm.conf.d/20-autologin.conf`) on fresnel before reboot. ✓ done. +6. scp `linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst` from boltzmann to fresnel. +7. `sudo pacman -U` on fresnel. +8. Reboot fresnel. +9. Verify new kernel running (`uname -a` reports `7.0.0-fresnel-fourier` with newer build timestamp). +10. Run 5-codec sweep + γ instrumentation to check whether Bug 4 + Bug 5 hashes change. + +## Risks + +- **R-1**: RFC v2 patch doesn't apply cleanly to mmind v7.0 base. Probability: low — same base as v1 which built clean (existing 7.0-1 package). Mitigation: if patch fails, fix conflicts manually on boltzmann; rebuild. +- **R-2**: Kernel boots but new module signatures break. Probability: very low — patches are media-driver-only; no symbol exports the rest of the kernel depends on. Mitigation: keep 7.0-1 entry in extlinux menu (PKGBUILD's hook already creates a parallel boot option per the PKGBUILD comment "User picks at u-boot menu. Reverting = boot the linux-eos-arm entry."). +- **R-3**: Kernel boots but libva-vaapi still produces broken HEVC/H.264 output. **The most likely outcome** — RFC v2's fence machinery may not engage with libva's specific cached-mmap readback (ffmpeg-vaapi-hwdownload). Bug 4 + Bug 5 stay open; iter12 close as PARTIAL with kernel-substrate upgraded. +- **R-4**: Kernel boots and FIXES Bug 4 + Bug 5. Bigger campaign milestone. Possible if libva's path implicitly benefits from the fence-publishing. +- **R-5**: Some other regression (panfrost, sound, USB) post-reboot. Probability: very low (patches are media-only). Mitigation: rollback via extlinux 7.0-1 option. + +## Phase 5 review concerns + +Reviewer should confirm: +- The 3 patches modify only `drivers/media/common/videobuf2/`, `drivers/media/platform/verisilicon/hantro/`, `drivers/media/platform/rockchip/rga/`. No accidental wide changes. +- PKGBUILD source array correctness (all 3 patches listed; SKIP'd checksums for all entries). +- `extlinux-add` hook still files a parallel boot entry so 7.0-1 remains as fallback. + +This is a quick review since the user has already prepared the patches and authorized the kernel work flow. + +## Phase 7 verification + +After reboot: +1. `uname -a` shows new build timestamp. +2. `dmesg | grep -i "vb2\|dma_resv\|hantro\|rkvdec"` — any new RFC v2 trace messages OR no errors. +3. 5-codec sweep: + - H.264 → was `71ac099b…`. Compare. + - HEVC → was `06b2c5a0…`. Compare. + - VP9 → was `4f1565e8…`. Must hold. + - MPEG-2 → was `19eefbf4…`. Must hold. + - VP8 → was `bcc57ed5…`. Compare (may change). +4. γ dump verifies what's in CAPTURE buffers post-DQBUF. + +iter12 PASS = libva hashes for Bug 4 + Bug 5 codecs change toward kdirect values + 4 working anchors hold. +iter12 PARTIAL = kernel upgraded cleanly, no regression, but Bug 4 + 5 hashes unchanged. + +## Iter12 close shape + +Both PASS and PARTIAL are valuable — the kernel substrate moves to vb2_dma_resv-enabled, which the campaign has wanted since iter1 closed `reference_dmabuf_resv_blocker.md`. Whatever the libva pixel result, the kernel substrate upgrade is achieved.