iter12 Phase 4 + 6: integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2

User signaled RFC v2 is prepared at boltzmann:~/v2-patch-work/v2-out/.
Three patches:
  0001 media: videobuf2: add opt-in dma_resv producer fence helper
  0002 media: hantro: attach dma_resv release fence at device_run
  0003 media: rockchip-rga: attach dma_resv release fence at ...

v2 key change vs v1: attach moves from buf_queue to m2m device_run
(Dufresne's finite-time-contract concern). Build the kernel package
on boltzmann (~/src/kernel-agent-bootstrap/.../linux-fresnel-fourier/),
deploy to fresnel, reboot, retest.

sddm auto-login as mfritsche staged in /etc/sddm.conf.d/20-autologin.conf
on fresnel before reboot per user authorization.

Phase 0's α-16 OUTPUT-byte dump candidate parked; kernel substrate
upgrade takes precedence given RFC v2 is the long-stalled
reference_dmabuf_resv_blocker.md unblock.

Iter12 outcomes:
  PASS  = Bug 4/5 hashes shift toward kdirect after reboot.
  PARTIAL = kernel upgraded cleanly, no regression, hashes unchanged.

Either outcome is valuable — substrate moves forward regardless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-14 06:56:15 +00:00
parent f40b025868
commit de889898b8
+81
View File
@@ -0,0 +1,81 @@
# Iteration 12 — Phase 4 plan + Phase 6 implementation
Major pivot at iter12 Phase 0 close: user signaled "RFC v2 is already prepared and should be included in fresnel's running kernel." This changes iter12's vehicle from "OUTPUT byte dump (α-16)" to **"integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2 and reboot fresnel."**
## Why this matters for Bug 4 + Bug 5
Per `reference_dmabuf_resv_blocker.md`:
> RK3399 hantro/rkvdec CAPTURE → libva readback returns all-zero pages until vb2_dma_resv kernel patches land.
The bugs we've been chasing (Bug 4 H.264 partial-fill, Bug 5 HEVC all-zero) are not in libva's wire-payload. iter9 + iter11 exhausted the wire-byte search space with 10 confirmed-inert field changes (constraint_set_flags, POC sentinel, reference_ts magnitude, sps_max_num_reorder_pics, IRAP/IDR flags, num_entry_point_offsets, …). The kernel decodes correctly; the readback via libva's cached-mmap returns stale / zero / partial-write data.
RFC v2 attaches a `dma_resv` exclusive WRITE fence to V4L2-produced dmabufs at `device_run` time. Userspace consumers that implicit-sync via `poll(POLLIN)` or `DMA_BUF_IOCTL_EXPORT_SYNC_FILE` will:
- See a real signalled fence after decode completes (not the stub).
- Get cache-coherent buffer access guaranteed by the fence semantics.
Whether this fixes our libva → ffmpeg-vaapi-hwdownload pixel readback path specifically depends on whether ffmpeg-vaapi's read uses the fence. If it does (or if installing the patch incidentally fixes the cache-coherency mechanism it touches), Bugs 4 + 5 may close.
## Patches
Located at `boltzmann:~/v2-patch-work/v2-out/`:
| Patch | Subject | Size |
|---|---|---|
| 0000 | cover letter | 7,063 bytes |
| 0001 | media: videobuf2: add opt-in dma_resv producer fence helper | 13,089 bytes |
| 0002 | media: hantro: attach dma_resv release fence at device_run | 4,136 bytes |
| 0003 | media: rockchip-rga: attach dma_resv release fence at … | 4,469 bytes |
Cover-letter excerpt:
> v2 of the vb2 dma_resv producer-fence series. Two substantive changes addressing v1 review:
> 1. Attach point moves from buf_queue to m2m device_run. v1 attached at buf_queue, which violates the dma_fence finite-time contract (Nicolas Dufresne, [linux-rockchip][2]): a fence published when userspace QBUFs is observable to consumers immediately, but the OUTPUT-side driver may be starved indefinitely if userspace never queues bitstream chunks. By moving the attach to device_run, we publish only after the m2m core has committed to running the job.
## Integration plan
1. Copy the 3 patches into `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` on boltzmann as `0004-*.patch`, `0005-*.patch`, `0006-*.patch`. ✓ done.
2. Update PKGBUILD source=() array to include them. ✓ done.
3. Bump `pkgrel` from 1 to 2. ✓ done.
4. Run `makepkg -fs --noconfirm` on boltzmann (native, 8 cores, no distcc per kernel-agent policy). ⏳ in progress.
5. Stage sddm auto-login as `mfritsche` (`/etc/sddm.conf.d/20-autologin.conf`) on fresnel before reboot. ✓ done.
6. scp `linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst` from boltzmann to fresnel.
7. `sudo pacman -U` on fresnel.
8. Reboot fresnel.
9. Verify new kernel running (`uname -a` reports `7.0.0-fresnel-fourier` with newer build timestamp).
10. Run 5-codec sweep + γ instrumentation to check whether Bug 4 + Bug 5 hashes change.
## Risks
- **R-1**: RFC v2 patch doesn't apply cleanly to mmind v7.0 base. Probability: low — same base as v1 which built clean (existing 7.0-1 package). Mitigation: if patch fails, fix conflicts manually on boltzmann; rebuild.
- **R-2**: Kernel boots but new module signatures break. Probability: very low — patches are media-driver-only; no symbol exports the rest of the kernel depends on. Mitigation: keep 7.0-1 entry in extlinux menu (PKGBUILD's hook already creates a parallel boot option per the PKGBUILD comment "User picks at u-boot menu. Reverting = boot the linux-eos-arm entry.").
- **R-3**: Kernel boots but libva-vaapi still produces broken HEVC/H.264 output. **The most likely outcome** — RFC v2's fence machinery may not engage with libva's specific cached-mmap readback (ffmpeg-vaapi-hwdownload). Bug 4 + Bug 5 stay open; iter12 close as PARTIAL with kernel-substrate upgraded.
- **R-4**: Kernel boots and FIXES Bug 4 + Bug 5. Bigger campaign milestone. Possible if libva's path implicitly benefits from the fence-publishing.
- **R-5**: Some other regression (panfrost, sound, USB) post-reboot. Probability: very low (patches are media-only). Mitigation: rollback via extlinux 7.0-1 option.
## Phase 5 review concerns
Reviewer should confirm:
- The 3 patches modify only `drivers/media/common/videobuf2/`, `drivers/media/platform/verisilicon/hantro/`, `drivers/media/platform/rockchip/rga/`. No accidental wide changes.
- PKGBUILD source array correctness (all 3 patches listed; SKIP'd checksums for all entries).
- `extlinux-add` hook still files a parallel boot entry so 7.0-1 remains as fallback.
This is a quick review since the user has already prepared the patches and authorized the kernel work flow.
## Phase 7 verification
After reboot:
1. `uname -a` shows new build timestamp.
2. `dmesg | grep -i "vb2\|dma_resv\|hantro\|rkvdec"` — any new RFC v2 trace messages OR no errors.
3. 5-codec sweep:
- H.264 → was `71ac099b…`. Compare.
- HEVC → was `06b2c5a0…`. Compare.
- VP9 → was `4f1565e8…`. Must hold.
- MPEG-2 → was `19eefbf4…`. Must hold.
- VP8 → was `bcc57ed5…`. Compare (may change).
4. γ dump verifies what's in CAPTURE buffers post-DQBUF.
iter12 PASS = libva hashes for Bug 4 + Bug 5 codecs change toward kdirect values + 4 working anchors hold.
iter12 PARTIAL = kernel upgraded cleanly, no regression, but Bug 4 + 5 hashes unchanged.
## Iter12 close shape
Both PASS and PARTIAL are valuable — the kernel substrate moves to vb2_dma_resv-enabled, which the campaign has wanted since iter1 closed `reference_dmabuf_resv_blocker.md`. Whatever the libva pixel result, the kernel substrate upgrade is achieved.