Files
fresnel-fourier/phase4_iter12_plan.md
marfrit de889898b8 iter12 Phase 4 + 6: integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2
User signaled RFC v2 is prepared at boltzmann:~/v2-patch-work/v2-out/.
Three patches:
  0001 media: videobuf2: add opt-in dma_resv producer fence helper
  0002 media: hantro: attach dma_resv release fence at device_run
  0003 media: rockchip-rga: attach dma_resv release fence at ...

v2 key change vs v1: attach moves from buf_queue to m2m device_run
(Dufresne's finite-time-contract concern). Build the kernel package
on boltzmann (~/src/kernel-agent-bootstrap/.../linux-fresnel-fourier/),
deploy to fresnel, reboot, retest.

sddm auto-login as mfritsche staged in /etc/sddm.conf.d/20-autologin.conf
on fresnel before reboot per user authorization.

Phase 0's α-16 OUTPUT-byte dump candidate parked; kernel substrate
upgrade takes precedence given RFC v2 is the long-stalled
reference_dmabuf_resv_blocker.md unblock.

Iter12 outcomes:
  PASS  = Bug 4/5 hashes shift toward kdirect after reboot.
  PARTIAL = kernel upgraded cleanly, no regression, hashes unchanged.

Either outcome is valuable — substrate moves forward regardless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:56:15 +00:00

82 lines
6.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 12 — Phase 4 plan + Phase 6 implementation
Major pivot at iter12 Phase 0 close: user signaled "RFC v2 is already prepared and should be included in fresnel's running kernel." This changes iter12's vehicle from "OUTPUT byte dump (α-16)" to **"integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2 and reboot fresnel."**
## Why this matters for Bug 4 + Bug 5
Per `reference_dmabuf_resv_blocker.md`:
> RK3399 hantro/rkvdec CAPTURE → libva readback returns all-zero pages until vb2_dma_resv kernel patches land.
The bugs we've been chasing (Bug 4 H.264 partial-fill, Bug 5 HEVC all-zero) are not in libva's wire-payload. iter9 + iter11 exhausted the wire-byte search space with 10 confirmed-inert field changes (constraint_set_flags, POC sentinel, reference_ts magnitude, sps_max_num_reorder_pics, IRAP/IDR flags, num_entry_point_offsets, …). The kernel decodes correctly; the readback via libva's cached-mmap returns stale / zero / partial-write data.
RFC v2 attaches a `dma_resv` exclusive WRITE fence to V4L2-produced dmabufs at `device_run` time. Userspace consumers that implicit-sync via `poll(POLLIN)` or `DMA_BUF_IOCTL_EXPORT_SYNC_FILE` will:
- See a real signalled fence after decode completes (not the stub).
- Get cache-coherent buffer access guaranteed by the fence semantics.
Whether this fixes our libva → ffmpeg-vaapi-hwdownload pixel readback path specifically depends on whether ffmpeg-vaapi's read uses the fence. If it does (or if installing the patch incidentally fixes the cache-coherency mechanism it touches), Bugs 4 + 5 may close.
## Patches
Located at `boltzmann:~/v2-patch-work/v2-out/`:
| Patch | Subject | Size |
|---|---|---|
| 0000 | cover letter | 7,063 bytes |
| 0001 | media: videobuf2: add opt-in dma_resv producer fence helper | 13,089 bytes |
| 0002 | media: hantro: attach dma_resv release fence at device_run | 4,136 bytes |
| 0003 | media: rockchip-rga: attach dma_resv release fence at … | 4,469 bytes |
Cover-letter excerpt:
> v2 of the vb2 dma_resv producer-fence series. Two substantive changes addressing v1 review:
> 1. Attach point moves from buf_queue to m2m device_run. v1 attached at buf_queue, which violates the dma_fence finite-time contract (Nicolas Dufresne, [linux-rockchip][2]): a fence published when userspace QBUFs is observable to consumers immediately, but the OUTPUT-side driver may be starved indefinitely if userspace never queues bitstream chunks. By moving the attach to device_run, we publish only after the m2m core has committed to running the job.
## Integration plan
1. Copy the 3 patches into `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` on boltzmann as `0004-*.patch`, `0005-*.patch`, `0006-*.patch`. ✓ done.
2. Update PKGBUILD source=() array to include them. ✓ done.
3. Bump `pkgrel` from 1 to 2. ✓ done.
4. Run `makepkg -fs --noconfirm` on boltzmann (native, 8 cores, no distcc per kernel-agent policy). ⏳ in progress.
5. Stage sddm auto-login as `mfritsche` (`/etc/sddm.conf.d/20-autologin.conf`) on fresnel before reboot. ✓ done.
6. scp `linux-fresnel-fourier-7.0-2-aarch64.pkg.tar.zst` from boltzmann to fresnel.
7. `sudo pacman -U` on fresnel.
8. Reboot fresnel.
9. Verify new kernel running (`uname -a` reports `7.0.0-fresnel-fourier` with newer build timestamp).
10. Run 5-codec sweep + γ instrumentation to check whether Bug 4 + Bug 5 hashes change.
## Risks
- **R-1**: RFC v2 patch doesn't apply cleanly to mmind v7.0 base. Probability: low — same base as v1 which built clean (existing 7.0-1 package). Mitigation: if patch fails, fix conflicts manually on boltzmann; rebuild.
- **R-2**: Kernel boots but new module signatures break. Probability: very low — patches are media-driver-only; no symbol exports the rest of the kernel depends on. Mitigation: keep 7.0-1 entry in extlinux menu (PKGBUILD's hook already creates a parallel boot option per the PKGBUILD comment "User picks at u-boot menu. Reverting = boot the linux-eos-arm entry.").
- **R-3**: Kernel boots but libva-vaapi still produces broken HEVC/H.264 output. **The most likely outcome** — RFC v2's fence machinery may not engage with libva's specific cached-mmap readback (ffmpeg-vaapi-hwdownload). Bug 4 + Bug 5 stay open; iter12 close as PARTIAL with kernel-substrate upgraded.
- **R-4**: Kernel boots and FIXES Bug 4 + Bug 5. Bigger campaign milestone. Possible if libva's path implicitly benefits from the fence-publishing.
- **R-5**: Some other regression (panfrost, sound, USB) post-reboot. Probability: very low (patches are media-only). Mitigation: rollback via extlinux 7.0-1 option.
## Phase 5 review concerns
Reviewer should confirm:
- The 3 patches modify only `drivers/media/common/videobuf2/`, `drivers/media/platform/verisilicon/hantro/`, `drivers/media/platform/rockchip/rga/`. No accidental wide changes.
- PKGBUILD source array correctness (all 3 patches listed; SKIP'd checksums for all entries).
- `extlinux-add` hook still files a parallel boot entry so 7.0-1 remains as fallback.
This is a quick review since the user has already prepared the patches and authorized the kernel work flow.
## Phase 7 verification
After reboot:
1. `uname -a` shows new build timestamp.
2. `dmesg | grep -i "vb2\|dma_resv\|hantro\|rkvdec"` — any new RFC v2 trace messages OR no errors.
3. 5-codec sweep:
- H.264 → was `71ac099b…`. Compare.
- HEVC → was `06b2c5a0…`. Compare.
- VP9 → was `4f1565e8…`. Must hold.
- MPEG-2 → was `19eefbf4…`. Must hold.
- VP8 → was `bcc57ed5…`. Compare (may change).
4. γ dump verifies what's in CAPTURE buffers post-DQBUF.
iter12 PASS = libva hashes for Bug 4 + Bug 5 codecs change toward kdirect values + 4 working anchors hold.
iter12 PARTIAL = kernel upgraded cleanly, no regression, but Bug 4 + 5 hashes unchanged.
## Iter12 close shape
Both PASS and PARTIAL are valuable — the kernel substrate moves to vb2_dma_resv-enabled, which the campaign has wanted since iter1 closed `reference_dmabuf_resv_blocker.md`. Whatever the libva pixel result, the kernel substrate upgrade is achieved.