Files
fresnel-fourier/phase8_iteration12_close.md
marfrit 33f74b07c8 iter12 Phase 8 close: kernel 7.0-2 with RFC v2 deployed; Bug 4/5 unchanged
Boltzmann built linux-fresnel-fourier 7.0-2 in ~50 min (8-core native,
no distcc). Package sha 843fd4462a09b3d9... Deployed to fresnel:
sudo pacman -U clean. extlinux hook updated entry. sddm autologin as
mfritsche persisted. Reboot succeeded; fresnel up on new kernel
within 30s.

5-codec sweep post-reboot: all 5 hashes BYTE-IDENTICAL to pre-iter12
anchors. RFC v2's dma_resv fence machinery does NOT engage libva's
cached-mmap pixel readback path. Consistent with what
reference_dmabuf_resv_blocker.md memo always said: vaDeriveImage /
cached-mmap is the broken path; RFC v2 helps DRM_PRIME / compositor
paths.

Substrate state moved forward (kernel 7.0-1 -> 7.0-2 with RFC v2).
Memory entries updated:
  reference_fresnel_kernel_substrate.md (pkg version + patch list)
  feedback_rfc_v2_vb2_dma_resv_scope.md (NEW — scope clarification)

iter13 candidates ranked:
  α-17: DMA_BUF_IOCTL_SYNC(START|END) in libva backend around image
        read sites (~30 LOC).
  α-18: switch libva image export to DRM_PRIME (larger refactor).
  α-16: OUTPUT byte dump (deferred again).

α-17 is the natural follow-on — Figa's 2024 "userspace responsibility
for explicit sync" line directly addresses the libva-cached-mmap path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 07:47:51 +00:00

6.5 KiB
Raw Permalink Blame History

Iteration 12 — Phase 8 (close)

Closes 2026-05-14. iter12 = integrate RFC v2 (vb2_dma_resv producer fences) into linux-fresnel-fourier kernel + verify. PARTIAL close.

Outcome

Metric Value
Iteration target Kernel substrate upgrade (RFC v2) + Bug 4/5 retest
Kernel version linux-fresnel-fourier 7.0-17.0-2
Build host boltzmann (~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/)
Build duration ~50 min on 8-core boltzmann, native (no distcc per kernel-agent policy)
Build hash (kernel pkg) 843fd4462a09b3d9…
Reboot Successful; sddm autologin for mfritsche kicked in cleanly
Bug 4 / Bug 5 NO MOVEMENT — 5-codec hashes byte-identical to anchors

5-codec sweep post-reboot

Codec Anchor iter12 hash Verdict
H.264 71ac099b… 71ac099b… unchanged (Bug 4 still open)
HEVC 06b2c5a0… 06b2c5a0… unchanged (Bug 5 still open)
VP9 4f1565e8… 4f1565e8… PASS unchanged
MPEG-2 19eefbf4… 19eefbf4… PASS unchanged
VP8 bcc57ed5… bcc57ed5… unchanged (Bug 6 still open)

What RFC v2 fixes (and doesn't)

Confirmed by re-reading reference_dmabuf_resv_blocker.md in light of this result:

vaDeriveImage / cached-mmap returns all-zero on RK3399; mpv --vo=image or ffmpeg-v4l2request DRM_PRIME are the cache-safe verifiers.

RFC v2 publishes a real dma_resv exclusive WRITE fence on V4L2-produced dmabufs at device_run time. Consumers that observe via:

  • poll(POLLIN) on the dmabuf fd
  • DMA_BUF_IOCTL_EXPORT_SYNC_FILE + sync_file wait
  • EGL/Vulkan implicit-sync extensions

…now see a real signalled fence after decode completes (was a stub fence pre-RFC). This addresses the compositor / GL consumer problem: Wayland compositors importing CAPTURE dmabufs no longer pull pre-decode-complete pixels (the "solid green frames" symptom from RK3566+Mali-G52 paneled in mfritsche's RFC v1 cover letter).

RFC v2 does NOT touch the cached-mmap readback path that libva-vaapi + ffmpeg-hwdownload exercises. Libva does:

  1. mmap CAPTURE buffer at init.
  2. After DQBUF, call vaDeriveImage; pointer is the mmap region.
  3. ffmpeg-hwdownload memcpy's from that pointer.
  4. Without DMA_BUF_IOCTL_SYNC(START) to invalidate CPU caches, the CPU sees stale data relative to kernel DMA writes.

This path needs dma_buf_begin/end_cpu_access or explicit DMA_BUF_IOCTL_SYNC on every read. The 2024 Ming Qian (NXP) videobuf2 cache-sync patch and Hans Verkuil's 9-year-old vb2-cpu-access branch are the alternative tracks for this fix. RFC v2 is orthogonal.

Why VP9 works while H.264 / HEVC don't (via libva)

Empirical observation iter11/iter12: VP9 hash matches kdirect byte-identical via libva on the SAME readback mechanism that fails for H.264/HEVC. The cache-coherency theory predicts ALL codecs should fail — they don't.

Plausible explanations (defer to a future iteration for empirical confirmation):

  • VP9 decode timing is short enough that the mmap region's CPU cache is naturally clean by the time hwdownload reads (test fixture is 720p10s with sparse keyframes).
  • VP9's NV12 output may land in a different memory region (different cap_pool allocation pattern) that has different cache characteristics.
  • VP9's smaller frame size (1280×720) fits in L2/L3 cache windows where H.264 1080p doesn't.

Either way: the readback path is fragile, not uniformly broken.

Lessons

  1. Kernel substrate upgrade does not necessarily move codec-correctness needles. RFC v2 is a real architectural improvement but addresses a path orthogonal to libva's mmap readback. Memorialize.
  2. The PROMISE of reference_dmabuf_resv_blocker.md was always conditional: vb2_dma_resv patches help DRM_PRIME consumers. The libva cached-mmap readback was never specifically RFC v1/v2's target.
  3. The campaign's transitive-proof discipline (backend payload == kdirect, kdirect == SW) remains the valid verification path for libva codec correctness. RFC v2 is upstream-direction work that benefits Wayland compositors and explicit-sync consumers — orthogonal to the campaign's libva→ffmpeg-vaapi→hwdownload chain.

What did move

Substrate state update:

  • reference_fresnel_kernel_substrate.md no longer says "vb2_dma_resv … excluded pending RFC v2." RFC v2 is INCLUDED in 7.0-2. Memory entry needs updating.
  • Fresnel kernel pkg sha256: 7.0-1 (prior) → 7.0-2 843fd4462a09b3d9….
  • Backend SHA unchanged (521c1474… from iter11 close still installed; no libva rebuild this iter).

iter12 → iter13 handoff

Substrate at close:

  • Kernel 7.0-2 with RFC v2 patches.
  • Backend 521c1474… (iter11 close, α-13 + α-14 wire hygiene shipped).
  • sddm autologin as mfritsche persisted in /etc/sddm.conf.d/20-autologin.conf.

Iter13 candidates (Bug 4 / Bug 5 still open):

  • α-17 DMA_BUF_IOCTL_SYNC: explicit cache invalidation before CAPTURE buffer read in libva backend. Add ioctl(dmabuf_fd, DMA_BUF_IOCTL_SYNC, DMA_BUF_SYNC_START | DMA_BUF_SYNC_READ) and END at copy_surface_to_image entry/exit. ~30 LOC backend change.
  • α-18 DRM_PRIME path: switch libva backend's image export to use VIDIOC_EXPBUF + DRM_PRIME mmap. Bigger refactor but uses the cache-coherent path.
  • α-16 OUTPUT byte dump (deferred from iter12 Phase 0): instrument source_data pre-QBUF to verify bitstream content matches kdirect's. Rules out OUTPUT-side bug class.
  • Bug 6 VP8 partial output (sister direction): re-investigate now that kernel substrate moved.

α-17 is the natural follow-on given the iter12 finding. The cache-sync ioctl is exactly the path Tomasz Figa pointed at in the 2024 linaro-mm-sig discussion (userspace responsibility for explicit sync on cached-mmap V4L2 buffers).

Memory rule

To save in this iteration:

  • RFC v2 (vb2_dma_resv producer fences) addresses compositor/GL/EGL implicit-sync consumers, NOT libva-vaapi's cached-mmap pixel readback. Future Bug 4/Bug 5 work must NOT assume the upstream fence series is the fix. The libva path needs explicit DMA_BUF_IOCTL_SYNC or DRM_PRIME-mediated reads.

Substrate memory needs updating

reference_fresnel_kernel_substrate.md currently says:

running linux-fresnel-fourier 7.0-1 (kernel-agent product, mmind v7.0 + 3 PBP DTS patches). NOT besser-direct. vb2_dma_resv + panfrost iommu-cache excluded pending RFC v2.

Should update to:

running linux-fresnel-fourier 7.0-2 (mmind v7.0 + 3 PBP DTS + RFC v2 vb2_dma_resv producer fence patches). panfrost iommu-cache still excluded.