dmabuf-wayland green frame: hantro CAPTURE → panfrost zero-copy on RK3566 (iter1) #2
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hardware / software stack
/dev/video1)pfdev->coherent = false— nodma-coherentin DT for the GPU node)linux-pinetab2-danctnix-besser7.0 (Linux 6.12.x base)--vo=dmabuf-wayland --hwdec=v4l2requestSymptom
mpv --vo=dmabuf-wayland --hwdec=v4l2requestpaints a uniform green frame instead of the decoded video. The CPU-mmap path (--vo=gpu) renders correctly, so the decode is fine; only the zero-copy dma_buf import path is broken.Reproduction screenshot md5 (the green frame, baseline):
c8c8e9b88521a0069f709d483451c3d4.Math sanity: BT.709 limited-range YUV→RGB on YUV(0,0,0) ⇒ RGB(0, 77, 0) ≈ pure green. Symptom is exactly "GPU sampled zero pages."
Root cause (confirmed via kernel source-read)
v4l2 videobuf2 does not attach a
dma_resvrelease fence to CAPTURE buffers on DQBUF. When userspace exports the CAPTURE buffer viaVIDIOC_EXPBUFand hands the fd to Wayland (zwp_linux_dmabuf_v1→EGL_LINUX_DMA_BUF_EXT), the GPU importer (panfrost) has no per-frame fence to wait on.Compounding: panfrost's IOMMU mapping for imported dma_bufs is
IOMMU_READ | IOMMU_WRITEwith noIOMMU_CACHE, andpfdev->coherentis false on RK3566 (nodma-coherentproperty on the GPU node in the device tree). The Mali samples uncached / non-snooping; hantro's CPU-side writes sit in dirty cache lines and the GPU reads stale zero pages.Hypothesis space
wl_dmabufimport path bugDMA_BUF_IOCTL_SYNCbegin/end_cpu_access fixes itIOMMU_CACHE+ missing dma_resv fenceCounter-validation:
--hwdec=v4l2request --vo=gpuworks (CPU mmap triggers cache sync). Only the zero-copy dma_buf path fails.Reproduction
besser-7.0 pkgrel=1(no vb2_dma_resv patches)mpv --hwdec=v4l2request --vo=dmabuf-wayland --fullscreen --pause --start=00:00:00.42 --quiet ~/fourier-test/bbb_1080p30_h264.mp4spectacle -b -n -o /tmp/shot.png && md5sum /tmp/shot.pngc8c8e9b88521a0069f709d483451c3d4Candidate fix (iter1, currently building)
linux-pinetab2-danctnix-besser pkgrel=2with three RFC patches against vb2:0001-media-videobuf2-add-dma_resv-release-fence-helper.patch0002-media-hantro-attach-dma_resv-release-fence-at-buf_qu.patch0003-media-rockchip-rga-attach-dma_resv-release-fence-at-.patchThese mirror the upstream
vb2_dma_resvRFC v1 series (linux-media, 2026-04) authored as part of this campaign. Reviews from Nicolas Dufresne and Christian König collected; v2 in flight.Pass criterion: post-reboot mpv screenshot md5 differs from
c8c8e9b88521a0069f709d483451c3d4.Pointers
project_vb2_dma_resv_v2_state.mdin fourier auto-memory~/src/linux-rfc(branchvb2-dma-resv-rfc)~/src/dmabuf-modifier-triage/phase2_iter1_findings.mdprobes/expbuf_probe.c(proved kernel exports full 3,657,728-byte dma_buf)iter1 confirmed: green frame fixed
Kernel
linux-pinetab2-danctnix-besser pkgrel=2with the three vb2_dma_resv RFC v2 patches built clean (after besser#17 cumulative-patch fix) and installed on ohm.Test result:
7.0.0-danctnix1-2-pinetab2-danctnix-besserf6c6e78291e6cdc020d78f83178caef6c8c8e9b88521a0069f709d483451c3d4Sanity-check on the screenshot (against "could be a different uniform"): 1280×800, 62,750 colors, entropy 0.68, top-frequency color
#99BDEAFF(BBB sky-blue, 21,735 px). Visual inspection: actual decoded Big Buck Bunny frame att=0.42s— green conifer rows, blue sky, Coffee Bunny logo top-left. Real video content, not a uniform fallback.Reproduction: identical to the bug report —
mpv --hwdec=v4l2request --vo=dmabuf-wayland --fullscreen --pause --start=00:00:00.42 ~/fourier-test/bbb_1080p30_h264.mp4,spectacle -b -n -o /tmp/shot.png.Mechanism confirmed: vb2's missing
dma_resvrelease fence was the dominant defect. Hantrobuf_queuenow attaches a release fence at queue-time; panfrost's GPU import waits on it before sampling, so the L2-dirty cache lines are flushed/synced before Mali touches the page. The non-IOMMU_CACHEmapping onpfdev->coherent=falseis no longer fatal because the synchronization happens at the dma_resv layer instead.Closing.
Reopening — iter1 "success" was a false positive
Close was premature. The screenshot md5 difference (
f6c6e78≠ baselinec8c8e9b8) was treated as proof the patches fixed the green frame, but re-running withmpv -vshows the video chain never initializes:v4l2request hwdec fails with
Could not create device, decoder falls back to software,hwuploadcan't convertyuv420p→drm_primefor the dmabuf-wayland VO, video chain init fails, mpv runs audio-only.The
f6c6e78screenshot showing real BBB content was captured from screen state that was not this mpv invocation — likely a session-restored or stale mpv window from before the test runner pkilled it. The test runner's md5-inequality logic doesn't distinguish "different decoded frame" from "different desktop / different stale window."What is still confirmed
7.0.0-danctnix1-2-pinetab2-danctnix-besserrunning with the 3 RFC patches indrivers/media/common/videobuf2/.videobuf2_dma_contigmodule loaded.ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime ~/fourier-test/bbb_1080p30_h264.mp4 -frames:v 5 -f null -engages hantro-vpu and outputsdrm_prime(tv, bt709, progressive), 1920x1080.mpv --hwdec=v4l2request --vo=null -vreachesLooking at hwdec h264-v4l2requestbut fails atCould not create device.What is NOT validated
The vb2_dma_resv RFC v2 patches' effect on the dmabuf-wayland green frame. mpv-fourier 0.41-9 cannot drive the path the patches change. Need either:
Next: chase the mpv "Could not create device" failure
Diagnosis ongoing. Likely candidates:
/dev/dri/renderD128(perms, libdrm version, panfrost)Will update with findings.
iter1 actually confirmed — vb2_dma_resv RFC v2 fixes the green frame
The earlier close was a false positive (test-rig captured stale on-screen content); reopen analysis at #issuecomment-281 was correct. After fixing the actual mpv blocker, the bug is empirically resolved.
What was wrong with the test rig
mpv-fourier
0.41.0-9could not engage--hwdec=v4l2requestzero-copy because mpv 0.41 upstream doesn't wireAV_HWDEVICE_TYPE_V4L2REQUESTthrough thedrmprimeVO hwdec. mpv's matcher requires equality onlavc_device; v4l2request decoder neededAV_HWDEVICE_TYPE_V4L2REQUESTbut onlyAV_HWDEVICE_TYPE_DRMwas registered →Could not create device→ fallback to software decode → silent video-chain failure. Test runner's md5-inequality logic mistook "different desktop content captured by spectacle" for "different decoded frame."What landed
mpv-fourier 1:0.41.0-10with two added patches:0001-meson-add-detection-logic-for-v4l2request-support.patch0002-vo-hwdec-drmprime-add-separate-hwdecs-for-v4l2reques.patchBoth from Philip Langdale + Jonas Karlman (Aug 2024), originally for DanctNIX's
mpv-v4l2request. They addinit_v4l2request()tohwdec_drmprime.cthat callsav_hwdevice_ctx_create(..., AV_HWDEVICE_TYPE_V4L2REQUEST, ...)and registers a newra_hwdec_v4l2requestdriver, plus the meson detection. Built and installed on ohm.Verification
mpv
-vnow shows the full zero-copy path:No
Could not create device, nohwupload failed, no fallback. The chain is hantro → V4L2 CAPTURE buffer with our dma_resv release fence → VIDIOC_EXPBUF → wl_buffer NV12 → KWin → panfrost EGL_LINUX_DMA_BUF_EXT → Mali sample.Screenshot result:
7.0.0-danctnix1-2-pinetab2-danctnix-besser(vb2_dma_resv RFC v2 patches applied)mpv-fourier 1:0.41.0-10(Kwiboo+Langdale v4l2request hwdec wiring)mpv --hwdec=v4l2request --vo=dmabuf-wayland --fullscreen --pause --start=00:01:00.0 ~/fourier-test/bbb_1080p30_h264.mp4c8c8e9b88521a0069f709d483451c3d43f16fd2471ec0783dfa91040c47c2ad9What this empirically validates
buf_queueattach point IS sufficient on RK3566 panfrost zero-copy import — the implicit-sync viadma_resvrelease fence does land before Mali samples.IOMMU_CACHEmapping onpfdev->coherent=falseis no longer fatal once the fence wait is wired (kernel-level synchronization handles it).What v2 still needs to verify
device_runattach-point move (Nicolas Dufresne's finite-time-fence concern) still owes empirical validation. Separate issue at #3.dma_fence_begin/end_signalling()annotations.Closing.