dmabuf-wayland↔KWin presentation handoff produces solid green on ohm — independent of decoder backend #1

Closed
opened 2026-05-08 13:40:20 +00:00 by marfrit · 4 comments
Owner

Summary

On ohm (PineTab2 / RK3568 / hantro G1 / Mali-G52 / KWin 6 Wayland) as of 2026-05-08, mpv --vo=dmabuf-wayland produces a solid green frame regardless of the decoder backend in use. The bug reproduces with both libva (--hwdec=vaapi against libva-v4l2-request-fourier) and ffmpeg's native V4L2 request hwaccel (--hwdec=v4l2request, libva entirely bypassed). --vo=gpu with the same hwdec produces the correct picture (slow due to GPU shader path on Mali-G52, but content is real bbb), so the decoder is not at fault.

The iter5 close (2026-05-05) of the libva-multiplanar campaign claims "mpv --hwdec=vaapi smooth" on ohm. That claim does not hold today against fresh-install + fresh-Plasma-session interactive testing. Either iter5/6/7/8 changed something about the dmabuf modifier reporting, or KWin / Mesa-panfrost / mpv has had a regression since 2026-05-05 that broke the dmabuf-wayland handoff.

Reproducer

# All three commands run on ohm with libva-v4l2-request-fourier-1.0.0.r280.65969da-1
# installed from [marfrit] and the env vars from /etc/profile.d/libva-v4l2-request.sh.

# 1. Via libva — green (also hits the libva cap_pool/REQBUFS bug, see
#    libva-v4l2-request-fourier#1, but the picture would be green even without it)
mpv --hwdec=vaapi --vo=dmabuf-wayland --target-colorspace-hint=no \
    fourier-test/bbb_1080p30_h264.mp4

# 2. Via ffmpeg V4L2 request hwaccel — also green (no libva at all)
mpv --hwdec=v4l2request --vo=dmabuf-wayland \
    fourier-test/bbb_1080p30_h264.mp4

# 3. Via ffmpeg V4L2 request hwaccel + GPU shader VO — correct picture (slow)
mpv --hwdec=v4l2request --vo=gpu \
    fourier-test/bbb_1080p30_h264.mp4

Result #3 demonstrates that the hantro decoder + ffmpeg + mpv + the dmabuf path are functional. The green only appears when --vo=dmabuf-wayland is selected.

Suspected handshake

From mpv -v startup log (libva path; ffmpeg-v4l2request path produces an analogous list):

[vo/dmabuf-wayland/wayland] Compositor supports parametric image description creator.
[vo/dmabuf-wayland/wayland] Compositor supports setting mastering display primaries.
[vo/dmabuf-wayland/vaapi] Supported Wayland display format nv12: 'NV12(0000000000000000)'

→ KWin advertises only one dmabuf format/modifier combo: NV12 LINEAR (modifier 0x0). Any buffer with a non-LINEAR modifier or non-matching pitch alignment is the suspect handoff path.

The libva-multiplanar campaign already has a relevant finding: iter2 Fix 2 (conditional DRM_FORMAT_MOD_INVALID for non-64-aligned pitch). Hantro G1 produces NV12 with 1920-byte stride for 1920×1080 content, which IS 64-aligned, so iter2's MOD_INVALID path shouldn't fire — the buffer should report MOD_LINEAR. But that does need verifying.

For ffmpeg's path, the modifier comes from the AVDRMFrameDescriptor filled by av_hwframe_map(..., AV_HWFRAME_TRANSFER_DIRECTION_FROM). ffmpeg-v4l2-request-fourier's V4L2 hwaccel may report a different modifier than libva does.

Color hypothesis already ruled out

Earlier in the diagnostic flow, wp_color_manager_v1 HDR PQ misinterpretation was suspected (KWin advertises HDR-capable, mpv tagged SDR content as PQ → SDR-as-HDR rendering = green). Disabled with --target-colorspace-hint=no and the explicit --target-trc=bt.1886 --target-prim=bt.709 --target-peak=100 triple. Neither cleared the green. So it is not a colorspace handshake issue.

Triage suggestions

  1. Capture the actual modifier reported by vaExportSurfaceHandle on libva path and AVDRMFrameDescriptor on ffmpeg path. Compare to KWin's advertised list (NV12(0x0)). If non-zero → that's the mismatch.
  2. Strace KWin's compositor process during a green-frame run, look for linux-dmabuf-v1 request rejections in the protocol log.
  3. A/B against an older KWin: bisect the kwin-fourier package against versions before iter5 close. If green pre-dates KWin upgrades, the bug is decoder-side modifier reporting, not KWin.
  4. Test on a non-RK3568 host with hantro-style decoder if any exists in the fleet — eliminate ohm-specific kernel/driver state.
  5. Check whether KWin's modifier list actually reflects panfrost's capabilities or some other limit. Mali-G52 (Bifrost) panfrost should be able to import NV12 LINEAR without issue, but maybe the import path adds modifier constraints.

Workaround

Use --vo=gpu instead of --vo=dmabuf-wayland. Slower (GPU shader path on Mali-G52), but correct picture. This is the production HW-decode workflow on ohm right now.

Cross-reference

  • Campaign README "Known issues at iter8 production tip" section (commit 20b3dd4).
  • Sibling issue: marfrit/libva-v4l2-request-fourier#1 — separate libva lifecycle bug found in the same test session.

Environment

  • Host: ohm (PineTab2, RK3568, hantro G1, Mali-G52)
  • libva: 2.23.0-1 (VA-API 1.23)
  • libva-v4l2-request-fourier: 1.0.0.r280.65969da-1 (iter8 close)
  • ffmpeg-v4l2-request: 2:8.1-3
  • mpv: v0.41.0
  • Compositor: KWin 6.x (Plasma 6 Wayland)
  • Kernel: EndeavourOS-ARM custom
## Summary On ohm (PineTab2 / RK3568 / hantro G1 / Mali-G52 / KWin 6 Wayland) as of 2026-05-08, `mpv --vo=dmabuf-wayland` produces a solid green frame regardless of the decoder backend in use. The bug reproduces with both libva (`--hwdec=vaapi` against `libva-v4l2-request-fourier`) and ffmpeg's native V4L2 request hwaccel (`--hwdec=v4l2request`, libva entirely bypassed). `--vo=gpu` with the same hwdec produces the **correct picture** (slow due to GPU shader path on Mali-G52, but content is real bbb), so the decoder is not at fault. The iter5 close (2026-05-05) of the libva-multiplanar campaign claims "mpv `--hwdec=vaapi` smooth" on ohm. That claim does **not** hold today against fresh-install + fresh-Plasma-session interactive testing. Either iter5/6/7/8 changed something about the dmabuf modifier reporting, or KWin / Mesa-panfrost / mpv has had a regression since 2026-05-05 that broke the dmabuf-wayland handoff. ## Reproducer ```bash # All three commands run on ohm with libva-v4l2-request-fourier-1.0.0.r280.65969da-1 # installed from [marfrit] and the env vars from /etc/profile.d/libva-v4l2-request.sh. # 1. Via libva — green (also hits the libva cap_pool/REQBUFS bug, see # libva-v4l2-request-fourier#1, but the picture would be green even without it) mpv --hwdec=vaapi --vo=dmabuf-wayland --target-colorspace-hint=no \ fourier-test/bbb_1080p30_h264.mp4 # 2. Via ffmpeg V4L2 request hwaccel — also green (no libva at all) mpv --hwdec=v4l2request --vo=dmabuf-wayland \ fourier-test/bbb_1080p30_h264.mp4 # 3. Via ffmpeg V4L2 request hwaccel + GPU shader VO — correct picture (slow) mpv --hwdec=v4l2request --vo=gpu \ fourier-test/bbb_1080p30_h264.mp4 ``` Result #3 demonstrates that the hantro decoder + ffmpeg + mpv + the dmabuf path are functional. The green only appears when `--vo=dmabuf-wayland` is selected. ## Suspected handshake From mpv `-v` startup log (libva path; ffmpeg-v4l2request path produces an analogous list): ``` [vo/dmabuf-wayland/wayland] Compositor supports parametric image description creator. [vo/dmabuf-wayland/wayland] Compositor supports setting mastering display primaries. [vo/dmabuf-wayland/vaapi] Supported Wayland display format nv12: 'NV12(0000000000000000)' ``` → KWin advertises **only one** dmabuf format/modifier combo: `NV12 LINEAR` (modifier `0x0`). Any buffer with a non-LINEAR modifier or non-matching pitch alignment is the suspect handoff path. The libva-multiplanar campaign already has a relevant finding: iter2 Fix 2 (`conditional DRM_FORMAT_MOD_INVALID for non-64-aligned pitch`). Hantro G1 produces NV12 with 1920-byte stride for 1920×1080 content, which IS 64-aligned, so iter2's `MOD_INVALID` path shouldn't fire — the buffer should report `MOD_LINEAR`. But that does need verifying. For ffmpeg's path, the modifier comes from the `AVDRMFrameDescriptor` filled by `av_hwframe_map(..., AV_HWFRAME_TRANSFER_DIRECTION_FROM)`. ffmpeg-v4l2-request-fourier's V4L2 hwaccel may report a different modifier than libva does. ## Color hypothesis already ruled out Earlier in the diagnostic flow, `wp_color_manager_v1` HDR PQ misinterpretation was suspected (KWin advertises HDR-capable, mpv tagged SDR content as PQ → SDR-as-HDR rendering = green). Disabled with `--target-colorspace-hint=no` and the explicit `--target-trc=bt.1886 --target-prim=bt.709 --target-peak=100` triple. Neither cleared the green. So it is not a colorspace handshake issue. ## Triage suggestions 1. Capture the actual modifier reported by `vaExportSurfaceHandle` on libva path and `AVDRMFrameDescriptor` on ffmpeg path. Compare to KWin's advertised list (`NV12(0x0)`). If non-zero → that's the mismatch. 2. Strace KWin's compositor process during a green-frame run, look for `linux-dmabuf-v1` request rejections in the protocol log. 3. A/B against an older KWin: bisect the kwin-fourier package against versions before iter5 close. If green pre-dates KWin upgrades, the bug is decoder-side modifier reporting, not KWin. 4. Test on a non-RK3568 host with hantro-style decoder if any exists in the fleet — eliminate ohm-specific kernel/driver state. 5. Check whether KWin's modifier list actually reflects panfrost's capabilities or some other limit. Mali-G52 (Bifrost) panfrost should be able to import `NV12 LINEAR` without issue, but maybe the import path adds modifier constraints. ## Workaround Use `--vo=gpu` instead of `--vo=dmabuf-wayland`. Slower (GPU shader path on Mali-G52), but correct picture. This is the production HW-decode workflow on ohm right now. ## Cross-reference - Campaign README "Known issues at iter8 production tip" section (commit `20b3dd4`). - Sibling issue: [marfrit/libva-v4l2-request-fourier#1](https://git.reauktion.de/marfrit/libva-v4l2-request-fourier/issues/1) — separate libva lifecycle bug found in the same test session. ## Environment - Host: ohm (PineTab2, RK3568, hantro G1, Mali-G52) - libva: 2.23.0-1 (VA-API 1.23) - libva-v4l2-request-fourier: 1.0.0.r280.65969da-1 (iter8 close) - ffmpeg-v4l2-request: 2:8.1-3 - mpv: v0.41.0 - Compositor: KWin 6.x (Plasma 6 Wayland) - Kernel: EndeavourOS-ARM custom
Author
Owner

Triage update — root cause isolated 2026-05-08

Layer-by-layer A/B testing on ohm ruled out every layer that doesn't matter, then WAYLAND_DEBUG=1 capture surfaced the actual bug at the protocol-message level.

Ruled out (each via a directed A/B):

  • kwin-fourier 0001 patch (stock arch kwin still greens)
  • Mesa 26.0.5 vs 26.0.6 (downgrade still greens)
  • libva (ffmpeg --hwdec=v4l2request also greens, libva entirely bypassed)
  • Decoder content (vo=gpu --hwdec=v4l2request shows correct picture)
  • Color tagging (--target-colorspace-hint=no + explicit BT.709 SDR triple did nothing)
  • Wayland generally (--vo=wlshm shows correct picture — bypasses dmabuf entirely)
  • Kernel 6.19.10 vs 7.0 besser (boot.scr swap + reboot, both green)
  • KWin Vulkan vs OpenGL backend (qdbus6 confirms KWin uses OpenGL+Mesa-panfrost, exactly the same path mpv's vo=gpu uses successfully)

Root cause isolated: mpv's vo_dmabuf_wayland.c constructs the wl_dmabuf protocol message with internally inconsistent plane semantics. WAYLAND_DEBUG trace:

create_params #54
  add(fd 41, plane=0, offset=0,        stride=1920, mod=0,0)   ← Y plane
  add(fd 42, plane=1, offset=2088960,  stride=1920, mod=0,0)   ← UV plane
create_immed(wl_buffer#56, 1920, 1080, NV12, flags=0)
  • 1920×1088 = 2088960 (Y plane size, height-aligned). Correct offset for single-allocation semantics.
  • Two different fds for plane 0 and plane 1. Correct fd structure for multi-fd-per-plane semantics.
  • Combining the two: KWin reads plane 1 from fd 42 at offset 2088960, but fd 42 only maps the UV plane (~1MB) → offset 2088960 is past EOF → all-zero UV plane → renders dark green (Y=0, U=0, V=0 in BT.601/709 → ~RGB(0,70,0) = the exact dark green observed).

Filed the root-cause issue separately at marfrit/dmabuf-modifier-triage#1 with full diagnosis + suggested fix path. This issue stays open as the user-visible-symptom tracker; closes when an mpv patch lands and the green clears on ohm.

Next deferred verifier: re-run the WAYLAND_DEBUG capture with --hwdec=vaapi (libva path) once libva-v4l2-request-fourier#1 is fixed and the cap_pool/REQBUFS cascade stops blocking libva playback. If both libva and ffmpeg paths produce the same wrong .add() pattern → bug is in mpv VO. If only one does → that path's producer (libva or ffmpeg-v4l2-request) is the bug.

Working ohm HW-decode workflow remains: mpv --hwdec=v4l2request --vo=gpu fourier-test/... (correct picture, slow due to GPU shader path on Mali-G52).

## Triage update — root cause isolated 2026-05-08 Layer-by-layer A/B testing on ohm ruled out every layer that doesn't matter, then `WAYLAND_DEBUG=1` capture surfaced the actual bug at the protocol-message level. **Ruled out** (each via a directed A/B): - kwin-fourier 0001 patch (stock arch kwin still greens) - Mesa 26.0.5 vs 26.0.6 (downgrade still greens) - libva (ffmpeg `--hwdec=v4l2request` also greens, libva entirely bypassed) - Decoder content (`vo=gpu --hwdec=v4l2request` shows correct picture) - Color tagging (`--target-colorspace-hint=no` + explicit BT.709 SDR triple did nothing) - Wayland generally (`--vo=wlshm` shows correct picture — bypasses dmabuf entirely) - Kernel 6.19.10 vs 7.0 besser (boot.scr swap + reboot, both green) - KWin Vulkan vs OpenGL backend (qdbus6 confirms KWin uses OpenGL+Mesa-panfrost, exactly the same path mpv's vo=gpu uses successfully) **Root cause isolated**: mpv's `vo_dmabuf_wayland.c` constructs the wl_dmabuf protocol message with internally inconsistent plane semantics. WAYLAND_DEBUG trace: ``` create_params #54 add(fd 41, plane=0, offset=0, stride=1920, mod=0,0) ← Y plane add(fd 42, plane=1, offset=2088960, stride=1920, mod=0,0) ← UV plane create_immed(wl_buffer#56, 1920, 1080, NV12, flags=0) ``` - 1920×1088 = 2088960 (Y plane size, height-aligned). Correct offset for **single-allocation** semantics. - Two **different fds** for plane 0 and plane 1. Correct fd structure for **multi-fd-per-plane** semantics. - Combining the two: KWin reads plane 1 from fd 42 at offset 2088960, but fd 42 only maps the UV plane (~1MB) → offset 2088960 is past EOF → all-zero UV plane → renders dark green (Y=0, U=0, V=0 in BT.601/709 → ~RGB(0,70,0) = the exact dark green observed). Filed the root-cause issue separately at [marfrit/dmabuf-modifier-triage#1](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1) with full diagnosis + suggested fix path. This issue stays open as the user-visible-symptom tracker; closes when an mpv patch lands and the green clears on ohm. Next deferred verifier: re-run the WAYLAND_DEBUG capture with `--hwdec=vaapi` (libva path) once libva-v4l2-request-fourier#1 is fixed and the cap_pool/REQBUFS cascade stops blocking libva playback. If both libva and ffmpeg paths produce the same wrong .add() pattern → bug is in mpv VO. If only one does → that path's producer (libva or ffmpeg-v4l2-request) is the bug. Working ohm HW-decode workflow remains: `mpv --hwdec=v4l2request --vo=gpu fourier-test/...` (correct picture, slow due to GPU shader path on Mali-G52).
Author
Owner

Update — earlier root-cause comment overturned by Phase 2 source-read (2026-05-08)

Comment #243 above pointed at marfrit/dmabuf-modifier-triage#1 with a claimed root cause in mpv's vo_dmabuf_wayland.c plane-semantics translation. That diagnosis was wrong. Phase 2 source-read of mpv 0.41.0 + Kwiboo's ffmpeg fork at b57fbbe shows both pieces of code do the right thing; the WAYLAND_DEBUG "different fds per plane" trace anomaly is most likely libwayland's wl_closure_marshal dup_cloexec'ing the fd at protocol-marshal time and printing the post-dup value. Full analysis in marfrit/dmabuf-modifier-triage#1 comment 252.

Real layer is back open. Four candidates filed in the linked comment; cheapest decisive probe is lseek(EXPBUF_fd, SEEK_END) on ohm to check whether hantro's kernel-side dma_buf is sized for the full allocation (3,655,712) or just the Y plane (2,088,960).

This comment exists to break the trail to the wrong analysis. The user-visible green-frame symptom and the workaround (mpv --hwdec=v4l2request --vo=gpu) are unchanged.

## Update — earlier root-cause comment overturned by Phase 2 source-read (2026-05-08) [Comment #243](#issuecomment-243) above pointed at [marfrit/dmabuf-modifier-triage#1](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1) with a claimed root cause in mpv's `vo_dmabuf_wayland.c` plane-semantics translation. **That diagnosis was wrong.** Phase 2 source-read of mpv 0.41.0 + Kwiboo's ffmpeg fork at `b57fbbe` shows both pieces of code do the right thing; the WAYLAND_DEBUG "different fds per plane" trace anomaly is most likely libwayland's `wl_closure_marshal` `dup_cloexec`'ing the fd at protocol-marshal time and printing the post-dup value. Full analysis in [marfrit/dmabuf-modifier-triage#1 comment 252](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1#issuecomment-252). Real layer is back open. Four candidates filed in the linked comment; cheapest decisive probe is `lseek(EXPBUF_fd, SEEK_END)` on ohm to check whether hantro's kernel-side `dma_buf` is sized for the full allocation (3,655,712) or just the Y plane (2,088,960). This comment exists to break the trail to the wrong analysis. The user-visible green-frame symptom and the workaround (`mpv --hwdec=v4l2request --vo=gpu`) are unchanged.
Collaborator

Still reproducible on ohm, 2026-05-18. Recommending close as duplicate of marfrit/dmabuf-modifier-triage#1 — that issue carries the actual root-cause analysis, this issue is the user-facing symptom report. Both can't be fixed independently; the actual fix has to land in mpv's vo_dmabuf_wayland.c.

Re-verification

Stack today: mpv-fourier 1:0.41.0-10, same libva-v4l2-request-fourier 1:1.0.0.r361.cf8cd9d-1 we just upgraded ohm to for the libva-v4l2-request-fourier#1 triage. WAYLAND_DEBUG capture:

zwp_linux_buffer_params_v1#54.add(fd 40, 0, 0,       1920, 0, 0)   ← Y plane
zwp_linux_buffer_params_v1#54.add(fd 47, 1, 2088960, 1920, 0, 0)   ← UV plane: different fd + non-zero offset

Byte-identical to the broken pattern called out in dmabuf-modifier-triage#1: per-plane fds combined with a single-allocation offset for plane 1. mpv upstream hasn't fixed vo_dmabuf_wayland.c between 2026-05-08 and today.

Closing rationale

Two issues for the same upstream bug is overhead. Keeping dmabuf-modifier-triage#1 as canonical (root-cause + suggested fix); closing this one.

Workaround for ohm users remains --vo=gpu (slower but correct).

**Still reproducible on ohm, 2026-05-18.** Recommending close as duplicate of [marfrit/dmabuf-modifier-triage#1](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1) — that issue carries the actual root-cause analysis, this issue is the user-facing symptom report. Both can't be fixed independently; the actual fix has to land in mpv's `vo_dmabuf_wayland.c`. ## Re-verification Stack today: `mpv-fourier 1:0.41.0-10`, same `libva-v4l2-request-fourier 1:1.0.0.r361.cf8cd9d-1` we just upgraded ohm to for the [libva-v4l2-request-fourier#1 triage](https://git.reauktion.de/marfrit/libva-v4l2-request-fourier/issues/1). WAYLAND_DEBUG capture: ``` zwp_linux_buffer_params_v1#54.add(fd 40, 0, 0, 1920, 0, 0) ← Y plane zwp_linux_buffer_params_v1#54.add(fd 47, 1, 2088960, 1920, 0, 0) ← UV plane: different fd + non-zero offset ``` Byte-identical to the broken pattern called out in dmabuf-modifier-triage#1: per-plane fds combined with a single-allocation offset for plane 1. mpv upstream hasn't fixed `vo_dmabuf_wayland.c` between 2026-05-08 and today. ## Closing rationale Two issues for the same upstream bug is overhead. Keeping dmabuf-modifier-triage#1 as canonical (root-cause + suggested fix); closing this one. Workaround for ohm users remains `--vo=gpu` (slower but correct).
Collaborator

Correction to my earlier close comment on this issue (a few hours ago, before the visual re-verification on ohm 2026-05-18 ~09:50 UTC).

I closed this as duplicate-of-dmabuf-modifier-triage#1 with a "still reproducible" claim based on the WAYLAND_DEBUG trace pattern matching. That claim was wrong — I'd matched the trace pattern but hadn't verified the visual output. The trace pattern (different fds + non-zero offset for plane 1) is actually NORMAL for this case: V4L2 hantro NV12 uses single-allocation (one fd per buffer); the wayland fds are SCM_RIGHTS dups, not per-plane fds; and FFmpeg + mpv handle it correctly.

The bug IS fixed — by the cache-sync workaround in mpv-fourier 0.41.0-10 (your own earlier patch 0001-vo_dmabuf_wayland-explicit-cache-sync-on-import-fd.patch). Visual re-verification 2026-05-18: real BBB content displayed.

Full root-cause attribution + corrected analysis in dmabuf-modifier-triage#1's closing comment (or whichever ID the close-comment gets).

Issue stays closed; just correcting the record.

**Correction to my earlier close comment on this issue (a few hours ago, before the visual re-verification on ohm 2026-05-18 ~09:50 UTC).** I closed this as duplicate-of-[dmabuf-modifier-triage#1](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1) with a "still reproducible" claim based on the WAYLAND_DEBUG trace pattern matching. That claim was **wrong** — I'd matched the trace pattern but hadn't verified the visual output. The trace pattern (different fds + non-zero offset for plane 1) is actually NORMAL for this case: V4L2 hantro NV12 uses single-allocation (one fd per buffer); the wayland fds are SCM_RIGHTS dups, not per-plane fds; and FFmpeg + mpv handle it correctly. The bug IS fixed — by the cache-sync workaround in `mpv-fourier 0.41.0-10` (your own earlier patch `0001-vo_dmabuf_wayland-explicit-cache-sync-on-import-fd.patch`). Visual re-verification 2026-05-18: real BBB content displayed. Full root-cause attribution + corrected analysis in [dmabuf-modifier-triage#1's closing comment](https://git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/1#issuecomment-1213) (or whichever ID the close-comment gets). Issue stays closed; just correcting the record.
Sign in to join this conversation.
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/libva-multiplanar#1