735f7f7ae3
Shallow-cloned KWin 6.6.4 (references/kwin-6.6.4/, gitignored) and
read the wl_dmabuf protocol handler + EGL import paths.
Findings:
(a) LinuxDmaBufParamsV1::zwp_linux_buffer_params_v1_add stores
fd[i]/offset[i]/pitch[i] per plane index, no fd dedup, no
cross-plane comparison.
(b) LinuxDmaBufParamsV1::test() does lseek(SEEK_END) per plane and
range-checks against the result. Our exported size 3,657,728
satisfies all checks for offset[1]=2,088,960 + pitch[1]=1920.
(c) EglDisplay::importDmaBufAsImage (egldisplay.cpp:166 combined
and :218 per-plane) passes user-supplied fd/offset/pitch
straight to eglCreateImage(EGL_LINUX_DMA_BUF_EXT) verbatim. No
transformation.
(d) EglBackend::testImportBuffer (eglbackend.cpp:338) chooses
between combined (line 342) and per-plane (line 353-356)
based on whether NV12+LINEAR is in nonExternalOnlySupportedDrmFormats.
Either path passes offset=2088960 to the driver.
KWin is innocent. Live hypothesis narrows to H1 (panfrost's
EGL_DMA_BUF_PLANE*_OFFSET_EXT handling for LINEAR NV12, most likely
via the per-plane RG88-from-offset-2088960 import path).
Posted to dmabuf-modifier-triage#1 comment 256.
Three of five hypotheses ruled out (H2, H3, H5). Remaining: H1
(panfrost EGL offset), H4 (kwin-fourier residual, low conf).
Next probe: minimal EGL importer harness on ohm.
105 lines
8.4 KiB
Markdown
105 lines
8.4 KiB
Markdown
# Phase 2 — iter1 source-read findings (REOPEN of root-cause analysis)
|
||
|
||
**Opened 2026-05-08** during the iter1 phase 2 source-read of mpv 0.41.0 + Kwiboo's ffmpeg fork at commit `b57fbbe`. Phase 0's earlier conclusion ("mpv mixes per-plane fds with single-allocation offset") needs revision — the source code reads + runtime probe show the situation is more nuanced than the WAYLAND_DEBUG wire trace alone suggested.
|
||
|
||
## What the source actually says
|
||
|
||
**mpv `video/out/vo_dmabuf_wayland.c` `drmprime_dmabuf_importer` (lines 250-277)** straightforwardly relays the producer's `AVDRMFrameDescriptor`:
|
||
|
||
```c
|
||
for (plane_no = 0; plane_no < layer.nb_planes; ++plane_no) {
|
||
AVDRMPlaneDescriptor plane = layer.planes[plane_no];
|
||
int object_index = plane.object_index;
|
||
AVDRMObjectDescriptor object = desc->objects[object_index];
|
||
uint64_t modifier = object.format_modifier;
|
||
zwp_linux_buffer_params_v1_add(params, object.fd, plane_no, plane.offset,
|
||
plane.pitch, modifier >> 32, modifier & 0xffffffff);
|
||
}
|
||
```
|
||
|
||
No `dup()`, no rewriting, no transformation. mpv passes through what `AVDRMFrameDescriptor` says.
|
||
|
||
**Kwiboo's `libavutil/hwcontext_v4l2request.c` `v4l2request_set_drm_descriptor` (lines 138-198)** for hantro's NV12 single-planar (V4L2_PIX_FMT_NV12, the format `v4l2-ctl --get-fmt-video-mplane-cap` reports for `/dev/video1` on ohm):
|
||
|
||
```c
|
||
desc->base.nb_objects = num_planes; // = 1 for single-planar NV12 on hantro
|
||
desc->base.objects[0].fd = exportbuffer.fd; // VIDIOC_EXPBUF returns ONE fd
|
||
// in v4l2request_set_drm_descriptor:
|
||
desc->nb_layers = 1;
|
||
layer->nb_planes = 1;
|
||
layer->planes[0].object_index = 0;
|
||
layer->planes[0].offset = 0;
|
||
layer->planes[0].pitch = bytesperline; // 1920
|
||
if (modifier != ARM_VENDOR) { // hantro outputs LINEAR (0x0), so this is true
|
||
layer->nb_planes = 2;
|
||
layer->planes[1].object_index = 0; // ← BOTH PLANES point at object 0
|
||
layer->planes[1].offset = pitch * height; // 1920 * 1088 = 2088960
|
||
layer->planes[1].pitch = layer->planes[0].pitch;
|
||
}
|
||
```
|
||
|
||
Per the source, mpv should produce **identical** fd values in the two `.add()` calls — both pulling from `desc->objects[0].fd`.
|
||
|
||
## What the runtime probe says
|
||
|
||
`v4l2-ctl --get-fmt-video-mplane-cap` on ohm `/dev/video1`:
|
||
|
||
```
|
||
Pixel Format : 'NV12' (Y/UV 4:2:0)
|
||
Number of planes : 1
|
||
sizeimage=3655712, bytesperline=1920
|
||
```
|
||
|
||
`strace -e trace=ioctl mpv ...` confirms ffmpeg only does **one** `VIDIOC_EXPBUF` per CAPTURE buffer (`index=N, plane=0` → one fd), exactly matching `nb_objects = 1`.
|
||
|
||
But `WAYLAND_DEBUG=1 mpv ...` shows two `.add()` calls **with different fd numbers** per buffer:
|
||
|
||
```
|
||
add(fd 41, 0, 0, 1920, 0, 0)
|
||
add(fd 42, 1, 2088960, 1920, 0, 0)
|
||
```
|
||
|
||
These fd numbers are **consecutive**, suggesting libwayland's `wl_closure_marshal` is `dup_cloexec`'ing the fd at protocol-marshal time and the trace prints the post-dup fd. Both fd 41 and fd 42 are dups of the same underlying `dma_buf` object (originally fd 17 or similar in mpv's table).
|
||
|
||
## Implications for iter1
|
||
|
||
The earlier phase 0 conclusion that mpv constructs an "internally inconsistent" wl_dmabuf message was **wrong**. There is no inconsistency at the producer ↔ mpv layer:
|
||
|
||
- nb_objects = 1, both planes use object 0 → mpv passes the same fd value into both `.add()` calls
|
||
- libwayland dups it before sending → wire trace shows different fd numbers, but they refer to the same backing memory
|
||
- Plane 1's offset = 2088960 is correct relative to the (single) underlying allocation
|
||
|
||
So the green frame is **not caused by mpv or ffmpeg's descriptor construction**. Something else.
|
||
|
||
## New hypothesis space (one of these is the real bug)
|
||
|
||
1. **Mali-G52 panfrost EGL_EXT_image_dma_buf_import_modifiers regression for NV12 with non-zero plane offset.** The driver may sample plane 1 from offset 0 of the imported fd instead of offset 2088960, giving zero-fill UV. Testable: a minimal EGL importer C program that imports a known NV12 dmabuf with offsets and reads back via `glReadPixels`.
|
||
|
||
2. ~~**KWin's wl_dmabuf import logic deduplicates the dup'd fds incorrectly.**~~ **RULED OUT 2026-05-08** by source-read of KWin 6.6.4 at `references/kwin-6.6.4/src/wayland/linuxdmabufv1clientbuffer.cpp` + `src/opengl/{eglbackend,egldisplay}.cpp`. (a) `LinuxDmaBufParamsV1::zwp_linux_buffer_params_v1_add` simply stores per-plane fd/offset/pitch in separate slots, no dedup. (b) `LinuxDmaBufParamsV1::test()` does `lseek(SEEK_END)` per plane + range checks against the resulting size; our 3,657,728 satisfies all of them. (c) `EglDisplay::importDmaBufAsImage` (both the combined and per-plane forms) passes `dmabuf.fd[i]`, `dmabuf.offset[i]`, `dmabuf.pitch[i]` straight to `eglCreateImage(EGL_LINUX_DMA_BUF_EXT, ...)` with no transformation. (d) `EglBackend::testImportBuffer` chooses between combined import and per-plane (Y as R8 / UV as RG88 from offset 2,088,960) based on whether NV12+LINEAR is in `nonExternalOnlySupportedDrmFormats()`. **Either path** forwards `offset = 2,088,960` to the driver. KWin is innocent.
|
||
|
||
3. ~~**hantro kernel driver exports a `dma_buf` with `size` < full allocation.**~~ **RULED OUT 2026-05-08** by `/tmp/expbuf_probe.c` on ohm. Driver `hantro-vpu` on `rk3568-vpu-dec` reports `CAPTURE: NV12 1920x1088 num_planes=1 sizeimage=3655712`; `VIDIOC_EXPBUF` yields fd whose `lseek(fd, 0, SEEK_END) = 3,657,728` (page-rounded up from 3,655,712). Offset 2,088,960 (plane 1 base) is firmly inside the exported size. Kernel is innocent.
|
||
|
||
Side observation worth recording: `sizeimage = 3,655,712` is bigger than naïve NV12's 1920×1088×1.5 = 3,133,440. The 522,272-byte excess sits **past** the UV plane (Y at [0, 2,088,960), UV at [2,088,960, 3,133,440), trailing padding at [3,133,440, 3,655,712)). On Rockchip codecs that tail commonly holds per-frame motion-vector / decoder-context data. Confirms ffmpeg's hardcoded `planes[1].offset = pitch*height = 2,088,960` is correct.
|
||
|
||
4. **kwin-fourier 0001 still has effect we missed.** Even though we ruled out kwin-fourier as a compositor-replacement A/B, that test was on an earlier kernel/Mesa combo. Worth verifying the test environment is fully reset.
|
||
|
||
## Recommended next moves for iter1
|
||
|
||
a. **Write a small C harness that does VIDIOC_EXPBUF on a hantro CAPTURE buffer and reports fd size + backing dma_buf info.** Decides hypothesis 3 in 30 minutes. Run on ohm directly.
|
||
|
||
b. **Patch mpv with `MP_VERBOSE` logging of the AVDRMFrameDescriptor fields at .add()-call time** (nb_objects, planes[].object_index, planes[].offset, objects[].size). Confirms the source-read is correct at runtime. Drop into mpv-fourier's `prepare()` slot, bump pkgrel, rebuild on fermi (~10 min CI).
|
||
|
||
c. **Read KWin's wl_dmabuf import logic** (KDE Plasma 6 / KWin 6.6.4 source) for how it handles multiple-fd-same-buffer cases. ~30 min source-read.
|
||
|
||
d. **Update `marfrit/dmabuf-modifier-triage#1`** with this revised analysis. The current issue body claims the bug is in mpv's plane-semantics translation — that conclusion is now overturned.
|
||
|
||
## Status
|
||
|
||
- iter1 phase 2 closed 2026-05-08. Three of five hypotheses ruled out:
|
||
- H2 (KWin wl_dmabuf import) — KWin 6.6.4 source-read clean; both wl-protocol params validation and EGL import paths forward user-supplied offset verbatim to `eglCreateImage`. Innocent.
|
||
- H3 (hantro size cap) — `/tmp/expbuf_probe.c` runtime probe on ohm: `lseek(SEEK_END)=3,657,728`, well past offset 2,088,960. Innocent.
|
||
- H5 (offset mismatch with hantro NV12 stride) — arithmetic disproves: Y at [0, 2,088,960), UV at [2,088,960, 3,133,440), trailing Rockchip metadata at [3,133,440, 3,655,712). ffmpeg's plane[1].offset is correct.
|
||
- **Live hypothesis: H1 — panfrost's `EGL_DMA_BUF_PLANE*_OFFSET_EXT` handling for LINEAR NV12 (or per-plane RG88 with non-zero offset).** Acceptance criterion (`screenshots/frame10_expected.png`) is unchanged.
|
||
- Delivery vehicle (`mpv-fourier-1:0.41.0-8`) is still the right shipping path **if** the fix turns out to be a defensive workaround in mpv. With kernel + ffmpeg + mpv + KWin all exonerated, the patch most likely lands in Mesa-panfrost (`vulkan-panfrost` package — already in marfrit). Less-likely fallback: kernel hantro driver-side adjustment.
|
||
- Next probe: a minimal EGL importer harness that recreates the per-plane RG88-from-offset-2088960 import. Run with a known-good NV12 buffer (synthesized in CPU memory, written to a CMA dmabuf via udmabuf) and `glReadPixels` the resulting EGL_image. If the read shows zeros where UV data should be, panfrost is confirmed culprit. ~1-2h C code, runs on ohm.
|