Commit Graph

8 Commits

Author SHA1 Message Date
marfrit 89a4b81654 iter1 phase 2: hypothesis 3 ruled out by EXPBUF lseek probe
Probe `/tmp/expbuf_probe.c` (snapshot at probes/expbuf_probe.c) opens
/dev/video1, sets OUTPUT format H264_SLICE 1920x1088, REQBUFS 4 capture
buffers, EXPBUF on plane 0 of buffer 0, lseek(fd, 0, SEEK_END).

On ohm (kernel besser-7.0, hantro-vpu / rk3568-vpu-dec):
  CAPTURE: NV12 1920x1088 num_planes=1 sizeimage=3655712
  EXPBUF fd lseek(SEEK_END) = 3657728  (page-rounded from 3655712)

Kernel exports the dma_buf at full sizeimage; offset 2,088,960
(plane 1 base in ffmpeg's drm-frame-descriptor) is well inside.
Hantro is innocent.

Side observation: sizeimage = 3,655,712 > naive NV12's 3,133,440.
The 522,272-byte excess is trailing padding (likely Rockchip
per-frame MV / context metadata) past the UV plane. Y and UV layout
fit cleanly within [0, 3,133,440), exactly where mpv/ffmpeg expect.

Remaining hypothesis space: H1 (panfrost EGL non-zero plane offset),
H2 (KWin wl_dmabuf import), H4 (kwin-fourier residual, low conf).

Next probe queued: H2 source-read of KWin 6.6.4 wl_dmabuf import
path. ~30 min, no hardware needed. If that turns up nothing,
write the EGL importer harness for H1.

Posted to dmabuf-modifier-triage#1 comment 255.
2026-05-08 21:11:09 +00:00
marfrit eddd9ef88f phase2 iter1: source-read overturns earlier mpv-bug conclusion
mpv 0.41.0's drmprime_dmabuf_importer reads correctly. Kwiboo's
ffmpeg V4L2 hwaccel at b57fbbe sets planes[1].object_index=0 for
hantro single-planar NV12 (LINEAR modifier branch, line 157), so
mpv should produce identical fd values for both .add() calls.

Runtime confirms via strace: ffmpeg does one VIDIOC_EXPBUF per
CAPTURE buffer, returning ONE fd. nb_objects=1.

The "different fds per plane" observed in WAYLAND_DEBUG is most
likely libwayland's wl_closure_marshal dup_cloexec'ing the fd at
protocol-marshal time — both .add()s use the same source fd, the
trace shows post-dup values which are consecutive but point at the
same dma_buf.

This means the earlier phase 0 conclusion ("mpv mixes per-plane
fds with single-allocation offset") was wrong. The wl_dmabuf
message is internally consistent. Bug is somewhere else.

New hypothesis space (in phase2_iter1_findings.md):
  - Mali-G52 panfrost EGL_dma_buf_import with non-zero plane offset
  - KWin wl_dmabuf import deduplication bug
  - hantro kernel exports dma_buf with size < full allocation
  - environment-reset incompleteness from earlier kwin-fourier A/B

Recommended next moves: probe fd size on ohm, mpv debug-logging
patch, KWin source-read, update issue #1 with revised analysis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:12:47 +00:00
marfrit 9f406c0c42 phase0: revise framing — iter1 is unblocked, libva comparison is follow-up validation
Operator pushback: phase 0 should unblock iter1, not gate it. The
locked question is "fix what's locally in scope" — kill the green,
not just identify a layer. The captured wl_dmabuf message is
internally inconsistent on its own (per-plane fds + single-allocation
offset for plane 1 is a contradiction no valid producer can claim
simultaneously). mpv's translation layer produces this regardless of
which producer feeds it, so iter1 can write the fix from the ffmpeg-
side data alone. The libva-path WAYLAND_DEBUG comparison after iter9
is a follow-up validation that confirms the fix handles both producer
shapes, not a prerequisite for writing it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 19:01:12 +00:00
marfrit 4bbf255ea6 lock iter1 acceptance criterion: before/after screenshot anchor
Operator decision 2026-05-08: the iter1 fix is shipped only when frame
10 of the bbb fixture under --vo=dmabuf-wayland matches the expected
reference (captured via --vo=gpu on same hardware/session).

- screenshots/frame10_dmabuf_green.png — current broken state
- screenshots/frame10_expected.png — target post-fix state
- screenshots/README.md — verification protocol (SSIM > 0.95 + valid
  WAYLAND_DEBUG .add() semantics + no regression on vo=gpu / vo=wlshm)
- phase0_findings.md — references the criterion in the conclusion section

This locks the campaign's deliverable. Patches that change wire-protocol
.add() shape but don't restore the picture do not satisfy iter1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:40:10 +00:00
marfrit e293078d46 screenshots: frame 10 dmabuf-wayland green capture (reference for #1)
Reference image of the bug as visible on ohm. Captured 2026-05-08 via
mpv --hwdec=v4l2request --vo=dmabuf-wayland --pause --start=00:00:00.42
--fullscreen fourier-test/bbb_1080p30_h264.mp4 then spectacle -b -f -n.

Uniform dark green at approx RGB(0, 75, 0) — exactly what the all-zero
NV12 hypothesis (Y=U=V=0 through BT.601/709 conversion) predicts and
what marfrit/dmabuf-modifier-triage#1 diagnoses (KWin reads past-EOF
on the UV plane fd, returns zeros for chroma).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:36:12 +00:00
marfrit d13ad847a4 phase0 closed — root cause isolated to mpv vo_dmabuf_wayland plane-semantics
Eight directed A/B tests on ohm ruled out every layer that doesn't
matter (libva, decoder content, color tags, kwin-fourier 0001 patch,
Mesa 26.0.6 vs 26.0.5, Wayland generally, kernel 6.19.10 vs 7.0,
KWin Vulkan vs OpenGL backend). WAYLAND_DEBUG=1 capture surfaced the
real bug at the protocol-message layer:

  add(fd 41, plane=0, offset=0,        stride=1920, mod=0,0)
  add(fd 42, plane=1, offset=2088960,  stride=1920, mod=0,0)

mpv mixes per-plane fds (V4L2 MPLANE export semantics) with
single-allocation offset for plane 1 (single-fd export semantics).
KWin reads UV from past-EOF on the UV-plane fd → all-zero NV12 →
dark green render (Y=U=V=0 in BT.601/709 ≈ RGB(0,70,0)).

Filed at marfrit/dmabuf-modifier-triage#1 with full diagnosis +
suggested fix path. Iter1 fix work is blocked on libva-multiplanar
iter9 (need clean libva path to verify which producer is at fault
or whether mpv VO is the sole bug). Working interim path is
mpv --hwdec=v4l2request --vo=gpu (correct picture, slow shader path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:30:00 +00:00
marfrit dfabeddf93 phase0: reorder priority — kwin-fourier A/B first, smoking gun identified
User runs kwin-fourier with active patch
0001-transaction-bypass-watchDmaBuf-fence-wait.patch — this bypasses
KWin's implicit-sync fence wait on dmabufs. If KWin samples a hantro
CAPTURE buffer before the decoder fence signals, it gets all-zeros
NV12 which renders solid green in YUV→RGB. This is the most likely
cause and is decisively testable via a stock-kwin A/B (item 1).

Hypothesis: pre-iter5 the libva path was masked because vaSyncSurface
provided ordering. iter6/7 may have changed that. Both libva and
ffmpeg-v4l2request paths now expose the lack of fence-wait.

Run-history breadcrumb (#51 introduce, #58 switch to 0002, #59 revert
to 0001) recorded in the findings doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:56:42 +00:00
marfrit 7d68d17232 dmabuf-modifier-triage campaign scaffold
Focused triage of marfrit/libva-multiplanar#1 — dmabuf-wayland green
on ohm independent of decoder backend. Locked question: identify the
layer responsible (libva / ffmpeg / KWin / Mesa-panfrost / kernel)
and file upstream where appropriate. Performance is explicitly out of
scope — user has working slow path via vo=gpu hwdec=v4l2request.

Phase 0 deliverables: vaExportSurfaceHandle + AVDRMFrameDescriptor
modifier captures, Wayland linux-dmabuf-v1 advertise snapshot, pacman
upgrade timeline review for the iter5→iter8 regression window, and
stock-kwin A/B isolating kwin-fourier as a candidate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:49:51 +00:00