5 Commits

Author SHA1 Message Date
marfrit 365764fffb Phase 0 amendment: hantro writes zeros, sentinel test cache-buggy
Re-baselined libva-v4l2-request decode path with kernel-side
observability (ftrace v4l2/vb2/dma_fence + dmesg + dynamic_debug)
and visual disambiguator (mpv --vo=gpu in operator's live Plasma
session).

Findings:

1. Kernel reports successful CAPTURE buffer write every frame:
   ftrace vb2_buf_done shows bytesused=3655712 (full NV12 1920x1088
   + hantro tile padding). dmesg completely silent — no
   hantro/vpu/decode/error/warn messages.

2. Visual disambiguator: mpv --hwdec=vaapi-copy --vo=gpu shows a
   solid GREEN frame; --hwdec=vaapi --vo=gpu shows solid BLUE.
   Neither shows the sentinel mid-beige (NV12 Y=0xab,UV=0xab would
   render cream). Both colors are consistent with the kernel
   writing all-zero NV12 (Y=0,UV=0 → green via BT.709 limited; same
   buffer GL-imported as DMA-BUF with different colorspace → blue).

3. Patch 0011 sentinel test has a cache-coherency bug: writes
   0xab via cached surface_object->destination_map[0] mmap, never
   invalidates cache before readback. So the readback always
   shows the stale sentinel even when kernel DMA-overwrote it
   with zeros. vaapi-copy and Mesa DMA-BUF GL import correctly
   invalidate cache and see the real (zero) contents.

This corrects the previous Phase 0 verdicts twice in one day:
- Original commit f15ba8b ("the 2026-04-26 picture holds") was
  wrong: clean contract trace, never checked pixel content.
- Revised commit e892cea ("kernel produces no decoded pixel
  output, sentinel survives") was half right: kernel does write,
  writes zeros, and the sentinel test was reading stale cache.
- Now: kernel writes ALL ZEROS to the CAPTURE buffer. Hantro is
  silently failing the bitstream parse or some control validation.

This is consistent with patch 0011's own commit message hypothesis:
"All zeros → kernel did write 0x00s (overwriting our sentinel),
and the apparent 'no picture' output is the kernel-side decode
actually producing zeros (e.g. parser rejected the bitstream)."
That hypothesis was right; we just couldn't confirm it via the
sentinel test (cache bug) and went down the wrong rabbit hole.

Phase 6 direction sharpens substantially. Bug isn't "we can't
engage hantro" — it's "hantro engages but its parser produces
zeros." Bisect the control submission: VIDIOC_G_EXT_CTRLS
readback to verify writes stick, diff against FFmpeg's
v4l2_request_h264.c (proven working on hantro), verify SPS
completeness, resolve patch 0008's slice_header bit_size open
question, dyndbg the hantro module, etc. Phase 1 boolean-
correctness criterion needs a working pixel-content check before
lock; fix patch 0011's cache sync first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 11:39:42 +00:00
marfrit e892cea858 Phase 0 deliverable #3 (Firefox live session): inverted verdict
Re-tested Firefox 150.0.1 inside operator's active Plasma 6 Wayland
session (not Xvfb). Two-layer finding:

1. Firefox engages libva in real Plasma session: full V4L2-stateless
   contract lifecycle completes, no EINVAL on the request-API path,
   v4l2_request_drv_video.so successfully loaded, /dev/video1 +
   /dev/media0 opened by RDD utility process 146420.

2. Kernel produces no decoded pixel output: CAPTURE buffer returns
   from DQBUF with the patch-0011 sentinel pattern 0xab unchanged.
   Hantro never wrote the buffer despite the contract trace looking
   clean. Firefox detected the failed first frame and silently fell
   back to SW decode in RDD's FFmpeg-OS-library PDM. User-visible
   playback continues normally for 5+ minutes (operator confirmed
   t=337s playback time in live inspection).

Cross-checked against the prior 2026-05-04 mpv vaapi-copy run: 68 of
68 mpv CAPTURE buffers show the same sentinel-survives pattern.
mpv's --vo=null consumed all 68 sentinel buffers as if they were
valid NV12 frames; the failure was invisible. OUTPUT bytes are
byte-for-byte identical between mpv and Firefox (same IDR slice via
libavcodec, both consumers feed hantro the same data, hantro
silently drops both).

Implication: the prior Phase 0 in-session re-verification verdict
(commit f15ba8b: "the 2026-04-26 picture holds at boolean-correctness
level") was wrong at the kernel-decode layer. The patch-0011 sentinel
test in the deployed Step 1 build was authored specifically to detect
this failure mode; the predecessor close-out didn't grep for it, and
contract-trace cleanliness was mistaken for end-to-end success.

Phase 1 lock should be deferred until: (a) boolean-correctness
criterion is sharpened to require pixel-content verification,
(b) Phase 0 acquires kernel-side observability (ftrace, dmesg) to
characterize WHY hantro is silent. Step 1 engages libva but doesn't
make hantro decode -- Phase 6 has substantive work beyond the
18-patch series.

Likely failure-mode candidates flagged in findings_live.md priority
order: reference_ts not propagated; DECODE_PARAMS slice_header
bit_size zero; POC sentinel may still leak past patch-0015 strip;
level_idc over-allocation; SOURCE_CHANGE event handling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:38:57 +00:00
marfrit f115fa6cbc Phase 0 deliverable #3 (Firefox): headless-rig finding
Firefox 150.0.1 + media.ffmpeg.vaapi.enabled=true + LIBVA_DRIVER_NAME=
v4l2_request, executed under Xvfb on ohm.

Result: inconclusive at the boolean-correctness level. RDD process
dlopens libva.so.2 + libva-drm.so.2 + libva-x11.so.2 for capability
probe then immediately closes them; never reaches vaInitialize, never
opens /dev/dri/renderD128, never reaches v4l2_request_drv_video.so.
Falls back to software H.264 in RDD via FFmpeg-OS-library PDM
(Broadcast support from 'RDD', support=H264 SWDEC).

Root cause: Xvfb provides software framebuffer with no DRI/DRM
render-node integration. Firefox's gfx-environment platform-fitness
check rejects VAAPI before adding it to the RDD PDM order list.
Not a libva-side or driver-side fault — mpv --hwdec=vaapi-copy in
the same headless rig DID engage end-to-end (per
phase0_evidence/2026-05-04/findings.md).

Definitive Firefox verdict requires retesting inside a live Plasma
session — deferred to live-session run (next commit).

Also: Phase 0 deliverable #2 (Step 1 reconciliation into fork
master) was completed and pushed to marfrit/libva-v4l2-request-fourier
between this and the prior Phase 0 commit; status table updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 10:19:14 +00:00
marfrit f15ba8b147 Phase 0 in-session re-verification: 2026-04-26 picture holds
Re-executed deliverables #1 (verify failure-mode finding) and #4 (capture
contract trace) on ohm against the substrate that's actually deployed —
not the libva-v4l2-request-fourier git fork master, but the
libva-v4l2-request-ohm-gl-fix package built on boltzmann from the Step 1
18-patch series.

Result: vainfo enumerates 7 H.264 + 2 MPEG-2 profiles cleanly; mpv
--hwdec=vaapi-copy decodes 68 H.264 frames end-to-end through the full
V4L2-stateless contract on hantro /dev/video1 + /dev/media0. Zero
EINVAL/EAGAIN/EBUSY on the request-API path. No rig drift requiring
Phase 2 loopback.

Inventory finding documented: the git fork at e8c3937 is a pre-Step-1
substrate; rebuilding from it as-is would be a regression. Step 1
reconciliation (deliverable #2) is upstream of any future build-from-fork
action.

Rig caveat captured: --hwdec=vaapi requires a real VO; --hwdec=vaapi-copy
is the headless-safe alternative for SSH-driven test rigs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 09:16:38 +00:00
marfrit affc1752e0 Initial campaign substrate: README + phase0_findings
Single-question campaign — make multi-planar libva accepted by VA-API
consumers on Rockchip hantro RK3568 (PineTab2/ohm first iteration).
Backend only, success criterion is boolean correctness, performance
deferred. Substrate carried over from libva-v4l2-request-fourier
STUDY.md (commit e0acc33 in the fork) plus locked decisions from the
2026-05-04 setup exchange.

Fork lives as a subdirectory: libva-v4l2-request-fourier/ (separate
git repo, origin marfrit/libva-v4l2-request-fourier, upstream
bootlin/libva-v4l2-request).

Empty Gitea repo created at git.reauktion.de/marfrit/libva-multiplanar;
local origin remote set, no push yet (per operator instruction —
wait for publish-worthy state).
2026-05-04 08:10:03 +00:00