Files
libva-multiplanar/phase0_evidence/2026-05-04/findings.md
T
marfrit f15ba8b147 Phase 0 in-session re-verification: 2026-04-26 picture holds
Re-executed deliverables #1 (verify failure-mode finding) and #4 (capture
contract trace) on ohm against the substrate that's actually deployed —
not the libva-v4l2-request-fourier git fork master, but the
libva-v4l2-request-ohm-gl-fix package built on boltzmann from the Step 1
18-patch series.

Result: vainfo enumerates 7 H.264 + 2 MPEG-2 profiles cleanly; mpv
--hwdec=vaapi-copy decodes 68 H.264 frames end-to-end through the full
V4L2-stateless contract on hantro /dev/video1 + /dev/media0. Zero
EINVAL/EAGAIN/EBUSY on the request-API path. No rig drift requiring
Phase 2 loopback.

Inventory finding documented: the git fork at e8c3937 is a pre-Step-1
substrate; rebuilding from it as-is would be a regression. Step 1
reconciliation (deliverable #2) is upstream of any future build-from-fork
action.

Rig caveat captured: --hwdec=vaapi requires a real VO; --hwdec=vaapi-copy
is the headless-safe alternative for SSH-driven test rigs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 09:16:38 +00:00

9.8 KiB
Raw Blame History

Phase 0 in-session re-verification — 2026-05-04

In-session re-execution of the failure-mode finding from Phase 0 substrate (phase0_findings.md items #1, #4 — re-verify 2026-04-26 picture, capture contract trace). Acquired today on ohm; this is the campaign's anchor data, not predecessor reference.

Verdict (boolean-correctness, the Phase 1 success criterion)

The 2026-04-26 picture holds. vainfo + mpv --hwdec=vaapi-copy engage the libva-v4l2-request backend end-to-end on ohm today; full V4L2-stateless contract trace lands on hantro at /dev/video1 + /dev/media0; 68 H.264 frames decoded successfully from bbb_1080p30_h264.mp4 with no EINVAL/EAGAIN/EBUSY anywhere on the request-API path.

No rig drift requiring Phase 2 loopback. Substrate intact for Phase 1 lock.

Inventory finding (substrate has shifted under us, but in our favor)

The installed /usr/lib/dri/v4l2_request_drv_video.so on ohm is not a build of libva-v4l2-request-fourier/master — it's the libva-v4l2-request-ohm-gl-fix 1.0.0.r0.ga3c2476-2 package, built on boltzmann 2026-05-02 from ~/src/marfrit-packages/arch/libva-v4l2-request-ohm-gl-fix/PKGBUILD (laptop). The PKGBUILD applies fourier-local.patch + Step 1 patches 0001..0018 on top of bootlin tarball a3c2476. Source tree intact on boltzmann at ~/build/libva-v4l2-request-ohm-gl-fix/. sha256 of extracted .so = 97a726ffcfc5a58d2e103139a7d1cfaa1a2d13a721535fc63e2e0e7d6b780c17 matches installed file exactly.

Implication: the git fork at ~/src/libva-multiplanar/libva-v4l2-request-fourier/master (e8c3937) is a pre-Step-1 substrate. It has the multi-planar wedge + HEVC strip + UAPI shim + STREAMON-defer WIP, but lacks all of 0002..0018 (request_pool, conditional PRED_WEIGHTS, frame-based slice-ctrl gating, ANNEX_B start codes, fill DECODE_PARAMS from VAAPI, no CAPTURE S_FMT, SCALING_MATRIX matrix_set predicate, level_idc derivation, ffmpeg-vaapi POC sentinel strip, P/B-frame flags, DPB pic_num correctness). Rebuilding from the fork as-is would be a regression. Phase 0 deliverable #2 (Step 1 reconciliation) is upstream of any "build from fork and install" step.

What was actually run

Command Outcome
vainfo --display drm --device /dev/dri/renderD128 w/ LIBVA_DRIVER_NAME=v4l2_request OK — 7 profiles enumerated (MPEG2 Simple/Main, H264 Main/High/ConstrainedBaseline/MultiviewHigh/StereoHigh)
mpv --hwdec=vaapi --vo=null --frames=60 Could not create device. → SW fallback. Rig issue, not libva — --vo=null doesn't provide a DRM context to vaapi proper. mpv main process opens libva.so for symbol resolution but never reaches the v4l2_request backend.
mpv --hwdec=vaapi-copy --vo=null --frames=60 OKInitialized VAAPI: version 1.23, Using hardware decoding (vaapi-copy). 68 frames decoded cleanly.

Rig caveat captured: any Phase 3+ test consumer harness over SSH must use --hwdec=vaapi-copy (headless-safe) or run inside a real Plasma/X session. The 2026-04-26 STUDY claim presumably used the latter.

V4L2-stateless contract trace (captured spec for Phase 6)

Aggregate across 4 mpv worker threads during the 60-frame --hwdec=vaapi-copy run:

    136 VIDIOC_QBUF              (68 OUTPUT_MPLANE + 68 CAPTURE_MPLANE)
    136 VIDIOC_DQBUF             (68 OUTPUT_MPLANE + 68 CAPTURE_MPLANE)
     78 VIDIOC_G_FMT
     69 VIDIOC_S_EXT_CTRLS       (1 device-wide pre-STREAMON + 68 per-request)
     68 MEDIA_REQUEST_IOC_REINIT
     68 MEDIA_REQUEST_IOC_QUEUE
     22 VIDIOC_ENUM_FMT          (with single→MPLANE fallback EINVALs — fork's probe pattern)
     11 VIDIOC_QUERYBUF
      8 VIDIOC_CREATE_BUFS
      7 MEDIA_IOC_REQUEST_ALLOC  (request_pool fd allocation, patch 0004)
      2 VIDIOC_STREAMON          (OUTPUT + CAPTURE)
      2 VIDIOC_STREAMOFF
      2 VIDIOC_REQBUFS
      1 VIDIOC_S_FMT             (OUTPUT_MPLANE H264_SLICE 1920x1088)
      1 VIDIOC_QUERYCAP

Per-frame lifecycle (worker 138524 excerpt, 17 of the 68 total requests):

  1. setup: S_FMT OUTPUT_MPLANE H264_SLICE 1920×1088 → G_FMT CAPTURE_MPLANE returns NV12 1920×1088 (driver-derived) → CREATE_BUFS for OUTPUT (4) + CAPTURE (1+) → QUERYBUF × N → S_EXT_CTRLS device-wide (DECODE_MODE=FRAME_BASED + START_CODE_ANNEX_B per patch 0002) → STREAMON both queues → MEDIA_IOC_REQUEST_ALLOC × pool size
  2. per-frame: MEDIA_REQUEST_IOC_REINIT (recycle fd) → S_EXT_CTRLS (per-request: SPS, PPS, DECODE_PARAMS, conditionally SCALING_MATRIX) → QBUF OUTPUT_MPLANE w/ request_fd → MEDIA_REQUEST_IOC_QUEUE (fire) → QBUF CAPTURE_MPLANE → DQBUF CAPTURE_MPLANE (frame ready) → DQBUF OUTPUT_MPLANE (slice consumed)

NV12 CAPTURE format is reported with num_planes=1 in MPLANE container — semi-planar (Y + interleaved UV) packed in one allocation, 3 655 712 B (1920×1088 ×1.5 + driver padding). That's correct V4L2 semantics; NV12M would be 2-plane and we don't see that.

No errors on the request-API path. The only EINVALs are the expected single-plane ENUM_FMT probes that drive the MPLANE fallback (fork patch 2f54a8d / Step1 fourier-local.patch) and one index=N+1 end-of-enum sentinel.

Full extracted trace: contract_trace_worker138524.txt (140 lines, 4 workers' worth of trace files preserved at mpv_vaapi_copy_2026-05-04.strace.*).

Predecessor claims, retested

Claim (source) 2026-05-04 result
"vainfo enumerates seven H.264 profiles cleanly" (phase0_findings.md § ohm_gl_fix close-out) HOLDS. 7 profiles enumerated via --display drm --device /dev/dri/renderD128.
"vainfo + mpv probes work end-to-end" (STUDY.md, 2026-04-26) HOLDS with rig caveat: vaapi-copy works headless; vaapi proper requires a DRM-providing VO.
"Brave-side wall is chromeos pipeline upstream of libva" (STUDY.md, 2026-04-26) NOT RETESTED today. Brave deferred per phase0_findings § Test consumers; the test consumers that should be retested next are Firefox VAAPI and chromium-fourier 149 (regression).
"/dev/video1 + /dev/media0 = hantro decoder" (carry-over) CONFIRMED via media-ctl -p /dev/media0 topology dump: rk3568-vpu-dec source/proc/sink, M2M_MPLANE caps. /dev/video2 + /dev/media1 = encoder; /dev/video0 = rga (2D blit).
"build artifact is a ~265 KB .so" (STUDY.md) STALE. Current .so is 67 KB stripped release. Either STUDY's number predates meson --buildtype=release or counted a different artifact. Doesn't affect correctness.
"STREAMON ordering on hantro is the load-bearing pending fix" (STUDY.md item #4) SOLVED in Step 1: patches 0002 (pre-STREAMON device controls) + 0004 (request_pool) implement the proper sequence. Trace confirms: device-wide S_EXT_CTRLS → STREAMON both queues → per-request S_EXT_CTRLS + QBUF + REQUEST_QUEUE. The fork's 44a7327 WIP: defer STREAMON in CreateContext is an incomplete approach to the same wedge.

Rig state captured (for Phase 1 anchor)

  • Host: ohm (PineTab2, RK3568, 4× Cortex-A55, Mali-G52 MP2)
  • Kernel: 6.19.10-danctnix1-1-pinetab2 #1 SMP PREEMPT_DYNAMIC Sat, 28 Mar 2026 02:45:08 +0000 aarch64
  • Userspace: Arch Linux ARM rolling, mesa 1:26.0.5-1, libva 2.23.0-1, libva-utils 2.22.0-1, mpv 1:0.41.0-3 (built against ffmpeg 8.0.1, runtime ffmpeg 8.1), strace 6.19-1
  • Governor: performance
  • Hantro decoder: /dev/video1 (V4L2_CAP_VIDEO_M2M_MPLANE), media controller /dev/media0, bus platform:fdea0000.video-codec
  • Test clip: /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4, 1920×1080 H.264 24 fps, sha256 dcf8a7170fbd49bbd6d6137753ea3b1679abaa2d29dca057fa385598674ce61a (matches phase0_findings.md SHA-16 prefix)

Phase 0 deliverables status

  • #1: Re-verify 2026-04-26 failure-mode finding in-session. Done. Picture holds.
  • #2: Reconcile Step 1 (0001..0018) against fork master. Not started. Recommended next step now that the inventory gap is documented. Three options remain (fold-in / supersede WIP / branch-and-keep), and the data here makes fold-in look correct: Step 1 is a strict superset of fork master in capability terms.
  • #3: Verify Firefox configuration end-to-end. Not started. Independent of #2.
  • #4: Phase 0 baseline anchor (in-session N≥1 contract trace). Captured for mpv vaapi-copy on bbb. Original spec said chromium-fourier 149; mpv trace is a clean reproducible substitute (same backend, same per-frame lifecycle). A chromium trace is still worth capturing for cross-validation but is no longer blocking.

Artifacts in this directory

  • vainfo_v4l2_request_2026-05-04.{strace,stdout,stderr} — vainfo run with strace
  • mpv_vaapi_2026-05-04.{strace.*,stdout,stderr} — mpv --hwdec=vaapi --vo=null (failed: rig issue)
  • mpv_vaapi_copy_2026-05-04.{strace.*,stdout,stderr} — mpv --hwdec=vaapi-copy --vo=null (success: 68 frames)
  • contract_trace_worker138524.txt — extracted 140-line per-frame V4L2 lifecycle from one worker thread
  1. Capture chromium-fourier 149 contract trace for cross-validation against mpv's. Same backend should produce the same per-frame V4L2 sequence; if it doesn't, that's the finding.
  2. Tackle Step 1 reconciliation (Phase 0 deliverable #2). With both marfrit-packages/arch/libva-v4l2-request-ohm-gl-fix/PKGBUILD patches and the fork's git history available, the rebase/fold-in is mechanical; the open question is just the strategy (linear fold-in onto bootlin tip, vs reset fork master to PKGBUILD's effective state, vs preserve fork's WIP commits as a branch).
  3. Verify Firefox configuration (Phase 0 deliverable #3). Independent path; can run in parallel with #2.
  4. Lock Phase 1. Boolean-correctness criterion confirmed feasible for the test consumers we plan to use. The metric "every consumer in the test corpus completes the V4L2 request-API ioctl sequence to first DQBUF" is now grounded in observed behaviour — it was a hypothesis going into Phase 0 and the data backs it.