Files
marfrit 0d9914062c iter1 phase1: goal formulation
Locks iter1 scope to the 3 currently-kernel-supported codecs (H.264,
VP8, MPEG-2). The other 3 (HEVC, VP9, AV1) are deferred to dedicated
iter2/3/4 loops — bundling them would conflate three different
external dependencies into one un-closable iteration.

Resolves Phase 0 open questions Q1 (scope = 3 codecs), Q2 (bit-exact
anchor = libva-HW vs SW byte-compare; kdirect-on-RK3588 is its own
piece of work), Q4 (firefox-fourier in scope as consumer test).
Defers Q3 (HEVC OOPS sandboxing) and Q5 (AV1 source clip) to the
relevant iterations.

Success criteria are six per-codec instruments (C1-C6: frame count,
HW engagement via lsof+strace, byte-identical at frame 0, SSIM>=0.99
at frame 30s, N=3 FPS, clean dmesg) plus one campaign-level
firefox-fourier vendor-default test (C7).

Hypothesis is falsifiable in three named ways (codec crash/hang,
first-frame byte deviation indicating kernel ABI drift, MPEG-2 IDCT
tolerance refinement) — each maps to a specific phase loopback.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 07:20:23 +00:00

6.2 KiB
Raw Permalink Blame History

Phase 1 — Goal formulation (iter1)

Locked 2026-05-16. Phase 0 finished 30 min earlier; substrate confirms 3 codecs decode-capable on the clean kernel and 3 are blocked behind external work. This iteration locks the scope to the kernel-supported subset — chasing the 3 blockers would conflate three different debugging problems with three different teams (kernel oops fix, kernel feature add, backend iter39) into one iteration loop, which is exactly the failure mode the 8-phase process is designed to avoid.

Iter1 goal (one sentence)

Establish a rigorous, instrumented baseline that the 3 currently-kernel-supported codecs (H.264 via rkvdec, VP8 + MPEG-2 via hantro) decode end-to-end through the libva-v4l2-request-fourier (iter38b, hand-built) backend on ampere, with measurable evidence of byte-correctness and HW path engagement — so subsequent iterations chasing the 3 blocker codecs have a known-stable floor to regress against.

Resolved Phase 0 open questions

  • Q1 scope: 3 codecs (H.264 + VP8 + MPEG-2). The other 3 are tracked as separate iterations (each becomes its own 8-phase loop once the kernel-agent experiment / backend iter39 produces something to test). Locking iter1 to "all 6 work" makes the iteration unclosable.
  • Q2 bit-exact anchor: libva-HW vs ffmpeg -i clip … -f rawvideo nv12 SW decode of the SAME input clip. kdirect (the fresnel-side V4L2 stateless test rig) is RK3399-specific and not yet ported to RK3588's mainline rkvdec; porting it is a separate piece of work, not iter1. The SW-compare anchor is weaker than libva==kdirect (it conflates "libva backend bug" with "kernel V4L2 stateless ABI deviation from SW reference"), but it is measurable against a stable reference, which is what Phase 1 needs.
  • Q3 HEVC OOPS sandboxing: deferred to the HEVC-fix iteration. Iter1 avoids HEVC entirely.
  • Q4 firefox-fourier vendor defaults: in scope for iter1 as the consumer validation step — install firefox-fourier 150.0.1-5 on ampere, sweep H.264 / VP8 / MPEG-2 (skip HEVC, skip VP9, MPEG-2 expected n/a in Gecko per fresnel iter1 finding) with the empty-profile-vendor-defaults test method validated on fresnel.
  • Q5 AV1 source clip: deferred — iter1 doesn't touch AV1.

Success criteria (measurable, with the Phase 3 instrument named for each)

For each of {H.264, VP8, MPEG-2}:

# Criterion Phase 3 instrument
C1 libva HW decode runs to completion, exit 0, frame count = clip frame count (1440 frames for 60 s @ 24 fps) ffmpeg -hide_banner -benchmark -hwaccel vaapi -hwaccel_output_format vaapi -i $clip -vf hwdownload,format=nv12 -f rawvideo /tmp/out.nv12 ; frame= count from stderr, ls -la /tmp/out.nv12 size check (= frame_count × 1280×720 × 1.5)
C2 HW path is actually engaged, not a silent SW fallback lsof -p $PID during decode shows /dev/video* and /dev/media* opened by the ffmpeg PID; AND strace -e trace=ioctl -p $PID shows VIDIOC_QUERYCAP / VIDIOC_S_EXT_CTRLS / MEDIA_REQUEST_IOC_QUEUE calls
C3 Decoded byte output matches SW reference at frame 0 cmp of first 1 382 400 bytes (one yuv420p 720p frame) between libva-HW /tmp/hw.nv12 and SW ffmpeg -i $clip -f rawvideo -pix_fmt nv12 -frames:v 1 /tmp/sw.nv12
C4 At frame 30 s, SSIM Y ≥ 0.99 vs SW reference (acknowledges H.264 cumulative drift per fresnel iter1) ffmpeg -lavfi "[0:v][1:v]ssim" -f null - between the two clips at frame 720
C5 Per-codec decode FPS reported at N=3 with mean and σ. No "should be fast" — actual numbers, no recall from training or fresnel cross-host. wall-time measurement around timeout … ffmpeg …; mean and standard deviation of 3 runs
C6 dmesg clean across the full sweep — no new oops, no new warning, no rkvdec/hantro error lines dmesg --time-format=ctime before/after the sweep, diff'd

Plus one campaign-level criterion:

| C7 | firefox-fourier 150.0.1-5 engages HW for {H.264, VP8} via empty-profile vendor defaults (MPEG-2 expected n/a — Gecko doesn't play video/mp2t) | Empty ~/.mozilla/firefox-test/profile, autoplay page, count Got VA-API DMABufSurface and Requesting pixel format VAAPI_VLD MOZ_LOG entries |

Hypothesis

With backend 7ac934e (hand-built 0c9a7efaab…) over kernel 7.0.0-rc3-devices+, all 3 codecs satisfy C1-C6 and {H.264, VP8} satisfy C7. The basis for this prediction is fresnel iter38 close which certified the same backend on RK3399 with the same V4L2 stateless ABI surface; RK3588's mainline rkvdec + generic hantro present a similar enough surface that the libva-side code path should function identically.

The hypothesis is wrong if:

  • (a) Any of the 3 codecs fails C1 (decode crash / hang / wrong frame count). Loop back to Phase 0: substrate was wrong, the codec isn't actually viable on this kernel.
  • (b) Decode completes but C3 fails at frame 0 (libva-HW first-frame bytes differ from SW first-frame bytes). Loop back to Phase 2: surface a kernel-V4L2-ABI deviation on RK3588 that didn't show on RK3399 / RK3568, and characterize it before chasing per-codec tuning.
  • (c) MPEG-2 C3 fails per IDCT precision tolerance (per memory feedback_mpeg2_hw_sw_idct_precision, hantro MPEG-2 is intrinsically ≤3 LSB off SW per IEEE 1180). Phase 4 must relax C3 for MPEG-2 to "SSIM Y ≥ 0.9997" instead of "byte-identical." This is a plan refinement, not a substrate failure.

Out of iter1, tracked as separate iterations

  • iter2 (planned): HEVC kernel OOPS fix. Substrate-mode work; needs marfrit/kernel-agent experiment with a candidate patch for rkvdec_hevc_prepare_hw_st_rps. Open a follow-up issue against kernel-agent before iter1 closes.
  • iter3 (planned): VP9 kernel enablement on RK3588 rkvdec. Substrate-mode; kernel-agent experiment with the upstream patch chain (VDPU381/383 family, per memory feedback_rkvdec_patch_reachability).
  • iter4 (planned): AV1 via libva backend iter39 — third-fd probe in request.c. Pure backend work, not kernel.

Each will be Phase 0 → Phase 8 in its own right, anchored to this iter1 baseline as its reference floor.

Phase 1 close

Goal is one sentence, success is six criteria each named to a specific instrument, hypothesis is falsifiable in three named ways. Ready for Phase 2 situation analysis.