Locks iter1 scope to the 3 currently-kernel-supported codecs (H.264, VP8, MPEG-2). The other 3 (HEVC, VP9, AV1) are deferred to dedicated iter2/3/4 loops — bundling them would conflate three different external dependencies into one un-closable iteration. Resolves Phase 0 open questions Q1 (scope = 3 codecs), Q2 (bit-exact anchor = libva-HW vs SW byte-compare; kdirect-on-RK3588 is its own piece of work), Q4 (firefox-fourier in scope as consumer test). Defers Q3 (HEVC OOPS sandboxing) and Q5 (AV1 source clip) to the relevant iterations. Success criteria are six per-codec instruments (C1-C6: frame count, HW engagement via lsof+strace, byte-identical at frame 0, SSIM>=0.99 at frame 30s, N=3 FPS, clean dmesg) plus one campaign-level firefox-fourier vendor-default test (C7). Hypothesis is falsifiable in three named ways (codec crash/hang, first-frame byte deviation indicating kernel ABI drift, MPEG-2 IDCT tolerance refinement) — each maps to a specific phase loopback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6.2 KiB
Phase 1 — Goal formulation (iter1)
Locked 2026-05-16. Phase 0 finished 30 min earlier; substrate confirms 3 codecs decode-capable on the clean kernel and 3 are blocked behind external work. This iteration locks the scope to the kernel-supported subset — chasing the 3 blockers would conflate three different debugging problems with three different teams (kernel oops fix, kernel feature add, backend iter39) into one iteration loop, which is exactly the failure mode the 8-phase process is designed to avoid.
Iter1 goal (one sentence)
Establish a rigorous, instrumented baseline that the 3 currently-kernel-supported codecs (H.264 via rkvdec, VP8 + MPEG-2 via hantro) decode end-to-end through the libva-v4l2-request-fourier (iter38b, hand-built) backend on ampere, with measurable evidence of byte-correctness and HW path engagement — so subsequent iterations chasing the 3 blocker codecs have a known-stable floor to regress against.
Resolved Phase 0 open questions
- Q1 scope: 3 codecs (H.264 + VP8 + MPEG-2). The other 3 are tracked as separate iterations (each becomes its own 8-phase loop once the kernel-agent experiment / backend iter39 produces something to test). Locking iter1 to "all 6 work" makes the iteration unclosable.
- Q2 bit-exact anchor: libva-HW vs
ffmpeg -i clip … -f rawvideo nv12SW decode of the SAME input clip.kdirect(the fresnel-side V4L2 stateless test rig) is RK3399-specific and not yet ported to RK3588's mainline rkvdec; porting it is a separate piece of work, not iter1. The SW-compare anchor is weaker than libva==kdirect (it conflates "libva backend bug" with "kernel V4L2 stateless ABI deviation from SW reference"), but it is measurable against a stable reference, which is what Phase 1 needs. - Q3 HEVC OOPS sandboxing: deferred to the HEVC-fix iteration. Iter1 avoids HEVC entirely.
- Q4 firefox-fourier vendor defaults: in scope for iter1 as the consumer validation step — install
firefox-fourier 150.0.1-5on ampere, sweep H.264 / VP8 / MPEG-2 (skip HEVC, skip VP9, MPEG-2 expected n/a in Gecko per fresnel iter1 finding) with the empty-profile-vendor-defaults test method validated on fresnel. - Q5 AV1 source clip: deferred — iter1 doesn't touch AV1.
Success criteria (measurable, with the Phase 3 instrument named for each)
For each of {H.264, VP8, MPEG-2}:
| # | Criterion | Phase 3 instrument |
|---|---|---|
| C1 | libva HW decode runs to completion, exit 0, frame count = clip frame count (1440 frames for 60 s @ 24 fps) | ffmpeg -hide_banner -benchmark -hwaccel vaapi -hwaccel_output_format vaapi -i $clip -vf hwdownload,format=nv12 -f rawvideo /tmp/out.nv12 ; frame= count from stderr, ls -la /tmp/out.nv12 size check (= frame_count × 1280×720 × 1.5) |
| C2 | HW path is actually engaged, not a silent SW fallback | lsof -p $PID during decode shows /dev/video* and /dev/media* opened by the ffmpeg PID; AND strace -e trace=ioctl -p $PID shows VIDIOC_QUERYCAP / VIDIOC_S_EXT_CTRLS / MEDIA_REQUEST_IOC_QUEUE calls |
| C3 | Decoded byte output matches SW reference at frame 0 | cmp of first 1 382 400 bytes (one yuv420p 720p frame) between libva-HW /tmp/hw.nv12 and SW ffmpeg -i $clip -f rawvideo -pix_fmt nv12 -frames:v 1 /tmp/sw.nv12 |
| C4 | At frame 30 s, SSIM Y ≥ 0.99 vs SW reference (acknowledges H.264 cumulative drift per fresnel iter1) | ffmpeg -lavfi "[0:v][1:v]ssim" -f null - between the two clips at frame 720 |
| C5 | Per-codec decode FPS reported at N=3 with mean and σ. No "should be fast" — actual numbers, no recall from training or fresnel cross-host. | wall-time measurement around timeout … ffmpeg …; mean and standard deviation of 3 runs |
| C6 | dmesg clean across the full sweep — no new oops, no new warning, no rkvdec/hantro error lines | dmesg --time-format=ctime before/after the sweep, diff'd |
Plus one campaign-level criterion:
| C7 | firefox-fourier 150.0.1-5 engages HW for {H.264, VP8} via empty-profile vendor defaults (MPEG-2 expected n/a — Gecko doesn't play video/mp2t) | Empty ~/.mozilla/firefox-test/profile, autoplay page, count Got VA-API DMABufSurface and Requesting pixel format VAAPI_VLD MOZ_LOG entries |
Hypothesis
With backend 7ac934e (hand-built 0c9a7efaab…) over kernel 7.0.0-rc3-devices+, all 3 codecs satisfy C1-C6 and {H.264, VP8} satisfy C7. The basis for this prediction is fresnel iter38 close which certified the same backend on RK3399 with the same V4L2 stateless ABI surface; RK3588's mainline rkvdec + generic hantro present a similar enough surface that the libva-side code path should function identically.
The hypothesis is wrong if:
- (a) Any of the 3 codecs fails C1 (decode crash / hang / wrong frame count). Loop back to Phase 0: substrate was wrong, the codec isn't actually viable on this kernel.
- (b) Decode completes but C3 fails at frame 0 (libva-HW first-frame bytes differ from SW first-frame bytes). Loop back to Phase 2: surface a kernel-V4L2-ABI deviation on RK3588 that didn't show on RK3399 / RK3568, and characterize it before chasing per-codec tuning.
- (c) MPEG-2 C3 fails per IDCT precision tolerance (per memory
feedback_mpeg2_hw_sw_idct_precision, hantro MPEG-2 is intrinsically ≤3 LSB off SW per IEEE 1180). Phase 4 must relax C3 for MPEG-2 to "SSIM Y ≥ 0.9997" instead of "byte-identical." This is a plan refinement, not a substrate failure.
Out of iter1, tracked as separate iterations
- iter2 (planned): HEVC kernel OOPS fix. Substrate-mode work; needs
marfrit/kernel-agentexperiment with a candidate patch forrkvdec_hevc_prepare_hw_st_rps. Open a follow-up issue against kernel-agent before iter1 closes. - iter3 (planned): VP9 kernel enablement on RK3588 rkvdec. Substrate-mode; kernel-agent experiment with the upstream patch chain (VDPU381/383 family, per memory
feedback_rkvdec_patch_reachability). - iter4 (planned): AV1 via libva backend iter39 — third-fd probe in
request.c. Pure backend work, not kernel.
Each will be Phase 0 → Phase 8 in its own right, anchored to this iter1 baseline as its reference floor.
Phase 1 close
Goal is one sentence, success is six criteria each named to a specific instrument, hypothesis is falsifiable in three named ways. Ready for Phase 2 situation analysis.