diff --git a/phase1_goal_iter2.md b/phase1_goal_iter2.md new file mode 100644 index 0000000..2ad7708 --- /dev/null +++ b/phase1_goal_iter2.md @@ -0,0 +1,62 @@ +# Phase 1 — iter2 goal formulation (HEVC backend EXT_SPS_*_RPS extension) + +Locked 2026-05-16, post-Phase-0-update with upstream-survey resolution. + +## Iter2 goal (one sentence) + +**Extend `libva-v4l2-request-fourier` to populate `V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS` and `_LT_RPS` for VDPU381/VDPU383 HEVC, by vendoring GStreamer's `gst-plugins-bad/codecparsers/gsth265parser.c` (dropping GLib deps), and using a runtime-optional control probe so the same backend binary works against both 7.0-and-later kernels (HEVC engages) and earlier kernels (HEVC falls back to whatever the pre-7.0 path did) — until ampere HEVC HW decode passes the ampere-fourier iter1 C1-C6 success criteria per the per-codec floors, with HEVC SSIM-Y-at-f720 expected in H.264-drift territory (~0.65 ± 0.05).** + +## Resolved Phase 0 open questions + +- **Q1 (architecture)**: B — implement H.265 SPS parser in backend, mirroring GStreamer's `gst_v4l2_codec_h265_dec_fill_ext_sps_rps`. +- **Q2 (UAPI shim vs bump)**: header strategy is runtime-optional control probe (no `#ifndef` shim), per GStreamer's pattern. Compile-time CID + struct definitions: ship a minimal internal header at `src/hevc-ctrls/v4l2-hevc-ext-controls.h` (the new file matches the precedent of fresnel iter25's image_fmt shim approach — minimal, scope-tagged to what the new UAPI added in 7.0). The headers package on ampere stays at 6.19-1; no `pacman -Syu` needed. +- **Parser source sub-question**: B1 — vendor GStreamer's `gsth265parser.c` directly. Operator-confirmed 2026-05-16. + +## Vendoring spec for the GStreamer parser + +- Source: `gitlab.freedesktop.org/gstreamer/gstreamer` → `subprojects/gst-plugins-bad/codecparsers/gsth265parser.c` + companion header. Take the version that landed alongside or after MR !10820 (GStreamer 1.28). +- Target location in backend: `src/h265_parser.{c,h}` OR `src/gsth265parser/{gsth265parser.c, gsth265parser.h}` — TBD in Phase 4 (file-naming detail; the vendored content itself is licence-pinned by upstream's LGPL header which we preserve). +- License: file keeps GStreamer's LGPL header (authorship + source URL preserved). Backend's existing COPYING.LGPL covers redistribution. MIT remains for the rest of the backend. Add a top-level note in `README.md` listing the vendored file as an LGPL-licensed third-party component. +- GLib dependency removal: replace `GArray` with plain dynamically-sized C arrays (size + pointer); replace `g_slice_*` / `g_malloc` with libc `malloc`/`free`; replace `GstBitReader` with the backend's own `bit_reader.{c,h}` style (or a vendored copy if backend lacks one — check during Phase 2). +- Function-name namespace: keep `gst_h265_parser_*` for the vendored functions (don't rename — that breaks the upstream-bug-fix-sync we're paying for by vendoring). The backend's call sites use the GStreamer names directly. +- Tracking upstream: README note pinning the vendored revision (sha + tag); when GStreamer ships a parser fix, the backend pulls it manually (no submodule machinery in this iteration). + +## Success criteria + +For HEVC, mirroring ampere-fourier iter1's C1-C6 with per-codec-floor refinement: + +| # | Criterion | Phase 3 instrument | Phase 7 anchor | +|---|-----------|--------------------|----------------| +| **C1** | HEVC libva HW decode runs to completion on `bbb_60s_720p.hevc.mp4`, exit 0, frame count = 1440 frames | `ffmpeg -benchmark -hwaccel vaapi -hwaccel_output_format vaapi -i $clip -vf hwdownload,format=nv12 -f null -`; frame= count | post-iter2 patch installed | +| **C2** | HW path engaged via ioctl trace; new `MEDIA_REQUEST_IOC_QUEUE` count includes the EXT_SPS_*_RPS controls (i.e. higher per-frame ioctl count vs ampere-fourier iter1's H.264 baseline) | `strace -ff -e trace=ioctl -p $PID`; count `VIDIOC_S_EXT_CTRLS` invocations involving the new CID values | iter1 H.264 ioctl counts as reference history | +| **C3** | HEVC frame 0 byte-identical vs ffmpeg SW reference (HEVC I-frame, no inter-prediction) | `cmp` of 1 382 400 bytes (one 720p yuv420p frame) | iter1 all-codec sha `3214803d8be74416` | +| **C4** | HEVC frame 720 (t=30 s) SSIM Y in H.264-drift territory (~0.65 ± 0.05); no fixed PASS threshold | `ffmpeg -lavfi "[0:v][1:v]ssim"` at frame 720 | fresnel iter1 HEVC SSIM Y was 1.000 (byte-identical) on RK3399 — does NOT carry as anchor; ampere RK3588 may behave differently. Track and accept whatever drift territory shows up; if it's surprising (way better OR worse than ~0.65 ± 0.05), Phase 7 surfaces it for Phase 4 plan refinement | +| **C5** | HEVC FPS reported at N=3 with σ; mean within an order of magnitude of iter1's H.264 461 FPS / VP8 217 FPS / MPEG-2 200 FPS (all 8x-20x realtime). No fixed threshold | wall-time around `ffmpeg`; mean + σ | iter1 numbers as reference history only | +| **C6** | dmesg clean across the full sweep — specifically no OOPS in `rkvdec_hevc_prepare_hw_st_rps` or related | diff pre vs post `dmesg --time-format=ctime` | iter1 empty diff | +| **C7** | (carried over from iter1, was deferred) firefox-fourier vendor-default HEVC engagement — now testable since SDDM auto-login was set up post-iter1 | empty-profile sweep, count `Got VA-API DMABufSurface` for HEVC autoplay | optional — capture if rig allows but not iter2-blocking | +| **C8** | **regression check** — ampere-fourier iter1's 3-codec baseline (H.264, VP8, MPEG-2) still passes C1-C6 per iter1 per-codec floors | re-run `~/measurements/p3_*.sh` post-patch | iter1 numbers verbatim | + +## Hypothesis + +With the vendored GStreamer parser + EXT_SPS_*_RPS control population: +- C1-C3, C5-C6, C8 pass +- C4 falls in H.264-drift territory (~0.65 ± 0.05) per fresnel + ampere iter1 convergent observations — RK3588 rkvdec is conformant within H.265 spec tolerance but not bit-identical to libavcodec SW +- C7 passes if rig (auto-login Wayland session) holds at test time + +## Hypothesis falsifiers (loopback edges) + +- **F1**: HEVC decode still OOPSes despite the new controls being set with valid GStreamer-parser data → loopback Phase 0 with re-opened `kernel-agent#11`. Mechanism is something other than UAPI-gap; needs deeper kernel-side debugging. +- **F2**: HEVC decode succeeds (no OOPS) but produces garbage output (C3 fails — frame 0 not byte-identical) → loopback Phase 4. The parser is producing wrong field values; bisect against GStreamer's reference output or FFmpeg's WIP equivalent on the same clip. +- **F3**: HEVC decode succeeds and produces clean output but ampere-fourier iter1 regression (C8 fails) → loopback Phase 4. The patch is touching shared HEVC code in a way Phase 5 review didn't catch (e.g. unconditional control submission breaks RK3399 path). Per-driver gate via `driver_data->driver_kind` per `feedback_per_driver_kludge_gating`. +- **F4**: Parser vendoring hits LGPL/MIT/license compatibility issues at distribution time → not a code falsifier; revisit at Phase 6 with the operator if it surfaces. + +## Out of iter2 + +- VP9 — that's iter3. +- AV1 — that's `libva-v4l2-request-fourier#2` outside this meta-campaign. +- Kernel-side defense-in-depth NULL/uninit guard for `rkvdec_hevc_prepare_hw_st_rps` — upstream-defense work, optional follow-up issue against kernel-agent or directly upstream. +- IOMMU restore patch verification (Phase 0 open Q4) — passive substrate check during iter2 Phase 2. + +## Phase 1 close + +Goal, eight success criteria (C1-C8), four falsifier paths. Parser-source decision locked. Ready for iter2 Phase 2 (situation analysis + reset-context).