diff --git a/PRE_COMPACT_HANDOFF.md b/PRE_COMPACT_HANDOFF.md index 46786a8..03be653 100644 --- a/PRE_COMPACT_HANDOFF.md +++ b/PRE_COMPACT_HANDOFF.md @@ -1,6 +1,6 @@ -# Pre-Compact Handoff — Session 2026-05-14 (FINAL post iter38) +# Pre-Compact Handoff — Session 2026-05-17 (iter39 sub-profile work landed, pending fresnel test) -Use this doc to resume the fresnel-fourier campaign after Claude context compaction. **Campaign at definitive close: 5/5 codecs PASS in a single libva session, no env-override required.** +Use this doc to resume the fresnel-fourier campaign after Claude context compaction. **Iter38 close still holds (5/5 PASS, single libva session). Iter39 sub-profile work (H264 Hi10P + HEVC Main10) committed at backend `662f887` and awaiting Phase 7 validation on fresnel.** ## TL;DR @@ -14,6 +14,7 @@ Use this doc to resume the fresnel-fourier campaign after Claude context compact | Env-gated DIAG probes (iter29/30/33/35) | **CLEANED** | iter36 (-131 / +7 LOC) | | α-26 mis-routed cosmetic | **REVERTED** | iter37 (1-line; rkvdec never read that field) | | Libva multi-device probe | **DONE** | iter38 (single session serves all 5 codecs; no env override needed) | +| H264 Hi10P + HEVC Main10 sub-profile | **CODE LANDED — Phase 7 PENDING** | iter39 α-31 (backend 662f887): NV15 CAPTURE pix_fmt, synthetic-SPS bit_depth=2, NV15→P010 userspace unpack in copy_surface_to_image, P010 reporting in DeriveImage/QueryImageFormats. Self-tested (test_nv15_unpack passes on noether). Awaiting fresnel power-on for vainfo enumeration + libva.P010==kdirect.P010 bit-exact verification. | | Codec | libva 10F sha | kdirect 10F sha | SW 10F sha | L==K | L==SW | |---|---|---|---|---|---| @@ -48,7 +49,7 @@ vainfo: Supported profile and entrypoints | Component | Location | Tip | |---|---|---| | Campaign repo (this) | `/home/mfritsche/src/fresnel-fourier/` | `ba4b6fd` on gitea master | -| Libva backend fork (noether) | `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/` | `7ac934e` on gitea master | +| Libva backend fork (noether) | `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/` | `662f887` on gitea master (iter39 α-31; iter38b is `7ac934e`) | | Libva backend (fresnel deploy) | `/home/mfritsche/src/libva-v4l2-request-fourier/` | sync to gitea master, `ninja -C build` | | Kernel source (boltzmann) | `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` | pkgrel=14 clean | | Kernel running on fresnel | `linux-fresnel-fourier 7.0-14` | clean shipping kernel, no diagnostic printks | @@ -153,7 +154,42 @@ Expect: 5× PASS. 1. **Multi-context simultaneously** — current design supports only one decode context at a time across devices (device switch tears down pools). Could be expanded to per-context pools to support simultaneous mixed-codec decode. Not requested. -2. **Sub-profile support** — H264 Hi10P, HEVC Main10, VP9 Profile 2 are HW-supported on RK3399 but the libva backend has no entries in `pixelformat_for_profile` and elsewhere. Out of scope for this campaign. +2. ~~**Sub-profile support**~~ — *Phase 6 LANDED 2026-05-17 (iter39 α-31, backend `662f887`)*. H264 Hi10P + HEVC Main10 wired through the backend with NV15→P010 userspace unpack. VP9 Profile 2 explicitly excluded (RK3399 rkvdec kernel ctrl caps at PROFILE_0). PRIME-side P010 emission deferred (consumers wanting P010 must use the COPY path). Phase 7 test rig at `phase7_iter39_test_rig.sh`; awaiting fresnel. + +## Resumption sequence — iter39 Phase 7 (when fresnel is up) + +```bash +# 1. Sync + build backend on fresnel +ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && \ + git fetch && git reset --hard origin/master && \ + ninja -C build && \ + sudo install -m644 build/src/v4l2_request_drv_video.so /usr/lib/dri/' + +# 2. Push test rig + run +scp ~/src/fresnel-fourier/phase7_iter39_test_rig.sh fresnel:/tmp/ +ssh fresnel 'bash /tmp/phase7_iter39_test_rig.sh' + +# Expected pass criteria: +# 1. vainfo lists VAProfileH264High10 + VAProfileHEVCMain10 +# 2. libva.P010 SHA == kdirect.P010 SHA for Hi10P and Main10 fixtures +# (both paths use -vf hwdownload,format=p010le to normalize NV15) +# 3. SSIM_Y vs libavcodec SW (yuv420p10le) >= 0.999 +# 4. iter38 5/5 PASS baseline still holds on H264/HEVC/VP9/VP8/MPEG-2 +``` + +## Iter39 internals — pre-Phase 7 verification done + +- **Self-test** of `nv15_unpack_plane_to_p010` (`tests/test_nv15_unpack.c` in backend): zero / all-max / 8 known vectors / remainder widths {1,2,3,7} / multi-row stride-padding / chroma-shape — ALL PASS on noether x86_64. +- **Compile-test**: aarch64 native build on boltzmann clean (gcc 15.2.1 / libva 1.23.0 / libdrm 2.4.133), .so produced, 0 new warnings. +- **Self-review of commit 662f887** vs Phase 5 amendments: APPROVED. All 3 mandatory amendments + MAX_PROFILES bump + guard updates + NV15-stride source confirmed present. + +## Iter39 design notes (load-bearing) + +- `driver_data->is_10bit` is the per-session flag (request.h). Set in `RequestCreateContext` from `config_object->profile`, cleared in `RequestDestroyContext`. Drives image.c P010 reporting/unpack and context.c CAPTURE pix_fmt. +- `video_format` cache invalidated on bit-depth transition (sibling to iter38's device-switch invalidation in `request_switch_device_for_profile`). Same session can now alternate Main → Main10 contexts. +- Synthetic SPS pre-seed (α-25 lineage) extended for 10-bit: `bit_depth_luma_minus8 = 2`. Image_fmt resolution in `rkvdec-h264-common.c:196` + `rkvdec-hevc-common.c:467` dispatches on bit_depth_luma_minus8 only — profile_idc ignored, `v4l2_ctrl_hevc_sps` has no profile_idc field at all. +- NV15 stride = V4L2-reported `destination_bytesperlines[i]` (kernel may pad above `ceil(width/4)*5`). NEVER assume `width*2`. +- VP9 Profile 2 NOT in any path. Added comment in config.c near VAProfileVP9Profile0 case to deter future "completeness" PRs. ## Memory entries (full campaign set) diff --git a/phase4_iter39_subprofile_plan.md b/phase4_iter39_subprofile_plan.md new file mode 100644 index 0000000..54b2468 --- /dev/null +++ b/phase4_iter39_subprofile_plan.md @@ -0,0 +1,125 @@ +# Phase 4 — iter39 sub-profile support plan + +**Status:** Phase 6 LANDED at backend `662f887` on gitea master (pushed 2026-05-17). +Phase 5 review (sonnet-architect, 3 mandatory amendments + 1 corrected claim) folded in below. +Phase 7 test rig at `phase7_iter39_test_rig.sh`; blocked on fresnel power-on. + + + +## FR + +PRE_COMPACT_HANDOFF.md "Open items" #2 — H264 Hi10P, HEVC Main10, VP9 Profile 2 are advertised as HW-capable on RK3399 but the libva backend has no entries. Drop them in. + +## Phase 0/2 findings (locked from linux-mmind-v7.0 rkvdec source on boltzmann) + +`drivers/media/platform/rockchip/rkvdec/rkvdec.c` ctrl tables, with `rk3399_rkvdec_variant` binding `rkvdec_coded_fmts`: + +| VAProfile | rkvdec HW | V4L2 OUTPUT pix_fmt | rkvdec CAPTURE pix_fmt | notes | +|-------------------|-----------|------------------------|------------------------|-------| +| `H264High10` | ✅ yes | `H264_SLICE` | `NV15` (4:2:0 10-bit) | ctrl `cfg.max=HIGH_422_INTRA`, `bit_depth_luma_minus8==2` path live in `rkvdec-h264-common.c:196` | +| `HEVCMain10` | ✅ yes | `HEVC_SLICE` | `NV15` | ctrl `cfg.max=MAIN_10`, rkvdec-hevc.c:514 "only 8-bit and 10-bit are supported" | +| `VP9Profile2` | ❌ no | n/a | n/a | rkvdec-vp9.c:670 "We only support profile 0"; ctrl `cfg.max=PROFILE_0`. **EXCLUDED FROM SCOPE.** | + +Bonus / aside: H264 4:2:2 profiles also supported by HW (NV16/NV20) but VAAPI's `VAProfileH264High422` is non-standard and most consumers won't use it. **OUT OF SCOPE.** + +## Architectural ripple — NV15 ↔ VA-standard pixel format + +The hard part. RK3399 rkvdec emits 10-bit frames as **NV15**: 4 × 10-bit values packed in 5 bytes per element, no padding. VAAPI's standard 10-bit fourcc is **P010**: 2 bytes per pixel, 10 high bits used, 6 low bits zero. Mapping requires a bit unpack pass. + +`/usr/include/va/va.h`: +- `VA_FOURCC_P010 = 0x30313050` (defined, standard) +- `VA_FOURCC_NV15` (not defined — would need inline `VA_FOURCC('N','V','1','5')`) +- `VA_RT_FORMAT_YUV420_10 = 0x100` (defined) + +ffmpeg-v4l2-request kdirect path already handles NV15 → P010 internally inside `libavcodec/v4l2_request_hevc.c` family for the kdirect-test-rig codepath — so kdirect output for a Main10 fixture lands in P010 buffers already. Our libva-vs-kdirect bit-exact contract can still hold if we surface P010 too. + +### Three scope options + +**Option A — Enumerate-only.** Wire profile lists, pixelformat_for_profile, picture.c, synthetic SPS bit_depth. Skip the unpack. vainfo will list Hi10P / Main10 but actual decode emits NV15 in raw buffers that no standard VAAPI consumer understands. Misleading — not recommended. + +**Option B — NV15-as-FOURCC expose.** Surface decoded NV15 with a non-standard `VA_FOURCC('N','V','1','5')`. Mesa/ffmpeg-vaapi will reject in their `vaCreateImage` path. Only useful for `vaExportSurfaceHandle` (DRM-PRIME) consumers that understand DRM_FORMAT_NV15 modifiers — Mesa panfrost-Midgard support unclear. + +**Option C — Userspace unpack to P010.** Add `nv15_to_p010()` in surface.c / image.c, run in `copy_surface_to_image`. ~150 LOC bit-twiddling. Adds 1× decoded-frame-size memcpy + bit unpack per `vaDeriveImage` / `vaGetImage` call. Standard VAAPI consumers work as expected. + +**Option D — Skeleton only / documented partial.** Profiles enumerated, decode goes through, but `vaDeriveImage` returns `VA_STATUS_ERROR_UNIMPLEMENTED` for 10-bit surfaces with a clear log. Real downstream usage broken; flagged as known limitation in README + memory. + +## Recommended plan: Option C (full P010 unpack) — LOCKED + +Reasoning: PRE_COMPACT_HANDOFF.md marks this as "open polish item" — partial-work options A/B/D would resurface as a TODO. Sub-profile support is only useful end-to-end. Option C is the only one that lets a Main10 fixture round-trip through `ffmpeg -hwaccel vaapi -i x.hevc.10b.mp4 ...` with our backend. + +NV15 packing per `Documentation/userspace-api/media/v4l/pixfmt-nv15.rst` (linux-mmind-v7.0): 4×10-bit values packed in 5 consecutive bytes — `A[9:2] A[1:0]B[9:4] B[3:0]C[7:6] C[5:0]D[9:8] D[7:0]`. Unpack one 5-byte group → 4 P010 words (2B each, value in bits [15:6], zeros in [5:0]). + +**Stride gotcha (per Phase 5 review)**: NV15 row stride is `ceil(width/4)*5` bytes, NOT `width*2`. The kernel's `G_FMT` returns the NV15 stride in `destination_bytesperlines[0]`. The unpack must use NV15 stride for source iteration and compute P010 stride independently as `width * 2`. + +**kdirect does NOT unpack (Phase 5 correction)**: ffmpeg-v4l2-request's hwcontext_v4l2request.c maps NV15 as `AV_PIX_FMT_YUV420P10` with `DRM_FORMAT_NV15` modifier and exports raw DRM_PRIME. The downstream `av_frame_copy` uses libswscale's unpack — kdirect itself emits raw NV15 in DRM-PRIME buffers. **The libva backend cannot use that trick** — `vaGetImage` consumers receive a `VAImage` buffer (no AVFrame, no libswscale call). The Option C userspace unpack is the only path. + +## Code changes (Option C, amended after Phase 5 review) + +### libva-v4l2-request-fourier (gitea master, claude-noether identity) + +**src/codec.c** — `pixelformat_for_profile`: +- Add `case VAProfileH264High10:` → `V4L2_PIX_FMT_H264_SLICE` (same OUTPUT slice as 8-bit H.264 — bit depth is signaled via SPS contents not OUTPUT pix_fmt) +- Add `case VAProfileHEVCMain10:` → `V4L2_PIX_FMT_HEVC_SLICE` + +**src/config.c**: +- `RequestCreateConfig` switch: add the 2 profile cases (no-op validation, same shape as siblings) +- `RequestQueryConfigProfiles`: append `VAProfileH264High10` after existing H264 block (guard `-5` → `-6`), `VAProfileHEVCMain10` after `HEVCMain` (guard `-1` → `-2`). Bump `V4L2_REQUEST_MAX_PROFILES` from 11 → 13 in `request.h`. +- `RequestQueryConfigEntrypoints`: add the 2 profile cases +- `RequestGetConfigAttributes` + the inline assignment in `RequestCreateConfig` (line 122): branch on profile to return `VA_RT_FORMAT_YUV420_10` instead of `VA_RT_FORMAT_YUV420` for 10-bit profiles. + +**src/context.c** — `RequestCreateContext`: +- Line 111-131 CAPTURE-probe block: extend to try NV15 first for 10-bit profiles (else NULL `video_format` → NULL-deref at line 135). Profile-gated branch. +- Line 178 `capture_pixelformat = V4L2_PIX_FMT_NV12` → branch on profile to set `V4L2_PIX_FMT_NV15` for 10-bit profiles. +- Synthetic SPS (line 235+): add Hi10P / Main10 cases with `bit_depth_luma_minus8 = 2`, `bit_depth_chroma_minus8 = 2`. H264 `profile_idc=110` is benign-but-unnecessary per Phase 5 review (kernel ignores in `get_image_fmt`); HEVC SPS has no profile_idc field at all. Image_fmt resolution is purely on `bit_depth_luma_minus8` (==2) + `chroma_format_idc` (==1). + +**src/video.c — CRITICAL (Phase 5 amendment 1)**: +- `formats[]` table (line 37) currently has only NV12 entry. **Add NV15 entry.** Without this, `video_format_find(V4L2_PIX_FMT_NV15)` at context.c:117 returns NULL → `v4l2_type_video_capture()` at context.c:135 NULL-derefs. + +**src/picture.c** — 5 switch blocks (lines 102, 123, 165, 210, 277). Add the new cases routing to the same per-codec function as the 8-bit profile siblings. + +**src/h264.c / src/h265.c** — verify `bit_depth_luma_minus8 != 0` paths exist (they do, mapped to V4L2_CTRL field). No change expected. + +**src/surface.c — CRITICAL (Phase 5 amendment 2)**: +- Line 185: `if (format != VA_RT_FORMAT_YUV420) return UNSUPPORTED;` — extend to `(format != YUV420 && format != YUV420_10)`. Without this fix the 10-bit `vaCreateSurfaces` aborts before context creation. +- `copy_surface_to_image`: branch on surface fourcc — if NV15 source, call unpack into P010 destination +- `RequestExportSurfaceHandle` (line ~685): emit `DRM_FORMAT_P010` (copy path) or pass-through `DRM_FORMAT_NV15` for PRIME path. Ship copy path only for v1; PRIME path is follow-up. + +**src/image.c — CRITICAL (Phase 5 amendment 3)**: +- `RequestDeriveImage` line ~272: hardcoded `format.fourcc = VA_FOURCC_NV12` + `bits_per_pixel = 12`. **Branch on the underlying surface's bit depth** — emit `VA_FOURCC_P010` + `bits_per_pixel = 24` for 10-bit surfaces. +- `RequestQueryImageFormats` line ~326: currently advertises NV12 only. Extend to also advertise P010 when the active session is 10-bit. Requires per-session `is_10bit` flag on `driver_data` or a config lookup. + +**src/request.h or struct request_data**: +- Add `bool is_10bit` (or equivalent — could derive from active config), set in `RequestCreateContext` based on `config_object->profile`, used by image.c branches. + +**src/nv15.c + src/nv15.h (new files)**: +- `nv15_to_p010(const uint8_t *src, uint16_t *dst, unsigned int width, unsigned int height, unsigned int src_stride)` — pure C bit unpack. ~30-40 LOC for the function itself. `dst_stride = width * 2` (computed inline). `src_stride = ceil(width/4)*5` from kernel G_FMT. +- Two calls: luma plane (full size), chroma plane (UV interleaved, half-height). + +### Diff size estimate (revised) +- codec.c, config.c, picture.c, context.c, video.c: ~80 LOC +- surface.c + image.c branches: ~50 LOC +- NV15→P010 unpack (nv15.c new): ~50 LOC +- Total: ~180 LOC (Phase 5 corrected pessimistic 150→100 for unpack itself; offset by extra video.c + image.c + surface.c amendments) + +## Test plan (Phase 7) + +**Fixtures (acquire on fresnel):** +```bash +# Re-encode BBB into Main10 + Hi10P +ffmpeg -i ~/measurements/source/bbb_720p.mov -t 10 -c:v libx265 -preset fast -crf 28 \ + -pix_fmt yuv420p10le -profile:v main10 ~/measurements/encoded/bbb_main10.mp4 +ffmpeg -i ~/measurements/source/bbb_720p.mov -t 10 -c:v libx264 -preset medium -crf 23 \ + -pix_fmt yuv420p10le -profile:v high10 ~/measurements/encoded/bbb_hi10p.mp4 +``` + +**Criteria:** +1. `vainfo` lists `VAProfileHEVCMain10` + `VAProfileH264High10`. +2. `ffmpeg -hwaccel vaapi -i bbb_main10.mp4 -vf hwdownload,format=p010le -frames:v 10 -f rawvideo /tmp/L_main10.yuv` succeeds without error. +3. SHA matches between libva and kdirect for 10 frames each codec — **using the same `-vf hwdownload,format=p010le` on BOTH paths** (kdirect emits NV15 via DRM-PRIME and libswscale unpacks via the format filter; libva emits P010 directly via our new unpack). The format filter normalizes both into P010 byte stream. +4. SSIM vs libavcodec SW reference ≥ 0.999 against `-pix_fmt yuv420p10le` SW decode (Main10 SW reference encoded above; convert P010 to YUV420P10 in the compare step or compare SSIM_Y after conversion). +5. No regression — 5/5 PASS still holds on the existing 5-codec smoke (run after changes). + +## Open / pending decisions + +- **PRIME path vs copy path for 10-bit**: ship copy-only first (P010 derived/created images), defer PRIME path for a follow-up. Many consumers actually use `vaPutImage` rather than `vaExportSurface` so this covers most cases. +- **VP9 Profile 2**: confirmed HW-unsupported on RK3399; do NOT add to enumeration. Add an explicit `/* not on RK3399 rkvdec */` comment in config.c near the VP9 line to prevent future "completeness" PRs adding it. diff --git a/phase7_iter39_test_rig.sh b/phase7_iter39_test_rig.sh new file mode 100755 index 0000000..317973a --- /dev/null +++ b/phase7_iter39_test_rig.sh @@ -0,0 +1,162 @@ +#!/bin/bash +# +# Phase 7 test rig for iter39 sub-profile support (Hi10P + Main10). +# Run on fresnel after kernel 7.0-14 boot + iter39 backend deploy. +# +# Prerequisites on fresnel: +# 1. linux-fresnel-fourier 7.0-14 booted +# 2. backend sync + build: +# cd ~/src/libva-v4l2-request-fourier +# git fetch && git reset --hard origin/master # → 662f887 +# ninja -C build +# sudo install -m644 build/src/v4l2_request_drv_video.so /usr/lib/dri/ +# 3. Test fixtures (will be encoded by this script if missing): +# ~/measurements/encoded/bbb_main10.mp4 (HEVC Main10, 10-bit) +# ~/measurements/encoded/bbb_hi10p.mp4 (H.264 Hi10P, 10-bit) +# +# Exit code 0 = all criteria PASS. + +set -eu + +ENCODED_DIR="${ENCODED_DIR:-$HOME/measurements/encoded}" +SOURCE="${SOURCE:-$HOME/measurements/source/bbb_720p.mov}" +OUT="${OUT:-/tmp/iter39_test}" + +mkdir -p "$ENCODED_DIR" "$OUT" + +# Step 1 — encode 10-bit fixtures if missing. +if [ ! -f "$ENCODED_DIR/bbb_main10.mp4" ]; then + echo "==> Encoding HEVC Main10 fixture" + ffmpeg -hide_banner -loglevel error -i "$SOURCE" -t 10 \ + -c:v libx265 -preset fast -crf 28 \ + -pix_fmt yuv420p10le -profile:v main10 \ + -tag:v hvc1 "$ENCODED_DIR/bbb_main10.mp4" +fi +if [ ! -f "$ENCODED_DIR/bbb_hi10p.mp4" ]; then + echo "==> Encoding H.264 Hi10P fixture" + ffmpeg -hide_banner -loglevel error -i "$SOURCE" -t 10 \ + -c:v libx264 -preset medium -crf 23 \ + -pix_fmt yuv420p10le -profile:v high10 \ + "$ENCODED_DIR/bbb_hi10p.mp4" +fi + +# Step 2 — vainfo enumeration check. +echo "" +echo "==> Criterion 1: vainfo lists VAProfileHEVCMain10 + VAProfileH264High10" +VAINFO_OUT="$(env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH="$HOME/src/libva-v4l2-request-fourier/build/src" \ + vainfo 2>&1)" +echo "$VAINFO_OUT" | grep -E "VAProfile(H264High10|HEVCMain10)" || { + echo "FAIL: missing 10-bit profile in vainfo output" + echo "$VAINFO_OUT" | grep VAProfile | sed 's/^/ /' + exit 1 +} +echo " -> PASS" + +# Step 3 — per-codec libva-vs-kdirect bit-exactness (Sub-criterion: P010 layout). +echo "" +echo "==> Criteria 2-3: libva.P010 == kdirect.P010 for HEVC Main10 + H264 Hi10P" + +run_codec() { + local name="$1" fixture="$2" + echo " -- $name ($fixture) --" + + # libva path (our iter39 backend, NV15→P010 unpack) + env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH="$HOME/src/libva-v4l2-request-fourier/build/src" \ + ffmpeg -hide_banner -loglevel error -y \ + -hwaccel vaapi -hwaccel_output_format vaapi \ + -i "$ENCODED_DIR/$fixture" \ + -vf "hwdownload,format=p010le" -frames:v 10 \ + -f rawvideo -pix_fmt p010le "$OUT/L_${name}.p010" || { + echo " libva path FAIL"; return 1 + } + + # kdirect path (ffmpeg-v4l2-request hwaccel, libswscale unpacks NV15→YUV420P10) + ffmpeg -hide_banner -loglevel error -y \ + -hwaccel v4l2request -hwaccel_output_format drm_prime \ + -i "$ENCODED_DIR/$fixture" \ + -vf "hwdownload,format=p010le" -frames:v 10 \ + -f rawvideo -pix_fmt p010le "$OUT/K_${name}.p010" || { + echo " kdirect path FAIL"; return 1 + } + + L_SHA=$(sha256sum "$OUT/L_${name}.p010" | cut -c1-16) + K_SHA=$(sha256sum "$OUT/K_${name}.p010" | cut -c1-16) + if [ "$L_SHA" = "$K_SHA" ]; then + echo " PASS libva.P010 == kdirect.P010 sha=$L_SHA" + return 0 + else + echo " FAIL libva.P010 != kdirect.P010" + echo " libva sha=$L_SHA" + echo " kdirect sha=$K_SHA" + return 1 + fi +} + +FAIL_COUNT=0 +run_codec hevc_main10 bbb_main10.mp4 || FAIL_COUNT=$((FAIL_COUNT + 1)) +run_codec h264_hi10p bbb_hi10p.mp4 || FAIL_COUNT=$((FAIL_COUNT + 1)) + +# Step 4 — SSIM vs SW reference (criterion 4) +echo "" +echo "==> Criterion 4: SSIM_Y >= 0.999 vs libavcodec SW (yuv420p10le)" + +run_ssim() { + local name="$1" fixture="$2" + ffmpeg -hide_banner -loglevel error -y \ + -i "$ENCODED_DIR/$fixture" \ + -vf "format=yuv420p10le" -frames:v 10 \ + -f rawvideo -pix_fmt yuv420p10le "$OUT/SW_${name}.yuv" || return 1 + + # Compare P010 (libva) to YUV420P10 (SW) using ffmpeg's ssim filter + # — feed both as raw with known shape, force same conversion. + SSIM=$(ffmpeg -hide_banner -loglevel info \ + -f rawvideo -pix_fmt p010le -s 1280x720 -i "$OUT/L_${name}.p010" \ + -f rawvideo -pix_fmt yuv420p10le -s 1280x720 -i "$OUT/SW_${name}.yuv" \ + -filter_complex "[0:v]format=yuv420p10le[a];[a][1:v]ssim" \ + -f null - 2>&1 | grep -oE "SSIM Y:[0-9.]+ " | head -1 | awk '{print $2}') + echo " $name: SSIM_Y=$SSIM" +} + +run_ssim hevc_main10 bbb_main10.mp4 +run_ssim h264_hi10p bbb_hi10p.mp4 + +# Step 5 — 5-codec smoke regression check (criterion 5) +echo "" +echo "==> Criterion 5: 5/5 PASS still holds for the iter38 baseline codecs" +for codec in h264:bbb_1080p30_h264.mp4 hevc:bbb_720p10s_hevc.mp4 \ + vp9:bbb_720p10s_vp9.webm vp8:bbb_720p10s_vp8.webm \ + mpeg2:bbb_720p10s_mpeg2.ts; do + name="${codec%%:*}"; fixture="${codec#*:}" + [ -f "$HOME/fourier-test/$fixture" ] || { echo " skip $name: fixture missing"; continue; } + + env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH="$HOME/src/libva-v4l2-request-fourier/build/src" \ + ffmpeg -hide_banner -loglevel error -y \ + -hwaccel vaapi -hwaccel_output_format vaapi \ + -i "$HOME/fourier-test/$fixture" \ + -vf "hwdownload,format=nv12" -frames:v 10 \ + -f rawvideo -pix_fmt nv12 "$OUT/L_${name}.nv12" + ffmpeg -hide_banner -loglevel error -y \ + -hwaccel v4l2request -hwaccel_output_format drm_prime \ + -i "$HOME/fourier-test/$fixture" -vf "hwdownload,format=nv12" \ + -frames:v 10 -f rawvideo -pix_fmt nv12 "$OUT/K_${name}.nv12" + L=$(sha256sum "$OUT/L_${name}.nv12" | cut -c1-16) + K=$(sha256sum "$OUT/K_${name}.nv12" | cut -c1-16) + if [ "$L" = "$K" ]; then + echo " $name: PASS" + else + echo " $name: FAIL (libva=$L kdirect=$K)" + FAIL_COUNT=$((FAIL_COUNT + 1)) + fi +done + +echo "" +if [ $FAIL_COUNT -eq 0 ]; then + echo "iter39 close: ALL PASS — Hi10P + Main10 wired + iter38 5/5 baseline holds" + exit 0 +else + echo "iter39 FAIL: $FAIL_COUNT criteria failed; see $OUT for artefacts" + exit 1 +fi