From c839b9456eee567717d6cc4351a9223ef19d0160 Mon Sep 17 00:00:00 2001 From: claude-noether Date: Sun, 17 May 2026 12:12:23 +0000 Subject: [PATCH] ampere-av1 Phase 3 finding: iter2 Fix 3 release is NOT the divergence cause MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Investigated whether picture.c::BeginPicture's iter2 Fix 3 release-on- rebind was causing AV1 inter-frame divergence on av1_larger.ivf (film_grain stress vector). Added env-gated LIBVA_SKIP_REBIND=1 experiment (leak old slot instead of release); A/B run showed identical 3/10 PASS count with and without the release. Hypothesis disproved. Where the divergence actually lives: - patched ffmpeg-v4l2-request-fourier libavcodec.so with a fwrite diag in ff_v4l2_request_append_output → 7 dump files for the -frames:v 5 kdirect run, sizes [15133, 3670, 1970, 1323, 812, 886, 1310] BYTE-IDENTICAL to our LIBVA_V4L2_DUMP_OUTPUT first 7 submissions for the same input - our backend has 2 EXTRA EndPicture calls (t8 size 824, t9 size 487) on RE-USED surfaces (0x4000008 and 0x4000006) - the extras happen because ffmpeg-vaapi's AV1 hwaccel issues redecode requests onto surfaces that already hold frames the consumer hasn't downloaded yet - SKIP_REBIND should let those redecodes' slots stay around but doesn't help, because surface_object->current_slot can only point at ONE slot at a time and bind_slot overwrites it True root cause: ffmpeg-vaapi AV1 hwaccel's surface accounting is incompatible with the iter2 Fix 3 1:1 surface↔slot invariant when the stream has show_existing_frame frames. Fix would need either (a) cap_pool tracking N surfaces per slot, or (b) backend reading ffmpeg-vaapi's display-order mapping and remapping slots accordingly. Both are non-trivial Phase 4 work — outside this iteration's scope. Reverted the LIBVA_SKIP_REBIND env-gate to clean shape. Comment updated with the investigation outcome so the next session has the context without rediscovering. State: 3/10 av1_larger frames bit-exact (frames 0/2/4, the apply_grain=1 IDR-derived ones). test_av1.ivf 208x208 still bit-exact PASS (no regression). diagnostic logs in BeginPicture + surface_unbind_slot + v4l2_ioctl_controls retained for future investigation. Co-Authored-By: Claude Opus 4.7 --- src/picture.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/picture.c b/src/picture.c index 013371a..785b4a4 100644 --- a/src/picture.c +++ b/src/picture.c @@ -378,6 +378,15 @@ VAStatus RequestBeginPicture(VADriverContextP context, VAContextID context_id, * first. The new slot is bound and its V4L2 index + mmap pointers * are mirrored into surface_object->destination_* so the existing * QBUF/DQBUF/EXPBUF code paths see no behavioral change. + * + * AV1 Phase 3 finding: LIBVA_SKIP_REBIND=1 experiment (do NOT + * unbind on rebind) did not improve PASS count for the av1_larger + * film_grain stress vector — proving the iter2 Fix 3 release is + * NOT the source of the inter-frame divergence. The issue is + * deeper in ffmpeg-vaapi's AV1 hwaccel: per byte-equal OUTPUT + * comparison with the patched-ffmpeg-v4l2request reference run + * (LD_LIBRARY_PATH override on a debug libavcodec.so), 7/7 first + * EndPicture submissions are byte-identical, libva has 2 EXTRA. */ if (surface_object->current_slot != NULL) surface_unbind_slot(driver_data, surface_object);