marfrit c6d55bce29 Phase 3 close: AV1 PASS on all-intra (10/10) and grain-IDR; film_grain+show_existing edge case localized
Major iteration result on av1-iter1 backend branch (tip c839b94 both
ampere and noether). AV1 hardware decode is FUNCTIONALLY WORKING for
the common cases:

  Fixture                                    Result
  test_av1.ivf (2 frames, no grain)          bit-exact PASS 2/2
  av1-1-b8-02-allintra.ivf (39 all-intra)    bit-exact PASS 10/10
  av1_larger.ivf (film_grain + show_existing) 3/10 PASS (apply_grain=1 IDR-derived)
  av1-1-b10-23-film_grain-50.ivf (10-bit)    both libva + kdirect 0 bytes (vpu981 may not support)

The 10/10 all-intra PASS is the load-bearing validation: it proves
our backend's V4L2 control submission, OUTPUT byte assembly, surface
management, reference timestamp plumbing, and per-codec dispatch
are all correct for the common AV1 case.

The remaining 7/10 divergence on the film_grain+show_existing
fixture is localized via patched-libavcodec dump (LD_LIBRARY_PATH
override on debug fwrite-instrumented libavcodec.so) to:
  - First 7 EndPicture submissions byte-IDENTICAL to kdirect for
    SEQUENCE + FRAME + TILE_GROUP_ENTRY + FILM_GRAIN ctrls AND for
    OUTPUT byte payload.
  - libva has 2 EXTRA EndPicture calls on REUSED surfaces (the
    ffmpeg-vaapi AV1 hwaccel's show_existing_frame handling).
  - iter2 Fix 3 release-on-rebind FALSIFIED as the cause
    (LIBVA_SKIP_REBIND=1 A/B identical to default).

Fix space (Phase 4): cap_pool refactor to track N surfaces per
slot, OR ffmpeg-vaapi AV1 hwaccel surface-allocation change.

All diagnostic infrastructure retained for next iteration:
  /tmp/diff_av1_ctrls.py on ampere (per-CID strace byte diff)
  /tmp/ivf_split.py on ampere (per-frame IVF extraction)
  LIBVA_V4L2_DUMP_OUTPUT env on backend (libva-side OUTPUT bytes)
  patched libavcodec build instructions in close doc

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 12:14:03 +00:00

ampere-av1-enablement

AV1 hardware decode verification on Rockchip RK3588 ampere (CoolPi CM5 GenBook).

Status (2026-05-17 09:00)

VERIFIED WORKING bit-perfect first-try using mainline 7.0.0-rc3 + ffmpeg-v4l2request kdirect path on the hantro vpu981 AV1 driver. Zero new code required.

Sibling campaign ampere-vp9-enablement closed at structural-impossibility on rkvdec/vdpu381 VP9; Janet PIVOT verdict pointed to AV1; verification confirmed AV1 works out-of-the-box.

Verification

$ ssh ampere
$ ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime \
    -i /tmp/av1_larger.ivf -vf 'hwdownload,format=nv12' \
    -f rawvideo -pix_fmt nv12 /tmp/hw-av1-all.nv12
[AVHWFramesContext] Using V4L2 media driver hantro-vpu (7.0.0) for AV1F

Byte-compare against ffmpeg's libdav1d SW reference, all 10 frames of the av1-1-b8-23-film_grain-50.ivf test vector (352×288, includes film-grain feature):

frame 0: exact=100.0000%
frame 1: exact=100.0000%
... 
frame 9: exact=100.0000%

Smaller test vector (av1-1-b8-01-size-208x208.ivf, 2 frames): also 100% match.

Driver stack

  • IP: vpu981 (Rockchip's dedicated AV1 hardware on RK3588, MMIO at fdc70000)
  • Kernel driver: drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c (in-tree)
  • DT compatible: presumably rockchip,rk3588-vpu981-av1-dec (verified loaded via lsmod | grep hantro_vpu)
  • V4L2 node: /dev/video4 (enumerates AV1F format)
  • Userspace path: ffmpeg -hwaccel v4l2request (kdirect, not libva)

What's NOT done

  • libva-v4l2-request-fourier backend AV1 dispatch — backend has no AV1 codec module. ffmpeg-v4l2request kdirect works without libva. Adding libva AV1 support would make AV1 available to other VAAPI consumers (VLC, mpv with VA-API, GStreamer-VAAPI, browsers via VAAPI/VDPAU). Estimated effort: 1-2 days mirroring the existing HEVC/H.264/VP9 dispatch patterns in ~/src/libva-v4l2-request-fourier/src/ (sibling repo).
  • Fluster AV1-TEST-VECTORS comprehensive validation (only ran 2 of the AOM test vectors).
  • Stress test (long bitstream, 1080p+, complex features beyond film_grain).

Out of scope

  • Adding AV1 to rkvdec/vdpu381 (would duplicate the working hantro-vpu981 path)
  • Reviving the failed VP9 work from sibling campaign

Process

This is a verification campaign more than an enablement campaign. The work was done upstream by Verisilicon + Collabora; this repo documents that it works on ampere out-of-the-box.

S
Description
AV1 HW decode verified on RK3588 ampere via mainline hantro vpu981 driver — bit-perfect first-try
Readme 58 KiB
Languages
Markdown 100%