commit f9562972bdda0b885a317d29c003f3f493c9f115 Author: Claude (noether) Date: Sun May 17 07:00:02 2026 +0000 Phase 0: AV1 hw decode on ampere VERIFIED bit-perfect first-try Pivoted from ampere-vp9-enablement (closed at structural impossibility on rkvdec/vdpu381). Janet PIVOT verdict pointed at AV1 — verification shows it works out-of-the-box on mainline 7.0.0-rc3: - Kernel driver: drivers/media/platform/verisilicon/ rockchip_vpu981_hw_av1_dec.c (in-tree, loaded as hantro-vpu) - Hardware: vpu981 dedicated AV1 IP at fdc70000 (separate from rkvdec) - V4L2 node: /dev/video4 enumerates AV1F format - Userspace: ffmpeg -hwaccel v4l2request kdirect path works Verification: byte-compare HW (hantro-vpu) vs SW (libdav1d) on two AOM test vectors: - av1-1-b8-01-size-208x208.ivf (2 frames): 100.0000% exact match - av1-1-b8-23-film_grain-50.ivf (10 frames): 100.0000% exact match per frame, including AV1 film_grain post-processing Pivot outcome: - VP9 campaign: 10 iterations + 2 architect reviews → structural impossibility (kernel-side gap that needs upstream/Collabora coord) - AV1 verification: 0 iterations → bit-perfect first try The "enablement campaign" framing is mostly inappropriate for AV1 — this is a verification campaign. Real upstream work was done by Verisilicon + Collabora; we just confirm it works on ampere. Optional follow-ups (out of Phase 0 scope): A. libva backend AV1 dispatch (enables VAAPI consumers) B. Fluster AV1-TEST-VECTORS comprehensive validation C. 1080p/4K real-world AV1 stress test Co-Authored-By: Claude Opus 4.7 diff --git a/README.md b/README.md new file mode 100644 index 0000000..5e17ec1 --- /dev/null +++ b/README.md @@ -0,0 +1,53 @@ +# ampere-av1-enablement + +AV1 hardware decode verification on Rockchip RK3588 ampere (CoolPi CM5 GenBook). + +## Status (2026-05-17 09:00) + +**VERIFIED WORKING bit-perfect first-try** using mainline 7.0.0-rc3 + ffmpeg-v4l2request kdirect path on the hantro `vpu981` AV1 driver. Zero new code required. + +Sibling campaign [ampere-vp9-enablement](https://git.reauktion.de/claude-noether/ampere-vp9-enablement) closed at structural-impossibility on rkvdec/vdpu381 VP9; Janet PIVOT verdict pointed to AV1; verification confirmed AV1 works out-of-the-box. + +## Verification + +``` +$ ssh ampere +$ ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime \ + -i /tmp/av1_larger.ivf -vf 'hwdownload,format=nv12' \ + -f rawvideo -pix_fmt nv12 /tmp/hw-av1-all.nv12 +[AVHWFramesContext] Using V4L2 media driver hantro-vpu (7.0.0) for AV1F +``` + +Byte-compare against ffmpeg's libdav1d SW reference, all 10 frames of the av1-1-b8-23-film_grain-50.ivf test vector (352×288, includes film-grain feature): + +``` +frame 0: exact=100.0000% +frame 1: exact=100.0000% +... +frame 9: exact=100.0000% +``` + +Smaller test vector (av1-1-b8-01-size-208x208.ivf, 2 frames): also 100% match. + +## Driver stack + +- IP: `vpu981` (Rockchip's dedicated AV1 hardware on RK3588, MMIO at fdc70000) +- Kernel driver: `drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c` (in-tree) +- DT compatible: presumably `rockchip,rk3588-vpu981-av1-dec` (verified loaded via `lsmod | grep hantro_vpu`) +- V4L2 node: `/dev/video4` (enumerates AV1F format) +- Userspace path: ffmpeg `-hwaccel v4l2request` (kdirect, not libva) + +## What's NOT done + +- **libva-v4l2-request-fourier backend AV1 dispatch** — backend has no AV1 codec module. ffmpeg-v4l2request kdirect works without libva. Adding libva AV1 support would make AV1 available to other VAAPI consumers (VLC, mpv with VA-API, GStreamer-VAAPI, browsers via VAAPI/VDPAU). Estimated effort: 1-2 days mirroring the existing HEVC/H.264/VP9 dispatch patterns in `~/src/libva-v4l2-request-fourier/src/` (sibling repo). +- Fluster `AV1-TEST-VECTORS` comprehensive validation (only ran 2 of the AOM test vectors). +- Stress test (long bitstream, 1080p+, complex features beyond film_grain). + +## Out of scope + +- Adding AV1 to rkvdec/vdpu381 (would duplicate the working hantro-vpu981 path) +- Reviving the failed VP9 work from sibling campaign + +## Process + +This is a verification campaign more than an enablement campaign. The work was done upstream by Verisilicon + Collabora; this repo documents that it works on ampere out-of-the-box. diff --git a/phase0_findings.md b/phase0_findings.md new file mode 100644 index 0000000..f270a52 --- /dev/null +++ b/phase0_findings.md @@ -0,0 +1,78 @@ +# Phase 0 findings — AV1 on RK3588 ampere is already a working mainline path + +Date: 2026-05-17 09:00. Campaign opened immediately after the sibling ampere-vp9-enablement closed at structural-impossibility (10 failed iterations of VP9-on-vdpu381 register tuning; Janet PIVOT verdict recommended AV1). + +## Substrate + +| Component | Where | Status | +|---|---|---| +| Kernel AV1 driver | `drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c` (mainline 7.0) | Loaded as `hantro-vpu` module | +| V4L2 device node | `/dev/video4` on ampere | Enumerates `AV1F (AV1 Frame, compressed)` | +| Hardware IP | vpu981 @ fdc70000 (RK3588 dedicated AV1 decoder) | Active, IOMMU mapped | +| Userspace path 1 | ffmpeg `-hwaccel v4l2request` (kdirect) | **Bit-perfect verified** | +| Userspace path 2 | libva `-hwaccel vaapi` via libva-v4l2-request-fourier | NOT implemented (no `av1.c` in backend) | + +## Verification methodology (per [feedback_compare_hw_against_sw_reference](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_compare_hw_against_sw_reference.md)) + +Two AOM test vectors decoded twice — once via HW (`v4l2request → hantro-vpu`) and once via SW (libdav1d). Byte-compare the raw NV12 outputs. + +### Test 1: av1-1-b8-01-size-208x208.ivf (small smoke test) + +- 2 frames, 208×208 yuv420p +- Both HW and SW produced 129792 bytes (= 2 × 208 × 208 × 1.5) +- **100.0000% exact match** + +### Test 2: av1-1-b8-23-film_grain-50.ivf (complex AV1 feature: film grain) + +- 10 frames, 352×288 yuv420p, includes AV1 film_grain synthesis +- Both produced 1520640 bytes (= 10 × 352 × 288 × 1.5) +- **All 10 frames: 100.0000% exact match** + +Film grain is one of AV1's more complex post-processing features (decoder-side noise synthesis with provided model parameters); bit-perfect match here is a strong signal that the HW decoder's spec compliance is solid. + +## What this verifies + +1. The mainline `rockchip_vpu981_hw_av1_dec.c` driver is functional on ampere's 7.0.0-rc3 kernel +2. The hantro-vpu framework correctly routes AV1 to vpu981 (not to vdpu381 rkvdec) +3. ffmpeg's v4l2request hwaccel correctly invokes the kernel AV1 path +4. HW output matches libdav1d SW reference at byte level + +## What this does NOT verify + +- libva backend AV1 support (still completely absent from `libva-v4l2-request-fourier`) +- Higher resolutions (1080p/4K AV1 — current tests are 208×208 + 352×288) +- Longer sequences (current tests are 2 + 10 frames) +- Profile 1/2 (high-bit-depth, 10-bit) — current tests are 8-bit Profile 0 +- Real-time-ish workload (current tests are tiny vectors) + +## Why this campaign closed at Phase 0 + +Unlike VP9 (where the gap was kernel-side register-layout incompleteness requiring multi-week porting), AV1 was already enabled by upstream Verisilicon + Collabora work and just needed verification. The "enablement campaign" framing is mostly inappropriate here — the right word is "verification campaign". + +## Open scope (optional follow-ups) + +### A: Libva backend AV1 support (multi-day work) + +Add `~/src/libva-v4l2-request-fourier/src/av1.{c,h}` to dispatch VAAPI AV1 controls (VAPictureParameterBufferAV1, VASliceParameterBufferAV1, etc.) onto the V4L2_CID_STATELESS_AV1_* control set. Pattern mirrors existing VP9 + HEVC dispatch. Would enable VAAPI consumers (VLC, mpv `--hwdec=vaapi`, GStreamer-VAAPI, browsers). + +Useful if the ampere desktop has consumers that prefer VAAPI over direct v4l2request. + +### B: Fluster AV1-TEST-VECTORS comprehensive run + +Set up `gst-launch-1.0` + Fluster GStreamer-AV1-V4L2SL-Gst1.0 test runner. Get a quantitative pass-rate against the canonical AV1 conformance suite. + +### C: Real-world bitstream verification + +Encode + decode a real 1080p/4K AV1 sample (e.g., a 30s clip from a Netflix-encoded AV1 sample, if available). Verify bit-perfect at scale. + +## Persistent state + +- **Repo**: `git.reauktion.de/claude-noether/ampere-av1-enablement` — this README + Phase 0 close +- **Local working dir**: `fresnel:~/src/ampere-av1-enablement/` +- **Ampere**: restored to sibling-campaign-close kernel module (HEVC bit-perfect retained). AV1 path is hantro-vpu (separate from rkvdec which the VP9 campaign attempted to modify). +- **Test vectors**: `/tmp/test_av1.ivf` + `/tmp/av1_larger.ivf` on both fresnel and ampere +- **Outputs**: `/tmp/hw-av1*.nv12` + `/tmp/sw-av1*.nv12` on ampere (~1.5MB total) + +## Verdict + +AV1 hardware decode on ampere is **WORKING** via mainline + ffmpeg-v4l2request, bit-perfect against SW reference. Pivot from VP9 success.