Open ampere-vp9-enablement to enable VP9 hardware decode on RK3588 ampere (rkvdec / vdpu381 register layout). Sibling to ampere-kernel-decoders (closed at HEVC bit-perfect 2026-05-17 ~00:42). Phase 0 substrate locked: upstream status (Collabora roadmap, no series posted), legacy code reference (rkvdec-vp9.c 1042 lines, vdpu341), vdpu381 pattern reference (rkvdec-vdpu381-hevc.c, struct-based regs + common-file split), work-plan outline, open questions (chiefly: where is the vdpu381 VP9 register layout documented), risk register. Phase 1 (architectural plan + Sonnet review) next session. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8.5 KiB
Phase 0 findings — ampere VP9 enablement substrate
Date: 2026-05-17, opening session of the campaign (sibling: ampere-kernel-decoders closed at HEVC bit-perfect ~30 min earlier).
Goal
Bring VP9 hardware decode up on RK3588 ampere via rkvdec (vdpu381 register layout), upstream-aligned, suitable for a clean linux-media RFC. End-state criterion: bit-perfect against ffmpeg -c:v vp9 SW reference per feedback_compare_hw_against_sw_reference.
Upstream status (search round 1)
| Source | Result |
|---|---|
| Collabora blog 2026-05 (Panthor → RK3588) | "Going forward, Collabora will work on ... VP9 code support on RK3588" — roadmap item, no series posted |
| Collabora RK3588/RK3576 decoders merged | Linux 7.0 landed H.264 + HEVC for vdpu381/vdpu383 only |
WebSearch "rk3588" OR "vdpu381" rkvdec vp9 patch site:lore.kernel.org |
AV1 series + other unrelated; no VP9 vdpu381 series |
WebSearch rkvdec2 vp9 rk3588 collabora linux-media kernel patch 2026 |
Same conclusion; RKVDEC2 driver supports H.264 only at posting time |
lore.kernel.org/linux-media WebFetch |
Anubis access-denied (anti-bot block) |
lore.kernel.org/linaro-mm-sig WebFetch |
Anubis access-denied |
git remote -v on boltzmann:~/src/linux-rockchip → collabora remote |
collabora/add-rkvdec2-driver* branches exist (vdpu383-hevc variant); no *-vp9* branch |
Conclusion: VP9 on RK3588 vdpu381 is not yet in flight upstream. We are first to implement.
Existing code substrate (boltzmann:~/src/linux-rockchip @ linux-rk3588-marfrit)
Legacy reference (RK3399 / vdpu341)
drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c— 1042 lines (Brezillon 2019 + Pietrasiewicz 2021 + Alpha Lin 2016). Useswritel-style register access viarkvdec-regs.h.rkvdec.c:419definesrkvdec_vp9_ctrl_descs[](V4L2_CID_STATELESS_VP9_FRAME + V4L2_CID_STATELESS_VP9_COMPRESSED_HDR — small ctrl set)rkvdec.c:478..492registers VP9 inrk3399_coded_fmts[](4096×2304 max, 64×64 alignment step)- Ops:
rkvdec_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }(notry_ctrl)
vdpu381 reference (RK3588) — pattern to follow
rkvdec-vdpu381-hevc.c— 639 lines, 2025 Casanova. Struct-based register layout (rkvdec-vdpu381-regs.h), shared preamble viarkvdec-hevc-common.c/h.rkvdec-vdpu381-h264.c— same pattern, h264-common shared file.rkvdec.c:513..549definesvdpu381_coded_fmts[]with HEVC + H.264 only — VP9 entry must be added here.rkvdec.c:1701vdpu381_variant_opsexposes single-IRQ handler + coded_fmts table — no per-codec dispatch needed.
Common helpers already in place
rkvdec-cabac.c/h— CABAC tables, codec-agnosticrkvdec-rcb.c/h— Row Cache Buffer / SRAM management (vdpu38x has internal SRAM for line caches)rkvdec-h264-common.c/h,rkvdec-hevc-common.c/h— codec spec parsing, RPS prep, control-batch helpers
VP9 has no rkvdec-vp9-common.* yet. Today the legacy rkvdec-vp9.c holds both the spec/probability logic AND the vdpu341 register code in one file.
Work plan outline (to be refined in Phase 1)
| Step | Output | Notes |
|---|---|---|
| 1 | rkvdec-vp9-common.{c,h} — extracted from legacy rkvdec-vp9.c |
Probability tables, frame_ctx state, segmap mgmt, libv4l2 vp9 helpers (v4l2-vp9.h). Stays codec-spec-only, no register access. Legacy rkvdec-vp9.c then includes/links to it. |
| 2 | rkvdec-vdpu381-vp9.c — new backend |
rkvdec_vdpu381_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }. Re-implements register packing against struct vp9_regs in vdpu381 layout. |
| 3 | rkvdec-vdpu381-regs.h additions |
VP9 register struct definitions (need Rockchip TRM or BSP reference — see open-question O1) |
| 4 | vdpu38x_vp9_ctrl_descs[] in rkvdec.c |
Likely identical to legacy rkvdec_vp9_ctrl_descs[] (V4L2 controls are HW-agnostic) — just renamed and possibly with vdpu38x-specific dims. |
| 5 | vdpu381_coded_fmts[] third entry |
V4L2_PIX_FMT_VP9_FRAME pointing to the new ops + ctrls. Sizes likely 65472×65472 to match HEVC entry. |
| 6 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per feedback_compare_hw_against_sw_reference.md. |
| 7 | Series-prep: split into individual reviewable patches | Eventually for linux-media submission via b4. |
Step 1 is the biggest single chunk (refactor + maintain bit-perfect behaviour on legacy path); steps 2-3 are where the unknown register layout dominates time.
Open questions
| # | Question | Resolution path |
|---|---|---|
| O1 | Where is the vdpu381 VP9 register layout documented? Public Rockchip TRMs (RK3588 TRM v0.7 / v1.0) cover vdpu341 only. We need either: (a) Rockchip BSP kernel (linux-5.10-rkr or 6.1-rkr) inspection — they have a working VP9 path, (b) Casanova's WIP if it exists privately, (c) blind RE from hardware behaviour | Step 1: pull Rockchip BSP kernel-5.10 or kernel-6.1 rkvdec source (mpp_vp9d_vdpu* in mpp/userspace; rkvdec kernel side typically minimal) |
| O2 | Does vdpu381 share enough VP9 hardware with vdpu341 that legacy register sequencing is largely portable, or is this a clean-sheet IP? | Inspect a Rockchip BSP rkvdec node from RK3588 DTS — register-map size + interrupt + clock topology says a lot. Compare to RK3399's |
| O3 | Probability table format/layout — same between IPs? | VP9 spec is spec; HW prob-table layout is HW-specific. Need register doc. |
| O4 | Is RCB / SRAM usage required for VP9 on vdpu381 same as for HEVC? | Reuse rkvdec-rcb helper if so; new sizing constants if not |
| O5 | Multicore disabled (commit e570307ac987) — does that affect VP9? |
Likely not — VP9 was never multicore-aware; single decoder core path will work |
| O6 | Validate via Fluster (200/239 AV1 example) or VP9-TEST-VECTORS suite | Set up fluster GStreamer-VP9-V4L2SL-Gst1.0 test post-Phase-3 |
| O7 | Stretch: can we cross-port the RKVDEC2 (Casanova WIP) approach if upstream add-rkvdec2-driver-vp9 appears mid-campaign? |
Watch lore + Collabora gitlab |
Risk register
| # | Risk | Mitigation |
|---|---|---|
| R1 | Register layout unknown — could spend weeks reverse-engineering with no public docs | Lean hard on Rockchip BSP source; if blocked, file Collabora inquiry to short-circuit |
| R2 | Legacy rkvdec-vp9.c refactor (extract common) breaks RK3399 path |
Cross-test the legacy build on dirac (RK3399 ROCK Pi 4) before merging — sibling: dirac.fritz.box should still have the old kernel for regression testing |
| R3 | VP9 spec features (compressed header, segmentation, frame parallel decode) not supported by vdpu381 HW | Determine empirically; document limitations upstream |
| R4 | Backend (libva-v4l2-request-fourier) already has VP9 path for hantro (per feedback_vaapi_strips_vp8_uncompressed_header.md) but rkvdec-vp9 VAAPI integration may need adaptation |
Trace ffmpeg-vaapi VP9 OUTPUT layout vs the iter38b backend's VP9 dispatch; sibling: fresnel-fourier iter33 VP8 work |
| R5 | Casanova posts an upstream VP9 series mid-effort, causing fork divergence | Monitor collabora/add-rkvdec2-driver-vp9 branch + lore weekly; pivot to coordination if so |
Substrate locked
Phase 0 closes here. Phase 1 (architectural plan + Sonnet review) starts next session:
- Pull Rockchip BSP rkvdec source for VP9 register-layout reference (O1)
- Draft
rkvdec-vp9-common.csplit outline - Draft
vdpu381-vp9.cregister-packing skeleton - Identify any V4L2 uAPI additions needed (likely none —
V4L2_CID_STATELESS_VP9_*already exist)
Persistence
- Repo:
/home/mfritsche/src/ampere-vp9-enablement/on fresnel - Gitea remote: TBD (file as
claude-noether/ampere-vp9-enablementper feedback_gitea_as_claude_noether) - Kernel work:
boltzmann:~/src/linux-rockchipbranchlinux-rk3588-marfrit(same tree as ampere-kernel-decoders campaign — separate iteration branches undervp9-*namespace recommended) - ampere current state: vanilla
7.0.0-rc3-devices+kernel + iter3/iter4-fixed modules from sibling campaign; bit-perfect HEVC verified; backendv4l2_request_drv_video.sois iter38b. VP9 has not been exercised on this system since the kernel-agent rollout.