Rockchip BSP inspection (mpp_rkvdec2.c, soc.cmake, vp9d/CMakeLists.txt) overturns the initial premise. The same physical rkvdec IP on RK3588 accepts two register-protocol dialects within its 0x400 MMIO window: - vdpu381 dialect (Casanova mainline naming) — H.264 + HEVC - vdpu34x dialect (Rockchip legacy naming) — VP9 + AVS2 BSP rkvdec_rk3588_data uses rkvdec_v2_hw_info + rkvdec_v2_trans, the same dispatch tables as RK356X. MPP userspace builds the vdpu34x VP9 backend for RK3588 because no vdpu381 VP9 backend exists; it isn't needed — the existing vdpu34x register layout drives this hardware. Implication: mainline rkvdec_vp9_fmt_ops (vdpu34x layout, written for RK3399) can drive RK3588 rkvdec hardware as-is. VP9 enablement is a < 100-line wiring patch (third entry in vdpu381_coded_fmts[] + maybe a codec-aware IRQ split), not a 1000+ line backend port. Open questions revised; risk register tightened. Phase 1 starts by reading BSP rkvdec_rk3588_hw_ops IRQ + power-on routines to resolve O2/O3 (codec-aware dispatch needed? mode-switch register?) and BSP vp9d_vdpu34x.c for max-resolution + RCB usage. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
12 KiB
Phase 0 findings — ampere VP9 enablement substrate
Date: 2026-05-17, opening session of the campaign (sibling: ampere-kernel-decoders closed at HEVC bit-perfect ~30 min earlier).
Goal
Bring VP9 hardware decode up on RK3588 ampere via rkvdec (vdpu381 register layout), upstream-aligned, suitable for a clean linux-media RFC. End-state criterion: bit-perfect against ffmpeg -c:v vp9 SW reference per feedback_compare_hw_against_sw_reference.
Upstream status (search round 1)
| Source | Result |
|---|---|
| Collabora blog 2026-05 (Panthor → RK3588) | "Going forward, Collabora will work on ... VP9 code support on RK3588" — roadmap item, no series posted |
| Collabora RK3588/RK3576 decoders merged | Linux 7.0 landed H.264 + HEVC for vdpu381/vdpu383 only |
WebSearch "rk3588" OR "vdpu381" rkvdec vp9 patch site:lore.kernel.org |
AV1 series + other unrelated; no VP9 vdpu381 series |
WebSearch rkvdec2 vp9 rk3588 collabora linux-media kernel patch 2026 |
Same conclusion; RKVDEC2 driver supports H.264 only at posting time |
lore.kernel.org/linux-media WebFetch |
Anubis access-denied (anti-bot block) |
lore.kernel.org/linaro-mm-sig WebFetch |
Anubis access-denied |
git remote -v on boltzmann:~/src/linux-rockchip → collabora remote |
collabora/add-rkvdec2-driver* branches exist (vdpu383-hevc variant); no *-vp9* branch |
Conclusion: VP9 on RK3588 vdpu381 is not yet in flight upstream. We are first to implement.
Existing code substrate (boltzmann:~/src/linux-rockchip @ linux-rk3588-marfrit)
Legacy reference (RK3399 / vdpu341)
drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c— 1042 lines (Brezillon 2019 + Pietrasiewicz 2021 + Alpha Lin 2016). Useswritel-style register access viarkvdec-regs.h.rkvdec.c:419definesrkvdec_vp9_ctrl_descs[](V4L2_CID_STATELESS_VP9_FRAME + V4L2_CID_STATELESS_VP9_COMPRESSED_HDR — small ctrl set)rkvdec.c:478..492registers VP9 inrk3399_coded_fmts[](4096×2304 max, 64×64 alignment step)- Ops:
rkvdec_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }(notry_ctrl)
vdpu381 reference (RK3588) — pattern to follow
rkvdec-vdpu381-hevc.c— 639 lines, 2025 Casanova. Struct-based register layout (rkvdec-vdpu381-regs.h), shared preamble viarkvdec-hevc-common.c/h.rkvdec-vdpu381-h264.c— same pattern, h264-common shared file.rkvdec.c:513..549definesvdpu381_coded_fmts[]with HEVC + H.264 only — VP9 entry must be added here.rkvdec.c:1701vdpu381_variant_opsexposes single-IRQ handler + coded_fmts table — no per-codec dispatch needed.
Common helpers already in place
rkvdec-cabac.c/h— CABAC tables, codec-agnosticrkvdec-rcb.c/h— Row Cache Buffer / SRAM management (vdpu38x has internal SRAM for line caches)rkvdec-h264-common.c/h,rkvdec-hevc-common.c/h— codec spec parsing, RPS prep, control-batch helpers
VP9 has no rkvdec-vp9-common.* yet. Today the legacy rkvdec-vp9.c holds both the spec/probability logic AND the vdpu341 register code in one file.
Work plan outline (to be refined in Phase 1)
| Step | Output | Notes |
|---|---|---|
| 1 | rkvdec-vp9-common.{c,h} — extracted from legacy rkvdec-vp9.c |
Probability tables, frame_ctx state, segmap mgmt, libv4l2 vp9 helpers (v4l2-vp9.h). Stays codec-spec-only, no register access. Legacy rkvdec-vp9.c then includes/links to it. |
| 2 | rkvdec-vdpu381-vp9.c — new backend |
rkvdec_vdpu381_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }. Re-implements register packing against struct vp9_regs in vdpu381 layout. |
| 3 | rkvdec-vdpu381-regs.h additions |
VP9 register struct definitions (need Rockchip TRM or BSP reference — see open-question O1) |
| 4 | vdpu38x_vp9_ctrl_descs[] in rkvdec.c |
Likely identical to legacy rkvdec_vp9_ctrl_descs[] (V4L2 controls are HW-agnostic) — just renamed and possibly with vdpu38x-specific dims. |
| 5 | vdpu381_coded_fmts[] third entry |
V4L2_PIX_FMT_VP9_FRAME pointing to the new ops + ctrls. Sizes likely 65472×65472 to match HEVC entry. |
| 6 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per feedback_compare_hw_against_sw_reference.md. |
| 7 | Series-prep: split into individual reviewable patches | Eventually for linux-media submission via b4. |
Step 1 is the biggest single chunk (refactor + maintain bit-perfect behaviour on legacy path); steps 2-3 are where the unknown register layout dominates time.
Architectural correction (mid-Phase-0)
Original premise overturned by Rockchip BSP inspection. The work-plan-outline above assumed vdpu381 (mainline RK3588) is a different IP from vdpu341 (RK3399), requiring a new register backend. The BSP investigation says no:
| Source | Finding |
|---|---|
BSP DTS rk3588s.dtsi (lines 5059, 5113) |
rkvdec0@fdc38000, rkvdec1@fdc48000 carry compatible = "rockchip,rkv-decoder-v2" and a 0x400-byte MMIO window. No separate vdpu34x physical IP exists on RK3588. |
BSP driver drivers/video/rockchip/mpp/mpp_rkvdec2.c:1659 |
rkvdec_rk3588_data ties RK3588 to rkvdec_v2_hw_info and rkvdec_v2_trans (the same dispatch tables used for RK356X). RK3576 alone routes to rkvdec_vdpu383_*. RK3588 stays on the v2/vdpu34x family in BSP naming. |
BSP MPP userspace mpp/soc.cmake |
add_soc_config("RK3588" "VDPU381,VDPU34X,...") — RK3588 enables both register-protocol backends. Same physical rkvdec IP accepts two register-layout dialects within its MMIO window: vdpu381 dialect (Casanova mainline naming, used for H.264/HEVC) and vdpu34x dialect (Rockchip legacy naming, used for VP9 + AVS2). |
BSP MPP mpp/hal/rkdec/vp9d/CMakeLists.txt |
VP9 backends gated only on HAVE_VDPU34X / VDPU382 / VDPU383 / VDPU384B. No HAVE_VDPU381 VP9 backend exists — vdpu381-class hardware uses the vdpu34x VP9 backend. |
Implication: RK3588 VP9 enablement does NOT require porting rkvdec-vp9.c to a new register layout. The existing mainline rkvdec_vp9_fmt_ops (vdpu34x layout, written for RK3399) can drive RK3588's rkvdec hardware as-is. The missing work is a wiring patch, not a backend port.
Revised work plan
| Step | Output | Notes |
|---|---|---|
| 1 | Add VP9 entry to vdpu381_coded_fmts[] in rkvdec.c:513, pointing to existing rkvdec_vp9_fmt_ops and reusing rkvdec_vp9_ctrls |
Frame-size limits from RK3399's entry (4096×2304, step 64) — RK3588 VP9 hard limits may differ; cross-check vp9d_vdpu34x.c for max-resolution constants |
| 2 | Wire vdpu381 IRQ handler to recognise VP9 codec context | Legacy IRQ (rkvdec_irq) and vdpu381 IRQ (vdpu381_irq_handler) read different status registers — VP9 path may need legacy IRQ semantics. Verify against BSP mpp_rkvdec2.c IRQ + trans_tbl_vp9d register layout |
| 3 | Verify RK3588 rkvdec clock topology matches what legacy VP9 path needs | RK3588 DT has clk_core, clk_cabac, clk_hevc_cabac — legacy VP9 path uses clk_core/clk_cabac (subset) |
| 4 | Verify legacy rkvdec-regs.h register offsets are valid on RK3588 rkvdec MMIO (0xfdc38100 + 0x400) |
Same physical IP, same register window. Smoke-test with devmem2 against ampere |
| 5 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per feedback_compare_hw_against_sw_reference. |
| 6 | Single RFC patch to linux-media | If wiring is the only delta — easy upstream sell |
Expected size: < 100 lines if IRQ-handler doesn't need codec-aware split, ~200 lines if VP9 requires per-codec IRQ dispatch. A small fraction of the original "port 1042 lines" estimate.
Open questions (revised)
| # | Question | Resolution path |
|---|---|---|
| O1 | Does rkvdec_v2_trans[RKVDEC_FMT_VP9D] register-translate table (offsets 128..232 × 4 = 0x200..0x3A0) fit inside the rkvdec MMIO window (0x400 bytes)? |
Yes by inspection (mpp_rkvdec2.c BSP). Confirmed. |
| O2 | Does the vdpu381 IRQ handler need codec-aware dispatch, or can it handle VP9 termination identically to HEVC/H.264? | Read rkvdec_rk3588_hw_ops IRQ in mpp_rkvdec2.c + compare to legacy rkvdec_irq in mainline. If different status-bit semantics, need if (ctx->coded_fmt == V4L2_PIX_FMT_VP9_FRAME) split |
| O3 | RK3588-specific clock/reset requirements for VP9 beyond HEVC? | Compare BSP mpp_rkvdec2.c IRQ + power-on routines for FMT_VP9D vs FMT_H265D |
| O4 | Does legacy rkvdec_vp9_start / rkvdec_vp9_stop (probe + segmap buffer alloc) work against RK3588's IOMMU configuration? |
Most likely yes (vb2_dma_contig handles IOMMU transparently). Verify at first decode attempt |
| O5 | RCB / SRAM — legacy VP9 path doesn't use RCB; vdpu381 HEVC does. Is RCB needed for VP9 on RK3588? | Compare vp9d_vdpu34x.c MPP backend's RCB usage to vp9d_vdpu382.c. If vdpu34x backend works without RCB on RK3588 in BSP, mainline doesn't need it either for VP9 |
| O6 | Validate via Fluster VP9-TEST-VECTORS post-Phase-3 |
Set up GStreamer-VP9-V4L2SL-Gst1.0 test rig |
| O7 | If a Collabora linux-rkvdec-vp9-on-rk3588 series appears, pivot to coordination |
Monitor lore + Collabora gitlab weekly |
Risk register (revised)
| # | Risk | Mitigation |
|---|---|---|
| R1 | Same physical IP accepting two register dialects is unusual — there may be a hidden mode-switch register that must be set before VP9 work | Inspect BSP rkvdec_rk3588_hw_ops for any pre-decode setup distinguishing VP9 from HEVC |
| R2 | Backend (libva-v4l2-request-fourier) doesn't yet have rkvdec VP9 dispatch path; only hantro VP8 exists |
Mirror the iter33 VP8 pattern: profile-gated codec dispatch in RequestCreateConfig. Sibling: feedback_unconditional_codec_state.md (must per-codec gate) |
| R3 | RK3588 VP9 max-resolution may differ from RK3399's 4096×2304 | Read MPP vp9d_vdpu34x.c max-resolution constants for confirmation |
| R4 | dirac (RK3399) cross-test fixture status unknown — needed if we modify legacy rkvdec-vp9.c for any reason |
If modification needed, verify dirac is reachable before commit. Mostly we should NOT need to touch legacy file |
| R5 | Casanova posts upstream VP9 series mid-effort → fork divergence | Monitor weekly; coordinate if posted |
Substrate locked
Phase 0 closes here. Phase 1 (architectural plan + Sonnet review) starts next session:
- Read BSP
mpp_rkvdec2.crkvdec_rk3588_hw_ops and IRQ routine to resolve O2/O3 (mode-switch register? VP9-specific IRQ status?) - Read BSP MPP
vp9d_vdpu34x.cto confirm O5 (RCB usage) and R3 (max-resolution) - Draft the wiring patch outline (3rd entry in
vdpu381_coded_fmts[]+ any IRQ split) - Decide ampere-side test fixture (which VP9 bitstreams; bbb-vp9 + a streaming-capable test vector)
Persistence
- Repo:
/home/mfritsche/src/ampere-vp9-enablement/on fresnel - Gitea remote: TBD (file as
claude-noether/ampere-vp9-enablementper feedback_gitea_as_claude_noether) - Kernel work:
boltzmann:~/src/linux-rockchipbranchlinux-rk3588-marfrit(same tree as ampere-kernel-decoders campaign — separate iteration branches undervp9-*namespace recommended) - ampere current state: vanilla
7.0.0-rc3-devices+kernel + iter3/iter4-fixed modules from sibling campaign; bit-perfect HEVC verified; backendv4l2_request_drv_video.sois iter38b. VP9 has not been exercised on this system since the kernel-agent rollout.