Files
ampere-vp9-enablement/phase0_findings.md
T
claude-noether eb60ecd224 Phase 0: open campaign + substrate findings
Open ampere-vp9-enablement to enable VP9 hardware decode on RK3588 ampere
(rkvdec / vdpu381 register layout). Sibling to ampere-kernel-decoders
(closed at HEVC bit-perfect 2026-05-17 ~00:42).

Phase 0 substrate locked: upstream status (Collabora roadmap, no series
posted), legacy code reference (rkvdec-vp9.c 1042 lines, vdpu341),
vdpu381 pattern reference (rkvdec-vdpu381-hevc.c, struct-based regs +
common-file split), work-plan outline, open questions (chiefly: where
is the vdpu381 VP9 register layout documented), risk register.

Phase 1 (architectural plan + Sonnet review) next session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:48:26 +00:00

8.5 KiB
Raw Blame History

Phase 0 findings — ampere VP9 enablement substrate

Date: 2026-05-17, opening session of the campaign (sibling: ampere-kernel-decoders closed at HEVC bit-perfect ~30 min earlier).

Goal

Bring VP9 hardware decode up on RK3588 ampere via rkvdec (vdpu381 register layout), upstream-aligned, suitable for a clean linux-media RFC. End-state criterion: bit-perfect against ffmpeg -c:v vp9 SW reference per feedback_compare_hw_against_sw_reference.

Upstream status (search round 1)

Source Result
Collabora blog 2026-05 (Panthor → RK3588) "Going forward, Collabora will work on ... VP9 code support on RK3588" — roadmap item, no series posted
Collabora RK3588/RK3576 decoders merged Linux 7.0 landed H.264 + HEVC for vdpu381/vdpu383 only
WebSearch "rk3588" OR "vdpu381" rkvdec vp9 patch site:lore.kernel.org AV1 series + other unrelated; no VP9 vdpu381 series
WebSearch rkvdec2 vp9 rk3588 collabora linux-media kernel patch 2026 Same conclusion; RKVDEC2 driver supports H.264 only at posting time
lore.kernel.org/linux-media WebFetch Anubis access-denied (anti-bot block)
lore.kernel.org/linaro-mm-sig WebFetch Anubis access-denied
git remote -v on boltzmann:~/src/linux-rockchip → collabora remote collabora/add-rkvdec2-driver* branches exist (vdpu383-hevc variant); no *-vp9* branch

Conclusion: VP9 on RK3588 vdpu381 is not yet in flight upstream. We are first to implement.

Existing code substrate (boltzmann:~/src/linux-rockchip @ linux-rk3588-marfrit)

Legacy reference (RK3399 / vdpu341)

  • drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c — 1042 lines (Brezillon 2019 + Pietrasiewicz 2021 + Alpha Lin 2016). Uses writel-style register access via rkvdec-regs.h.
  • rkvdec.c:419 defines rkvdec_vp9_ctrl_descs[] (V4L2_CID_STATELESS_VP9_FRAME + V4L2_CID_STATELESS_VP9_COMPRESSED_HDR — small ctrl set)
  • rkvdec.c:478..492 registers VP9 in rk3399_coded_fmts[] (4096×2304 max, 64×64 alignment step)
  • Ops: rkvdec_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run } (no try_ctrl)

vdpu381 reference (RK3588) — pattern to follow

  • rkvdec-vdpu381-hevc.c — 639 lines, 2025 Casanova. Struct-based register layout (rkvdec-vdpu381-regs.h), shared preamble via rkvdec-hevc-common.c/h.
  • rkvdec-vdpu381-h264.c — same pattern, h264-common shared file.
  • rkvdec.c:513..549 defines vdpu381_coded_fmts[] with HEVC + H.264 only — VP9 entry must be added here.
  • rkvdec.c:1701 vdpu381_variant_ops exposes single-IRQ handler + coded_fmts table — no per-codec dispatch needed.

Common helpers already in place

  • rkvdec-cabac.c/h — CABAC tables, codec-agnostic
  • rkvdec-rcb.c/h — Row Cache Buffer / SRAM management (vdpu38x has internal SRAM for line caches)
  • rkvdec-h264-common.c/h, rkvdec-hevc-common.c/h — codec spec parsing, RPS prep, control-batch helpers

VP9 has no rkvdec-vp9-common.* yet. Today the legacy rkvdec-vp9.c holds both the spec/probability logic AND the vdpu341 register code in one file.

Work plan outline (to be refined in Phase 1)

Step Output Notes
1 rkvdec-vp9-common.{c,h} — extracted from legacy rkvdec-vp9.c Probability tables, frame_ctx state, segmap mgmt, libv4l2 vp9 helpers (v4l2-vp9.h). Stays codec-spec-only, no register access. Legacy rkvdec-vp9.c then includes/links to it.
2 rkvdec-vdpu381-vp9.c — new backend rkvdec_vdpu381_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }. Re-implements register packing against struct vp9_regs in vdpu381 layout.
3 rkvdec-vdpu381-regs.h additions VP9 register struct definitions (need Rockchip TRM or BSP reference — see open-question O1)
4 vdpu38x_vp9_ctrl_descs[] in rkvdec.c Likely identical to legacy rkvdec_vp9_ctrl_descs[] (V4L2 controls are HW-agnostic) — just renamed and possibly with vdpu38x-specific dims.
5 vdpu381_coded_fmts[] third entry V4L2_PIX_FMT_VP9_FRAME pointing to the new ops + ctrls. Sizes likely 65472×65472 to match HEVC entry.
6 Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference Per feedback_compare_hw_against_sw_reference.md.
7 Series-prep: split into individual reviewable patches Eventually for linux-media submission via b4.

Step 1 is the biggest single chunk (refactor + maintain bit-perfect behaviour on legacy path); steps 2-3 are where the unknown register layout dominates time.

Open questions

# Question Resolution path
O1 Where is the vdpu381 VP9 register layout documented? Public Rockchip TRMs (RK3588 TRM v0.7 / v1.0) cover vdpu341 only. We need either: (a) Rockchip BSP kernel (linux-5.10-rkr or 6.1-rkr) inspection — they have a working VP9 path, (b) Casanova's WIP if it exists privately, (c) blind RE from hardware behaviour Step 1: pull Rockchip BSP kernel-5.10 or kernel-6.1 rkvdec source (mpp_vp9d_vdpu* in mpp/userspace; rkvdec kernel side typically minimal)
O2 Does vdpu381 share enough VP9 hardware with vdpu341 that legacy register sequencing is largely portable, or is this a clean-sheet IP? Inspect a Rockchip BSP rkvdec node from RK3588 DTS — register-map size + interrupt + clock topology says a lot. Compare to RK3399's
O3 Probability table format/layout — same between IPs? VP9 spec is spec; HW prob-table layout is HW-specific. Need register doc.
O4 Is RCB / SRAM usage required for VP9 on vdpu381 same as for HEVC? Reuse rkvdec-rcb helper if so; new sizing constants if not
O5 Multicore disabled (commit e570307ac987) — does that affect VP9? Likely not — VP9 was never multicore-aware; single decoder core path will work
O6 Validate via Fluster (200/239 AV1 example) or VP9-TEST-VECTORS suite Set up fluster GStreamer-VP9-V4L2SL-Gst1.0 test post-Phase-3
O7 Stretch: can we cross-port the RKVDEC2 (Casanova WIP) approach if upstream add-rkvdec2-driver-vp9 appears mid-campaign? Watch lore + Collabora gitlab

Risk register

# Risk Mitigation
R1 Register layout unknown — could spend weeks reverse-engineering with no public docs Lean hard on Rockchip BSP source; if blocked, file Collabora inquiry to short-circuit
R2 Legacy rkvdec-vp9.c refactor (extract common) breaks RK3399 path Cross-test the legacy build on dirac (RK3399 ROCK Pi 4) before merging — sibling: dirac.fritz.box should still have the old kernel for regression testing
R3 VP9 spec features (compressed header, segmentation, frame parallel decode) not supported by vdpu381 HW Determine empirically; document limitations upstream
R4 Backend (libva-v4l2-request-fourier) already has VP9 path for hantro (per feedback_vaapi_strips_vp8_uncompressed_header.md) but rkvdec-vp9 VAAPI integration may need adaptation Trace ffmpeg-vaapi VP9 OUTPUT layout vs the iter38b backend's VP9 dispatch; sibling: fresnel-fourier iter33 VP8 work
R5 Casanova posts an upstream VP9 series mid-effort, causing fork divergence Monitor collabora/add-rkvdec2-driver-vp9 branch + lore weekly; pivot to coordination if so

Substrate locked

Phase 0 closes here. Phase 1 (architectural plan + Sonnet review) starts next session:

  • Pull Rockchip BSP rkvdec source for VP9 register-layout reference (O1)
  • Draft rkvdec-vp9-common.c split outline
  • Draft vdpu381-vp9.c register-packing skeleton
  • Identify any V4L2 uAPI additions needed (likely none — V4L2_CID_STATELESS_VP9_* already exist)

Persistence

  • Repo: /home/mfritsche/src/ampere-vp9-enablement/ on fresnel
  • Gitea remote: TBD (file as claude-noether/ampere-vp9-enablement per feedback_gitea_as_claude_noether)
  • Kernel work: boltzmann:~/src/linux-rockchip branch linux-rk3588-marfrit (same tree as ampere-kernel-decoders campaign — separate iteration branches under vp9-* namespace recommended)
  • ampere current state: vanilla 7.0.0-rc3-devices+ kernel + iter3/iter4-fixed modules from sibling campaign; bit-perfect HEVC verified; backend v4l2_request_drv_video.so is iter38b. VP9 has not been exercised on this system since the kernel-agent rollout.