8dce724b8c
Rockchip BSP inspection (mpp_rkvdec2.c, soc.cmake, vp9d/CMakeLists.txt) overturns the initial premise. The same physical rkvdec IP on RK3588 accepts two register-protocol dialects within its 0x400 MMIO window: - vdpu381 dialect (Casanova mainline naming) — H.264 + HEVC - vdpu34x dialect (Rockchip legacy naming) — VP9 + AVS2 BSP rkvdec_rk3588_data uses rkvdec_v2_hw_info + rkvdec_v2_trans, the same dispatch tables as RK356X. MPP userspace builds the vdpu34x VP9 backend for RK3588 because no vdpu381 VP9 backend exists; it isn't needed — the existing vdpu34x register layout drives this hardware. Implication: mainline rkvdec_vp9_fmt_ops (vdpu34x layout, written for RK3399) can drive RK3588 rkvdec hardware as-is. VP9 enablement is a < 100-line wiring patch (third entry in vdpu381_coded_fmts[] + maybe a codec-aware IRQ split), not a 1000+ line backend port. Open questions revised; risk register tightened. Phase 1 starts by reading BSP rkvdec_rk3588_hw_ops IRQ + power-on routines to resolve O2/O3 (codec-aware dispatch needed? mode-switch register?) and BSP vp9d_vdpu34x.c for max-resolution + RCB usage. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
123 lines
12 KiB
Markdown
123 lines
12 KiB
Markdown
# Phase 0 findings — ampere VP9 enablement substrate
|
||
|
||
Date: 2026-05-17, opening session of the campaign (sibling: ampere-kernel-decoders closed at HEVC bit-perfect ~30 min earlier).
|
||
|
||
## Goal
|
||
|
||
Bring VP9 hardware decode up on RK3588 ampere via rkvdec (vdpu381 register layout), upstream-aligned, suitable for a clean linux-media RFC. End-state criterion: bit-perfect against `ffmpeg -c:v vp9` SW reference per [feedback_compare_hw_against_sw_reference](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_compare_hw_against_sw_reference.md).
|
||
|
||
## Upstream status (search round 1)
|
||
|
||
| Source | Result |
|
||
|---|---|
|
||
| Collabora blog 2026-05 ([Panthor → RK3588](https://www.collabora.com/news-and-blog/news-and-events/from-panthor-to-rk3588-advancing-graphics-video-soc-support-linux-kernel-7.html)) | "Going forward, Collabora will work on ... VP9 code support on RK3588" — roadmap item, no series posted |
|
||
| [Collabora RK3588/RK3576 decoders merged](https://www.collabora.com/news-and-blog/news-and-events/rk3588-and-rk3576-video-decoders-support-merged-in-the-upstream-linux-kernel.html) | Linux 7.0 landed H.264 + HEVC for vdpu381/vdpu383 only |
|
||
| WebSearch `"rk3588" OR "vdpu381" rkvdec vp9 patch site:lore.kernel.org` | AV1 series + other unrelated; no VP9 vdpu381 series |
|
||
| WebSearch `rkvdec2 vp9 rk3588 collabora linux-media kernel patch 2026` | Same conclusion; RKVDEC2 driver supports H.264 only at posting time |
|
||
| `lore.kernel.org/linux-media` WebFetch | Anubis access-denied (anti-bot block) |
|
||
| `lore.kernel.org/linaro-mm-sig` WebFetch | Anubis access-denied |
|
||
| `git remote -v` on boltzmann:~/src/linux-rockchip → collabora remote | `collabora/add-rkvdec2-driver*` branches exist (vdpu383-hevc variant); **no `*-vp9*` branch** |
|
||
|
||
Conclusion: VP9 on RK3588 vdpu381 is **not yet in flight upstream**. We are first to implement.
|
||
|
||
## Existing code substrate (boltzmann:~/src/linux-rockchip @ `linux-rk3588-marfrit`)
|
||
|
||
### Legacy reference (RK3399 / vdpu341)
|
||
|
||
- `drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c` — 1042 lines (Brezillon 2019 + Pietrasiewicz 2021 + Alpha Lin 2016). Uses `writel`-style register access via `rkvdec-regs.h`.
|
||
- `rkvdec.c:419` defines `rkvdec_vp9_ctrl_descs[]` (V4L2_CID_STATELESS_VP9_FRAME + V4L2_CID_STATELESS_VP9_COMPRESSED_HDR — small ctrl set)
|
||
- `rkvdec.c:478..492` registers VP9 in `rk3399_coded_fmts[]` (4096×2304 max, 64×64 alignment step)
|
||
- Ops: `rkvdec_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }` (no `try_ctrl`)
|
||
|
||
### vdpu381 reference (RK3588) — pattern to follow
|
||
|
||
- `rkvdec-vdpu381-hevc.c` — 639 lines, 2025 Casanova. **Struct-based register layout** (`rkvdec-vdpu381-regs.h`), shared preamble via `rkvdec-hevc-common.c/h`.
|
||
- `rkvdec-vdpu381-h264.c` — same pattern, h264-common shared file.
|
||
- `rkvdec.c:513..549` defines `vdpu381_coded_fmts[]` with HEVC + H.264 only — **VP9 entry must be added here**.
|
||
- `rkvdec.c:1701` `vdpu381_variant_ops` exposes single-IRQ handler + coded_fmts table — no per-codec dispatch needed.
|
||
|
||
### Common helpers already in place
|
||
|
||
- `rkvdec-cabac.c/h` — CABAC tables, codec-agnostic
|
||
- `rkvdec-rcb.c/h` — Row Cache Buffer / SRAM management (vdpu38x has internal SRAM for line caches)
|
||
- `rkvdec-h264-common.c/h`, `rkvdec-hevc-common.c/h` — codec spec parsing, RPS prep, control-batch helpers
|
||
|
||
VP9 has no `rkvdec-vp9-common.*` yet. Today the legacy `rkvdec-vp9.c` holds both the spec/probability logic AND the vdpu341 register code in one file.
|
||
|
||
## Work plan outline (to be refined in Phase 1)
|
||
|
||
| Step | Output | Notes |
|
||
|---|---|---|
|
||
| 1 | `rkvdec-vp9-common.{c,h}` — extracted from legacy `rkvdec-vp9.c` | Probability tables, frame_ctx state, segmap mgmt, libv4l2 vp9 helpers (`v4l2-vp9.h`). Stays codec-spec-only, no register access. Legacy `rkvdec-vp9.c` then includes/links to it. |
|
||
| 2 | `rkvdec-vdpu381-vp9.c` — new backend | `rkvdec_vdpu381_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }`. Re-implements register packing against `struct vp9_regs` in vdpu381 layout. |
|
||
| 3 | `rkvdec-vdpu381-regs.h` additions | VP9 register struct definitions (need Rockchip TRM or BSP reference — see open-question O1) |
|
||
| 4 | `vdpu38x_vp9_ctrl_descs[]` in `rkvdec.c` | Likely identical to legacy `rkvdec_vp9_ctrl_descs[]` (V4L2 controls are HW-agnostic) — just renamed and possibly with vdpu38x-specific dims. |
|
||
| 5 | `vdpu381_coded_fmts[]` third entry | V4L2_PIX_FMT_VP9_FRAME pointing to the new ops + ctrls. Sizes likely 65472×65472 to match HEVC entry. |
|
||
| 6 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per `feedback_compare_hw_against_sw_reference.md`. |
|
||
| 7 | Series-prep: split into individual reviewable patches | Eventually for linux-media submission via `b4`. |
|
||
|
||
Step 1 is the biggest single chunk (refactor + maintain bit-perfect behaviour on legacy path); steps 2-3 are where the unknown register layout dominates time.
|
||
|
||
## Architectural correction (mid-Phase-0)
|
||
|
||
**Original premise overturned by Rockchip BSP inspection.** The work-plan-outline above assumed vdpu381 (mainline RK3588) is a *different IP* from vdpu341 (RK3399), requiring a new register backend. The BSP investigation says no:
|
||
|
||
| Source | Finding |
|
||
|---|---|
|
||
| BSP DTS `rk3588s.dtsi` (lines 5059, 5113) | `rkvdec0@fdc38000`, `rkvdec1@fdc48000` carry `compatible = "rockchip,rkv-decoder-v2"` and a 0x400-byte MMIO window. **No separate vdpu34x physical IP exists on RK3588.** |
|
||
| BSP driver `drivers/video/rockchip/mpp/mpp_rkvdec2.c:1659` | `rkvdec_rk3588_data` ties RK3588 to `rkvdec_v2_hw_info` and `rkvdec_v2_trans` (the *same* dispatch tables used for RK356X). RK3576 alone routes to `rkvdec_vdpu383_*`. **RK3588 stays on the v2/vdpu34x family in BSP naming.** |
|
||
| BSP MPP userspace `mpp/soc.cmake` | `add_soc_config("RK3588" "VDPU381,VDPU34X,...")` — RK3588 *enables both* register-protocol backends. Same physical rkvdec IP accepts two register-layout dialects within its MMIO window: vdpu381 dialect (Casanova mainline naming, used for H.264/HEVC) and vdpu34x dialect (Rockchip legacy naming, used for VP9 + AVS2). |
|
||
| BSP MPP `mpp/hal/rkdec/vp9d/CMakeLists.txt` | VP9 backends gated only on `HAVE_VDPU34X / VDPU382 / VDPU383 / VDPU384B`. **No `HAVE_VDPU381` VP9 backend exists** — vdpu381-class hardware uses the vdpu34x VP9 backend. |
|
||
|
||
**Implication**: RK3588 VP9 enablement does NOT require porting `rkvdec-vp9.c` to a new register layout. The existing mainline `rkvdec_vp9_fmt_ops` (vdpu34x layout, written for RK3399) can drive RK3588's rkvdec hardware *as-is*. The missing work is a **wiring patch**, not a backend port.
|
||
|
||
## Revised work plan
|
||
|
||
| Step | Output | Notes |
|
||
|---|---|---|
|
||
| 1 | Add VP9 entry to `vdpu381_coded_fmts[]` in `rkvdec.c:513`, pointing to existing `rkvdec_vp9_fmt_ops` and reusing `rkvdec_vp9_ctrls` | Frame-size limits from RK3399's entry (4096×2304, step 64) — RK3588 VP9 hard limits may differ; cross-check `vp9d_vdpu34x.c` for max-resolution constants |
|
||
| 2 | Wire vdpu381 IRQ handler to recognise VP9 codec context | Legacy IRQ (`rkvdec_irq`) and vdpu381 IRQ (`vdpu381_irq_handler`) read different status registers — VP9 path may need legacy IRQ semantics. Verify against BSP `mpp_rkvdec2.c` IRQ + `trans_tbl_vp9d` register layout |
|
||
| 3 | Verify RK3588 rkvdec clock topology matches what legacy VP9 path needs | RK3588 DT has `clk_core`, `clk_cabac`, `clk_hevc_cabac` — legacy VP9 path uses `clk_core`/`clk_cabac` (subset) |
|
||
| 4 | Verify legacy `rkvdec-regs.h` register offsets are valid on RK3588 rkvdec MMIO (0xfdc38100 + 0x400) | Same physical IP, same register window. Smoke-test with devmem2 against ampere |
|
||
| 5 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per [feedback_compare_hw_against_sw_reference](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_compare_hw_against_sw_reference.md). |
|
||
| 6 | Single RFC patch to linux-media | If wiring is the only delta — easy upstream sell |
|
||
|
||
**Expected size**: < 100 lines if IRQ-handler doesn't need codec-aware split, ~200 lines if VP9 requires per-codec IRQ dispatch. A small fraction of the original "port 1042 lines" estimate.
|
||
|
||
## Open questions (revised)
|
||
|
||
| # | Question | Resolution path |
|
||
|---|---|---|
|
||
| O1 | Does `rkvdec_v2_trans[RKVDEC_FMT_VP9D]` register-translate table (offsets 128..232 × 4 = 0x200..0x3A0) fit inside the rkvdec MMIO window (0x400 bytes)? | Yes by inspection (`mpp_rkvdec2.c` BSP). Confirmed. |
|
||
| O2 | Does the vdpu381 IRQ handler need codec-aware dispatch, or can it handle VP9 termination identically to HEVC/H.264? | Read `rkvdec_rk3588_hw_ops` IRQ in `mpp_rkvdec2.c` + compare to legacy `rkvdec_irq` in mainline. If different status-bit semantics, need `if (ctx->coded_fmt == V4L2_PIX_FMT_VP9_FRAME)` split |
|
||
| O3 | RK3588-specific clock/reset requirements for VP9 beyond HEVC? | Compare BSP `mpp_rkvdec2.c` IRQ + power-on routines for `FMT_VP9D` vs `FMT_H265D` |
|
||
| O4 | Does legacy `rkvdec_vp9_start` / `rkvdec_vp9_stop` (probe + segmap buffer alloc) work against RK3588's IOMMU configuration? | Most likely yes (vb2_dma_contig handles IOMMU transparently). Verify at first decode attempt |
|
||
| O5 | RCB / SRAM — legacy VP9 path doesn't use RCB; vdpu381 HEVC does. Is RCB needed for VP9 on RK3588? | Compare `vp9d_vdpu34x.c` MPP backend's RCB usage to `vp9d_vdpu382.c`. If vdpu34x backend works without RCB on RK3588 in BSP, mainline doesn't need it either for VP9 |
|
||
| O6 | Validate via Fluster `VP9-TEST-VECTORS` post-Phase-3 | Set up `GStreamer-VP9-V4L2SL-Gst1.0` test rig |
|
||
| O7 | If a Collabora `linux-rkvdec-vp9-on-rk3588` series appears, pivot to coordination | Monitor lore + Collabora gitlab weekly |
|
||
|
||
## Risk register (revised)
|
||
|
||
| # | Risk | Mitigation |
|
||
|---|---|---|
|
||
| R1 | Same physical IP accepting two register dialects is unusual — there may be a hidden mode-switch register that must be set before VP9 work | Inspect BSP `rkvdec_rk3588_hw_ops` for any pre-decode setup distinguishing VP9 from HEVC |
|
||
| R2 | Backend (`libva-v4l2-request-fourier`) doesn't yet have rkvdec VP9 dispatch path; only hantro VP8 exists | Mirror the iter33 VP8 pattern: profile-gated codec dispatch in `RequestCreateConfig`. Sibling: `feedback_unconditional_codec_state.md` (must per-codec gate) |
|
||
| R3 | RK3588 VP9 max-resolution may differ from RK3399's 4096×2304 | Read MPP `vp9d_vdpu34x.c` max-resolution constants for confirmation |
|
||
| R4 | `dirac` (RK3399) cross-test fixture status unknown — needed if we modify legacy `rkvdec-vp9.c` for any reason | If modification needed, verify dirac is reachable before commit. Mostly we should NOT need to touch legacy file |
|
||
| R5 | Casanova posts upstream VP9 series mid-effort → fork divergence | Monitor weekly; coordinate if posted |
|
||
|
||
## Substrate locked
|
||
|
||
Phase 0 closes here. Phase 1 (architectural plan + Sonnet review) starts next session:
|
||
- Read BSP `mpp_rkvdec2.c` rkvdec_rk3588_hw_ops and IRQ routine to resolve O2/O3 (mode-switch register? VP9-specific IRQ status?)
|
||
- Read BSP MPP `vp9d_vdpu34x.c` to confirm O5 (RCB usage) and R3 (max-resolution)
|
||
- Draft the wiring patch outline (3rd entry in `vdpu381_coded_fmts[]` + any IRQ split)
|
||
- Decide ampere-side test fixture (which VP9 bitstreams; bbb-vp9 + a streaming-capable test vector)
|
||
|
||
## Persistence
|
||
|
||
- Repo: `/home/mfritsche/src/ampere-vp9-enablement/` on fresnel
|
||
- Gitea remote: TBD (file as `claude-noether/ampere-vp9-enablement` per [feedback_gitea_as_claude_noether](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_gitea_as_claude_noether.md))
|
||
- Kernel work: `boltzmann:~/src/linux-rockchip` branch `linux-rk3588-marfrit` (same tree as ampere-kernel-decoders campaign — separate iteration branches under `vp9-*` namespace recommended)
|
||
- ampere current state: vanilla `7.0.0-rc3-devices+` kernel + iter3/iter4-fixed modules from sibling campaign; bit-perfect HEVC verified; backend `v4l2_request_drv_video.so` is iter38b. VP9 has not been exercised on this system since the kernel-agent rollout.
|