commit eb60ecd224ed75cd7932a343d8c26d403d75ec2e Author: Claude (noether) Date: Sat May 16 22:48:26 2026 +0000 Phase 0: open campaign + substrate findings Open ampere-vp9-enablement to enable VP9 hardware decode on RK3588 ampere (rkvdec / vdpu381 register layout). Sibling to ampere-kernel-decoders (closed at HEVC bit-perfect 2026-05-17 ~00:42). Phase 0 substrate locked: upstream status (Collabora roadmap, no series posted), legacy code reference (rkvdec-vp9.c 1042 lines, vdpu341), vdpu381 pattern reference (rkvdec-vdpu381-hevc.c, struct-based regs + common-file split), work-plan outline, open questions (chiefly: where is the vdpu381 VP9 register layout documented), risk register. Phase 1 (architectural plan + Sonnet review) next session. Co-Authored-By: Claude Opus 4.7 diff --git a/README.md b/README.md new file mode 100644 index 0000000..6f3e787 --- /dev/null +++ b/README.md @@ -0,0 +1,28 @@ +# ampere-vp9-enablement + +Stand-alone port + upstream-targeting work to enable VP9 hardware decode on Rockchip RK3588's rkvdec (vdpu381 register layout). + +## Status (2026-05-17 ~01:00) + +Upstream RK3588 mainline rkvdec ([Casanova v7.0 series](https://www.collabora.com/news-and-blog/news-and-events/rk3588-and-rk3576-video-decoders-support-merged-in-the-upstream-linux-kernel.html), landed in Linux 7.0) supports **H.264 + HEVC only**. VP9 is on Collabora's stated roadmap but no WIP series has been posted to linux-media as of this campaign open. The legacy `rkvdec-vp9.c` (RK3399 / vdpu341 hardware) is feature-complete at 1042 lines but its register-config logic does not translate directly to vdpu381. + +This campaign: +1. Ports VP9 enablement to vdpu381 register layout (new file `rkvdec-vdpu381-vp9.c`) +2. Registers VP9 V4L2 controls in `vdpu38x_vp9_ctrl_descs[]` +3. Adds VP9 fmt to `vdpu381_coded_fmts[]` with the new ops +4. Verifies bit-perfect HW vs SW decode (per [feedback_compare_hw_against_sw_reference](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_compare_hw_against_sw_reference.md)) +5. Proposes upstream via linux-media + +Sibling campaign: [ampere-kernel-decoders](https://git.reauktion.de/claude-noether/ampere-kernel-decoders) closed at HEVC bit-perfect (kernel-agent#14 + #15 are the prerequisite kernel fixes). + +## Scope (out of) + +- VP9 on RK3399 (works via legacy `rkvdec-vp9.c` already in mainline) +- VP9 on hantro (hantro decoder on RK3588 doesn't expose VP9; this campaign targets rkvdec) +- AV1 on RK3588 (separate work; AV1 is on hantro fdc70000 already + per Collabora) +- VP8 (already works via hantro) +- HEVC (closed in ampere-kernel-decoders) + +## Process + +8-phase loop (per ~/.claude/CLAUDE.md). All commits via `claude-noether` identity. Patches will be RFC-quality and routed via kernel-agent once ready. diff --git a/phase0_findings.md b/phase0_findings.md new file mode 100644 index 0000000..10bb165 --- /dev/null +++ b/phase0_findings.md @@ -0,0 +1,96 @@ +# Phase 0 findings — ampere VP9 enablement substrate + +Date: 2026-05-17, opening session of the campaign (sibling: ampere-kernel-decoders closed at HEVC bit-perfect ~30 min earlier). + +## Goal + +Bring VP9 hardware decode up on RK3588 ampere via rkvdec (vdpu381 register layout), upstream-aligned, suitable for a clean linux-media RFC. End-state criterion: bit-perfect against `ffmpeg -c:v vp9` SW reference per [feedback_compare_hw_against_sw_reference](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_compare_hw_against_sw_reference.md). + +## Upstream status (search round 1) + +| Source | Result | +|---|---| +| Collabora blog 2026-05 ([Panthor → RK3588](https://www.collabora.com/news-and-blog/news-and-events/from-panthor-to-rk3588-advancing-graphics-video-soc-support-linux-kernel-7.html)) | "Going forward, Collabora will work on ... VP9 code support on RK3588" — roadmap item, no series posted | +| [Collabora RK3588/RK3576 decoders merged](https://www.collabora.com/news-and-blog/news-and-events/rk3588-and-rk3576-video-decoders-support-merged-in-the-upstream-linux-kernel.html) | Linux 7.0 landed H.264 + HEVC for vdpu381/vdpu383 only | +| WebSearch `"rk3588" OR "vdpu381" rkvdec vp9 patch site:lore.kernel.org` | AV1 series + other unrelated; no VP9 vdpu381 series | +| WebSearch `rkvdec2 vp9 rk3588 collabora linux-media kernel patch 2026` | Same conclusion; RKVDEC2 driver supports H.264 only at posting time | +| `lore.kernel.org/linux-media` WebFetch | Anubis access-denied (anti-bot block) | +| `lore.kernel.org/linaro-mm-sig` WebFetch | Anubis access-denied | +| `git remote -v` on boltzmann:~/src/linux-rockchip → collabora remote | `collabora/add-rkvdec2-driver*` branches exist (vdpu383-hevc variant); **no `*-vp9*` branch** | + +Conclusion: VP9 on RK3588 vdpu381 is **not yet in flight upstream**. We are first to implement. + +## Existing code substrate (boltzmann:~/src/linux-rockchip @ `linux-rk3588-marfrit`) + +### Legacy reference (RK3399 / vdpu341) + +- `drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c` — 1042 lines (Brezillon 2019 + Pietrasiewicz 2021 + Alpha Lin 2016). Uses `writel`-style register access via `rkvdec-regs.h`. +- `rkvdec.c:419` defines `rkvdec_vp9_ctrl_descs[]` (V4L2_CID_STATELESS_VP9_FRAME + V4L2_CID_STATELESS_VP9_COMPRESSED_HDR — small ctrl set) +- `rkvdec.c:478..492` registers VP9 in `rk3399_coded_fmts[]` (4096×2304 max, 64×64 alignment step) +- Ops: `rkvdec_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }` (no `try_ctrl`) + +### vdpu381 reference (RK3588) — pattern to follow + +- `rkvdec-vdpu381-hevc.c` — 639 lines, 2025 Casanova. **Struct-based register layout** (`rkvdec-vdpu381-regs.h`), shared preamble via `rkvdec-hevc-common.c/h`. +- `rkvdec-vdpu381-h264.c` — same pattern, h264-common shared file. +- `rkvdec.c:513..549` defines `vdpu381_coded_fmts[]` with HEVC + H.264 only — **VP9 entry must be added here**. +- `rkvdec.c:1701` `vdpu381_variant_ops` exposes single-IRQ handler + coded_fmts table — no per-codec dispatch needed. + +### Common helpers already in place + +- `rkvdec-cabac.c/h` — CABAC tables, codec-agnostic +- `rkvdec-rcb.c/h` — Row Cache Buffer / SRAM management (vdpu38x has internal SRAM for line caches) +- `rkvdec-h264-common.c/h`, `rkvdec-hevc-common.c/h` — codec spec parsing, RPS prep, control-batch helpers + +VP9 has no `rkvdec-vp9-common.*` yet. Today the legacy `rkvdec-vp9.c` holds both the spec/probability logic AND the vdpu341 register code in one file. + +## Work plan outline (to be refined in Phase 1) + +| Step | Output | Notes | +|---|---|---| +| 1 | `rkvdec-vp9-common.{c,h}` — extracted from legacy `rkvdec-vp9.c` | Probability tables, frame_ctx state, segmap mgmt, libv4l2 vp9 helpers (`v4l2-vp9.h`). Stays codec-spec-only, no register access. Legacy `rkvdec-vp9.c` then includes/links to it. | +| 2 | `rkvdec-vdpu381-vp9.c` — new backend | `rkvdec_vdpu381_vp9_fmt_ops = { .adjust_fmt, .start, .stop, .run }`. Re-implements register packing against `struct vp9_regs` in vdpu381 layout. | +| 3 | `rkvdec-vdpu381-regs.h` additions | VP9 register struct definitions (need Rockchip TRM or BSP reference — see open-question O1) | +| 4 | `vdpu38x_vp9_ctrl_descs[]` in `rkvdec.c` | Likely identical to legacy `rkvdec_vp9_ctrl_descs[]` (V4L2 controls are HW-agnostic) — just renamed and possibly with vdpu38x-specific dims. | +| 5 | `vdpu381_coded_fmts[]` third entry | V4L2_PIX_FMT_VP9_FRAME pointing to the new ops + ctrls. Sizes likely 65472×65472 to match HEVC entry. | +| 6 | Test: ffmpeg-vaapi VP9 decode + byte-compare against SW reference | Per `feedback_compare_hw_against_sw_reference.md`. | +| 7 | Series-prep: split into individual reviewable patches | Eventually for linux-media submission via `b4`. | + +Step 1 is the biggest single chunk (refactor + maintain bit-perfect behaviour on legacy path); steps 2-3 are where the unknown register layout dominates time. + +## Open questions + +| # | Question | Resolution path | +|---|---|---| +| O1 | **Where is the vdpu381 VP9 register layout documented?** Public Rockchip TRMs (RK3588 TRM v0.7 / v1.0) cover vdpu341 only. We need either: (a) Rockchip BSP kernel (linux-5.10-rkr or 6.1-rkr) inspection — they have a working VP9 path, (b) Casanova's WIP if it exists privately, (c) blind RE from hardware behaviour | Step 1: pull Rockchip BSP `kernel-5.10` or `kernel-6.1` rkvdec source (`mpp_vp9d_vdpu*` in mpp/userspace; rkvdec kernel side typically minimal) | +| O2 | **Does vdpu381 share enough VP9 hardware with vdpu341 that legacy register sequencing is largely portable, or is this a clean-sheet IP?** | Inspect a Rockchip BSP `rkvdec` node from RK3588 DTS — register-map size + interrupt + clock topology says a lot. Compare to RK3399's | +| O3 | **Probability table format/layout — same between IPs?** | VP9 spec is spec; HW prob-table layout is HW-specific. Need register doc. | +| O4 | **Is RCB / SRAM usage required for VP9 on vdpu381 same as for HEVC?** | Reuse `rkvdec-rcb` helper if so; new sizing constants if not | +| O5 | **Multicore disabled** (commit `e570307ac987`) — does that affect VP9? | Likely not — VP9 was never multicore-aware; single decoder core path will work | +| O6 | **Validate via Fluster (200/239 AV1 example) or VP9-TEST-VECTORS suite** | Set up fluster GStreamer-VP9-V4L2SL-Gst1.0 test post-Phase-3 | +| O7 | **Stretch: can we cross-port the RKVDEC2 (Casanova WIP) approach** if upstream `add-rkvdec2-driver-vp9` appears mid-campaign? | Watch lore + Collabora gitlab | + +## Risk register + +| # | Risk | Mitigation | +|---|---|---| +| R1 | Register layout unknown — could spend weeks reverse-engineering with no public docs | Lean hard on Rockchip BSP source; if blocked, file Collabora inquiry to short-circuit | +| R2 | Legacy `rkvdec-vp9.c` refactor (extract common) breaks RK3399 path | Cross-test the legacy build on dirac (RK3399 ROCK Pi 4) before merging — sibling: `dirac.fritz.box` should still have the old kernel for regression testing | +| R3 | VP9 spec features (compressed header, segmentation, frame parallel decode) not supported by vdpu381 HW | Determine empirically; document limitations upstream | +| R4 | Backend (`libva-v4l2-request-fourier`) already has VP9 path for hantro (per `feedback_vaapi_strips_vp8_uncompressed_header.md`) but rkvdec-vp9 VAAPI integration may need adaptation | Trace ffmpeg-vaapi VP9 OUTPUT layout vs the iter38b backend's VP9 dispatch; sibling: fresnel-fourier iter33 VP8 work | +| R5 | Casanova posts an upstream VP9 series mid-effort, causing fork divergence | Monitor `collabora/add-rkvdec2-driver-vp9` branch + lore weekly; pivot to coordination if so | + +## Substrate locked + +Phase 0 closes here. Phase 1 (architectural plan + Sonnet review) starts next session: +- Pull Rockchip BSP rkvdec source for VP9 register-layout reference (O1) +- Draft `rkvdec-vp9-common.c` split outline +- Draft `vdpu381-vp9.c` register-packing skeleton +- Identify any V4L2 uAPI additions needed (likely none — `V4L2_CID_STATELESS_VP9_*` already exist) + +## Persistence + +- Repo: `/home/mfritsche/src/ampere-vp9-enablement/` on fresnel +- Gitea remote: TBD (file as `claude-noether/ampere-vp9-enablement` per [feedback_gitea_as_claude_noether](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_gitea_as_claude_noether.md)) +- Kernel work: `boltzmann:~/src/linux-rockchip` branch `linux-rk3588-marfrit` (same tree as ampere-kernel-decoders campaign — separate iteration branches under `vp9-*` namespace recommended) +- ampere current state: vanilla `7.0.0-rc3-devices+` kernel + iter3/iter4-fixed modules from sibling campaign; bit-perfect HEVC verified; backend `v4l2_request_drv_video.so` is iter38b. VP9 has not been exercised on this system since the kernel-agent rollout.