[ka:experiment] RK3588 rkvdec: VP9 not registered as OUTPUT pixfmt — enable via VDPU381/383 variant_ops #12

Closed
opened 2026-05-16 07:27:04 +00:00 by claude-noether · 6 comments
Collaborator

[ka:experiment] RK3588 rkvdec: VP9 not registered as OUTPUT pixfmt — enable via VDPU381/383 variant_ops

Symptom

On ampere (CoolPi CM5 GenBook, RK3588, linux-ampere-fourier 7.0rc3.kafr1-1):

$ v4l2-ctl -d /dev/video1 --list-formats-out
ioctl: VIDIOC_ENUM_FMT
	Type: Video Output Multiplanar
	[0]: 'S265' (HEVC Parsed Slice Data, compressed)
	[1]: 'S264' (H.264 Parsed Slice Data, compressed)

/dev/video1 is the rkvdec node (media-ctl -d /dev/media0 -p → driver rkvdec driver version 7.0.0). Only HEVC + H.264 registered as OUTPUT pixfmts. VP9 absent, despite:

  • The kernel module v4l2_vp9 is loaded (lsmod | grep v4l2_vp9 → "used by 2: rockchip_vdec, hantro_vpu"). So the VP9 control parsing infrastructure IS available.
  • Fresnel (RK3399 rkvdec via linux-fresnel-fourier 7.0-14) exposes VP9 cleanly — V4L2_PIX_FMT_VP9_FRAME registered in its rkvdec binding.

So the issue is: RK3588's rkvdec binding does not register VP9 in its variant_ops, even though the chip's documented capabilities include VP9 decode and the kernel module is present.

Boundary

Per memory feedback_rkvdec_patch_reachability:

before applying upstream rkvdec patches to fresnel kernel, verify the patched function is reachable from rk3399_variant_ops; mainline has diverging RK3399 legacy vs VDPU381/383 paths.

The RK3588 rkvdec uses the VDPU381 / VDPU383 mainline path, not the RK3399 legacy rkvdec path. So a VP9-enablement patch must target VDPU381/383 specifically — applying RK3399's VP9 patch would land in the wrong driver region.

Suggested investigation paths

  1. Identify the upstream VP9 enablement patch for VDPU381/383: search linux-media + rockchip-linux + Kwiboo's rkvdec2 trees for VP9 on VDPU381/383. There's likely a patch out there (Collabora / Bootlin / kwiboo have worked on this).
  2. Check whether a DTS node is needed: RK3588 may need a compatible = "rockchip,rk3588-vdpu381" or similar binding entry that's not in the ampere DTB yet. Check arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-genbook.dts (or its includes) for the rkvdec node — what compatible does it advertise?
  3. Confirm v4l2_vp9 helper module exposes the right controls for the RK3588 binding to use — it's loaded but unused-on-rkvdec for RK3588 currently.

Out-of-scope per operator policy

Same as the HEVC issue: ampere stays on a clean mainline + board-DTS kernel. VP9 enablement is a kernel-agent experiment branch / target, NOT a patch on the baseline linux-ampere-fourier package.

Acceptance

v4l2-ctl -d /dev/video1 --list-formats-out lists VP9F (or whatever the registered pixfmt FOURCC is) in addition to S265 + S264. Then vainfo on the libva backend lists VAProfileVP9Profile0. Then a follow-up ampere-fourier iteration validates the VP9 decode end-to-end (Phase 3 C1-C6 against an encoded VP9 clip).

Refs

  • ampere-fourier iter1 Phase 0: ~/src/ampere-fourier/phase0_findings.md
  • Sibling: HEVC oops issue filed separately
  • Memory feedback_rkvdec_patch_reachability (VDPU381/383 vs RK3399 legacy)
# [ka:experiment] RK3588 rkvdec: VP9 not registered as OUTPUT pixfmt — enable via VDPU381/383 variant_ops ## Symptom On ampere (CoolPi CM5 GenBook, RK3588, `linux-ampere-fourier 7.0rc3.kafr1-1`): ```sh $ v4l2-ctl -d /dev/video1 --list-formats-out ioctl: VIDIOC_ENUM_FMT Type: Video Output Multiplanar [0]: 'S265' (HEVC Parsed Slice Data, compressed) [1]: 'S264' (H.264 Parsed Slice Data, compressed) ``` `/dev/video1` is the rkvdec node (`media-ctl -d /dev/media0 -p` → driver `rkvdec` driver version 7.0.0). Only HEVC + H.264 registered as OUTPUT pixfmts. **VP9 absent**, despite: - The kernel module `v4l2_vp9` is loaded (`lsmod | grep v4l2_vp9` → "used by 2: rockchip_vdec, hantro_vpu"). So the VP9 control parsing infrastructure IS available. - Fresnel (RK3399 rkvdec via `linux-fresnel-fourier 7.0-14`) exposes VP9 cleanly — `V4L2_PIX_FMT_VP9_FRAME` registered in its rkvdec binding. So the issue is: RK3588's rkvdec binding does not register VP9 in its variant_ops, even though the chip's documented capabilities include VP9 decode and the kernel module is present. ## Boundary Per memory `feedback_rkvdec_patch_reachability`: > before applying upstream rkvdec patches to fresnel kernel, verify the patched function is reachable from rk3399_variant_ops; mainline has diverging RK3399 legacy vs VDPU381/383 paths. The RK3588 rkvdec uses the **VDPU381 / VDPU383** mainline path, not the RK3399 legacy `rkvdec` path. So a VP9-enablement patch must target VDPU381/383 specifically — applying RK3399's VP9 patch would land in the wrong driver region. ## Suggested investigation paths 1. **Identify the upstream VP9 enablement patch for VDPU381/383**: search linux-media + rockchip-linux + Kwiboo's rkvdec2 trees for VP9 on VDPU381/383. There's likely a patch out there (Collabora / Bootlin / kwiboo have worked on this). 2. **Check whether a DTS node is needed**: RK3588 may need a `compatible = "rockchip,rk3588-vdpu381"` or similar binding entry that's not in the ampere DTB yet. Check `arch/arm64/boot/dts/rockchip/rk3588-coolpi-cm5-genbook.dts` (or its includes) for the rkvdec node — what `compatible` does it advertise? 3. **Confirm v4l2_vp9 helper module exposes the right controls** for the RK3588 binding to use — it's loaded but unused-on-rkvdec for RK3588 currently. ## Out-of-scope per operator policy Same as the HEVC issue: ampere stays on a clean mainline + board-DTS kernel. VP9 enablement is a kernel-agent experiment branch / target, NOT a patch on the baseline `linux-ampere-fourier` package. ## Acceptance `v4l2-ctl -d /dev/video1 --list-formats-out` lists `VP9F` (or whatever the registered pixfmt FOURCC is) in addition to S265 + S264. Then `vainfo` on the libva backend lists `VAProfileVP9Profile0`. Then a follow-up ampere-fourier iteration validates the VP9 decode end-to-end (Phase 3 C1-C6 against an encoded VP9 clip). ## Refs - ampere-fourier iter1 Phase 0: `~/src/ampere-fourier/phase0_findings.md` - Sibling: HEVC oops issue filed separately - Memory `feedback_rkvdec_patch_reachability` (VDPU381/383 vs RK3399 legacy)
Author
Collaborator

Triage refresh 2026-05-18. Still applicable, no progress in 10 days. Confirming the state:

  • fleet/ampere.yaml manifest preamble explicitly enumerates VP9 enablement as out-of-scope for the current linux-ampere-fourier 7.0rc3.kafr1-1 baseline ("RFC-stage work, scope unclear until research lands").
  • No VP9 patches exist anywhere under ~/src/kernel-agent/patches/.
  • Userspace side is ready: libva-v4l2-request-fourier/src/codec.c already maps VAProfileVP9Profile0 → V4L2_PIX_FMT_VP9_FRAME (uses this branch for hantro on fresnel). Once the kernel-side VDPU381/383 binding registers V4L2_PIX_FMT_VP9_FRAME as an OUTPUT pixfmt, vainfo on ampere will pick it up with zero backend work.

Operator decision paths

a) Upstream-patch hunt: search Kwiboo's rkvdec / rkvdec2 trees, Collabora's linux-media branches, Bootlin's recent rockchip patchsets for a VP9-on-VDPU381/383 enablement. If a patch series exists in any state, evaluate for cherry-pick into the kernel-agent stack as a scope-tagged driver/media/ patch (kernel-agent experiment branch, NOT baseline).

b) In-house enablement: register V4L2_PIX_FMT_VP9_FRAME in vdpu381_variant_ops (and vdpu383_variant_ops if hardware supports) — needs validating the chip actually has VP9 decode silicon in its VDPU38x block (RK3588 TRM check); needs a coded-format → variant lookup entry; needs the vp9 v4l2 stateless helper already loaded (✓ per issue body's lsmod evidence).

c) Defer indefinitely: VP9 is mostly browser/YouTube content; if Pi 5 + RK3399 (fresnel) cover the fleet's VP9 needs adequately, RK3588-VP9 may not be worth the kernel-side investment. ampere's H.264 + HEVC + (eventually) AV1 may be the priority.

The "vainfo 9 profiles, target 10" headline metric in the issue body is unchanged today — no point re-running v4l2-ctl --list-formats-out on ampere until one of the above paths produces a patch to test against.

Recommend (a) — Kwiboo's tree in particular has had VP9 work historically; worth ~30min of grep. No empirical reproduction needed; symptom is well-characterized + invariant.

Keeping open as waiting-on-operator-pick (same shape as #14).

**Triage refresh 2026-05-18.** Still applicable, no progress in 10 days. Confirming the state: - `fleet/ampere.yaml` manifest preamble explicitly enumerates VP9 enablement as **out-of-scope** for the current `linux-ampere-fourier 7.0rc3.kafr1-1` baseline ("RFC-stage work, scope unclear until research lands"). - No VP9 patches exist anywhere under `~/src/kernel-agent/patches/`. - Userspace side is ready: `libva-v4l2-request-fourier/src/codec.c` already maps `VAProfileVP9Profile0 → V4L2_PIX_FMT_VP9_FRAME` (uses this branch for hantro on fresnel). Once the kernel-side `VDPU381/383` binding registers `V4L2_PIX_FMT_VP9_FRAME` as an OUTPUT pixfmt, vainfo on ampere will pick it up with zero backend work. ## Operator decision paths a) **Upstream-patch hunt**: search Kwiboo's rkvdec / rkvdec2 trees, Collabora's linux-media branches, Bootlin's recent rockchip patchsets for a VP9-on-VDPU381/383 enablement. If a patch series exists in any state, evaluate for cherry-pick into the kernel-agent stack as a scope-tagged `driver/media/` patch (kernel-agent experiment branch, NOT baseline). b) **In-house enablement**: register `V4L2_PIX_FMT_VP9_FRAME` in `vdpu381_variant_ops` (and `vdpu383_variant_ops` if hardware supports) — needs validating the chip actually has VP9 decode silicon in its VDPU38x block (RK3588 TRM check); needs a coded-format → variant lookup entry; needs the `vp9` v4l2 stateless helper already loaded (✓ per issue body's `lsmod` evidence). c) **Defer indefinitely**: VP9 is mostly browser/YouTube content; if Pi 5 + RK3399 (fresnel) cover the fleet's VP9 needs adequately, RK3588-VP9 may not be worth the kernel-side investment. ampere's H.264 + HEVC + (eventually) AV1 may be the priority. The "vainfo 9 profiles, target 10" headline metric in the issue body is unchanged today — no point re-running `v4l2-ctl --list-formats-out` on ampere until one of the above paths produces a patch to test against. Recommend (a) — Kwiboo's tree in particular has had VP9 work historically; worth ~30min of grep. No empirical reproduction needed; symptom is well-characterized + invariant. Keeping open as waiting-on-operator-pick (same shape as [#14](https://git.reauktion.de/marfrit/kernel-agent/issues/14)).
Author
Collaborator

Patch hunt result + integration attempt 2026-05-18.

Patch found: D.V.A.B. Sarma's add-rkvdec-vdpu381-vp9-v8 series

Located via Collabora's blog post on RK3588/RK3576 video decoder upstream merge (their attribution: "VP9 on RK3588, for which D.V.A.B. Sarma added support"). The branch lives at:

Three commits, ~1500 LOC total, sit on top of Detlev Casanova's Collabora VDPU381/VDPU383 H.264+HEVC base (which is in mainline 7.0):

  1. f60174f07 — rename helper get_ref_buf → get_ref_buf_vp9 (10 LOC).
  2. e87662ca3 — move VP9 functions to rkvdec-vp9-common.{c,h} for cross-variant reuse (172 LOC, 2 new files).
  3. aa00b89b6the actual VP9 backend for VDPU381 (1303 LOC, adds rkvdec-vdpu381-vp9.c + register defs + glue).

Author commit msg: "The VDPU381 supports VP9 decoding up to 7680x4320@30fps. It supports YUV420 (8 and 10 bits) i.e Profile 0 and Profile 2. Testing shows promising results. Testing done on Orange pi 5 pro board with aosp 16 and with FFMPEG."

Per Collabora, this is "v8" of an internal iteration — being coached on kernel etiquette before a v1 hits linux-media. So it's "fairly mature out-of-tree" with no upstream timeline yet.

Integration attempt

Cherry-picked all 3 commits onto vp9-vdpu381-sarma-test (a throwaway branch off ampere-minimal-devices) on ampere's /home/mfritsche/src/linux-rockchip. All 3 applied cleanly (one pre-existing local-mod conflict required stashing the issue14 vb2-resv work before the 3rd pick; later restored). Module-only build succeeded:

[14/19] CC  rkvdec-vdpu381-vp9.o
LD [M]  rockchip-vdec.ko        # 241 KB, up from 122 KB

Install attempt blocked by a different problem entirely

modprobe of the new .ko failed:

module rockchip_vdec: .gnu.linkonce.this_module section size must match the kernel's built struct module size at run time

Root cause: /home/mfritsche/src/linux-rockchip/.config was modified on May 17 (DEBUG_OBJECTS, DEBUG_LOCK_ALLOC, DEBUG_MUTEXES, DEBUG_ATOMIC_SLEEP added; compiler bumped to GCC 16.1.1) — but the running kernel was built on May 16 with the older .config + GCC 15.2.1. The new debug options expand struct module so the module-vs-kernel size check rejects the rebuild.

84 .config lines differ between the running kernel and the current source tree's .config.

Restored the backup .ko, rkvdec is back to its working state on ampere; no permanent change. Pre-existing .pre-vp9-test-bak retained alongside the existing .pre-issue14-bak as audit trail.

What's actually needed to ship VP9 on ampere

Module-only build is structurally impossible until kernel + module are built from the same .config + same compiler. The realistic paths:

a) Full kernel rebuild with the current .config + Sarma's 3 commits + the existing issue14 local mods + reboot ampere into the new kernel. This is normal kernel-agent flow. Estimated effort: 30–45 min build + reboot risk on a host with the known black-screen-bisect concern from fleet/ampere.yaml.

b) Mirror the May-16 build environment (extract .config from /proc/config.gz, install GCC 15.2.1 toolchain, build module against that) — produces a compatible module without touching the running kernel. Estimated effort: 1 hour for toolchain setup + module rebuild + test.

c) Wait for Sarma's v1 → linux-media → mainline merge + adopt at next kernel-agent baseline bump. Zero local risk, indefinite timeline (Collabora hasn't named a target release).

Status / next move

Recommend (b) as the quickest empirical win — keeps the running kernel untouched, lets us verify Sarma's patches actually decode VP9 end-to-end on ampere before committing to a full kernel rebuild. If verification passes, then (a) becomes the ship path.

Keeping issue open. No autonomous next step until you pick a path — both (a) and (b) involve enough environmental setup that they should be scoped at the kernel-agent level rather than this issue's level.

Artifacts of this attempt

  • ampere /home/mfritsche/src/linux-rockchip — branch ampere-minimal-devices restored, local mods re-applied. Sarma remote retained but no branches checked out.
  • /lib/modules/.../rockchip-vdec.ko — original (May 17 09:00 timestamp, 122 KB)
  • /lib/modules/.../rockchip-vdec.ko.pre-vp9-test-bak — kept as audit trail
**Patch hunt result + integration attempt 2026-05-18.** ## Patch found: D.V.A.B. Sarma's `add-rkvdec-vdpu381-vp9-v8` series Located via Collabora's blog post on RK3588/RK3576 video decoder upstream merge (their attribution: "VP9 on RK3588, for which D.V.A.B. Sarma added support"). The branch lives at: - https://github.com/dvab-sarma/android_kernel_rk_opi/tree/add-rkvdec-vdpu381-vp9-v8 Three commits, ~1500 LOC total, sit on top of Detlev Casanova's Collabora VDPU381/VDPU383 H.264+HEVC base (which is in mainline 7.0): 1. `f60174f07` — rename helper `get_ref_buf → get_ref_buf_vp9` (10 LOC). 2. `e87662ca3` — move VP9 functions to `rkvdec-vp9-common.{c,h}` for cross-variant reuse (172 LOC, 2 new files). 3. `aa00b89b6` — **the actual VP9 backend for VDPU381** (1303 LOC, adds `rkvdec-vdpu381-vp9.c` + register defs + glue). Author commit msg: *"The VDPU381 supports VP9 decoding up to 7680x4320@30fps. It supports YUV420 (8 and 10 bits) i.e Profile 0 and Profile 2. Testing shows promising results. Testing done on Orange pi 5 pro board with aosp 16 and with FFMPEG."* Per Collabora, this is "v8" of an internal iteration — being coached on kernel etiquette before a v1 hits linux-media. So it's "fairly mature out-of-tree" with no upstream timeline yet. ## Integration attempt Cherry-picked all 3 commits onto `vp9-vdpu381-sarma-test` (a throwaway branch off `ampere-minimal-devices`) on ampere's `/home/mfritsche/src/linux-rockchip`. **All 3 applied cleanly** (one pre-existing local-mod conflict required stashing the issue14 vb2-resv work before the 3rd pick; later restored). Module-only build succeeded: ``` [14/19] CC rkvdec-vdpu381-vp9.o LD [M] rockchip-vdec.ko # 241 KB, up from 122 KB ``` ## Install attempt blocked by a different problem entirely modprobe of the new `.ko` failed: ``` module rockchip_vdec: .gnu.linkonce.this_module section size must match the kernel's built struct module size at run time ``` Root cause: `/home/mfritsche/src/linux-rockchip/.config` was modified on **May 17** (DEBUG_OBJECTS, DEBUG_LOCK_ALLOC, DEBUG_MUTEXES, DEBUG_ATOMIC_SLEEP added; compiler bumped to GCC 16.1.1) — but the running kernel was built on **May 16** with the older `.config` + GCC 15.2.1. The new debug options expand `struct module` so the module-vs-kernel size check rejects the rebuild. 84 .config lines differ between the running kernel and the current source tree's .config. Restored the backup `.ko`, rkvdec is back to its working state on ampere; no permanent change. Pre-existing `.pre-vp9-test-bak` retained alongside the existing `.pre-issue14-bak` as audit trail. ## What's actually needed to ship VP9 on ampere Module-only build is structurally impossible until kernel + module are built from the same `.config` + same compiler. The realistic paths: a) **Full kernel rebuild** with the current `.config` + Sarma's 3 commits + the existing issue14 local mods + reboot ampere into the new kernel. This is normal kernel-agent flow. Estimated effort: 30–45 min build + reboot risk on a host with the known black-screen-bisect concern from `fleet/ampere.yaml`. b) **Mirror the May-16 build environment** (extract `.config` from `/proc/config.gz`, install GCC 15.2.1 toolchain, build module against that) — produces a compatible module without touching the running kernel. Estimated effort: 1 hour for toolchain setup + module rebuild + test. c) **Wait for Sarma's v1 → linux-media → mainline merge** + adopt at next kernel-agent baseline bump. Zero local risk, indefinite timeline (Collabora hasn't named a target release). ## Status / next move Recommend **(b)** as the quickest empirical win — keeps the running kernel untouched, lets us verify Sarma's patches actually decode VP9 end-to-end on ampere before committing to a full kernel rebuild. If verification passes, then (a) becomes the ship path. Keeping issue open. No autonomous next step until you pick a path — both (a) and (b) involve enough environmental setup that they should be scoped at the kernel-agent level rather than this issue's level. ## Artifacts of this attempt - ampere `/home/mfritsche/src/linux-rockchip` — branch `ampere-minimal-devices` restored, local mods re-applied. Sarma remote retained but no branches checked out. - `/lib/modules/.../rockchip-vdec.ko` — original (May 17 09:00 timestamp, 122 KB) - `/lib/modules/.../rockchip-vdec.ko.pre-vp9-test-bak` — kept as audit trail
Author
Collaborator

Path (a) — full kernel rebuild — built + installed 2026-05-18.

Build + install summary

Step Result
Cherry-pick Sarma's 3 commits onto ampere-minimal-devices (throwaway branch vp9-build) clean, no conflicts
Stage build dir with running-kernel .config from /proc/config.gz + CONFIG_LOCALVERSION="-vp9-test" KERNELRELEASE = 7.0.0-rc3-vp9-test+
Full kernel build (Image + modules + dtbs, GCC 16.1.1) RC=0, real 60m49s
modules_install/lib/modules/7.0.0-rc3-vp9-test+/ installed (rockchip-vdec.ko et al.)
Image-7.0.0-rc3-vp9-test+ + rk3588-coolpi-cm5-genbook.dtb-7.0.0-rc3-vp9-test+ + initramfs-7.0.0-rc3-vp9-test+/boot/firmware/ installed
New extlinux label arch_vp9_test added (default unchanged at arch_devices) installed
Patches archived as scope-tagged kernel-agent artifacts PR #24

Important: REBOOT NEEDED to test

The new kernel is not loaded yet. Current uname is still 7.0.0-rc3-devices+. To verify the VP9 enablement worked end-to-end, the operator needs to:

  1. sudo reboot ampere
  2. At the 30s extlinux prompt (serial console or local display), type arch_vp9_test + Enter
  3. After boot, uname -r should return 7.0.0-rc3-vp9-test+
  4. v4l2-ctl -d /dev/video1 --list-formats-out should show VP9F alongside S265/S264
  5. End-to-end VP9 decode test:
    env LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hide_banner -loglevel error \
      -hwaccel vaapi -hwaccel_output_format vaapi \
      -i ~/measurements/encoded/bbb_60s_720p.vp9.webm \
      -vf "hwdownload,format=nv12" -frames:v 30 -f rawvideo /tmp/k_vp9.nv12
    
    Should produce 30 frames of real NV12 content (compare against a SW reference for bit-exactness).

If the new kernel doesn't boot for any reason, default extlinux label is still arch_devices — pick that at the menu (or just let the 30s timeout elapse) and ampere recovers to the May-16 running kernel.

Patches now archived in kernel-agent

PR #24 (#24) imports Sarma's 3 patches as scope-tagged artifacts under patches/driver/media/0001..0003-*.patch + a README.md documenting provenance, apply order, removal criteria, and the empirical results so far. Does NOT modify fleet/ampere.yaml — that bump is the operator's call once the post-reboot VP9 decode test passes.

Audit-trail files on ampere

  • /home/mfritsche/src/linux-rockchip — branch vp9-build with the 3 cherry-picks; build-vp9/ build directory retained
  • /home/mfritsche/src/linux-rockchip/.config.lockdep-kasan.bkp — the prior .config (lockdep-kasan kernel config) preserved for re-use
  • /home/mfritsche/vp9-test-install.sh — the install script (idempotent, can re-run if rollback + re-test needed)
  • /tmp/vp9-kernel-build-20260518-134556.log — full build log (60min, 415m CPU)
  • /tmp/vp9-install.log — install log
  • /boot/firmware/extlinux/extlinux.conf.pre-vp9-test.bkp — extlinux backup before adding the new label
  • /lib/modules/.../rockchip-vdec.ko.pre-vp9-test-bak — earlier module-only-install backup (orthogonal to this work, retained as audit trail)

Status: waiting on operator reboot. Issue stays open until VP9 decode is empirically verified on ampere.

**Path (a) — full kernel rebuild — built + installed 2026-05-18.** ## Build + install summary | Step | Result | |---|---| | Cherry-pick Sarma's 3 commits onto `ampere-minimal-devices` (throwaway branch `vp9-build`) | clean, no conflicts | | Stage build dir with running-kernel `.config` from `/proc/config.gz` + `CONFIG_LOCALVERSION="-vp9-test"` | KERNELRELEASE = `7.0.0-rc3-vp9-test+` | | Full kernel build (Image + modules + dtbs, GCC 16.1.1) | RC=0, real 60m49s | | `modules_install` → `/lib/modules/7.0.0-rc3-vp9-test+/` | installed (rockchip-vdec.ko et al.) | | `Image-7.0.0-rc3-vp9-test+` + `rk3588-coolpi-cm5-genbook.dtb-7.0.0-rc3-vp9-test+` + `initramfs-7.0.0-rc3-vp9-test+` → `/boot/firmware/` | installed | | New extlinux label `arch_vp9_test` added (default unchanged at `arch_devices`) | installed | | Patches archived as scope-tagged kernel-agent artifacts | PR #24 | ## Important: REBOOT NEEDED to test The new kernel is **not loaded** yet. Current uname is still `7.0.0-rc3-devices+`. To verify the VP9 enablement worked end-to-end, the operator needs to: 1. `sudo reboot` ampere 2. At the 30s extlinux prompt (serial console or local display), type `arch_vp9_test` + Enter 3. After boot, `uname -r` should return `7.0.0-rc3-vp9-test+` 4. `v4l2-ctl -d /dev/video1 --list-formats-out` should show `VP9F` alongside `S265`/`S264` 5. End-to-end VP9 decode test: ``` env LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hide_banner -loglevel error \ -hwaccel vaapi -hwaccel_output_format vaapi \ -i ~/measurements/encoded/bbb_60s_720p.vp9.webm \ -vf "hwdownload,format=nv12" -frames:v 30 -f rawvideo /tmp/k_vp9.nv12 ``` Should produce 30 frames of real NV12 content (compare against a SW reference for bit-exactness). If the new kernel doesn't boot for any reason, default extlinux label is still `arch_devices` — pick that at the menu (or just let the 30s timeout elapse) and ampere recovers to the May-16 running kernel. ## Patches now archived in kernel-agent PR #24 (https://git.reauktion.de/marfrit/kernel-agent/pulls/24) imports Sarma's 3 patches as scope-tagged artifacts under `patches/driver/media/0001..0003-*.patch` + a `README.md` documenting provenance, apply order, removal criteria, and the empirical results so far. **Does NOT modify `fleet/ampere.yaml`** — that bump is the operator's call once the post-reboot VP9 decode test passes. ## Audit-trail files on ampere - `/home/mfritsche/src/linux-rockchip` — branch `vp9-build` with the 3 cherry-picks; `build-vp9/` build directory retained - `/home/mfritsche/src/linux-rockchip/.config.lockdep-kasan.bkp` — the prior `.config` (lockdep-kasan kernel config) preserved for re-use - `/home/mfritsche/vp9-test-install.sh` — the install script (idempotent, can re-run if rollback + re-test needed) - `/tmp/vp9-kernel-build-20260518-134556.log` — full build log (60min, 415m CPU) - `/tmp/vp9-install.log` — install log - `/boot/firmware/extlinux/extlinux.conf.pre-vp9-test.bkp` — extlinux backup before adding the new label - `/lib/modules/.../rockchip-vdec.ko.pre-vp9-test-bak` — earlier module-only-install backup (orthogonal to this work, retained as audit trail) Status: waiting on operator reboot. Issue stays open until VP9 decode is empirically verified on ampere.
Author
Collaborator

🎉 Empirical VP9 decode on ampere CONFIRMED working, 2026-05-18.

Boot

$ uname -r
7.0.0-rc3-vp9-test+

Booted via the arch_vp9_test extlinux label (after one round-trip to fix a missing append line bug in the install script — now fixed in /home/mfritsche/vp9-test-install.sh). Operator made arch_vp9_test the default for the test boot. Recovery via USB stick was on standby; not needed.

Kernel-side: VP9F enumerated, decode bit-exact

$ v4l2-ctl -d /dev/video2 --list-formats-out
	[0]: 'S265' (HEVC Parsed Slice Data, compressed)
	[1]: 'S264' (H.264 Parsed Slice Data, compressed)
	[2]: 'VP9F' (VP9 Frame, compressed)       ← NEW

(/dev/video2 is rkvdec on RK3588's first VDPU381 core at fdc38100; the second core at fdc40100 is rejected with "missing multi-core support" — expected, Sarma's series is single-core only.)

$ ffmpeg -hwaccel v4l2request -ss 5 -i bbb_60s_720p.vp9.webm \
    -vf hwdownload,format=nv12 -frames:v 10 -f rawvideo /tmp/k_vp9.nv12
$ ffmpeg -ss 5 -i bbb_60s_720p.vp9.webm \
    -vf format=nv12 -frames:v 10 -f rawvideo /tmp/sw_vp9.nv12
$ sha256sum /tmp/{k,sw}_vp9.nv12
614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30  /tmp/k_vp9.nv12
614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30  /tmp/sw_vp9.nv12

Bit-exact HW==SW on 10-frame mid-fixture. Sarma's patches work.

libva surface: VP9 profile auto-enumerated

$ env LIBVA_DRIVER_NAME=v4l2_request vainfo | grep VP9
      VAProfileVP9Profile0            :	VAEntrypointVLD

The iter38 multi-device probe in libva-v4l2-request-fourier picked up VP9 automatically — no backend changes needed. Configurable via VAAPI consumers (firefox-fourier, chromium-fourier, mpv vaapi, ...) immediately.

libva path: works but bit-divergence vs kdirect (known limitation)

$ env LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi ... -i bbb_60s_720p.vp9.webm ...
exit=0; 13.8 MB written; 249 unique bytes (real content)
libva sha:   51672c373689cb02d26565a5cfa1433816f0d84117757e7915802565710633fc
kdirect sha: 614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30

Libva produces correct file size + real content, but the bytes differ from kdirect / SW. First-bytes inspection shows libva at 0d 0d ... (very dark) vs kdirect/SW at c5 c5 ... (bright). Likely either:

  • libva picks a different frame range (different seek semantics)
  • libva's VP9 frame-ctrl submission to the kernel differs subtly from ffmpeg's V4L2 hwaccel — same class as the Hi10P libva ctrl-submission bug

This is a separate issue from the kernel-side work this ticket covers. The kernel + kdirect path is solidly bit-exact. The libva-side fine-tuning is its own debug session if/when we want browser-side VP9 acceleration.

State on ampere

  • Booted kernel: 7.0.0-rc3-vp9-test+ (default in extlinux as of this test)
  • Both rkvdec cores attempt to register; only the primary at fdc38100 succeeds (multi-core unsupported in Sarma's v8 series)
  • Other decoders (hantro mpeg2/vp8, av1-vpu-dec) unaffected by the kernel rebuild
  • Audit-trail backups all intact (extlinux.conf.pre-*-bkp, .pre-vp9-test-bak .ko files, .pre-iter6postmortem etc.)

Path forward (operator pick)

The kernel-side issue is functionally resolved. Next moves:

a) Merge PR #24 (#24) — archives the 3 patches as scope-tagged artifacts.
b) Bump fleet/ampere.yaml to include the 3 patches in the official baseline. Then any future linux-ampere-fourier PKGBUILD rebuild picks them up.
c) Optionally: chase the libva-vs-kdirect bit-divergence (separate ticket). For browser-side VP9 acceleration via firefox-fourier/chromium-fourier, this matters.
d) Or: keep ampere on arch_vp9_test as the new working default; the existing arch_devices label stays as a fallback (no patches, no VP9).

Recommend (a) + (b) immediately, defer (c) until a browser consumer surfaces a concrete VP9 bug. Closing this issue is justified by the kernel-side bit-exact proof; the libva fine-tuning goes to a new ticket if needed.

**🎉 Empirical VP9 decode on ampere CONFIRMED working, 2026-05-18.** ## Boot ``` $ uname -r 7.0.0-rc3-vp9-test+ ``` Booted via the `arch_vp9_test` extlinux label (after one round-trip to fix a missing `append` line bug in the install script — now fixed in `/home/mfritsche/vp9-test-install.sh`). Operator made `arch_vp9_test` the default for the test boot. Recovery via USB stick was on standby; not needed. ## Kernel-side: VP9F enumerated, decode bit-exact ``` $ v4l2-ctl -d /dev/video2 --list-formats-out [0]: 'S265' (HEVC Parsed Slice Data, compressed) [1]: 'S264' (H.264 Parsed Slice Data, compressed) [2]: 'VP9F' (VP9 Frame, compressed) ← NEW ``` (`/dev/video2` is rkvdec on RK3588's first VDPU381 core at fdc38100; the second core at fdc40100 is rejected with "missing multi-core support" — expected, Sarma's series is single-core only.) ``` $ ffmpeg -hwaccel v4l2request -ss 5 -i bbb_60s_720p.vp9.webm \ -vf hwdownload,format=nv12 -frames:v 10 -f rawvideo /tmp/k_vp9.nv12 $ ffmpeg -ss 5 -i bbb_60s_720p.vp9.webm \ -vf format=nv12 -frames:v 10 -f rawvideo /tmp/sw_vp9.nv12 $ sha256sum /tmp/{k,sw}_vp9.nv12 614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30 /tmp/k_vp9.nv12 614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30 /tmp/sw_vp9.nv12 ``` **Bit-exact HW==SW on 10-frame mid-fixture.** Sarma's patches work. ## libva surface: VP9 profile auto-enumerated ``` $ env LIBVA_DRIVER_NAME=v4l2_request vainfo | grep VP9 VAProfileVP9Profile0 : VAEntrypointVLD ``` The iter38 multi-device probe in `libva-v4l2-request-fourier` picked up VP9 automatically — no backend changes needed. Configurable via VAAPI consumers (firefox-fourier, chromium-fourier, mpv vaapi, ...) immediately. ## libva path: works but bit-divergence vs kdirect (known limitation) ``` $ env LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi ... -i bbb_60s_720p.vp9.webm ... exit=0; 13.8 MB written; 249 unique bytes (real content) libva sha: 51672c373689cb02d26565a5cfa1433816f0d84117757e7915802565710633fc kdirect sha: 614a3a5cf89702aa5c9ada5903bb19482e078555ebcd6c03f936a8e5d219ca30 ``` Libva produces correct file size + real content, but the bytes differ from kdirect / SW. First-bytes inspection shows libva at `0d 0d ...` (very dark) vs kdirect/SW at `c5 c5 ...` (bright). Likely either: - libva picks a different frame range (different seek semantics) - libva's VP9 frame-ctrl submission to the kernel differs subtly from ffmpeg's V4L2 hwaccel — same class as the Hi10P libva ctrl-submission bug This is a **separate issue** from the kernel-side work this ticket covers. The kernel + kdirect path is solidly bit-exact. The libva-side fine-tuning is its own debug session if/when we want browser-side VP9 acceleration. ## State on ampere - Booted kernel: `7.0.0-rc3-vp9-test+` (default in extlinux as of this test) - Both rkvdec cores attempt to register; only the primary at fdc38100 succeeds (multi-core unsupported in Sarma's v8 series) - Other decoders (hantro mpeg2/vp8, av1-vpu-dec) unaffected by the kernel rebuild - Audit-trail backups all intact (`extlinux.conf.pre-*-bkp`, `.pre-vp9-test-bak` .ko files, `.pre-iter6postmortem` etc.) ## Path forward (operator pick) The kernel-side issue is functionally resolved. Next moves: a) **Merge PR #24** (https://git.reauktion.de/marfrit/kernel-agent/pulls/24) — archives the 3 patches as scope-tagged artifacts. b) **Bump `fleet/ampere.yaml`** to include the 3 patches in the official baseline. Then any future `linux-ampere-fourier` PKGBUILD rebuild picks them up. c) **Optionally**: chase the libva-vs-kdirect bit-divergence (separate ticket). For browser-side VP9 acceleration via firefox-fourier/chromium-fourier, this matters. d) **Or**: keep ampere on `arch_vp9_test` as the new working default; the existing `arch_devices` label stays as a fallback (no patches, no VP9). Recommend (a) + (b) immediately, defer (c) until a browser consumer surfaces a concrete VP9 bug. Closing this issue is justified by the kernel-side bit-exact proof; the libva fine-tuning goes to a new ticket if needed.
Author
Collaborator

Correction to my prior comment — libva path is ALSO bit-exact.

Per @marfrit's discipline reminder ("do not use the black frames of bbb for verification"), re-tested at -ss 30 (deep into the action, well past Blender title sequence):

$ for path in k sw libva; do
    ffmpeg [path-specific args] -ss 30 -i bbb_60s_720p.vp9.webm \
      -vf [hwdownload-or-format],format=nv12 -frames:v 20 \
      -f rawvideo -y /tmp/${path}_vp9_ss30.nv12
  done
$ sha256sum /tmp/{k,sw,libva}_vp9_ss30.nv12
c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48  /tmp/k_vp9_ss30.nv12
c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48  /tmp/sw_vp9_ss30.nv12
c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48  /tmp/libva_vp9_ss30.nv12

All three paths byte-identical (sha c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48) on the 20-frame mid-fixture range at -ss 30.

The earlier "libva differs from kdirect" at -ss 5 was the BBB title-sequence trap — -ss 5 lands the libva path on slightly different title frames than kdirect because of seek-table interpretation differences. Both decoded those frames correctly; they were just different frames. Mid-fixture (post-fadein, post-title) all three paths agree.

Updated verdict — VP9 on RK3588 is FULLY WORKING

  • Kernel side: VP9F enumerates on rkvdec /dev/video2, decode bit-exact HW==SW ✓
  • kdirect userspace (ffmpeg -hwaccel v4l2request): bit-exact ✓
  • libva userspace (ffmpeg -hwaccel vaapi, what firefox-fourier / chromium-fourier use): bit-exact ✓
  • vainfo: VAProfileVP9Profile0 enumerated automatically (iter38 multi-device probe) ✓
  • Other decoders: unaffected ✓

No separate libva-side debug session needed for VP9 — it's done across the board.

Recommendation

This issue is fully resolved at the kernel + userspace level. Next steps:

  • Merge PR #24 (archive Sarma patches as scope-tagged artifacts)
  • Bump fleet/ampere.yaml to include them in the official baseline
  • Close this issue

Memory entry feedback_no_bbb_intro_frames added to prevent re-litigating the seek-window trap in future test rigs. Same family as feedback_compare_hw_against_sw_reference and feedback_visual_check_before_concluding_reproduced — sharpened with a specific "-ss 30 minimum" rule for BBB-based test rigs.

**Correction to my prior comment — libva path is ALSO bit-exact.** Per @marfrit's discipline reminder ("do not use the black frames of bbb for verification"), re-tested at `-ss 30` (deep into the action, well past Blender title sequence): ``` $ for path in k sw libva; do ffmpeg [path-specific args] -ss 30 -i bbb_60s_720p.vp9.webm \ -vf [hwdownload-or-format],format=nv12 -frames:v 20 \ -f rawvideo -y /tmp/${path}_vp9_ss30.nv12 done $ sha256sum /tmp/{k,sw,libva}_vp9_ss30.nv12 c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48 /tmp/k_vp9_ss30.nv12 c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48 /tmp/sw_vp9_ss30.nv12 c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48 /tmp/libva_vp9_ss30.nv12 ``` **All three paths byte-identical** (sha `c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48`) on the 20-frame mid-fixture range at `-ss 30`. The earlier "libva differs from kdirect" at `-ss 5` was the BBB title-sequence trap — `-ss 5` lands the libva path on slightly different title frames than kdirect because of seek-table interpretation differences. Both decoded those frames correctly; they were just different frames. Mid-fixture (post-fadein, post-title) all three paths agree. ## Updated verdict — VP9 on RK3588 is FULLY WORKING - **Kernel side**: VP9F enumerates on rkvdec /dev/video2, decode bit-exact HW==SW ✓ - **kdirect userspace** (`ffmpeg -hwaccel v4l2request`): bit-exact ✓ - **libva userspace** (`ffmpeg -hwaccel vaapi`, what firefox-fourier / chromium-fourier use): **bit-exact ✓** - **vainfo**: VAProfileVP9Profile0 enumerated automatically (iter38 multi-device probe) ✓ - **Other decoders**: unaffected ✓ No separate libva-side debug session needed for VP9 — it's done across the board. ## Recommendation This issue is **fully resolved at the kernel + userspace level**. Next steps: - Merge PR #24 (archive Sarma patches as scope-tagged artifacts) - Bump `fleet/ampere.yaml` to include them in the official baseline - Close this issue Memory entry [`feedback_no_bbb_intro_frames`](/home/mfritsche/.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_no_bbb_intro_frames.md) added to prevent re-litigating the seek-window trap in future test rigs. Same family as `feedback_compare_hw_against_sw_reference` and `feedback_visual_check_before_concluding_reproduced` — sharpened with a specific "-ss 30 minimum" rule for BBB-based test rigs.
Author
Collaborator

Closing — VP9 on RK3588 fully resolved.

PR #24 merged at 96af34d7 carrying:

  1. Sarma's 3 VP9-VDPU381 patches imported as scope-tagged artifacts under patches/driver/media/0001..0003-*.patch + README.md with provenance / apply order / removal criteria.
  2. fleet/ampere.yaml bumped to reference the 3 patches in the baseline (apply order strict). The previous "VP9 explicitly NOT included" comment block removed.

Empirical proof (full triple bit-exact at -ss 30, post-fadein, post-title) re-recorded in the prior comment.

Reopen criteria

  • Sarma's series lands in mainline → drop the local patches, adopt the upstream version at the next kernel-agent baseline bump.
  • Multi-core VDPU381 support arrives (Sarma's v8 series only supports the primary core; the second core at fdc40100 is rejected with "missing multi-core support, ignoring this instance"). Performance gain unclear; not a regression today.

Ampere is currently running 7.0.0-rc3-vp9-test+ (the operator's hand-built kernel from this session). The next linux-ampere-fourier package rebuild from the merged manifest will produce a properly versioned kernel with the same patches.

Closed.

**Closing — VP9 on RK3588 fully resolved.** PR #24 merged at `96af34d7` carrying: 1. Sarma's 3 VP9-VDPU381 patches imported as scope-tagged artifacts under `patches/driver/media/0001..0003-*.patch` + `README.md` with provenance / apply order / removal criteria. 2. `fleet/ampere.yaml` bumped to reference the 3 patches in the baseline (apply order strict). The previous "VP9 explicitly NOT included" comment block removed. Empirical proof (full triple bit-exact at `-ss 30`, post-fadein, post-title) re-recorded in the prior comment. ## Reopen criteria - Sarma's series lands in mainline → drop the local patches, adopt the upstream version at the next kernel-agent baseline bump. - Multi-core VDPU381 support arrives (Sarma's v8 series only supports the primary core; the second core at `fdc40100` is rejected with "missing multi-core support, ignoring this instance"). Performance gain unclear; not a regression today. Ampere is currently running `7.0.0-rc3-vp9-test+` (the operator's hand-built kernel from this session). The next `linux-ampere-fourier` package rebuild from the merged manifest will produce a properly versioned kernel with the same patches. Closed.
Sign in to join this conversation.