Files
libva-v4l2-request-fourier/README.md
T
claude-noether 0182307403 README: add Quickstart section with per-host install + full stack matrix
The TL;DR of 'what packages do I install to watch YouTube on my
Rockchip board with HW acceleration in Firefox' wasn't reachable
from this README without reading three other repos' commit
histories. Fixed.

Now landed at the top:

- Stack matrix: kernel (linux-{fresnel,ampere}-fourier) -> ffmpeg
  (ffmpeg-v4l2-request-fourier) -> libva (libva-v4l2-request-fourier)
  -> browser (firefox-fourier or chromium-fourier + kwin-fourier on
  Wayland).
- Honest acknowledgement that the browser HW path is libavcodec
  hwdevice DRM, not VAAPI-via-libva. This backend matters for mpv /
  ffmpeg-as-vaapi consumers.
- Per-host pacman -S incantations for fresnel (RK3399), ampere
  (RK3588), ohm (RK3566).
- Live marfrit repo URL + signing-key import flow.
- Smoke-test commands (vainfo + MOZ_LOG patterns).
- Honest status flag: ffmpeg-v4l2-request-fourier, chromium-fourier,
  qt6-base-fourier exist in marfrit-packages source tree but NOT
  yet in the live repo. Users building those locally now.
- RK3588 mainline (Feb 2026) called out alongside ampere row.

What hasn't changed: Pi 5 standoff section, technical notes,
existing iter39 / iter40 status tables.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 20:48:53 +00:00

286 lines
13 KiB
Markdown

# libva-v4l2-request-fourier
VA-API ICD backend for V4L2 stateless video decoders. Fourier-campaign
fork of the dormant `bootlin/libva-v4l2-request` upstream.
> **TL;DR for "I want hardware-accelerated YouTube in Firefox on my
> Rockchip board":** skip to the [§ Quickstart](#quickstart) below.
> Fresnel (RK3399) and ampere (RK3588) are validated targets; ohm
> (RK3566 PineTab2) is the chromium-fourier validation rig.
## What works
| SoC / host | HW-accelerated codecs | Bit-exact vs `kdirect` |
|---|---|---|
| RK3399 (fresnel — Pinebook Pro) | H.264, HEVC Main, VP9 Profile 0, VP8, MPEG-2 | 5/5 at iter38; preserved through iter40b |
| RK3588 (ampere) | H.264 + HEVC (iter1+iter2 ampere-fourier); **mainline rkvdec / VDPU381 + VDPU383 landed February 2026** — VP9 / AV1 verification next | iter1 H.264 PASS; remaining codecs gated on mainline-driver bring-up |
| RK3568 / RK3566 (ohm — PineTab2) | H.264, MPEG-2, VP8 via hantro multi-planar | iter1-5 baseline (libva-multiplanar campaign) |
| BCM2712 (higgs — Pi 5 / CM5) | — | infrastructure landed (iter40 / iter40b), bit-exact NOT achieved, [see § Pi 5 standoff](#the-pi-5-standoff) |
`kdirect` is the reference: `ffmpeg -hwaccel v4l2request
-hwaccel_output_format drm_prime ...` via Kwiboo's downstream ffmpeg
patches (packaged here as **`ffmpeg-v4l2-request-fourier`**, FFmpeg 8.1
tip @ Kwiboo `v4l2-request-n8.1` commit `b57fbbe`).
## Quickstart
### What you need for HW-accelerated YouTube in Firefox
The full stack, top to bottom, with the package this campaign provides
at each layer:
| Layer | Package(s) | Notes |
|---|---|---|
| Linux kernel with V4L2 stateless decoders | `linux-fresnel-fourier` (RK3399), `linux-ampere-fourier` (RK3588) | Mainline rkvdec / hantro / VDPU381 / VDPU383. ohm typically rides on a Beryllium OS host kernel. |
| `ffmpeg` with Kwiboo's v4l2-request hwaccel | `ffmpeg-v4l2-request-fourier` | Provides `-hwaccel drm -c:v hevc` (and h264/vp9) routes via libavcodec hwdevice DRM. |
| `libva` VA-API runtime + this backend ICD | `libva` (stock) + **`libva-v4l2-request-fourier`** | This repo. Auto-detects rkvdec / hantro / cedrus on probe. |
| Firefox patched to call libavcodec stateless | `firefox-fourier` | 5-patch series, ~+169 LoC over stock Firefox. Validated on fresnel: **~5 % CPU at 1080p30 H.264** (vs 64 % software). |
| (Wayland alt) Chromium patched for V4L2VDA | `chromium-fourier` + `kwin-fourier` | Validated on ohm under KDE Plasma 6.6.5 Wayland. Needs `kwin-fourier` for the dmabuf-fence latency fix. |
| (Optional) panfrost / panthor GPU stack | `vulkan-panfrost` | Wayland compositor + 3D. |
The actual VA-API path is mostly historical inside this campaign — the
**user-facing browser HW decode story rides libavcodec's
`v4l2_request` hwaccel directly**, not VAAPI-via-libva. Firefox-fourier
attaches an `AV_HWDEVICE_TYPE_DRM` context to libavcodec's generic
`h264`/`hevc`/`vp9` decoder; libavcodec then auto-binds the
`v4l2_request` hwaccel from its `hw_configs`. No `LIBVA_DRIVER_NAME`
incantation needed for browser use. libva-v4l2-request-fourier matters
for mpv, ffmpeg-as-vaapi, and other VA-API direct consumers.
### Install on Arch ALARM (fresnel / ampere / ohm)
Add the marfrit repo if you haven't already:
```ini
# /etc/pacman.conf
[marfrit]
SigLevel = Required
Server = https://packages.reauktion.de/arch/$arch
```
Import the signing key (one-time):
```bash
sudo pacman-key --recv-keys <KEY-ID> # see https://packages.reauktion.de
sudo pacman-key --lsign-key <KEY-ID>
sudo pacman -Sy
```
Then per host:
```bash
# Fresnel — RK3399 Pinebook Pro
sudo pacman -S \
linux-fresnel-fourier linux-fresnel-fourier-headers \
libva-v4l2-request-fourier \
firefox-fourier
# (ffmpeg-v4l2-request-fourier currently still a local build — see § Status)
# Ampere — RK3588
sudo pacman -S \
linux-ampere-fourier linux-ampere-fourier-headers \
libva-v4l2-request-fourier \
firefox-fourier
# Ohm — RK3566 PineTab2 (chromium-fourier validated path)
sudo pacman -S \
libva-v4l2-request-fourier \
kwin-fourier
# chromium-fourier currently still a local build — see § Status
```
Reboot if a new kernel landed. Then:
```bash
# Smoke-test: vainfo should list HEVCMain + H264 entries
LIBVA_DRIVER_NAME=v4l2_request vainfo
# Browser launch with verbose decoder logging
MOZ_LOG="PlatformDecoderModule:5,FFmpegVideo:5" \
firefox-fourier 2>&1 | tee /tmp/fx.log
# Then open a YouTube 1080p H.264 video and grep for:
# "Choosing FFmpeg pixel format for V4L2 video decoding"
# "av_hwdevice_ctx_create(DRM, /dev/dri/renderD128) ok"
# If you DON'T see those: HW path didn't engage, fell back to software.
```
### Status of the published vs locally-built packages
As of May 2026, the live marfrit repo at
<https://packages.reauktion.de/arch/aarch64/> has:
-`libva-v4l2-request-fourier-1:1.0.0.r361.cf8cd9d-1` (iter40b tip)
-`firefox-fourier-150.0.1-16` (5-patch series, sandboxed RDD HW
decode validated on RK3399)
-`linux-fresnel-fourier-7.0-14` + headers (RK3399)
-`linux-ampere-fourier-7.0rc3.kafr1-1` + headers (RK3588)
-`kwin-fourier-1:6.6.5-1` (Wayland dmabuf-fence fix for chromium-fourier)
-`vulkan-panfrost-1:26.0.5-1` (GPU stack)
NOT yet published but **present in `marfrit-packages/arch/` source
tree** (build + publish pending):
-`ffmpeg-v4l2-request-fourier` (Kwiboo's v4l2-request hwaccel
series on FFmpeg 8.1 — firefox-fourier's HW path RELIES on this; the
validated 5 % CPU measurement was with this companion installed
manually).
-`chromium-fourier` (Chromium 147 + V4L2VDA-on-mainline patches —
blocked on Arch ALARM bumping clang 22 → 23).
-`qt6-base-fourier` (GL_ALPHA → GL_R8 fix — needed by KDE Plasma
Wayland on the panfrost stack).
If you want the fully-validated Firefox-on-fresnel stack today, you'll
need to build `ffmpeg-v4l2-request-fourier` from
`marfrit-packages/arch/ffmpeg-v4l2-request-fourier/` locally:
```bash
git clone ssh://git@git.reauktion.de:2222/marfrit/marfrit-packages.git
cd marfrit-packages/arch/ffmpeg-v4l2-request-fourier
makepkg -si
```
Same recipe for `chromium-fourier` and friends. They'll move to the
live repo as their dependencies settle.
## What does NOT work, and why it's stalled
| Target | Status | Blocker |
|---|---|---|
| H264 Hi10P on RK3399 | enumerated, decode returns all-zero | RK3399 silicon doesn't implement 10-bit despite kernel advertising the profile (iter39 close, Option B applied) |
| HEVC Main10 on RK3399 | not enumerated | same as Hi10P |
| **Pi 5 / CM5 (BCM2712 / `rpi-hevc-dec`)** | infrastructure landed (iter40 / iter40b), bit-exact NOT achieved | see "The Pi 5 standoff" below |
## What does NOT work, and why it's stalled
| Target | Status | Blocker |
|---|---|---|
| H264 Hi10P on RK3399 | enumerated, decode returns all-zero | RK3399 silicon doesn't implement 10-bit despite kernel advertising the profile (iter39 close, Option B applied) |
| HEVC Main10 on RK3399 | not enumerated | same as Hi10P |
| **Pi 5 / CM5 (BCM2712 / `rpi-hevc-dec`)** | infrastructure landed (iter40 / iter40b), bit-exact NOT achieved | see "The Pi 5 standoff" below |
### The Pi 5 standoff
iter40 + iter40b add a third multi-device-probe slot for
`rpi-hevc-dec`, an NC12 SAND128 detile primitive, per-driver gates
around the SPS pre-seed + start-code-prepend + scaling_matrix submission,
and a (fragile, fixture-specific) SPS field override using the
GStreamer 1.28.2 H.265 parser. ICD discovery works, `vainfo` lists
`VAProfileHEVCMain`, S\_FMT / REQBUFS / STREAMON all succeed.
**Decode itself never succeeds** — every CAPTURE DQBUF returns
`V4L2_BUF_FLAG_ERROR`. Driver author John Cox confirmed strict SPS
validation is intentional ("`try_ext_ctrls returned an error (22)` is
expected as it is validating the SPS"), and VAAPI's
`VAPictureParameterBufferHEVC` simply doesn't carry the bitstream-true
scalars (`sps_max_num_reorder_pics`, `sps_max_latency_increase_plus1`,
slice-level `num_entry_point_offsets`) that the driver wants. We can't
fish the SPS out of `source_data` either, because ffmpeg-vaapi parses
the SPS itself and passes only slice NAL bytes to libva backends.
This is not a bug in our backend, in libva, in ffmpeg, or in the kernel
driver. It's an ecosystem coordination failure of long standing:
- **Kwiboo's `ffmpeg-v4l2request` hwaccel** has been in production via
LibreELEC since December 2018. Re-submitted to ffmpeg-devel as a v2
series in August 2024. Still un-merged in May 2026 — **eight years
in the upstream review queue**.
- **`libva-v4l2-request`** (this project's upstream) hasn't taken
meaningful commits since ~2021. Nobody wants to own the impedance
mismatch between VAAPI's Intel-shaped "give me raw bitstream, I'll
parse" and V4L2 stateless's kernel-shaped "give me parsed structs,
I'll just drive the HW."
- **`rpi-hevc-dec` mainline submission** is at v4 (July 2025), 17
months in review. The Pi 6.18.x downstream kernel meanwhile has
active HEVC regressions ([raspberrypi/linux#7228](https://github.com/raspberrypi/linux/issues/7228),
[#7306](https://github.com/raspberrypi/linux/issues/7306)) that
aren't being fast-tracked because "the new uAPI is coming."
- **Mozilla is implementing Pi 5 HEVC via ffmpeg's hwaccel-context
path** (bug [1969297](https://bugzilla.mozilla.org/show_bug.cgi?id=1969297)),
not via libva — explicit acknowledgement from David Turner that
libavcodec needs to retain the SPS context for the strict driver to
accept the control batch.
What end-users actually do today: run Pi OS (downstream-patched ffmpeg
+ downstream kernel) or LibreELEC (Kwiboo's patches + downstream
kernel). Anyone on a stock distro outside those two: no HW HEVC on
Pi 5.
Nobody who has authority to merge has skin in the game. Everyone with
skin in the game lacks authority. Result: 8-year stalemate, three
forks of working code, no merged upstream.
### What this means for this backend
We chose to extend `libva-v4l2-request` into Pi 5 territory because
the architecture maps cleanly onto the existing iter38 multi-device
probe. That work landed (iter40 commit `3ffa9d0`, iter40b commit
`071b08d`). It's reusable infrastructure for any future strict V4L2
stateless decoder that ffmpeg ships before libva does.
But the *user-facing* Pi 5 HEVC story will not come from this
backend. The backend was a clean architectural target inside a
coordination dead-end. The actual Pi 5 HEVC path through libva
requires either:
- a VAAPI extension exposing the SPS scalars rpi-hevc-dec validates
against (Intel-driven; no Pi-aligned principal),
- a libva-internal `VABufferType` for raw SPS/PPS NAL bytes (no
maintainer),
- ffmpeg-vaapi forwarding `num_entry_point_offsets` to backends
(small upstream patch; no champion), OR
- the political situation around Kwiboo's series unblocks (no
visible movement).
iter40 + iter40b are **landed but parked**. The fresnel + ampere
sibling paths are unaffected (5/5 fresnel + 9 profiles ampere
verified post-iter40b, no regression). Phase 8 packaging is
deliberately skipped — shipping a `.deb` whose primary advertised
target (Pi 5) doesn't actually decode would mislead users.
See `phase0_pi5_hevc.md`, `phase1_pi5_hevc.md`,
`phase5_pi5_hevc_review.md`, `phase7_pi5_hevc_close.md` for the
chapter's full empirical record.
## Instructions
In order to use this backend, set the `LIBVA_DRIVER_NAME` environment
variable:
export LIBVA_DRIVER_NAME=v4l2_request
Then a VA-API-capable player can decode supported codecs on a probed
device:
vlc path/to/video.mp4
mpv --hwdec=vaapi path/to/video.mp4
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i in.mp4 -f null -
The backend auto-detects available decoders via the V4L2 media
topology walk; honors `LIBVA_V4L2_REQUEST_VIDEO_PATH` and
`LIBVA_V4L2_REQUEST_MEDIA_PATH` for explicit device selection.
## Technical Notes
### Multi-device probe (iter38)
A single libva session opens both `rkvdec` and `hantro-vpu` (and, on
hosts where it's present, `rpi-hevc-dec`) at init. `RequestCreateConfig`
re-targets the active fd per profile via
`request_switch_device_for_profile()`. Pool teardown happens at
switch time; the next `CreateContext` rebuilds against the right
device.
### Surface / Context / Picture / Image
A Surface is an internal data structure containing rendering output.
A Context owns the V4L2 lifecycle (S\_FMT, CAPTURE pool, ctrl-batch
defaults) for one decode session. A Picture is one encoded input
frame's set of buffers. An Image is a Standard VA pixel-format view
on a decoded Surface — the backend detiles SAND/COL128 or unpacks
NV15 to NV12/P010 here so consumers see linear pitches.
The real rendering is in `EndPicture`, not `RenderPicture`, because
the kernel needs the full extended-control batch when the OUTPUT
buffer is queued, and `RenderPicture` order is consumer-defined.