claude-noether 941fbc5b1b README: candid 'standoff' framing for Pi 5 HEVC + RK matrix
Replace the original 2018 Bootlin upstream README with the
fourier-fork situation as of May 2026. What works: fresnel 5/5,
ampere iter1+2, ohm baseline (all RK family, mainline VDPU381/383
landing Feb 2026 helps).

What doesn't: Pi 5 HEVC via this backend. New 'The Pi 5 standoff'
section captures the honest situation surfaced by the May 2026
web-research pass:

- Kwiboo's ffmpeg-v4l2request hwaccel: 8 years un-merged upstream
- libva-v4l2-request: no commits since ~2021
- rpi-hevc-dec mainline: 17 months in review, still not merged;
  Pi 6.18.x downstream has active HEVC regressions (#7228, #7306)
- Mozilla bug 1969297 picks the ffmpeg-hwaccel-context path, not
  libva — explicit ack that strict drivers need libavcodec's
  internal SPS context
- Frames the issue as ecosystem coordination failure (principal-
  agent stalemate), not architectural impossibility

Notes that iter40 + iter40b lands but parks: backend infra is
sound + reusable for any future strict V4L2 stateless target ffmpeg
ships before libva does, but the user-facing Pi 5 HEVC story will
not come from this backend — it'll come from Mozilla / Kwiboo /
upstream coordination unblocking.

iter38 5/5 fresnel + 9-profile ampere baselines preserved
post-iter40b — documented as no-regression in phase7_pi5_hevc_close.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 19:58:52 +00:00
2016-08-26 15:43:09 +02:00
2016-08-26 15:43:09 +02:00
2018-09-08 08:51:51 +02:00

libva-v4l2-request-fourier

VA-API ICD backend for V4L2 stateless video decoders. Fourier-campaign fork of the dormant bootlin/libva-v4l2-request upstream.

What works

SoC / host Codecs verified bit-exact vs kdirect
RK3399 (fresnel — Pinebook Pro) H.264, HEVC Main, VP9 Profile 0, VP8, MPEG-2 — 5/5 at iter38
RK3588 (ampere) H.264 (iter1 ampere-fourier); HEVC EXT_SPS structure clean (iter2); other codecs in progress
RK3568 / RK3566 (ohm — PineTab2) iter1-5 baseline (libva-multiplanar campaign)

kdirect = ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime ... through Kwiboo's downstream ffmpeg patches. The Rockchip family has the benefit of years of rkvdec + hantro-vpu iteration in mainline + the RK3588/RK3576 video decoder series landing in mainline February 2026.

What does NOT work, and why it's stalled

Target Status Blocker
H264 Hi10P on RK3399 enumerated, decode returns all-zero RK3399 silicon doesn't implement 10-bit despite kernel advertising the profile (iter39 close, Option B applied)
HEVC Main10 on RK3399 not enumerated same as Hi10P
Pi 5 / CM5 (BCM2712 / rpi-hevc-dec) infrastructure landed (iter40 / iter40b), bit-exact NOT achieved see "The Pi 5 standoff" below

The Pi 5 standoff

iter40 + iter40b add a third multi-device-probe slot for rpi-hevc-dec, an NC12 SAND128 detile primitive, per-driver gates around the SPS pre-seed + start-code-prepend + scaling_matrix submission, and a (fragile, fixture-specific) SPS field override using the GStreamer 1.28.2 H.265 parser. ICD discovery works, vainfo lists VAProfileHEVCMain, S_FMT / REQBUFS / STREAMON all succeed.

Decode itself never succeeds — every CAPTURE DQBUF returns V4L2_BUF_FLAG_ERROR. Driver author John Cox confirmed strict SPS validation is intentional ("try_ext_ctrls returned an error (22) is expected as it is validating the SPS"), and VAAPI's VAPictureParameterBufferHEVC simply doesn't carry the bitstream-true scalars (sps_max_num_reorder_pics, sps_max_latency_increase_plus1, slice-level num_entry_point_offsets) that the driver wants. We can't fish the SPS out of source_data either, because ffmpeg-vaapi parses the SPS itself and passes only slice NAL bytes to libva backends.

This is not a bug in our backend, in libva, in ffmpeg, or in the kernel driver. It's an ecosystem coordination failure of long standing:

  • Kwiboo's ffmpeg-v4l2request hwaccel has been in production via LibreELEC since December 2018. Re-submitted to ffmpeg-devel as a v2 series in August 2024. Still un-merged in May 2026 — eight years in the upstream review queue.
  • libva-v4l2-request (this project's upstream) hasn't taken meaningful commits since ~2021. Nobody wants to own the impedance mismatch between VAAPI's Intel-shaped "give me raw bitstream, I'll parse" and V4L2 stateless's kernel-shaped "give me parsed structs, I'll just drive the HW."
  • rpi-hevc-dec mainline submission is at v4 (July 2025), 17 months in review. The Pi 6.18.x downstream kernel meanwhile has active HEVC regressions (raspberrypi/linux#7228, #7306) that aren't being fast-tracked because "the new uAPI is coming."
  • Mozilla is implementing Pi 5 HEVC via ffmpeg's hwaccel-context path (bug 1969297), not via libva — explicit acknowledgement from David Turner that libavcodec needs to retain the SPS context for the strict driver to accept the control batch.

What end-users actually do today: run Pi OS (downstream-patched ffmpeg

  • downstream kernel) or LibreELEC (Kwiboo's patches + downstream kernel). Anyone on a stock distro outside those two: no HW HEVC on Pi 5.

Nobody who has authority to merge has skin in the game. Everyone with skin in the game lacks authority. Result: 8-year stalemate, three forks of working code, no merged upstream.

What this means for this backend

We chose to extend libva-v4l2-request into Pi 5 territory because the architecture maps cleanly onto the existing iter38 multi-device probe. That work landed (iter40 commit 3ffa9d0, iter40b commit 071b08d). It's reusable infrastructure for any future strict V4L2 stateless decoder that ffmpeg ships before libva does.

But the user-facing Pi 5 HEVC story will not come from this backend. The backend was a clean architectural target inside a coordination dead-end. The actual Pi 5 HEVC path through libva requires either:

  • a VAAPI extension exposing the SPS scalars rpi-hevc-dec validates against (Intel-driven; no Pi-aligned principal),
  • a libva-internal VABufferType for raw SPS/PPS NAL bytes (no maintainer),
  • ffmpeg-vaapi forwarding num_entry_point_offsets to backends (small upstream patch; no champion), OR
  • the political situation around Kwiboo's series unblocks (no visible movement).

iter40 + iter40b are landed but parked. The fresnel + ampere sibling paths are unaffected (5/5 fresnel + 9 profiles ampere verified post-iter40b, no regression). Phase 8 packaging is deliberately skipped — shipping a .deb whose primary advertised target (Pi 5) doesn't actually decode would mislead users.

See phase0_pi5_hevc.md, phase1_pi5_hevc.md, phase5_pi5_hevc_review.md, phase7_pi5_hevc_close.md for the chapter's full empirical record.

Instructions

In order to use this backend, set the LIBVA_DRIVER_NAME environment variable:

export LIBVA_DRIVER_NAME=v4l2_request

Then a VA-API-capable player can decode supported codecs on a probed device:

vlc path/to/video.mp4
mpv --hwdec=vaapi path/to/video.mp4
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i in.mp4 -f null -

The backend auto-detects available decoders via the V4L2 media topology walk; honors LIBVA_V4L2_REQUEST_VIDEO_PATH and LIBVA_V4L2_REQUEST_MEDIA_PATH for explicit device selection.

Technical Notes

Multi-device probe (iter38)

A single libva session opens both rkvdec and hantro-vpu (and, on hosts where it's present, rpi-hevc-dec) at init. RequestCreateConfig re-targets the active fd per profile via request_switch_device_for_profile(). Pool teardown happens at switch time; the next CreateContext rebuilds against the right device.

Surface / Context / Picture / Image

A Surface is an internal data structure containing rendering output. A Context owns the V4L2 lifecycle (S_FMT, CAPTURE pool, ctrl-batch defaults) for one decode session. A Picture is one encoded input frame's set of buffers. An Image is a Standard VA pixel-format view on a decoded Surface — the backend detiles SAND/COL128 or unpacks NV15 to NV12/P010 here so consumers see linear pitches.

The real rendering is in EndPicture, not RenderPicture, because the kernel needs the full extended-control batch when the OUTPUT buffer is queued, and RenderPicture order is consumer-defined.

S
Description
bootlin/libva-v4l2-request fork: multiplanar V4L2 support for Rockchip hantro (Fourier)
Readme 2.6 MiB
Languages
C 96.2%
Shell 2%
Meson 0.8%
Assembly 0.4%
Makefile 0.4%
Other 0.2%