Empirical higgs probe (sibling session 2026-05-17) confirmed rpi-hevc-dec at /dev/video19 is V4L2 STATELESS, not stateful: - Section header literally "Stateless Codec Controls" - OUTPUT V4L2_PIX_FMT_HEVC_SLICE (parsed slices), not full-stream HEVC - V4L2_CID_STATELESS_HEVC_* control set + slice_param_array[4096] - CAPTURE NC12 / NC30 (V4L2_PIX_FMT_NV12_COL128 / _10_COL128, SAND 128-column tiled, Pi-specific) So the Pi 5 HEVC HW path belongs HERE (request/stateless backend), not in a separate stateful project. Replaces the now-deleted libva-v4l2-stateful-fourier scaffold attempt. phase0_pi5_hevc.md captures: - Substrate (target host, backend baseline, empirical probe output) - What carries forward unchanged (most of HEVC plumbing) - What needs adding (RPI_HEVC_DEC driver_kind, NC12/NC30 video_format + detile primitive, image.c branch — small surface area) - Six open questions Phase 1 must answer first (EXT_SPS presence, start_code default, SAND tile spec, drm_prime modifier round-trip, rpi-hevc-dec submission ordering quirks, packaging target OS) - Phase 1 goal sketch (NOT locked) + Phase 3 baseline plan No code in this commit. Phase 1 opens when higgs is up + first two open questions are answered live. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7.3 KiB
Phase 0 — Pi 5 / CM5 HEVC chapter
Opened 2026-05-17 evening, after the failed libva-v4l2-stateful-fourier
scaffold attempt. Brother-session empirical Phase 0 on higgs invalidated
the stateful premise: rpi-hevc-dec is V4L2 stateless, so Pi 5 HEVC
belongs in this backend, not a separate sibling.
No code in this chapter yet. This doc is the substrate. Phase 1 picks up from the "Open questions" section.
Substrate
Target host
higgs — Pi CM5 module on Pi CM5 IO board. BCM2712 SoC. VPN-only, often
offline; wake via HIS skill recipe (no Fritz!Box plug — runs on power
when on). Debian-based. Sole HW video decoder is rpi-hevc-dec at
/dev/video19 + /dev/media1.
Backend baseline at chapter open
libva-v4l2-request-fourier master tip cf8cd9d (iter39 + Option B +
h265 ref-list cap fix). Multi-device probe (iter38) already opens
rkvdec + hantro slots; adding a third decoder slot for rpi-hevc-dec is
a natural extension of that architecture.
iter2 (ampere VDPU381 HEVC EXT_SPS) added the GStreamer 1.28.2 H.265
parser vendor + EXT_SPS_ST_RPS / _LT_RPS dynamic-array submission. That
plumbing is probe-gated (has_hevc_ext_sps_rps_rkvdec), so it stays
dormant on hosts where the controls don't exist.
Empirical higgs probe (brother session)
v4l2-ctl -d /dev/video19 --list-formats-ext --list-ctrls:
Stateless Codec Controls
hevc_sequence_parameter_set (compound, V4L2_CID_STATELESS_HEVC_SPS)
hevc_picture_parameter_set (compound, V4L2_CID_STATELESS_HEVC_PPS)
slice_param_array (compound dynamic-array dims=[4096])
hevc_scaling_matrix (compound)
hevc_decode_parameters (compound)
hevc_decode_mode (menu, "Frame-Based")
hevc_start_code (menu, default "No Start Code")
OUTPUT formats:
S265 V4L2_PIX_FMT_HEVC_SLICE (parsed slice payload)
CAPTURE formats:
NC12 V4L2_PIX_FMT_NV12_COL128 (8-bit SAND 128-column tiled)
NC30 V4L2_PIX_FMT_NV12_10_COL128 (10-bit SAND 128-column tiled)
Conclusion: this is the standard V4L2_CID_STATELESS_HEVC_* control set
exposed under the V4L2-request uAPI, exactly the same family our backend
already drives for rkvdec/hantro/cedrus HEVC paths. The novel parts are
two pixel formats (NC12, NC30) and one driver-id (rpi-hevc-dec).
What carries forward unchanged
- VAAPI HEVC profile enumeration (
config.c) h265_set_controlscore path (h265.c) — same compound ctrl set- Synthetic SPS pre-seed pattern (iter25/26) — already runs pre-CAPTURE-alloc
- Multi-device dispatch in
RequestCreateConfig(iter38) - VAAPI slice / picture / IQ matrix buffer parsing
- HEVC h264-style start-code policy (we already DON'T prepend for HEVC)
What needs adding
| Item | Location | Sizing |
|---|---|---|
RPI_HEVC_DEC enum in driver_kind_t |
request.h |
trivial |
Multi-device probe extends to /dev/video19 discovery |
context.c / request.c init |
small — mirror hantro slot |
V4L2_PIX_FMT_NV12_COL128 (NC12) video_format entry |
video.c |
small |
V4L2_PIX_FMT_NV12_10_COL128 (NC30) video_format entry |
video.c |
small |
| NC12 → NV12 detile primitive | new nv12_col128.c |
mid — column tile layout, see kernel docs |
| NC30 → P010 detile primitive | new nv12_col128.c |
mid — 10-bit variant of above |
copy_surface_to_image branch for NC12/NC30 |
image.c |
small (mirror NV15→P010 gating) |
| Per-driver gating for any rpi-specific quirks discovered | various | per per-driver-kludge-gating |
Open questions for Phase 1
Lock these before Phase 1 commits to a goal.
-
EXT_SPS controls on rpi-hevc-dec? Brother's
--list-ctrlsoutput above shows the standardV4L2_CID_STATELESS_HEVC_*family — NOT theEXT_SPS_ST_RPS/EXT_SPS_LT_RPSextensions that VDPU381 needs. Verify: doesslice_param_array[4096]acceptst_rps_bits/lt_rps_bitsin the per-slice payload, or does rpi-hevc-dec parse RPS itself from the slice header? If the latter, the iter2 EXT_SPS path stays dormant (probe-gated already), and rpi-hevc-dec just needs thepicture->st_rps_bits→slice_params->short_term_ref_pic_set_sizeplumbing that iter31 α-29 already wired. Expectation: works out of the box. Confirm before assuming. -
hevc_start_codectrl: "No Start Code" vs Annex B? Brother saw default"No Start Code"— matches our behavior (we don't prepend on HEVC). But the ctrl is configurable. Verify the menu values exposed and confirm "No Start Code" passes our raw slice-NAL payload as-is. If it doesn't, set the ctrl explicitly per unconditional-codec-state gating. -
NC12 / NC30 SAND tile layout — exact spec. Read
Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rstfor the COL128 variants. Confirm: column stride = 128 bytes (Y) / 128 bytes (UV interleaved). Row count =ALIGN(height, 16)orALIGN(height, 8)? Get the exact alignment and tile-traversal order before writing the detile primitive. Cite from kernel doc, NOT inferred from a hex dump. -
drm_prime / SAND modifier round-trip. Does ffmpeg-vaapi (and Firefox) accept the NC12 buffer via DRM_PRIME export carrying the DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT modifier, allowing zero-copy to a SAND-aware compositor? Or is libva-side detile to a linear NV12 buffer the only viable Firefox path? If detile is required for the consumer, the rockchip-pixel-verify-path rule (DMA-BUF GL preferred over cached mmap) might NOT apply since SAND is Pi-specific and not in the wider Wayland modifier ecosystem.
-
rpi-hevc-dec quirks on first SPS submission. rkvdec needs image_fmt pre-seed before CAPTURE alloc (iter25). Does rpi-hevc-dec have an analogous "must set OUTPUT pix_fmt + SPS before CAPTURE" ordering? Verify with strace early.
-
higgs OS + libva versioning. Brother probed on Debian. We package for Arch ALARM. What's the install path on higgs — Arch / Debian / Raspberry Pi OS? If Debian, the package needs a
debian/tree, not just PKGBUILD. Decide packaging target before Phase 8.
Phase 1 goal sketch (NOT locked)
Firefox HW HEVC playback on higgs at ≥30fps for 1080p Main, byte-exact libva-vs-kdirect for ≥3 reference fixtures (8-bit Main and 10-bit Main10).
Two measurable subgoals follow naturally:
- libva (this backend, NV12 image output) == kdirect (ffmpeg-v4l2request, NV12 image output) byte-exact for the same input.
- Firefox VA-API path engages (verify via
chrome://gpuequivalent / log inspection —MOZ_LOG=PlatformDecoderModule:5).
Phase 3 baseline plan
Before any backend code touches rpi-hevc-dec:
kdirectfloor:ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime -i bbb_720p10s_hevc.mp4 -vf hwdownload,format=nv12 -frames:v 10 ...and sha256 the YUV.SW reference: same ffmpeg without-hwaccel, sha256 the YUV.- Both runs N=3 per replicate-baseline-first.
- Capture
strace -f -e ioctlof the kdirect run — gives the canonical ioctl sequence rpi-hevc-dec expects.
Phase 0 closing
This doc commits the substrate. Phase 1 starts when:
- higgs is up + reachable
- Open questions 1+2 (EXT_SPS + start_code) are answered live, in one short probe session
- Phase 3 baseline floors are captured
No work blocks the close of iter39 / fresnel campaign — those are shipped.