25b8a15e09
Empirical higgs probe (sibling session 2026-05-17) confirmed rpi-hevc-dec at /dev/video19 is V4L2 STATELESS, not stateful: - Section header literally "Stateless Codec Controls" - OUTPUT V4L2_PIX_FMT_HEVC_SLICE (parsed slices), not full-stream HEVC - V4L2_CID_STATELESS_HEVC_* control set + slice_param_array[4096] - CAPTURE NC12 / NC30 (V4L2_PIX_FMT_NV12_COL128 / _10_COL128, SAND 128-column tiled, Pi-specific) So the Pi 5 HEVC HW path belongs HERE (request/stateless backend), not in a separate stateful project. Replaces the now-deleted libva-v4l2-stateful-fourier scaffold attempt. phase0_pi5_hevc.md captures: - Substrate (target host, backend baseline, empirical probe output) - What carries forward unchanged (most of HEVC plumbing) - What needs adding (RPI_HEVC_DEC driver_kind, NC12/NC30 video_format + detile primitive, image.c branch — small surface area) - Six open questions Phase 1 must answer first (EXT_SPS presence, start_code default, SAND tile spec, drm_prime modifier round-trip, rpi-hevc-dec submission ordering quirks, packaging target OS) - Phase 1 goal sketch (NOT locked) + Phase 3 baseline plan No code in this commit. Phase 1 opens when higgs is up + first two open questions are answered live. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
161 lines
7.3 KiB
Markdown
161 lines
7.3 KiB
Markdown
# Phase 0 — Pi 5 / CM5 HEVC chapter
|
||
|
||
Opened 2026-05-17 evening, after the failed `libva-v4l2-stateful-fourier`
|
||
scaffold attempt. Brother-session empirical Phase 0 on higgs invalidated
|
||
the stateful premise: rpi-hevc-dec is V4L2 **stateless**, so Pi 5 HEVC
|
||
belongs in this backend, not a separate sibling.
|
||
|
||
No code in this chapter yet. This doc is the substrate. Phase 1 picks up
|
||
from the "Open questions" section.
|
||
|
||
## Substrate
|
||
|
||
### Target host
|
||
|
||
higgs — Pi CM5 module on Pi CM5 IO board. BCM2712 SoC. VPN-only, often
|
||
offline; wake via HIS skill recipe (no Fritz!Box plug — runs on power
|
||
when on). Debian-based. Sole HW video decoder is rpi-hevc-dec at
|
||
`/dev/video19` + `/dev/media1`.
|
||
|
||
### Backend baseline at chapter open
|
||
|
||
`libva-v4l2-request-fourier` master tip `cf8cd9d` (iter39 + Option B +
|
||
h265 ref-list cap fix). Multi-device probe (iter38) already opens
|
||
rkvdec + hantro slots; adding a third decoder slot for rpi-hevc-dec is
|
||
a natural extension of that architecture.
|
||
|
||
iter2 (ampere VDPU381 HEVC EXT_SPS) added the GStreamer 1.28.2 H.265
|
||
parser vendor + EXT_SPS_ST_RPS / _LT_RPS dynamic-array submission. That
|
||
plumbing is probe-gated (`has_hevc_ext_sps_rps_rkvdec`), so it stays
|
||
dormant on hosts where the controls don't exist.
|
||
|
||
### Empirical higgs probe (brother session)
|
||
|
||
`v4l2-ctl -d /dev/video19 --list-formats-ext --list-ctrls`:
|
||
|
||
```
|
||
Stateless Codec Controls
|
||
|
||
hevc_sequence_parameter_set (compound, V4L2_CID_STATELESS_HEVC_SPS)
|
||
hevc_picture_parameter_set (compound, V4L2_CID_STATELESS_HEVC_PPS)
|
||
slice_param_array (compound dynamic-array dims=[4096])
|
||
hevc_scaling_matrix (compound)
|
||
hevc_decode_parameters (compound)
|
||
hevc_decode_mode (menu, "Frame-Based")
|
||
hevc_start_code (menu, default "No Start Code")
|
||
|
||
OUTPUT formats:
|
||
S265 V4L2_PIX_FMT_HEVC_SLICE (parsed slice payload)
|
||
|
||
CAPTURE formats:
|
||
NC12 V4L2_PIX_FMT_NV12_COL128 (8-bit SAND 128-column tiled)
|
||
NC30 V4L2_PIX_FMT_NV12_10_COL128 (10-bit SAND 128-column tiled)
|
||
```
|
||
|
||
Conclusion: this is the standard `V4L2_CID_STATELESS_HEVC_*` control set
|
||
exposed under the V4L2-request uAPI, exactly the same family our backend
|
||
already drives for rkvdec/hantro/cedrus HEVC paths. The novel parts are
|
||
two pixel formats (NC12, NC30) and one driver-id (rpi-hevc-dec).
|
||
|
||
## What carries forward unchanged
|
||
|
||
- VAAPI HEVC profile enumeration (`config.c`)
|
||
- `h265_set_controls` core path (`h265.c`) — same compound ctrl set
|
||
- Synthetic SPS pre-seed pattern (iter25/26) — already runs pre-CAPTURE-alloc
|
||
- Multi-device dispatch in `RequestCreateConfig` (iter38)
|
||
- VAAPI slice / picture / IQ matrix buffer parsing
|
||
- HEVC h264-style start-code policy (we already DON'T prepend for HEVC)
|
||
|
||
## What needs adding
|
||
|
||
| Item | Location | Sizing |
|
||
|------|----------|--------|
|
||
| `RPI_HEVC_DEC` enum in `driver_kind_t` | `request.h` | trivial |
|
||
| Multi-device probe extends to `/dev/video19` discovery | `context.c` / `request.c` init | small — mirror hantro slot |
|
||
| `V4L2_PIX_FMT_NV12_COL128` (NC12) `video_format` entry | `video.c` | small |
|
||
| `V4L2_PIX_FMT_NV12_10_COL128` (NC30) `video_format` entry | `video.c` | small |
|
||
| NC12 → NV12 detile primitive | new `nv12_col128.c` | mid — column tile layout, see kernel docs |
|
||
| NC30 → P010 detile primitive | new `nv12_col128.c` | mid — 10-bit variant of above |
|
||
| `copy_surface_to_image` branch for NC12/NC30 | `image.c` | small (mirror NV15→P010 gating) |
|
||
| Per-driver gating for any rpi-specific quirks discovered | various | per [[per-driver-kludge-gating]] |
|
||
|
||
## Open questions for Phase 1
|
||
|
||
Lock these before Phase 1 commits to a goal.
|
||
|
||
1. **EXT_SPS controls on rpi-hevc-dec?** Brother's `--list-ctrls` output
|
||
above shows the standard `V4L2_CID_STATELESS_HEVC_*` family — NOT the
|
||
`EXT_SPS_ST_RPS` / `EXT_SPS_LT_RPS` extensions that VDPU381 needs.
|
||
Verify: does `slice_param_array[4096]` accept `st_rps_bits` /
|
||
`lt_rps_bits` in the per-slice payload, or does rpi-hevc-dec parse RPS
|
||
itself from the slice header? If the latter, the iter2 EXT_SPS path
|
||
stays dormant (probe-gated already), and rpi-hevc-dec just needs the
|
||
`picture->st_rps_bits` → `slice_params->short_term_ref_pic_set_size`
|
||
plumbing that iter31 α-29 already wired. Expectation: works out of the
|
||
box. Confirm before assuming.
|
||
|
||
2. **`hevc_start_code` ctrl: "No Start Code" vs Annex B?** Brother saw
|
||
default `"No Start Code"` — matches our behavior (we don't prepend on
|
||
HEVC). But the ctrl is configurable. Verify the menu values exposed
|
||
and confirm "No Start Code" passes our raw slice-NAL payload as-is.
|
||
If it doesn't, set the ctrl explicitly per [[unconditional-codec-state]]
|
||
gating.
|
||
|
||
3. **NC12 / NC30 SAND tile layout — exact spec.** Read
|
||
`Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst` for the
|
||
COL128 variants. Confirm: column stride = 128 bytes (Y) / 128 bytes
|
||
(UV interleaved). Row count = `ALIGN(height, 16)` or `ALIGN(height, 8)`?
|
||
Get the exact alignment and tile-traversal order before writing the
|
||
detile primitive. Cite from kernel doc, NOT inferred from a hex dump.
|
||
|
||
4. **drm_prime / SAND modifier round-trip.** Does ffmpeg-vaapi (and
|
||
Firefox) accept the NC12 buffer via DRM_PRIME export carrying the
|
||
DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT modifier, allowing
|
||
zero-copy to a SAND-aware compositor? Or is libva-side detile to a
|
||
linear NV12 buffer the only viable Firefox path? If detile is
|
||
required for the consumer, the [[rockchip-pixel-verify-path]] rule
|
||
(DMA-BUF GL preferred over cached mmap) might NOT apply since SAND
|
||
is Pi-specific and not in the wider Wayland modifier ecosystem.
|
||
|
||
5. **rpi-hevc-dec quirks on first SPS submission.** rkvdec needs
|
||
image_fmt pre-seed before CAPTURE alloc (iter25). Does rpi-hevc-dec
|
||
have an analogous "must set OUTPUT pix_fmt + SPS before CAPTURE"
|
||
ordering? Verify with strace early.
|
||
|
||
6. **higgs OS + libva versioning.** Brother probed on Debian. We package
|
||
for Arch ALARM. What's the install path on higgs — Arch / Debian /
|
||
Raspberry Pi OS? If Debian, the package needs a `debian/` tree, not
|
||
just PKGBUILD. Decide packaging target before Phase 8.
|
||
|
||
## Phase 1 goal sketch (NOT locked)
|
||
|
||
> Firefox HW HEVC playback on higgs at ≥30fps for 1080p Main, byte-exact
|
||
> libva-vs-kdirect for ≥3 reference fixtures (8-bit Main and 10-bit Main10).
|
||
|
||
Two measurable subgoals follow naturally:
|
||
- libva (this backend, NV12 image output) == kdirect (ffmpeg-v4l2request,
|
||
NV12 image output) byte-exact for the same input.
|
||
- Firefox VA-API path engages (verify via `chrome://gpu` equivalent / log
|
||
inspection — `MOZ_LOG=PlatformDecoderModule:5`).
|
||
|
||
## Phase 3 baseline plan
|
||
|
||
Before any backend code touches rpi-hevc-dec:
|
||
- `kdirect` floor: `ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime
|
||
-i bbb_720p10s_hevc.mp4 -vf hwdownload,format=nv12 -frames:v 10 ...` and
|
||
sha256 the YUV.
|
||
- `SW reference`: same ffmpeg without `-hwaccel`, sha256 the YUV.
|
||
- Both runs N=3 per [[replicate-baseline-first]].
|
||
- Capture `strace -f -e ioctl` of the kdirect run — gives the canonical
|
||
ioctl sequence rpi-hevc-dec expects.
|
||
|
||
## Phase 0 closing
|
||
|
||
This doc commits the substrate. Phase 1 starts when:
|
||
- higgs is up + reachable
|
||
- Open questions 1+2 (EXT_SPS + start_code) are answered live, in one
|
||
short probe session
|
||
- Phase 3 baseline floors are captured
|
||
|
||
No work blocks the close of iter39 / fresnel campaign — those are shipped.
|