phase0_pi5_hevc: open Pi 5 / CM5 HEVC chapter (substrate doc only)
Empirical higgs probe (sibling session 2026-05-17) confirmed rpi-hevc-dec at /dev/video19 is V4L2 STATELESS, not stateful: - Section header literally "Stateless Codec Controls" - OUTPUT V4L2_PIX_FMT_HEVC_SLICE (parsed slices), not full-stream HEVC - V4L2_CID_STATELESS_HEVC_* control set + slice_param_array[4096] - CAPTURE NC12 / NC30 (V4L2_PIX_FMT_NV12_COL128 / _10_COL128, SAND 128-column tiled, Pi-specific) So the Pi 5 HEVC HW path belongs HERE (request/stateless backend), not in a separate stateful project. Replaces the now-deleted libva-v4l2-stateful-fourier scaffold attempt. phase0_pi5_hevc.md captures: - Substrate (target host, backend baseline, empirical probe output) - What carries forward unchanged (most of HEVC plumbing) - What needs adding (RPI_HEVC_DEC driver_kind, NC12/NC30 video_format + detile primitive, image.c branch — small surface area) - Six open questions Phase 1 must answer first (EXT_SPS presence, start_code default, SAND tile spec, drm_prime modifier round-trip, rpi-hevc-dec submission ordering quirks, packaging target OS) - Phase 1 goal sketch (NOT locked) + Phase 3 baseline plan No code in this commit. Phase 1 opens when higgs is up + first two open questions are answered live. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,160 @@
|
||||
# Phase 0 — Pi 5 / CM5 HEVC chapter
|
||||
|
||||
Opened 2026-05-17 evening, after the failed `libva-v4l2-stateful-fourier`
|
||||
scaffold attempt. Brother-session empirical Phase 0 on higgs invalidated
|
||||
the stateful premise: rpi-hevc-dec is V4L2 **stateless**, so Pi 5 HEVC
|
||||
belongs in this backend, not a separate sibling.
|
||||
|
||||
No code in this chapter yet. This doc is the substrate. Phase 1 picks up
|
||||
from the "Open questions" section.
|
||||
|
||||
## Substrate
|
||||
|
||||
### Target host
|
||||
|
||||
higgs — Pi CM5 module on Pi CM5 IO board. BCM2712 SoC. VPN-only, often
|
||||
offline; wake via HIS skill recipe (no Fritz!Box plug — runs on power
|
||||
when on). Debian-based. Sole HW video decoder is rpi-hevc-dec at
|
||||
`/dev/video19` + `/dev/media1`.
|
||||
|
||||
### Backend baseline at chapter open
|
||||
|
||||
`libva-v4l2-request-fourier` master tip `cf8cd9d` (iter39 + Option B +
|
||||
h265 ref-list cap fix). Multi-device probe (iter38) already opens
|
||||
rkvdec + hantro slots; adding a third decoder slot for rpi-hevc-dec is
|
||||
a natural extension of that architecture.
|
||||
|
||||
iter2 (ampere VDPU381 HEVC EXT_SPS) added the GStreamer 1.28.2 H.265
|
||||
parser vendor + EXT_SPS_ST_RPS / _LT_RPS dynamic-array submission. That
|
||||
plumbing is probe-gated (`has_hevc_ext_sps_rps_rkvdec`), so it stays
|
||||
dormant on hosts where the controls don't exist.
|
||||
|
||||
### Empirical higgs probe (brother session)
|
||||
|
||||
`v4l2-ctl -d /dev/video19 --list-formats-ext --list-ctrls`:
|
||||
|
||||
```
|
||||
Stateless Codec Controls
|
||||
|
||||
hevc_sequence_parameter_set (compound, V4L2_CID_STATELESS_HEVC_SPS)
|
||||
hevc_picture_parameter_set (compound, V4L2_CID_STATELESS_HEVC_PPS)
|
||||
slice_param_array (compound dynamic-array dims=[4096])
|
||||
hevc_scaling_matrix (compound)
|
||||
hevc_decode_parameters (compound)
|
||||
hevc_decode_mode (menu, "Frame-Based")
|
||||
hevc_start_code (menu, default "No Start Code")
|
||||
|
||||
OUTPUT formats:
|
||||
S265 V4L2_PIX_FMT_HEVC_SLICE (parsed slice payload)
|
||||
|
||||
CAPTURE formats:
|
||||
NC12 V4L2_PIX_FMT_NV12_COL128 (8-bit SAND 128-column tiled)
|
||||
NC30 V4L2_PIX_FMT_NV12_10_COL128 (10-bit SAND 128-column tiled)
|
||||
```
|
||||
|
||||
Conclusion: this is the standard `V4L2_CID_STATELESS_HEVC_*` control set
|
||||
exposed under the V4L2-request uAPI, exactly the same family our backend
|
||||
already drives for rkvdec/hantro/cedrus HEVC paths. The novel parts are
|
||||
two pixel formats (NC12, NC30) and one driver-id (rpi-hevc-dec).
|
||||
|
||||
## What carries forward unchanged
|
||||
|
||||
- VAAPI HEVC profile enumeration (`config.c`)
|
||||
- `h265_set_controls` core path (`h265.c`) — same compound ctrl set
|
||||
- Synthetic SPS pre-seed pattern (iter25/26) — already runs pre-CAPTURE-alloc
|
||||
- Multi-device dispatch in `RequestCreateConfig` (iter38)
|
||||
- VAAPI slice / picture / IQ matrix buffer parsing
|
||||
- HEVC h264-style start-code policy (we already DON'T prepend for HEVC)
|
||||
|
||||
## What needs adding
|
||||
|
||||
| Item | Location | Sizing |
|
||||
|------|----------|--------|
|
||||
| `RPI_HEVC_DEC` enum in `driver_kind_t` | `request.h` | trivial |
|
||||
| Multi-device probe extends to `/dev/video19` discovery | `context.c` / `request.c` init | small — mirror hantro slot |
|
||||
| `V4L2_PIX_FMT_NV12_COL128` (NC12) `video_format` entry | `video.c` | small |
|
||||
| `V4L2_PIX_FMT_NV12_10_COL128` (NC30) `video_format` entry | `video.c` | small |
|
||||
| NC12 → NV12 detile primitive | new `nv12_col128.c` | mid — column tile layout, see kernel docs |
|
||||
| NC30 → P010 detile primitive | new `nv12_col128.c` | mid — 10-bit variant of above |
|
||||
| `copy_surface_to_image` branch for NC12/NC30 | `image.c` | small (mirror NV15→P010 gating) |
|
||||
| Per-driver gating for any rpi-specific quirks discovered | various | per [[per-driver-kludge-gating]] |
|
||||
|
||||
## Open questions for Phase 1
|
||||
|
||||
Lock these before Phase 1 commits to a goal.
|
||||
|
||||
1. **EXT_SPS controls on rpi-hevc-dec?** Brother's `--list-ctrls` output
|
||||
above shows the standard `V4L2_CID_STATELESS_HEVC_*` family — NOT the
|
||||
`EXT_SPS_ST_RPS` / `EXT_SPS_LT_RPS` extensions that VDPU381 needs.
|
||||
Verify: does `slice_param_array[4096]` accept `st_rps_bits` /
|
||||
`lt_rps_bits` in the per-slice payload, or does rpi-hevc-dec parse RPS
|
||||
itself from the slice header? If the latter, the iter2 EXT_SPS path
|
||||
stays dormant (probe-gated already), and rpi-hevc-dec just needs the
|
||||
`picture->st_rps_bits` → `slice_params->short_term_ref_pic_set_size`
|
||||
plumbing that iter31 α-29 already wired. Expectation: works out of the
|
||||
box. Confirm before assuming.
|
||||
|
||||
2. **`hevc_start_code` ctrl: "No Start Code" vs Annex B?** Brother saw
|
||||
default `"No Start Code"` — matches our behavior (we don't prepend on
|
||||
HEVC). But the ctrl is configurable. Verify the menu values exposed
|
||||
and confirm "No Start Code" passes our raw slice-NAL payload as-is.
|
||||
If it doesn't, set the ctrl explicitly per [[unconditional-codec-state]]
|
||||
gating.
|
||||
|
||||
3. **NC12 / NC30 SAND tile layout — exact spec.** Read
|
||||
`Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst` for the
|
||||
COL128 variants. Confirm: column stride = 128 bytes (Y) / 128 bytes
|
||||
(UV interleaved). Row count = `ALIGN(height, 16)` or `ALIGN(height, 8)`?
|
||||
Get the exact alignment and tile-traversal order before writing the
|
||||
detile primitive. Cite from kernel doc, NOT inferred from a hex dump.
|
||||
|
||||
4. **drm_prime / SAND modifier round-trip.** Does ffmpeg-vaapi (and
|
||||
Firefox) accept the NC12 buffer via DRM_PRIME export carrying the
|
||||
DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT modifier, allowing
|
||||
zero-copy to a SAND-aware compositor? Or is libva-side detile to a
|
||||
linear NV12 buffer the only viable Firefox path? If detile is
|
||||
required for the consumer, the [[rockchip-pixel-verify-path]] rule
|
||||
(DMA-BUF GL preferred over cached mmap) might NOT apply since SAND
|
||||
is Pi-specific and not in the wider Wayland modifier ecosystem.
|
||||
|
||||
5. **rpi-hevc-dec quirks on first SPS submission.** rkvdec needs
|
||||
image_fmt pre-seed before CAPTURE alloc (iter25). Does rpi-hevc-dec
|
||||
have an analogous "must set OUTPUT pix_fmt + SPS before CAPTURE"
|
||||
ordering? Verify with strace early.
|
||||
|
||||
6. **higgs OS + libva versioning.** Brother probed on Debian. We package
|
||||
for Arch ALARM. What's the install path on higgs — Arch / Debian /
|
||||
Raspberry Pi OS? If Debian, the package needs a `debian/` tree, not
|
||||
just PKGBUILD. Decide packaging target before Phase 8.
|
||||
|
||||
## Phase 1 goal sketch (NOT locked)
|
||||
|
||||
> Firefox HW HEVC playback on higgs at ≥30fps for 1080p Main, byte-exact
|
||||
> libva-vs-kdirect for ≥3 reference fixtures (8-bit Main and 10-bit Main10).
|
||||
|
||||
Two measurable subgoals follow naturally:
|
||||
- libva (this backend, NV12 image output) == kdirect (ffmpeg-v4l2request,
|
||||
NV12 image output) byte-exact for the same input.
|
||||
- Firefox VA-API path engages (verify via `chrome://gpu` equivalent / log
|
||||
inspection — `MOZ_LOG=PlatformDecoderModule:5`).
|
||||
|
||||
## Phase 3 baseline plan
|
||||
|
||||
Before any backend code touches rpi-hevc-dec:
|
||||
- `kdirect` floor: `ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime
|
||||
-i bbb_720p10s_hevc.mp4 -vf hwdownload,format=nv12 -frames:v 10 ...` and
|
||||
sha256 the YUV.
|
||||
- `SW reference`: same ffmpeg without `-hwaccel`, sha256 the YUV.
|
||||
- Both runs N=3 per [[replicate-baseline-first]].
|
||||
- Capture `strace -f -e ioctl` of the kdirect run — gives the canonical
|
||||
ioctl sequence rpi-hevc-dec expects.
|
||||
|
||||
## Phase 0 closing
|
||||
|
||||
This doc commits the substrate. Phase 1 starts when:
|
||||
- higgs is up + reachable
|
||||
- Open questions 1+2 (EXT_SPS + start_code) are answered live, in one
|
||||
short probe session
|
||||
- Phase 3 baseline floors are captured
|
||||
|
||||
No work blocks the close of iter39 / fresnel campaign — those are shipped.
|
||||
Reference in New Issue
Block a user