phase0_pi5_hevc: close addendum with empirical higgs probe data

Live probe of rpi-hevc-dec on higgs (Pi CM5, kernel 6.12.75-rpt-rpi-2712,
Debian 13 trixie) answers Phase 0 open questions Q1, Q2, Q5, Q6
empirically; Q3 partial; Q4 still open.

Q1 (EXT_SPS): NOT present. Only standard V4L2_CID_STATELESS_HEVC_*.
  Probe ctrl id 0xa97 returns EINVAL — same gate iter2's
  has_hevc_ext_sps_rps_rkvdec uses. iter31 alpha-29 plumbing applies.

Q2 (hevc_start_code): default 0 "No Start Code"; matches our behaviour.

Q3 (NC12 SAND tile layout): partial. CAPTURE S_FMT for 1280x720 NC12
  returns sizeimage=1382400 (linear NV12 byte count) but
  bytesperline=1080 (suspect, encodes SAND col count not linear stride).
  Need kernel-doc / driver-source read before writing detile primitive.

Q4 (DRM modifier round-trip): hwdownload rejects SAND-tiled drm_prime
  (-38 Function not implemented). Backend CPU-detile to NV12 is the
  safe path for Firefox.

Q5 (submission ordering): empirical ioctl trace shows canonical V4L2
  stateless flow. Two notes for the backend: kdirect uses
  V4L2_MEMORY_DMABUF for both queues (we use MMAP for CAPTURE on
  rkvdec); kdirect does NOT need the iter25 SPS pre-seed pattern -
  rpi-hevc-dec takes explicit NC12 + dims directly.

Q6 (packaging): Debian 13 trixie. Phase 8 needs a debian/ tree, not
  just PKGBUILD. Decision in Phase 1.

Other findings: ffmpeg 7.1.3 from stock Debian is built with
--enable-v4l2-request. kdirect engagement line:
  Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19;
  buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8
No libva ICD installed (only armada-drm_dri.so). mpv installable.
Firefox 145 + rpi-firefox-mods present.

Phase 0 closed. Phase 1 opens with goal:
  HEVC bit-exact libva-vs-kdirect on higgs for 1280x720 Main 8-bit
  via the new RPI_HEVC_DEC driver_kind slot + NC12 detile primitive.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-17 18:54:08 +00:00
parent 25b8a15e09
commit b6a65fc692
+138
View File
@@ -158,3 +158,141 @@ This doc commits the substrate. Phase 1 starts when:
- Phase 3 baseline floors are captured - Phase 3 baseline floors are captured
No work blocks the close of iter39 / fresnel campaign — those are shipped. No work blocks the close of iter39 / fresnel campaign — those are shipped.
## Phase 0 close addendum (2026-05-17 evening, higgs probe session)
Empirical probes on higgs answered Q1, Q2, partial Q3, full Q5, full Q6.
Q4 (DRM modifier round-trip) remains open. Phase 0 is closed; Phase 1
opens with what's below.
### Q1 — EXT_SPS controls on rpi-hevc-dec: NOT present
`v4l2-ctl -d /dev/video19 --list-ctrls` confirms ONLY the standard
`V4L2_CID_STATELESS_HEVC_*` set:
- `hevc_sequence_parameter_set` (0x00a40a90)
- `hevc_picture_parameter_set` (0x00a40a91)
- `slice_param_array` (0x00a40a92, dynamic-array dims=[4096])
- `hevc_scaling_matrix` (0x00a40a93)
- `hevc_decode_parameters` (0x00a40a94)
- `hevc_decode_mode` (0x00a40a95, menu min=1 max=1 default=1 = Frame-Based)
- `hevc_start_code` (0x00a40a96, menu min=0 max=1 default=0 = No Start Code)
- 0x00a40a97 returns EINVAL (no EXT_SPS_*_RPS controls)
ioctl trace confirms ffmpeg's `VIDIOC_QUERY_EXT_CTRL` for `0xa97` returns
EINVAL — same probe pattern our backend uses for
`has_hevc_ext_sps_rps_rkvdec`. **The iter2 path stays dormant; the
iter31 α-29 `slice_params->short_term_ref_pic_set_size` plumbing is the
correct one for rpi-hevc-dec.**
### Q2 — hevc_start_code: default 0 (No Start Code), values {0, 1}
Default 0 matches our backend's "don't prepend HEVC start code" stance.
Confirm in Phase 1: rpi-hevc-dec accepts our raw NAL slice payload as-is.
### Q3 — NC12 / NC30 SAND tile layout: PARTIAL
CAPTURE S_FMT result for 1280×720 NC12:
- `sizeimage=1382400` = `1280 × 720 × 1.5` (linear NV12 byte count)
- `bytesperline=1080` (NOT 1280)
The bytesperline=1080 for a 1280-wide CAPTURE buffer is suspect — likely
encodes SAND column count rather than linear stride. Read
`drivers/staging/media/rpivid/` (or wherever NC12_COL128 lives in 6.12)
kernel source + `drm_fourcc.h` / `nv12_col128.rst` (if it exists) for
exact tile layout BEFORE writing the detile primitive. Do NOT infer
layout from this single observation.
### Q4 — DRM modifier round-trip: BLOCKED on hwdownload
ffmpeg `-hwaccel drm -hwaccel_output_format drm_prime -vf
hwmap=mode=read,format=nv12` returns `Failed to map frame: -38`
(`Function not implemented`). hwdownload cannot consume the SAND
modifier directly.
ffmpeg's path that DOES work: `-hwaccel drm -c:v hevc` WITHOUT
`-hwaccel_output_format drm_prime` lets ffmpeg's internal pipeline pull
back, detile (presumably via a Pi-specific helper or libdrm transform),
and present NV12 to the next filter. Bit-exact vs SW for the test
fixture (1280×720 Main 8-bit) — confirms HW engagement.
Phase 1 / Phase 4 will need to decide:
- Detile in the backend (CPU SIMD), exposing NV12 via VAImage; or
- Pass-through DRM_PRIME with SAND modifier and let the consumer
(compositor / Firefox) detile. Firefox almost certainly can't, so
CPU detile is the safe bet.
### Q5 — rpi-hevc-dec submission ordering: empirically locked
`strace -e ioctl` of the kdirect run shows:
1. `MEDIA_IOC_DEVICE_INFO` + `MEDIA_IOC_G_TOPOLOGY` (per media node)
2. `VIDIOC_QUERYCAP` per video node — `driver="rpi-hevc-dec"` identifies
the right one
3. `VIDIOC_ENUM_FMT` OUTPUT → S265 only
4. `VIDIOC_S_FMT` OUTPUT (HEVC_SLICE, placeholder dims)
5. `VIDIOC_REQBUFS` OUTPUT (DMABUF, count=N) — count=6 in kdirect
6. `VIDIOC_S_FMT` CAPTURE (NC12, actual dims from SPS parse)
7. `VIDIOC_CREATE_BUFS` CAPTURE (DMABUF, count=16)
8. `VIDIOC_STREAMON` both queues
9. `VIDIOC_QUERY_EXT_CTRL` enumeration
10. `VIDIOC_S_EXT_CTRLS` (decode_mode + start_code) — global ctrls
11. Per frame: `VIDIOC_S_EXT_CTRLS` (SPS+PPS+decode_params+slice_array,
class=0xf010000 = per-request) + `VIDIOC_QBUF` CAPTURE + `VIDIOC_QBUF`
OUTPUT (with `V4L2_BUF_FLAG_IN_REQUEST | V4L2_BUF_FLAG_REQUEST_FD`) +
`VIDIOC_DQBUF` OUTPUT + `VIDIOC_DQBUF` CAPTURE
**Two structural notes for the backend:**
- OUTPUT + CAPTURE both use `V4L2_MEMORY_DMABUF` in kdirect. Our backend
currently uses MMAP for CAPTURE on rkvdec/hantro. For Pi 5 we should
either follow kdirect (DMABUF, allows zero-copy DRM_PRIME export) or
use MMAP and CPU-detile. Phase 4 design decision.
- The order `S_FMT OUTPUT → REQBUFS OUTPUT → S_FMT CAPTURE → CREATE_BUFS
CAPTURE → STREAMON` differs from our iter25 rkvdec pre-seed pattern
(where SPS via S_EXT_CTRLS must come BEFORE CAPTURE alloc to resolve
the image_fmt). rpi-hevc-dec apparently DOESN'T need that pre-seed —
CAPTURE S_FMT just takes the explicit NC12 + caller's dims. Confirm
in Phase 1 by trying our existing iter25 pre-seed flow against it.
### Q6 — packaging: Debian 13 trixie, NOT Arch
higgs runs Debian 13 trixie (`PRETTY_NAME="Debian GNU/Linux 13 (trixie)"`),
not Arch ALARM. Phase 8 (per the dev-process Phase 8 packaging rule) for
the Pi 5 chapter needs a `debian/` packaging tree, not just a PKGBUILD.
Decide in Phase 1 whether to:
- Add Debian packaging to `marfrit-packages` as a second target, OR
- Use distrobox/podman with an Arch ALARM container on higgs for
install (test-only, not production), OR
- Pi 5 chapter ships a Debian source pkg via gitea / a personal Debian
repo.
### Other new findings from the probe session
- **ffmpeg 7.1.3 from Debian 13 is built with `--enable-v4l2-request`**
— the kdirect path exists. Invocation is `ffmpeg -hwaccel drm -c:v
hevc` (not just `-hwaccel drm`; the explicit codec flag matters for
the negotiation). Engagement log line is
`Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19;
buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8`. Per
[[hw-decode-engagement-check]], grep for that line to confirm HW path
engaged.
- **No libva ICD installed on higgs** — only `armada-drm_dri.so` ships,
which doesn't apply. We'd be the first VA-API HW path for HEVC on Pi
5 once installed.
- **mpv is apt-installable** (`mpv 0.40.0-3+deb13u1`) — useful as a
pixel-readback verifier once the backend works (`mpv --vo=image` or
`--vo=drm`).
- **Firefox 145.0.1 + rpi-firefox-mods 20251016 installed** (firefox-esr
package status was `rc` = removed but config remains). The mods
package likely contains VA-API plumbing prefs.
### What changes for Phase 1
- Goal is now phrasable: HEVC bit-exact libva-vs-kdirect on higgs for
the 1280×720 Main 8-bit test fixture (same generator as
`/tmp/bbb_main.mp4` here). Kdirect engagement signal is the
`Hwaccel V4L2 HEVC stateless V4` log line.
- Most backend code reuses existing rkvdec/hantro HEVC path: ctrls,
per-frame submission, request_fd, multi-device probe pattern.
- New code: NC12 video_format entry + detile primitive (sibling to
`nv15_unpack_plane_to_p010`) + RPI_HEVC_DEC driver_kind.
- Packaging target = Debian, not Arch.