From b6a65fc692761b071fdeb6deba3c522588f62198 Mon Sep 17 00:00:00 2001 From: claude-noether Date: Sun, 17 May 2026 18:54:08 +0000 Subject: [PATCH] phase0_pi5_hevc: close addendum with empirical higgs probe data MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Live probe of rpi-hevc-dec on higgs (Pi CM5, kernel 6.12.75-rpt-rpi-2712, Debian 13 trixie) answers Phase 0 open questions Q1, Q2, Q5, Q6 empirically; Q3 partial; Q4 still open. Q1 (EXT_SPS): NOT present. Only standard V4L2_CID_STATELESS_HEVC_*. Probe ctrl id 0xa97 returns EINVAL — same gate iter2's has_hevc_ext_sps_rps_rkvdec uses. iter31 alpha-29 plumbing applies. Q2 (hevc_start_code): default 0 "No Start Code"; matches our behaviour. Q3 (NC12 SAND tile layout): partial. CAPTURE S_FMT for 1280x720 NC12 returns sizeimage=1382400 (linear NV12 byte count) but bytesperline=1080 (suspect, encodes SAND col count not linear stride). Need kernel-doc / driver-source read before writing detile primitive. Q4 (DRM modifier round-trip): hwdownload rejects SAND-tiled drm_prime (-38 Function not implemented). Backend CPU-detile to NV12 is the safe path for Firefox. Q5 (submission ordering): empirical ioctl trace shows canonical V4L2 stateless flow. Two notes for the backend: kdirect uses V4L2_MEMORY_DMABUF for both queues (we use MMAP for CAPTURE on rkvdec); kdirect does NOT need the iter25 SPS pre-seed pattern - rpi-hevc-dec takes explicit NC12 + dims directly. Q6 (packaging): Debian 13 trixie. Phase 8 needs a debian/ tree, not just PKGBUILD. Decision in Phase 1. Other findings: ffmpeg 7.1.3 from stock Debian is built with --enable-v4l2-request. kdirect engagement line: Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19; buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8 No libva ICD installed (only armada-drm_dri.so). mpv installable. Firefox 145 + rpi-firefox-mods present. Phase 0 closed. Phase 1 opens with goal: HEVC bit-exact libva-vs-kdirect on higgs for 1280x720 Main 8-bit via the new RPI_HEVC_DEC driver_kind slot + NC12 detile primitive. Co-Authored-By: Claude Opus 4.7 --- phase0_pi5_hevc.md | 138 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 138 insertions(+) diff --git a/phase0_pi5_hevc.md b/phase0_pi5_hevc.md index db41aa7..61823d6 100644 --- a/phase0_pi5_hevc.md +++ b/phase0_pi5_hevc.md @@ -158,3 +158,141 @@ This doc commits the substrate. Phase 1 starts when: - Phase 3 baseline floors are captured No work blocks the close of iter39 / fresnel campaign — those are shipped. + +## Phase 0 close addendum (2026-05-17 evening, higgs probe session) + +Empirical probes on higgs answered Q1, Q2, partial Q3, full Q5, full Q6. +Q4 (DRM modifier round-trip) remains open. Phase 0 is closed; Phase 1 +opens with what's below. + +### Q1 — EXT_SPS controls on rpi-hevc-dec: NOT present + +`v4l2-ctl -d /dev/video19 --list-ctrls` confirms ONLY the standard +`V4L2_CID_STATELESS_HEVC_*` set: +- `hevc_sequence_parameter_set` (0x00a40a90) +- `hevc_picture_parameter_set` (0x00a40a91) +- `slice_param_array` (0x00a40a92, dynamic-array dims=[4096]) +- `hevc_scaling_matrix` (0x00a40a93) +- `hevc_decode_parameters` (0x00a40a94) +- `hevc_decode_mode` (0x00a40a95, menu min=1 max=1 default=1 = Frame-Based) +- `hevc_start_code` (0x00a40a96, menu min=0 max=1 default=0 = No Start Code) +- 0x00a40a97 returns EINVAL (no EXT_SPS_*_RPS controls) + +ioctl trace confirms ffmpeg's `VIDIOC_QUERY_EXT_CTRL` for `0xa97` returns +EINVAL — same probe pattern our backend uses for +`has_hevc_ext_sps_rps_rkvdec`. **The iter2 path stays dormant; the +iter31 α-29 `slice_params->short_term_ref_pic_set_size` plumbing is the +correct one for rpi-hevc-dec.** + +### Q2 — hevc_start_code: default 0 (No Start Code), values {0, 1} + +Default 0 matches our backend's "don't prepend HEVC start code" stance. +Confirm in Phase 1: rpi-hevc-dec accepts our raw NAL slice payload as-is. + +### Q3 — NC12 / NC30 SAND tile layout: PARTIAL + +CAPTURE S_FMT result for 1280×720 NC12: +- `sizeimage=1382400` = `1280 × 720 × 1.5` (linear NV12 byte count) +- `bytesperline=1080` (NOT 1280) + +The bytesperline=1080 for a 1280-wide CAPTURE buffer is suspect — likely +encodes SAND column count rather than linear stride. Read +`drivers/staging/media/rpivid/` (or wherever NC12_COL128 lives in 6.12) +kernel source + `drm_fourcc.h` / `nv12_col128.rst` (if it exists) for +exact tile layout BEFORE writing the detile primitive. Do NOT infer +layout from this single observation. + +### Q4 — DRM modifier round-trip: BLOCKED on hwdownload + +ffmpeg `-hwaccel drm -hwaccel_output_format drm_prime -vf +hwmap=mode=read,format=nv12` returns `Failed to map frame: -38` +(`Function not implemented`). hwdownload cannot consume the SAND +modifier directly. + +ffmpeg's path that DOES work: `-hwaccel drm -c:v hevc` WITHOUT +`-hwaccel_output_format drm_prime` lets ffmpeg's internal pipeline pull +back, detile (presumably via a Pi-specific helper or libdrm transform), +and present NV12 to the next filter. Bit-exact vs SW for the test +fixture (1280×720 Main 8-bit) — confirms HW engagement. + +Phase 1 / Phase 4 will need to decide: +- Detile in the backend (CPU SIMD), exposing NV12 via VAImage; or +- Pass-through DRM_PRIME with SAND modifier and let the consumer + (compositor / Firefox) detile. Firefox almost certainly can't, so + CPU detile is the safe bet. + +### Q5 — rpi-hevc-dec submission ordering: empirically locked + +`strace -e ioctl` of the kdirect run shows: +1. `MEDIA_IOC_DEVICE_INFO` + `MEDIA_IOC_G_TOPOLOGY` (per media node) +2. `VIDIOC_QUERYCAP` per video node — `driver="rpi-hevc-dec"` identifies + the right one +3. `VIDIOC_ENUM_FMT` OUTPUT → S265 only +4. `VIDIOC_S_FMT` OUTPUT (HEVC_SLICE, placeholder dims) +5. `VIDIOC_REQBUFS` OUTPUT (DMABUF, count=N) — count=6 in kdirect +6. `VIDIOC_S_FMT` CAPTURE (NC12, actual dims from SPS parse) +7. `VIDIOC_CREATE_BUFS` CAPTURE (DMABUF, count=16) +8. `VIDIOC_STREAMON` both queues +9. `VIDIOC_QUERY_EXT_CTRL` enumeration +10. `VIDIOC_S_EXT_CTRLS` (decode_mode + start_code) — global ctrls +11. Per frame: `VIDIOC_S_EXT_CTRLS` (SPS+PPS+decode_params+slice_array, + class=0xf010000 = per-request) + `VIDIOC_QBUF` CAPTURE + `VIDIOC_QBUF` + OUTPUT (with `V4L2_BUF_FLAG_IN_REQUEST | V4L2_BUF_FLAG_REQUEST_FD`) + + `VIDIOC_DQBUF` OUTPUT + `VIDIOC_DQBUF` CAPTURE + +**Two structural notes for the backend:** +- OUTPUT + CAPTURE both use `V4L2_MEMORY_DMABUF` in kdirect. Our backend + currently uses MMAP for CAPTURE on rkvdec/hantro. For Pi 5 we should + either follow kdirect (DMABUF, allows zero-copy DRM_PRIME export) or + use MMAP and CPU-detile. Phase 4 design decision. +- The order `S_FMT OUTPUT → REQBUFS OUTPUT → S_FMT CAPTURE → CREATE_BUFS + CAPTURE → STREAMON` differs from our iter25 rkvdec pre-seed pattern + (where SPS via S_EXT_CTRLS must come BEFORE CAPTURE alloc to resolve + the image_fmt). rpi-hevc-dec apparently DOESN'T need that pre-seed — + CAPTURE S_FMT just takes the explicit NC12 + caller's dims. Confirm + in Phase 1 by trying our existing iter25 pre-seed flow against it. + +### Q6 — packaging: Debian 13 trixie, NOT Arch + +higgs runs Debian 13 trixie (`PRETTY_NAME="Debian GNU/Linux 13 (trixie)"`), +not Arch ALARM. Phase 8 (per the dev-process Phase 8 packaging rule) for +the Pi 5 chapter needs a `debian/` packaging tree, not just a PKGBUILD. + +Decide in Phase 1 whether to: +- Add Debian packaging to `marfrit-packages` as a second target, OR +- Use distrobox/podman with an Arch ALARM container on higgs for + install (test-only, not production), OR +- Pi 5 chapter ships a Debian source pkg via gitea / a personal Debian + repo. + +### Other new findings from the probe session + +- **ffmpeg 7.1.3 from Debian 13 is built with `--enable-v4l2-request`** + — the kdirect path exists. Invocation is `ffmpeg -hwaccel drm -c:v + hevc` (not just `-hwaccel drm`; the explicit codec flag matters for + the negotiation). Engagement log line is + `Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19; + buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8`. Per + [[hw-decode-engagement-check]], grep for that line to confirm HW path + engaged. +- **No libva ICD installed on higgs** — only `armada-drm_dri.so` ships, + which doesn't apply. We'd be the first VA-API HW path for HEVC on Pi + 5 once installed. +- **mpv is apt-installable** (`mpv 0.40.0-3+deb13u1`) — useful as a + pixel-readback verifier once the backend works (`mpv --vo=image` or + `--vo=drm`). +- **Firefox 145.0.1 + rpi-firefox-mods 20251016 installed** (firefox-esr + package status was `rc` = removed but config remains). The mods + package likely contains VA-API plumbing prefs. + +### What changes for Phase 1 + +- Goal is now phrasable: HEVC bit-exact libva-vs-kdirect on higgs for + the 1280×720 Main 8-bit test fixture (same generator as + `/tmp/bbb_main.mp4` here). Kdirect engagement signal is the + `Hwaccel V4L2 HEVC stateless V4` log line. +- Most backend code reuses existing rkvdec/hantro HEVC path: ctrls, + per-frame submission, request_fd, multi-device probe pattern. +- New code: NC12 video_format entry + detile primitive (sibling to + `nv15_unpack_plane_to_p010`) + RPI_HEVC_DEC driver_kind. +- Packaging target = Debian, not Arch.