forked from marfrit/libva-v4l2-request-fourier
phase0_pi5_hevc: close addendum with empirical higgs probe data
Live probe of rpi-hevc-dec on higgs (Pi CM5, kernel 6.12.75-rpt-rpi-2712, Debian 13 trixie) answers Phase 0 open questions Q1, Q2, Q5, Q6 empirically; Q3 partial; Q4 still open. Q1 (EXT_SPS): NOT present. Only standard V4L2_CID_STATELESS_HEVC_*. Probe ctrl id 0xa97 returns EINVAL — same gate iter2's has_hevc_ext_sps_rps_rkvdec uses. iter31 alpha-29 plumbing applies. Q2 (hevc_start_code): default 0 "No Start Code"; matches our behaviour. Q3 (NC12 SAND tile layout): partial. CAPTURE S_FMT for 1280x720 NC12 returns sizeimage=1382400 (linear NV12 byte count) but bytesperline=1080 (suspect, encodes SAND col count not linear stride). Need kernel-doc / driver-source read before writing detile primitive. Q4 (DRM modifier round-trip): hwdownload rejects SAND-tiled drm_prime (-38 Function not implemented). Backend CPU-detile to NV12 is the safe path for Firefox. Q5 (submission ordering): empirical ioctl trace shows canonical V4L2 stateless flow. Two notes for the backend: kdirect uses V4L2_MEMORY_DMABUF for both queues (we use MMAP for CAPTURE on rkvdec); kdirect does NOT need the iter25 SPS pre-seed pattern - rpi-hevc-dec takes explicit NC12 + dims directly. Q6 (packaging): Debian 13 trixie. Phase 8 needs a debian/ tree, not just PKGBUILD. Decision in Phase 1. Other findings: ffmpeg 7.1.3 from stock Debian is built with --enable-v4l2-request. kdirect engagement line: Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19; buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8 No libva ICD installed (only armada-drm_dri.so). mpv installable. Firefox 145 + rpi-firefox-mods present. Phase 0 closed. Phase 1 opens with goal: HEVC bit-exact libva-vs-kdirect on higgs for 1280x720 Main 8-bit via the new RPI_HEVC_DEC driver_kind slot + NC12 detile primitive. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -158,3 +158,141 @@ This doc commits the substrate. Phase 1 starts when:
|
||||
- Phase 3 baseline floors are captured
|
||||
|
||||
No work blocks the close of iter39 / fresnel campaign — those are shipped.
|
||||
|
||||
## Phase 0 close addendum (2026-05-17 evening, higgs probe session)
|
||||
|
||||
Empirical probes on higgs answered Q1, Q2, partial Q3, full Q5, full Q6.
|
||||
Q4 (DRM modifier round-trip) remains open. Phase 0 is closed; Phase 1
|
||||
opens with what's below.
|
||||
|
||||
### Q1 — EXT_SPS controls on rpi-hevc-dec: NOT present
|
||||
|
||||
`v4l2-ctl -d /dev/video19 --list-ctrls` confirms ONLY the standard
|
||||
`V4L2_CID_STATELESS_HEVC_*` set:
|
||||
- `hevc_sequence_parameter_set` (0x00a40a90)
|
||||
- `hevc_picture_parameter_set` (0x00a40a91)
|
||||
- `slice_param_array` (0x00a40a92, dynamic-array dims=[4096])
|
||||
- `hevc_scaling_matrix` (0x00a40a93)
|
||||
- `hevc_decode_parameters` (0x00a40a94)
|
||||
- `hevc_decode_mode` (0x00a40a95, menu min=1 max=1 default=1 = Frame-Based)
|
||||
- `hevc_start_code` (0x00a40a96, menu min=0 max=1 default=0 = No Start Code)
|
||||
- 0x00a40a97 returns EINVAL (no EXT_SPS_*_RPS controls)
|
||||
|
||||
ioctl trace confirms ffmpeg's `VIDIOC_QUERY_EXT_CTRL` for `0xa97` returns
|
||||
EINVAL — same probe pattern our backend uses for
|
||||
`has_hevc_ext_sps_rps_rkvdec`. **The iter2 path stays dormant; the
|
||||
iter31 α-29 `slice_params->short_term_ref_pic_set_size` plumbing is the
|
||||
correct one for rpi-hevc-dec.**
|
||||
|
||||
### Q2 — hevc_start_code: default 0 (No Start Code), values {0, 1}
|
||||
|
||||
Default 0 matches our backend's "don't prepend HEVC start code" stance.
|
||||
Confirm in Phase 1: rpi-hevc-dec accepts our raw NAL slice payload as-is.
|
||||
|
||||
### Q3 — NC12 / NC30 SAND tile layout: PARTIAL
|
||||
|
||||
CAPTURE S_FMT result for 1280×720 NC12:
|
||||
- `sizeimage=1382400` = `1280 × 720 × 1.5` (linear NV12 byte count)
|
||||
- `bytesperline=1080` (NOT 1280)
|
||||
|
||||
The bytesperline=1080 for a 1280-wide CAPTURE buffer is suspect — likely
|
||||
encodes SAND column count rather than linear stride. Read
|
||||
`drivers/staging/media/rpivid/` (or wherever NC12_COL128 lives in 6.12)
|
||||
kernel source + `drm_fourcc.h` / `nv12_col128.rst` (if it exists) for
|
||||
exact tile layout BEFORE writing the detile primitive. Do NOT infer
|
||||
layout from this single observation.
|
||||
|
||||
### Q4 — DRM modifier round-trip: BLOCKED on hwdownload
|
||||
|
||||
ffmpeg `-hwaccel drm -hwaccel_output_format drm_prime -vf
|
||||
hwmap=mode=read,format=nv12` returns `Failed to map frame: -38`
|
||||
(`Function not implemented`). hwdownload cannot consume the SAND
|
||||
modifier directly.
|
||||
|
||||
ffmpeg's path that DOES work: `-hwaccel drm -c:v hevc` WITHOUT
|
||||
`-hwaccel_output_format drm_prime` lets ffmpeg's internal pipeline pull
|
||||
back, detile (presumably via a Pi-specific helper or libdrm transform),
|
||||
and present NV12 to the next filter. Bit-exact vs SW for the test
|
||||
fixture (1280×720 Main 8-bit) — confirms HW engagement.
|
||||
|
||||
Phase 1 / Phase 4 will need to decide:
|
||||
- Detile in the backend (CPU SIMD), exposing NV12 via VAImage; or
|
||||
- Pass-through DRM_PRIME with SAND modifier and let the consumer
|
||||
(compositor / Firefox) detile. Firefox almost certainly can't, so
|
||||
CPU detile is the safe bet.
|
||||
|
||||
### Q5 — rpi-hevc-dec submission ordering: empirically locked
|
||||
|
||||
`strace -e ioctl` of the kdirect run shows:
|
||||
1. `MEDIA_IOC_DEVICE_INFO` + `MEDIA_IOC_G_TOPOLOGY` (per media node)
|
||||
2. `VIDIOC_QUERYCAP` per video node — `driver="rpi-hevc-dec"` identifies
|
||||
the right one
|
||||
3. `VIDIOC_ENUM_FMT` OUTPUT → S265 only
|
||||
4. `VIDIOC_S_FMT` OUTPUT (HEVC_SLICE, placeholder dims)
|
||||
5. `VIDIOC_REQBUFS` OUTPUT (DMABUF, count=N) — count=6 in kdirect
|
||||
6. `VIDIOC_S_FMT` CAPTURE (NC12, actual dims from SPS parse)
|
||||
7. `VIDIOC_CREATE_BUFS` CAPTURE (DMABUF, count=16)
|
||||
8. `VIDIOC_STREAMON` both queues
|
||||
9. `VIDIOC_QUERY_EXT_CTRL` enumeration
|
||||
10. `VIDIOC_S_EXT_CTRLS` (decode_mode + start_code) — global ctrls
|
||||
11. Per frame: `VIDIOC_S_EXT_CTRLS` (SPS+PPS+decode_params+slice_array,
|
||||
class=0xf010000 = per-request) + `VIDIOC_QBUF` CAPTURE + `VIDIOC_QBUF`
|
||||
OUTPUT (with `V4L2_BUF_FLAG_IN_REQUEST | V4L2_BUF_FLAG_REQUEST_FD`) +
|
||||
`VIDIOC_DQBUF` OUTPUT + `VIDIOC_DQBUF` CAPTURE
|
||||
|
||||
**Two structural notes for the backend:**
|
||||
- OUTPUT + CAPTURE both use `V4L2_MEMORY_DMABUF` in kdirect. Our backend
|
||||
currently uses MMAP for CAPTURE on rkvdec/hantro. For Pi 5 we should
|
||||
either follow kdirect (DMABUF, allows zero-copy DRM_PRIME export) or
|
||||
use MMAP and CPU-detile. Phase 4 design decision.
|
||||
- The order `S_FMT OUTPUT → REQBUFS OUTPUT → S_FMT CAPTURE → CREATE_BUFS
|
||||
CAPTURE → STREAMON` differs from our iter25 rkvdec pre-seed pattern
|
||||
(where SPS via S_EXT_CTRLS must come BEFORE CAPTURE alloc to resolve
|
||||
the image_fmt). rpi-hevc-dec apparently DOESN'T need that pre-seed —
|
||||
CAPTURE S_FMT just takes the explicit NC12 + caller's dims. Confirm
|
||||
in Phase 1 by trying our existing iter25 pre-seed flow against it.
|
||||
|
||||
### Q6 — packaging: Debian 13 trixie, NOT Arch
|
||||
|
||||
higgs runs Debian 13 trixie (`PRETTY_NAME="Debian GNU/Linux 13 (trixie)"`),
|
||||
not Arch ALARM. Phase 8 (per the dev-process Phase 8 packaging rule) for
|
||||
the Pi 5 chapter needs a `debian/` packaging tree, not just a PKGBUILD.
|
||||
|
||||
Decide in Phase 1 whether to:
|
||||
- Add Debian packaging to `marfrit-packages` as a second target, OR
|
||||
- Use distrobox/podman with an Arch ALARM container on higgs for
|
||||
install (test-only, not production), OR
|
||||
- Pi 5 chapter ships a Debian source pkg via gitea / a personal Debian
|
||||
repo.
|
||||
|
||||
### Other new findings from the probe session
|
||||
|
||||
- **ffmpeg 7.1.3 from Debian 13 is built with `--enable-v4l2-request`**
|
||||
— the kdirect path exists. Invocation is `ffmpeg -hwaccel drm -c:v
|
||||
hevc` (not just `-hwaccel drm`; the explicit codec flag matters for
|
||||
the negotiation). Engagement log line is
|
||||
`Hwaccel V4L2 HEVC stateless V4; devices: /dev/media1,/dev/video19;
|
||||
buffers: src DMABuf, dst DMABuf; swfmt=rpi4_8`. Per
|
||||
[[hw-decode-engagement-check]], grep for that line to confirm HW path
|
||||
engaged.
|
||||
- **No libva ICD installed on higgs** — only `armada-drm_dri.so` ships,
|
||||
which doesn't apply. We'd be the first VA-API HW path for HEVC on Pi
|
||||
5 once installed.
|
||||
- **mpv is apt-installable** (`mpv 0.40.0-3+deb13u1`) — useful as a
|
||||
pixel-readback verifier once the backend works (`mpv --vo=image` or
|
||||
`--vo=drm`).
|
||||
- **Firefox 145.0.1 + rpi-firefox-mods 20251016 installed** (firefox-esr
|
||||
package status was `rc` = removed but config remains). The mods
|
||||
package likely contains VA-API plumbing prefs.
|
||||
|
||||
### What changes for Phase 1
|
||||
|
||||
- Goal is now phrasable: HEVC bit-exact libva-vs-kdirect on higgs for
|
||||
the 1280×720 Main 8-bit test fixture (same generator as
|
||||
`/tmp/bbb_main.mp4` here). Kdirect engagement signal is the
|
||||
`Hwaccel V4L2 HEVC stateless V4` log line.
|
||||
- Most backend code reuses existing rkvdec/hantro HEVC path: ctrls,
|
||||
per-frame submission, request_fd, multi-device probe pattern.
|
||||
- New code: NC12 video_format entry + detile primitive (sibling to
|
||||
`nv15_unpack_plane_to_p010`) + RPI_HEVC_DEC driver_kind.
|
||||
- Packaging target = Debian, not Arch.
|
||||
|
||||
Reference in New Issue
Block a user