299e376d51
GStreamer's MERGED v4l2_codec_h265_dec_fill_ext_sps_rps in gst-plugins-bad (GStreamer 1.28, MR !10820) is the primary upstream reference. Walks its own gst_h265_parser_'s GstH265SPS.short_term_ ref_pic_set[] array, field names match the H.265 spec, one-to-one mapping to the V4L2 control struct. Header strategy: runtime-optional control probe, NO #ifndef shim. Casanova's FFmpeg WIP branch (v4l2-request-ext-sps-rps-n8.0.1 at gitlab.collabora.com) is the secondary reference — walks libavcodec internal HEVCSPS->st_rps[] with different field names. Useful as cross-check but not the primary template (renaming gymnastics). cros-codecs has no support yet (would follow GStreamer's shape if added). Casanova's kernel-test framework uses fluster through these two upstream consumers; no other reference exists. Q1 (architecture): resolved — implement H.265 SPS parser in backend, mirror GStreamer pattern with spec-compliant field names. Q2 (UAPI shim): resolved — runtime-optional control probe per GStreamer pattern, NOT #ifndef shim. Remaining sub-question for Phase 1: parser SOURCE (vendor GStreamer's gsth265parser.c, adapt to backend idioms, or implement minimal fresh from H.265 §7.3.7). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
144 lines
13 KiB
Markdown
144 lines
13 KiB
Markdown
# Phase 0 — iter2 (HEVC backend EXT_SPS_*_RPS extension) substrate
|
||
|
||
Closed 2026-05-16 evening, post-meta-iter1-close.
|
||
|
||
## Research question
|
||
|
||
**Can a libva-v4l2-request-fourier patch that registers and populates `V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS` and `_LT_RPS` unblock HEVC HW decode on ampere RK3588 — and if so, what is the source of the RPS array contents (which VAAPI's `VAPictureParameterBufferHEVC` does NOT expose)?**
|
||
|
||
## Substrate
|
||
|
||
**Backend HEVC code layout** (in `~/src/libva-v4l2-request-fourier/src/h265.c` on ampere):
|
||
|
||
- `h265_fill_sps` at line 96 — populates `struct v4l2_ctrl_hevc_sps` from `VAPictureParameterBufferHEVC`. Reads `picture->num_short_term_ref_pic_sets` (line 145) and `picture->num_long_term_ref_pic_sps` (line 146) into the SPS struct. **Does NOT touch RPS arrays.**
|
||
- `h265_fill_pps` at line 173 — populates `struct v4l2_ctrl_hevc_pps`. Comment at line 238: *"VAAPI does not expose either flag in VAPictureParameterBufferHEVC."*
|
||
- `h265_fill_decode_params` at ~line 256 — DECODE_PARAMS population; ends with comment at line 325 referencing iter31's `va-st-rps-bits-is-slice-field` correction (the field with the same name in different V4L2 structs has different semantics).
|
||
- `h265_fill_slice_params` at line 361 — SLICE_PARAMS per slice. Has the iter31 α-29 fix: `slice_params->short_term_ref_pic_set_size = picture->st_rps_bits` (line 477+) — VAAPI's `st_rps_bits` is the slice-header bit-count, belongs here.
|
||
- `h265_set_controls` (the call site that registers controls) at ~line 660 — registers **5** controls today: SPS, PPS, SLICE_PARAMS, SCALING_MATRIX, DECODE_PARAMS via `v4l2_set_controls`. Plus DECODE_MODE + START_CODE registered earlier in `context.c:465-469`.
|
||
|
||
**No H.265 bitstream parser exists.** The backend has `h264_slice_header.{c,h}` for H.264 slice-header parsing (precedent that the codebase does this when needed), but no `h265_*` parser file.
|
||
|
||
**VAAPI's `VAPictureParameterBufferHEVC` only exposes RPS COUNTS, not contents.** Confirmed by grepping all `VAPicture*HEVC` field references in h265.c — only `num_short_term_ref_pic_sets` and `num_long_term_ref_pic_sps` are read, no `delta_poc_s0_minus1[]`, no `delta_idx_minus1`, no per-RPS fields. VAAPI's struct simply doesn't carry them.
|
||
|
||
**Kernel struct shapes for the new controls** (from `~/src/linux-rockchip/include/uapi/linux/v4l2-controls.h`):
|
||
|
||
```c
|
||
struct v4l2_ctrl_hevc_ext_sps_st_rps { // dynamic array, sized by sps->num_short_term_ref_pic_sets, ≤65 entries
|
||
__u8 delta_idx_minus1;
|
||
__u8 delta_rps_sign;
|
||
__u8 num_negative_pics;
|
||
__u8 num_positive_pics;
|
||
__u32 used_by_curr_pic;
|
||
__u32 use_delta_flag;
|
||
__u16 abs_delta_rps_minus1;
|
||
__u16 delta_poc_s0_minus1[16];
|
||
__u16 delta_poc_s1_minus1[16];
|
||
__u16 flags; // V4L2_HEVC_EXT_SPS_ST_RPS_FLAG_INTER_REF_PIC_SET_PRED
|
||
};
|
||
|
||
struct v4l2_ctrl_hevc_ext_sps_lt_rps { // dynamic array, sized by sps->num_long_term_ref_pics_sps, ≤65 entries
|
||
__u16 lt_ref_pic_poc_lsb_sps;
|
||
__u16 flags; // V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT
|
||
};
|
||
```
|
||
|
||
`linux-api-headers 6.19-1` on ampere does NOT define these — the backend would need a local UAPI shim (precedent: no current `hevc-ctrls/` dir in the backend, would need to be added).
|
||
|
||
**Kernel function that crashes** (from `rkvdec-hevc-common.c:380-410`):
|
||
|
||
```c
|
||
static void rkvdec_hevc_prepare_hw_st_rps(struct rkvdec_hevc_run *run, struct rkvdec_rps *rps,
|
||
struct v4l2_ctrl_hevc_ext_sps_st_rps *cache)
|
||
{
|
||
if (!run->ext_sps_st_rps)
|
||
return; // ← early return for NULL pointer
|
||
if (!memcmp(cache, run->ext_sps_st_rps, // ← OOPSes here per the stack trace
|
||
sizeof(struct v4l2_ctrl_hevc_ext_sps_st_rps)))
|
||
return;
|
||
/* ... per-element processing */
|
||
}
|
||
```
|
||
|
||
The crash IS in this memcmp. For the crash to happen at all:
|
||
- `run->ext_sps_st_rps` must be non-NULL (else early-return fires before memcmp), AND
|
||
- `memcmp` must dereference an unmapped / invalid address from one of `cache` or `run->ext_sps_st_rps`.
|
||
|
||
**Open mechanism question**: how does `run->ext_sps_st_rps` become a non-NULL pointer to invalid memory when the userspace never sets the control? Two candidates:
|
||
- (a) V4L2 control framework auto-allocates the control's `p_cur.p` to a default-zeroed buffer; later, `v4l2_ctrl_find` returns a control whose `p_cur.p` is a stale sentinel after some state transition.
|
||
- (b) The control storage is lazily allocated only on first set, but `v4l2_ctrl_find` returns the registered control object whose `p_cur.p` is whatever the registration-time stub left it as (likely uninitialized).
|
||
|
||
Resolving (a) vs (b) requires reading `drivers/media/v4l2-core/v4l2-ctrls-*.c` for the auto-allocation behavior of dynamic-array controls. Phase 2 work — not Phase 0.
|
||
|
||
## In-session baseline anchor for iter2
|
||
|
||
The HEVC OOPS reproducer remains as captured in `ampere-fourier` iter1 Phase 0:
|
||
|
||
```sh
|
||
LIBVA_DRIVER_NAME=v4l2_request \
|
||
ffmpeg -hide_banner -hwaccel vaapi -hwaccel_output_format vaapi \
|
||
-i ~/measurements/encoded/bbb_60s_720p.hevc.mp4 \
|
||
-vf "hwdownload,format=nv12" -frames:v 30 -f null -
|
||
# → kernel OOPS in dmesg, v4l2_mem2mem wedges all decoders until reboot
|
||
```
|
||
|
||
This is the iter2 falsifier — if a backend patch makes this stop OOPSing, the survey hypothesis is corroborated. If it still OOPSes the same way, mechanism is something else.
|
||
|
||
## Existing precedent: UAPI shim files
|
||
|
||
The backend currently has NO `hevc-ctrls/` directory (was searched; doesn't exist). The H.264 path uses system kernel headers via `<linux/v4l2-controls.h>`. Adding new HEVC CIDs that aren't in `linux-api-headers 6.19-1` will require:
|
||
- Adding a `hevc-ctrls/` directory with a local stub header that defines the missing constants + structs (matching the kernel 7.0 definitions verbatim).
|
||
- OR bumping the `linux-api-headers` package on ampere to 7.0+.
|
||
|
||
Per the fresnel-iter25 / `feedback_rkvdec_image_fmt_pre_seed` precedent, the backend ships local UAPI shims when the kernel side gets ahead of distro headers. Iter2 follows that precedent unless the operator prefers the headers-bump route.
|
||
|
||
## Open questions tabled into Phase 1
|
||
|
||
1. **Architecture for RPS data sourcing** (the BIG one): given VAAPI doesn't expose the RPS table contents, how does the backend obtain them?
|
||
- **(A) Implement H.265 SPS bitstream parser in the backend** — ~800-1500 lines of new code, well-defined per H.265 spec §7.3.2.2 + §7.3.7, follows `h264_slice_header.c` precedent. Highest scope, but self-contained and doesn't add dependencies.
|
||
- **(B) Test the "minimal patch with zero-init RPS data" hypothesis first** — if just registering the controls (with `delta_idx_minus1=0, num_*_pics=0` etc.) eliminates the OOPS, then HEVC decode probably produces wrong/black frames but doesn't crash. Iterates risk: stage A confirms mechanism, stage B (real parsing) follows. This is the staged approach Phase 1 of the META campaign already named as iter2's first concrete action.
|
||
- **(C) Link libavcodec's HEVC parser** — adds a build-time dep on FFmpeg's HEVC code, would expose `H265RawSPS`. Avoids reimplementing the parser. Out of campaign-typical practice (backend is minimal-deps); operator decision.
|
||
- **(D) Some other channel I haven't identified** — e.g. ffmpeg-vaapi's `VABufferType` ecosystem may have an SPS-RPS extension somewhere; Phase 2 would need to confirm.
|
||
2. **`linux-api-headers` shim vs bump**: ship `hevc-ctrls/` per iter25 precedent, or bump the package?
|
||
3. **Mechanism reconstruction depth**: do we need to read `v4l2-ctrls-*.c` to fully understand WHY the OOPS happens, or is "make ext_sps_*_rps non-NULL with valid data" empirically sufficient to validate the fix?
|
||
4. **Test-decode reference clip**: BBB 60s 720p HEVC is the iter1 substrate; works for iter2 too. No new clip needed.
|
||
5. **Phase 7 verification anchor**: ampere-fourier iter1 baseline (H.264 + VP8 + MPEG-2 still PASS C1-C6) PLUS new HEVC C1-C6 — iter2's Phase 1 success criteria should mirror iter1's per-codec C1-C6 with HEVC added; floor for HEVC SSIM Y at f720 expected in H.264-drift territory (~0.65 ± 0.05) per fresnel iter1 + ampere iter1 convergent observations.
|
||
|
||
## Phase 0 close
|
||
|
||
Substrate captured: 5 existing HEVC controls in the backend, no H.265 parser, VAAPI doesn't expose RPS contents, kernel struct shapes documented, mechanism partially understood (memcmp dereferences invalid memory; precise cause = open Q3). 5 open questions for Phase 1, with Q1 (architecture for RPS sourcing) being the load-bearing decision.
|
||
|
||
---
|
||
|
||
## Upstream-consumer survey (added 2026-05-16 post-Phase-0)
|
||
|
||
Per `feedback_upstream_alignment_over_speed`, surveyed real upstream V4L2 stateless HEVC consumers for the `EXT_SPS_*_RPS` pattern. Subagent transcript: `~/.../tasks/aa6f3e6382bc0d721.output`. Findings:
|
||
|
||
| Consumer | Status | Pattern |
|
||
|----------|--------|---------|
|
||
| **GStreamer** | **MERGED for GStreamer 1.28** ([!10820](https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/10820)) | Walks its own `gst_h265_parser_*`'s `GstH265SPS.short_term_ref_pic_set[]` array — field names match H.265 spec, one-to-one mapping to the V4L2 struct. **Header strategy: runtime-optional control probe, NO `#ifndef` shim.** File: `subprojects/gst-plugins-bad/sys/v4l2codecs/gstv4l2codech265dec.c`, function `gst_v4l2_codec_h265_dec_fill_ext_sps_rps` |
|
||
| **FFmpeg (Casanova WIP)** | Not yet on ffmpeg-devel (branch `v4l2-request-ext-sps-rps-n8.0.1` at gitlab.collabora.com) | Walks libavcodec's internal `HEVCSPS->st_rps[]` (different field names than spec — `rps_predict`, `delta_idx`, `abs_delta_rps`, etc., requires translation). LT_RPS commented-out (incomplete). Function: `fill_ext_sps_st_rps` in `libavcodec/v4l2_request_hevc.c` |
|
||
| cros-codecs | No support yet (would parse via own `cros_codecs::codec::h265::parser::Sps` when added — same shape as GStreamer) | n/a |
|
||
| Casanova kernel-test framework | fluster through GStreamer 1.28 + Collabora FFmpeg WIP — no separate reference consumer | n/a |
|
||
| Bootlin `libva-v4l2-request` | Dormant since 2019, no 7.0-UAPI work | n/a |
|
||
|
||
**Upstream-aligned pattern is unambiguous**: parse the H.265 SPS NAL ourselves, populate the V4L2 controls from our parser's output. Both active upstream consumers (GStreamer merged, FFmpeg WIP) follow this exactly. VAAPI does not and will not expose the RPS array content, so we must parse.
|
||
|
||
**GStreamer's mapping is the cleanest reference** — `GstH265ShortTermRefPicSet` field names mirror the H.265 spec, so the V4L2-control assignment is mechanical. FFmpeg's renaming gymnastics are a useful cross-check but should NOT be the primary template.
|
||
|
||
**Header strategy decided**: no `#ifndef` shim. Mirror GStreamer's "optional control" probe path — at backend init, `VIDIOC_QUERYCTRL` the two new CIDs; if both present and the active driver-kind is VDPU381/383 HEVC, set them; if absent, log + skip (graceful fallback for older kernels). Constants + struct shapes need to be available at *compile* time, however, so the build pipeline either requires `linux-api-headers` ≥ 7.0 OR ships a minimal internal header with just the two new CIDs + structs (with a comment pointing to the upstream UAPI source). Picking which of those is a tactical Phase 4 detail.
|
||
|
||
## Phase 0 update — Q1 (architecture) and Q2 (UAPI shim) resolved
|
||
|
||
- **Q1 (architecture for RPS data sourcing)**: **B — implement H.265 SPS parser in backend**, mirroring GStreamer's `gst_v4l2_codec_h265_dec_fill_ext_sps_rps` pattern with one-to-one spec-compliant field names. Per-RPS-set + LT_RPS arrays.
|
||
- **Q2 (UAPI shim vs headers bump)**: **runtime-optional control probe** (not a header `#ifndef` shim). Compile-time access to the new CIDs/structs handled via either a headers package bump OR a minimal internal header — Phase 4 picks tactically.
|
||
- **Q3 (mechanism reconstruction depth)**: now lower-priority — once the backend populates valid RPS data per the upstream pattern, the OOPS should be gone whatever its precise cause was. If somehow it isn't, then loopback Phase 0 with whatever new evidence the failure surfaces.
|
||
- **Q4 (test clip)**: unchanged — BBB iter1 carries.
|
||
- **Q5 (Phase 7 anchor)**: unchanged — ampere-fourier iter1 + HEVC C1-C6 added.
|
||
|
||
Sub-question remaining for Phase 1 lock: **what's the H.265 SPS parser source?** Three options, all upstream-aligned:
|
||
- **(B1) Vendor GStreamer's parser** — copy `subprojects/gst-plugins-bad/codecparsers/gsth265parser.c` (LGPL, compatible with libva backend license). Keeps backend self-contained; reuses thoroughly tested code; carries forward GStreamer's spec-compliant field naming. Mostly a copy + minor adaptation (drop GLib dependency or replace with libc equivalents).
|
||
- **(B2) Adapt GStreamer's parser to the backend's idioms** — same data flow but rewritten to match `h264_slice_header.c` style (C plain, no GLib). More work; fewer LOC.
|
||
- **(B3) Implement minimal SPS-RPS-only parser fresh from H.265 spec §7.3.7** — narrowest scope (just the bits needed for the two controls), but does not benefit from GStreamer's edge-case handling.
|
||
|
||
(B1) is the most upstream-aligned. (B2) is the same data flow with the project's house style. (B3) is the most minimal but reinvents.
|