# Iter4 close — second kernel bug: missing HEVC_SLICE_PARAMS registration Date: 2026-05-16 (afternoon, immediately following iter3 close) Branch: `master` Substrate: ampere `7.0.0-rc3-devices+` with iter3 fix (ext_sps NULL init) carried in. Backend: iter3 instrumented build, md5 `404041ea2dcc03c769e0ab8c43ddadd6`, deployed at `/usr/lib/dri/`. ## Bottom line **The Casanova/Collabora v7.0 HEVC series forgot to register `V4L2_CID_STATELESS_HEVC_SLICE_PARAMS` in the new `vdpu38x_hevc_ctrl_descs[]` table.** The legacy `rkvdec_hevc_ctrl_descs[]` (RK3399 path) has it; the new vdpu381/vdpu383 path doesn't. Result: every per-frame `VIDIOC_S_EXT_CTRLS` returns `-EINVAL` ("cannot find control id 0xa40a92") and userspace falls through to queue requests with no controls committed → decoder runs on zero-init control state → all-zero output (or worse, OOPSes on uninit memory before iter3 fix). ## Falsifier outcome F1 (kernel rejects 5-ctrl batch with EINVAL): **TRUE pre-patch** — confirmed by enabling `V4L2_DEV_DEBUG_CTRL` (bit 0x20) on `/sys/class/video4linux/videoN/dev_debug`, which surfaced the previously-silent `prepare_ext_ctrls: cannot find control id 0xa40a92` dprintk. F2 (registering HEVC_SLICE_PARAMS in `vdpu38x_hevc_ctrl_descs` makes the batch accept): **FALSE → TRUE** — 1-line patch (5 source lines with formatting) eliminated the EINVAL. ffmpeg exit 0, dmesg fully clean of `S_EXT_CTRLS: error` and `cannot find control id`. Decoder runs. F3 (decoder produces non-empty output post-patch): **FALSE** — output `/tmp/o.nv12` is 4147200 bytes (correct 3×NV12 frames) but contains only Y=16 (luma "video black") and Cb/Cr=128 (chroma neutral) — solid black. Decoder runs but bitstream isn't being interpreted. This is the iter5 hand-off bug. ## Root cause (iter4 Phase 6) `drivers/media/platform/rockchip/rkvdec/rkvdec.c` has two HEVC ctrl_descs arrays: | array | line | registers SLICE_PARAMS? | |-------|------|-------------------------| | `rkvdec_hevc_ctrl_descs[]` (legacy RK3399 path) | 189 | YES — dynamic array, dims={600} | | `vdpu38x_hevc_ctrl_descs[]` (Casanova RK3588/RK3576 path) | ~240 | **NO** | Both are passed to the same `rkvdec_hevc_run_preamble` (rkvdec-hevc-common.c:478) which calls `v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_HEVC_SLICE_PARAMS)`. With the new table, the ctrl isn't in the handler — userspace `VIDIOC_S_EXT_CTRLS` for this CID fails in `prepare_ext_ctrls` → return -EINVAL → kernel sets `error_idx = cs->count` (since `set=true`) → backend sees `error_idx=5, count=5, err=-22`. The reason this stayed silent in earlier debugging: dprintks for `cannot find control id 0x%x` are gated behind `V4L2_DEV_DEBUG_CTRL = 0x20`. Default `/sys/.../dev_debug` is 0 — no ctrl-class dprintks. Setting `0x3f` (all 6 bits) on every video device surfaced the lookup failure immediately. ## Minimal kernel patch (verified working) ```diff --- a/drivers/media/platform/rockchip/rkvdec/rkvdec.c +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.c @@ -242,6 +242,12 @@ static const struct rkvdec_ctrl_desc vdpu38x_hevc_ctrl_descs[] = { { .cfg.id = V4L2_CID_STATELESS_HEVC_DECODE_PARAMS, }, + { + .cfg.id = V4L2_CID_STATELESS_HEVC_SLICE_PARAMS, + .cfg.flags = V4L2_CTRL_FLAG_DYNAMIC_ARRAY, + .cfg.type = V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS, + .cfg.dims = { 600 }, + }, { .cfg.id = V4L2_CID_STATELESS_HEVC_SPS, .cfg.ops = &rkvdec_ctrl_ops, ``` Mirror of the legacy `rkvdec_hevc_ctrl_descs[]` entry. `600` is the absolute maximum slices per frame for HEVC level > 6 (matches `visl` and legacy rkvdec). ## Verification (on-target empirical) ``` $ ssh ampere 'sudo dmesg --clear; LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -i bbb_60s_720p.hevc.mp4 -vf hwdownload,format=nv12 -frames:v 3 -f rawvideo -pix_fmt nv12 /tmp/o.nv12; echo exit=$?' exit=0 $ ssh ampere 'sudo dmesg | grep -E "rkvdec|cannot find|S_EXT_CTRLS: error"' [Sat May 16 13:17:08 2026] rkvdec fdc40100.video-codec: missing multi-core support, ignoring this instance $ ssh ampere 'ls -la /tmp/o.nv12; md5sum /tmp/o.nv12; head -c 4147200 /tmp/o.nv12 | od -An -tu1 -w1 | sort -u' -rw-r--r-- 1 mfritsche mfritsche 4147200 May 16 13:17 /tmp/o.nv12 25ae521379343783da65b1fc80b1e8e8 /tmp/o.nv12 16 128 ``` No dmesg errors. No EINVAL. ffmpeg exit 0. 3-frame NV12 output (correct size). All bytes are 16 (Y) or 128 (Cb/Cr) — solid black, but a structurally-valid decode (no OOPS, no truncation). ## Why output is still black (deferred to iter5) Possible causes for iter5 to investigate: 1. **OUTPUT bitstream not reaching hardware** — backend assembles slice NALs into `source_data`, but maybe slices_size or QBUF length is wrong → hardware reads empty buffer → produces blank frame. 2. **Slice header field mismatch** — backend's `h265_fill_slice_params` may put bit_size/data_byte_offset/slice_segment_addr in fields the kernel doesn't expect. Strace shows `bit_size=0x1038` (519 bytes), `data_byte_offset=17` — plausible but unverified against the actual NAL. 3. **start_code prefix handling** — backend prepends Annex-B `00 00 00 01` when `h264_start_code=true`. For HEVC under DECODE_MODE_FRAME_BASED + START_CODE_ANNEX_B (both registered in vdpu38x_hevc_ctrl_descs), this should match — but the iter2 backend used `h264_start_code` as a profile-independent flag (per `feedback_unconditional_codec_state`); verify it gates correctly for HEVC. 4. **DECODE_PARAMS dpb/poc fields** — for IDR frame 1, dpb should be empty (num_active_dpb_entries=0), num_poc_st_curr_before/after/lt_curr=0. If backend sets non-zero, kernel may interpret as needing references that don't exist. iter5 starts with: enable `LIBVA_V4L2_DUMP_OUTPUT=