# Iter3 close — kernel-side root cause located Date: 2026-05-16 (afternoon) Branch: `master` Substrate: ampere `7.0.0-rc3-devices+` (mmind v7.0 + Casanova HEVC 7.0 series + ampere DTS) Backend: `libva-v4l2-request-fourier` iter2 + iter3 diagnostic instrumentation (md5 `404041ea`) ## Bottom line **The HEVC OOPS on RK3588 is a one-line kernel bug in the Casanova/Collabora 7.0 HEVC series.** Userspace work to vendor the GStreamer parser and submit `V4L2_CID_STATELESS_HEVC_EXT_SPS_{ST,LT}_RPS` controls is the *correct upstream-aligned approach* for HEVC streams that need it, BUT the OOPS occurs even without those controls because of a missing struct-field initialization in the kernel. ## Falsifier outcome F1 (no kernel OOPS): **TRUE — F1 fires again**, same as iter2. OOPS reproduced deterministically on fresh-boot run at 12:30:15 with iter3 instrumented build. F2 (vendored GStreamer parser extracts SPS from libavcodec-stripped source): **FALSE — verified empirically.** Diagnostic log shows `populate_ext_sps_rps_cache entry: src=0xffff7de54000 src_size=280 cache_valid=0` → `nal_iter=0 pr=5 offset=0 nalu.type=20 nalu.size=277` → `exit: err=-61 cache_valid=0`. First NAL in source_data is type 20 (`IDR_N_LP slice`), not type 33 (`SPS`). ffmpeg-vaapi strips VPS/SPS/PPS — confirms Phase 5 reviewer's prediction. This is consistent with the iter2 strace finding and not the bug. ## Root cause (Q1 + Q3 combined) In `drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-hevc.c:591`: ```c static int rkvdec_hevc_run(struct rkvdec_ctx *ctx) { struct rkvdec_dev *rkvdec = ctx->dev; struct rkvdec_hevc_run run; // ← UNINITIALIZED struct rkvdec_hevc_ctx *hevc_ctx = ctx->priv; struct rkvdec_hevc_priv_tbl *tbl = hevc_ctx->priv_tbl.cpu; rkvdec_hevc_run_preamble(ctx, &run); ... } ``` `rkvdec_hevc_run_preamble` in `rkvdec-hevc-common.c:478` assigns most `run->*` fields unconditionally, but the new EXT_SPS fields are conditional: ```c if (ctx->has_sps_st_rps) { ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS); run->ext_sps_st_rps = ctrl ? ctrl->p_cur.p : NULL; } if (ctx->has_sps_lt_rps) { ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS); run->ext_sps_lt_rps = ctrl ? ctrl->p_cur.p : NULL; } ``` When `ctx->has_sps_st_rps == false` (the common case for any HEVC stream where userspace doesn't supply EXT_SPS_*_RPS), `run.ext_sps_st_rps` retains **stack garbage** — the leftover bytes from a previous stack frame at the same offset. Empirically on the CoolPi CM5 GenBook this garbage is deterministically `0x51a0` per call, because the calling pattern is fixed. Downstream in `rkvdec_hevc_prepare_hw_st_rps` (rkvdec-hevc-common.c:380): ```c if (!run->ext_sps_st_rps) return; /* ← garbage 0x51a0 is non-NULL */ if (!memcmp(cache, run->ext_sps_st_rps, sizeof(struct v4l2_ctrl_hevc_ext_sps_st_rps))) return; /* ← memcmp derefs 0x51a0 → OOPS */ ``` The NULL-check is bypassed by stack garbage; `memcmp` faults at the bogus address. This is reachable on every HEVC decode that hits the assemble path on `vdpu381_variant` (RK3588) and `vdpu383_variant` (RK3576) — i.e., every stream where the dispatcher gate in `rkvdec-vdpu{381,383}-hevc.c:606` enters the else branch. ## Minimal kernel patch Either of these one-line patches fixes the OOPS without changing any behavior for valid use cases: **Option A — fix at the producer (preamble):** ```diff --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c @@ -495,6 +495,9 @@ void rkvdec_hevc_run_preamble(struct rkvdec_ctx *ctx, ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_HEVC_SCALING_MATRIX); run->scaling_matrix = ctrl ? ctrl->p_cur.p : NULL; + + run->ext_sps_st_rps = NULL; + run->ext_sps_lt_rps = NULL; if (ctx->has_sps_st_rps) { ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, ``` **Option B — fix at the callers (zero-init the run struct):** ```diff --- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-hevc.c +++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-hevc.c @@ -588,7 +588,7 @@ static int rkvdec_hevc_run(struct rkvdec_ctx *ctx) { struct rkvdec_dev *rkvdec = ctx->dev; - struct rkvdec_hevc_run run; + struct rkvdec_hevc_run run = {}; struct rkvdec_hevc_ctx *hevc_ctx = ctx->priv; struct rkvdec_hevc_priv_tbl *tbl = hevc_ctx->priv_tbl.cpu; ``` (same for `rkvdec-vdpu383-hevc.c`) Option A is more localized to the *cause* (the preamble's incomplete initialization). Option B is more defensive (catches any future fields added without explicit init). Either works. ## Why iter2's userspace fix matters separately Independently of the kernel patch, the iter2 work (vendored GStreamer 1.28.2 parser + UAPI shim + h265 cache population) IS the upstream-aligned path for HEVC streams where the SPS parameter set semantics require the new ST/LT RPS controls. When ffmpeg-vaapi forwards a buffer that DOES contain the SPS (e.g. some bitstream wrappers, or future ffmpeg versions that don't strip VPS/SPS/PPS), our parser activates and submits the EXT_SPS_*_RPS controls — exactly the upstream contract from the Casanova series. For the current ffmpeg-vaapi (which strips SPS), our backend correctly returns ENODATA from the parser and falls back to submitting only the standard 5 controls. With the kernel patch above, this fallback succeeds without OOPS. (There remains a separate question: for bbb_60s_720p.hevc, the SPS encoded by libx265 may or may not actually require EXT_SPS_*_RPS — typical streams use simple ST_RPS that the kernel can derive from slice headers alone, which is what the current rkvdec dispatcher path expects. The user-pref direction is "do it correctly aligned with upstream architecture", so the parser+cache+inject path stays.) ## Phase 6 question completion | Q | Answer | |---|--------| | Q1 — what is `0x51a0` | Stack garbage at `run.ext_sps_st_rps` offset from prior frame. Deterministic per calling pattern. | | Q2 — where is `ctrl->p_cur.p = 0x51a0` set | N/A — not a ctrl pointer corruption. The `0x51a0` is local stack memory, not from the V4L2 ctrl handler. | | Q3 — when `ctx->has_sps_st_rps` true | FALSE in our case. The OOPS triggers via the missing-init-of-run, not via the has_sps_st_rps gate. | | Q4 — backend log fires? | YES, after fflush patch + picture.c entry-point instrumentation. Confirmed full path BeginPicture → RenderPicture×2 → EndPicture → h265_set_controls → populate_ext_sps_rps_cache → ENODATA → fallback to 5-ctrl batch → EINVAL → MEDIA_REQUEST_IOC_QUEUE → kernel OOPS. | | Q5 — vdpu381 vs vdpu383 on RK3588 | RK3588 → `vdpu381_variant`. Confirmed in `drivers/media/platform/rockchip/rkvdec/rkvdec.c:1757` (`"rockchip,rk3588-vdec"` → `&vdpu381_variant`). vdpu383 binds only to RK3576. | ## Open follow-ups - **kernel-agent issue (new)**: file the 1-line patch above as an upstream-aligned fix to the Casanova v7.0 series. Tag for: applies to vdpu381 + vdpu383 both; reproducer is "any HEVC stream where ffmpeg-vaapi strips SPS" (i.e., everything). - **iter4 plan**: with kernel patched, re-run HEVC and confirm decode succeeds. Then iterate on the standard-5-controls EINVAL (separate issue — likely a slice_params field shape or DECODE_PARAMS flag the iter2 backend isn't setting correctly). - **Optional userspace robustness**: after kernel patch, the EXT_SPS_*_RPS path stays dormant on standard streams; consider deferring h265_populate to *only when SPS is detected in source_data* (cheap optimization). - **Diagnostic instrumentation removal**: iter3-Q4 logs in `picture.c` + `h265.c` + `utils.c::fflush` should be stripped once iter4 close confirms the kernel patch works. fflush in utils.c can stay (cheap, useful). ## Substrate state at close - Backend `.so`: md5 `404041ea2dcc03c769e0ab8c43ddadd6` on ampere `/usr/lib/dri/` - Backend source: ampere `~/src/libva-v4l2-request-fourier` working tree has uncommitted iter3 diagnostics (picture.c + utils.c + h265.c) - Kernel: unchanged from iter2 substrate (waiting on kernel-agent issue + patch land) Iter3 closes with **root cause identified and a 1-line fix proposed**, going from negative-result (iter2 close) to definitive-finding in one diagnostic iteration. The 8-phase loop's Phase 5 (review) recommendation in iter2 to add backend logging was the unblocking move.