## Iteration 20 — Phase 8 (close) Closes 2026-05-14. iter20 = kernel printk for `&ctx->ctrl_hdl`, `run.sps`, `run.decode_params` pointers + first 16 bytes of each, executed at top of `rkvdec_hevc_run` (after `rkvdec_hevc_run_preamble`). FULL close. Mechanism 4 reframed; root-cause localized to one kernel layer. ### Method `linux-fresnel-fourier 7.0-4` adds `rkvdec_iter20:` printk to RK3399 `rkvdec_hevc_run`: ```c { u8 *sps_bytes = (u8 *)run.sps; u8 *dp_bytes = (u8 *)run.decode_params; pr_info("rkvdec_iter20: ctrl_hdl=%p sps=%p sps[0..16]=%*ph " "dp=%p dp[0..16]=%*ph\n", &ctx->ctrl_hdl, run.sps, 16, sps_bytes ? sps_bytes : (u8 *)"", run.decode_params, 16, dp_bytes ? dp_bytes : (u8 *)""); } ``` Deployed via scp + `pacman -U` + reboot, with sddm autologin reseating mfritsche session. Build wall-clock 50 min on boltzmann. ### Results **libva HEVC** (13 frames, all identical): ``` rkvdec_iter20: ctrl_hdl=00000000f9b036ba sps=00000000105406cf sps[0..16]=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dp=00000000117b947e dp[0..16]=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ``` **kdirect HEVC** (15 frames): ``` rkvdec_iter20: ctrl_hdl=00000000d3afe1db sps=0000000095c47ba1 sps[0..16]=00 00 00 05 d0 02 00 00 04 04 02 04 01 01 00 03 dp=00000000599ee83f dp[0..16]=00..04..03 (varies per frame — correct, decode_params is per-frame) ``` ### What this proves 1. **`&ctx->ctrl_hdl` differs between processes** (libva `f9b036ba`, kdirect `d3afe1db`) — EXPECTED. Each backend opens `/dev/video3` separately, each gets its own `rkvdec_ctx` with its own private `ctrl_hdl`. This is normal V4L2 m2m. 2. **The `sps` pointer is stable across all libva frames** (`105406cf`) — confirms the SPS control is registered to the handler exactly once (at CreateContext / `rkvdec_init_ctrls`). The allocation exists, `v4l2_ctrl_find()` returns it correctly. The control structure is registered. Not a registration bug. 3. **libva's `*sps` content is all-zero**, **kdirect's `*sps` has real bytes** (`00 00 00 05 d0 02 00 00 04 04 02 04 01 01 00 03`) — the first SPS bytes in kdirect's case include `pic_width_in_luma_samples = 1280` (`0x05 0x00 = 1280` in little-endian + framing) which matches kdirect's `rkvdec_hevc_run` printk showing `w=1280`. libva's bytes are zero → its `w=0 h=0` printk follows. 4. **libva's `*decode_params` is also all-zero** across all 13 frames. kdirect's varies per-frame. Confirms decode_params for libva never gets non-zero values into ctx->ctrl_hdl either. ### Mechanism analysis The SPS control is **registered to `ctx->ctrl_hdl`** (pointer valid, stable, same allocation across 13 frames). What's missing is the **content copy** from `S_EXT_CTRLS` userspace payload into the registered control's `p_cur.p` memory. The V4L2 control-framework path for compound controls with `which=V4L2_CTRL_WHICH_REQUEST_VAL=0xf010000`: ``` userspace VIDIOC_S_EXT_CTRLS (which=REQUEST_VAL, request_fd=R, payload=...) → kernel v4l2_s_ext_ctrls() → which==REQUEST_VAL branch: looks up R's media_request, stages payload into req->p_new for each control → returns 0 userspace MEDIA_REQUEST_IOC_QUEUE on fd R → kernel queues req's pending bufs and pending controls → m2m schedules job → device_run callback → rkvdec_hevc_run_preamble(): v4l2_ctrl_request_setup(req, &ctx->ctrl_hdl): copies req->p_new → ctx->ctrl_hdl[ctrl]->p_cur → rkvdec_hevc_run() — printk fires here, reads ctx->ctrl_hdl values ``` For libva, the printk fires at the **read** site and observes all-zero. Three places this can fail: | # | Where | Likelihood | |---|---|---| | A | `v4l2_s_ext_ctrls` doesn't stage libva's payload into `req->p_new` for SPS | unknown — needs probe | | B | `req->p_new` has correct bytes but `v4l2_ctrl_request_setup` doesn't run for libva's request | unknown — needs probe | | C | `v4l2_ctrl_request_setup` runs but doesn't copy SPS for libva's request | unknown — needs probe | The kernel-direct path WORKS through the same control framework on the same kernel, same /dev/video3 — so the bug is in **how libva invokes the request lifecycle**, not in the framework code itself. ### Mechanism status update (post-iter20) | # | Mechanism | Status | |---|---|---| | 1 | request_fd mismatch (S_EXT_CTRLS R1, QUEUE R2) | strongly disfavored (strace shows consistent fd per frame, but worth one explicit verification) | | 2 | REINIT clears between S_EXT_CTRLS and QUEUE | DISPROVED iter19 | | 3 | Stack-locals stale | DISPROVED iter18 | | 4 | ctrl_hdl mismatch — different handlers | **REFRAMED iter20**: handlers differ (expected per-process), but BOTH register SPS correctly, and ctx->ctrl_hdl reads stable pointers. NOT a routing bug. | | 5 | error_idx silent partial fail | DISPROVED iter18 | | 6 | **NEW iter20**: req->p_new for SPS never receives libva's payload, OR v4l2_ctrl_request_setup never copies it into ctx->ctrl_hdl | **leading hypothesis** | ### User-level test for iter21 Libva can self-diagnose between A and B/C without kernel patches: After `S_EXT_CTRLS(which=REQUEST_VAL, request_fd=R, payload=...)`, immediately issue: - `G_EXT_CTRLS(which=REQUEST_VAL, request_fd=R)` for SPS. If readback returns non-zero bytes → **req->p_new HAS the payload** (mechanism A disproved, B or C remains). If readback returns zero → **req->p_new doesn't have it** (mechanism A confirmed). The G_EXT_CTRLS path with which=REQUEST_VAL reads from `req->p_new` directly — that's the staging slot. Outcome localizes the bug to one of two kernel layers. ### Substrate state at iter20 close - Backend SHA on fresnel: `c1d4bb53…` (iter15 stable, unchanged). - Fork tip `415688d` (iter19 state, unchanged). - Kernel `linux-fresnel-fourier 7.0-4` with iter17 + iter20 printk in rkvdec_hevc_run. NOT a shipping kernel — diagnostic only. - 5-codec anchors: unchanged from iter15. Zero regression. ### iter21 candidate `α-24`: Add G_EXT_CTRLS readback in libva's `h265_set_controls` right after every `v4l2_set_controls(... which=REQUEST_VAL ...)` call. Log first 16 bytes of returned SPS. ~15 LOC, fully reversible. Test in this single iter, then revert (diagnostic only, not for shipping). Outcomes: - **Non-zero readback** → req->p_new has libva's payload. Bug is in `v4l2_ctrl_request_setup` not running or not copying. iter22 = kernel printk in `v4l2_ctrl_request_setup` showing what gets copied for libva's request_fd at IOC_QUEUE time. - **Zero readback** → req->p_new doesn't have libva's payload. Bug is in `v4l2_s_ext_ctrls` staging for libva's invocation. iter22 = kernel printk in `v4l2_s_ext_ctrls` showing what libva actually passed. ### Lesson iter17 + iter20 prove `&ctx->ctrl_hdl` pointer routing is NOT the failure surface (registered controls allocated correctly, found correctly, pointer-stable). The failure surface is the **content copy** from userspace S_EXT_CTRLS into ctx->ctrl_hdl across the request lifecycle. Three iterations (17, 19, 20) of kernel printk have walked the bug-localization down from "anywhere in the kernel" → "S_EXT_CTRLS staging or v4l2_ctrl_request_setup application". Two more printk+probe iterations should reach the line of code.