## Iteration 23 — Phase 8 (close) Closes 2026-05-14. iter23 = kernel printk inside `v4l2_ctrl_request_setup` outer loop, BEFORE the `continue` check, logging every iteration. FULL close. ### Method `linux-fresnel-fourier 7.0-7` added one pr_info at TOP of the outer loop in `v4l2_ctrl_request_setup`, BEFORE `if (ref->req_done || (ctrl->flags & V4L2_CTRL_FLAG_READ_ONLY)) continue;`: ```c pr_info("iter23_loop: id=0x%x req_done=%d flags=0x%x ncontrols=%d cluster0_id=0x%x\n", ctrl->id, ref->req_done, ctrl->flags, master->ncontrols, master->cluster[0] ? master->cluster[0]->id : 0); ``` ### Result — definitive **libva HEVC** (first setup): iter23_loop fires for 16 IDs ending at 0xa40a90 (HEVC_SPS). **The outer loop EXITS before reaching 0xa40a91.** **kdirect HEVC** (first setup): iter23_loop fires for 22 IDs ending at 0xa40a96 (HEVC_START_CODE). **The outer loop completes normally.** The loop body has only two exit-loop paths after the iter23_loop printk fires: 1. `goto error` if `req_to_new(r)` returns non-zero. 2. `break` if `try_or_set_cluster(NULL, master, true, 0)` returns non-zero. For libva, ONE of these fires AT HEVC_SPS, exiting the loop. For kdirect, NEITHER fires. This **fully overturns iter21/22**: - The clone-hdl IS complete for libva (iter22 confirmed all 22 controls cloned). - The setup loop reaches HEVC_SPS for libva (iter23 confirmed). - The processing of HEVC_SPS in the setup loop FAILS for libva. The failure of HEVC_SPS processing means: - `p_cur` for HEVC_SPS is never committed → rkvdec reads zero (iter20 finding). - All subsequent compound HEVC controls (PPS, SLICE_PARAMS, SCALING_MATRIX, DECODE_PARAMS, DECODE_MODE, START_CODE) NEVER reach their processing → their `req_done` stays false but they're also never committed → all zero in `ctx->ctrl_hdl`. ### Why does HEVC_SPS processing fail for libva but not kdirect? The most likely candidates: | Function | Failure modes | |---|---| | `req_to_new(ref_SPS)` | -ENOENT if `!p_req_valid`. -EINVAL if elem count mismatch (`p_req_elems != p_array_alloc_elems` for non-dyn-array). -ENOMEM if alloc fails for dyn-array resize. | | `try_or_set_cluster(NULL, master_SPS, true, 0)` | Validator failures (out-of-range field values). Cluster ops failures. Often returns -EINVAL or -ERANGE. | iter24 will pinpoint which function fails and what return value. ### iter21/22's interpretation errors - **iter21**: I concluded the clone-hdl was missing controls. Wrong — the iter21_setup_ref printk was inside the loop body but AFTER the early-continue check. The "missing" controls were actually iterated past after SPS's processing failed and the loop exited — they never even saw the iter21 printk. - **iter22**: The clone trace confirmed clone-hdl is complete. Good. But my mid-conclusion ("clone-hdl is complete; staging fails in setup loop SKIP path") was partially wrong — the loop doesn't SKIP, it EXITS. ### Mechanism status (post-iter23) | # | Mechanism | Status | |---|---|---| | 1 | request_fd mismatch | DISPROVED iter17/18 | | 2 | REINIT clears | DISPROVED iter19 | | 3 | Stack-locals stale | DISPROVED iter18 | | 4 | ctrl_hdl mismatch | DISPROVED iter20-22 | | 5 | error_idx silent failure | DISPROVED iter18 | | 6 | req->p_new staging incomplete | DISPROVED iter22 | | 7 | Clone-hdl missing controls | DISPROVED iter22 | | 8 | Skip-loop bypass | DISPROVED iter23 (loop EXITS, not skips) | | 9 | **NEW iter23**: HEVC_SPS processing in v4l2_ctrl_request_setup fails for libva | **LEADING — iter24 candidate** | ### iter24 candidate `linux-fresnel-fourier 7.0-8`: ```c ret = req_to_new(r); pr_info("iter24_req_to_new: id=0x%x ret=%d p_req_valid=%d p_req_elems=%u\n", master->cluster[i]->id, ret, r->p_req_valid, r->p_req_elems); ... ret = try_or_set_cluster(NULL, master, true, 0); pr_info("iter24_try_or_set: master_id=0x%x ret=%d\n", master->id, ret); ``` After 7.0-8 deploys, libva HEVC will show: - `iter24_req_to_new id=0xa40a90 ret=X p_req_valid=Y p_req_elems=Z` where X is the actual return value. - If req_to_new ret != 0 → bug is in req_to_new for HEVC_SPS on libva's staged data. Compare p_req_elems to kdirect's value. - If req_to_new ret == 0 → check iter24_try_or_set's ret. If non-zero → validator rejects libva's SPS but accepts kdirect's. Investigate which field validator rejects. ### Substrate state at iter23 close - Backend SHA on fresnel: `c1d4bb53…` (iter15 stable, unchanged). - Fork tip `e109306` — unchanged. - Kernel `linux-fresnel-fourier 7.0-7` with iter17 + iter20 + iter21 + iter22 + iter23 printks. - 5-codec anchors: unchanged. ### iter24 build kicked off `linux-fresnel-fourier 7.0-8` building on boltzmann (PID 1672261, log /tmp/iter24-kbuild.log). ### Lesson Three iterations of mid-loop printk (iter21, iter22, iter23) needed to localize the exit. Each iteration overturned the previous's partial conclusion. Key methodology: **place the diagnostic printk at the very top of each loop body, BEFORE any continue/break, to distinguish "skipped" from "exited"**. Without that, "missing from printk output" is ambiguous. The bug is now localized to: - A specific function: `req_to_new` OR `try_or_set_cluster`. - A specific control: HEVC_SPS. - A specific request lifecycle pattern: libva's, not kdirect's. One more printk iteration (iter24) should give the failing function + return code.