Files
fresnel-fourier/phase4_iter21_plan.md
marfrit bf67900cd8 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24:    pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
           to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
           before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
           ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
           still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
           HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
           picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
           dp now match kdirect. HEVC frame 2+ still diverges
           (separate bug, likely DPB entry mapping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:10:56 +00:00

61 lines
3.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Iteration 21 — Phase 4 (plan)
Opens 2026-05-14. Continues iter20's localization: rkvdec sees all-zero ctx->ctrl_hdl SPS for libva, real bytes for kdirect. The break is between userspace S_EXT_CTRLS and rkvdec's read of `ctx->ctrl_hdl[SPS].p_cur.p`.
### Locked research question (iter21)
> *"At the v4l2_ctrl_request_setup() entry for libva's per-frame request_fd, is the V4L2 control-handler object (`obj`) found? For each control_ref in the request's hdl->ctrl_refs, is `p_req_valid == true`?"*
### What this narrows
`v4l2_ctrl_request_setup(req, main_hdl)` at IOC_QUEUE time iterates `hdl->ctrl_refs` and only applies controls where `ref->p_req_valid == true`. The bit gets set by the staging path in `try_set_ext_ctrls_common` (called from `try_set_ext_ctrls_request`) when which=V4L2_CTRL_WHICH_REQUEST_VAL.
If libva's S_EXT_CTRLS staged correctly, p_req_valid is true for each ctrl libva submitted. If staging failed silently, p_req_valid is false and `v4l2_ctrl_request_setup` skips them — ctx->ctrl_hdl stays at zero (matches iter20 evidence).
### Approach (α-24 inert → kernel path)
α-24 (libva G_EXT_CTRLS readback after S_EXT_CTRLS) was implemented in 1547a5d+a9c897f, returned `EACCES` for all 13 libva HEVC frames. Reverted in e109306. The kernel disallows G_EXT_CTRLS against a not-yet-completed request — userspace can't probe `req->p_new`. Mechanism distinguishing requires kernel printk.
iter21 patches `v4l2_ctrl_request_setup` with two printk lines:
```c
pr_info("iter21_setup: req=%p main_hdl=%p obj=%p\n",
req, main_hdl, obj);
pr_info("iter21_setup_ref: ctrl_id=0x%x p_req_valid=%d have_new=%d\n",
ctrl->id, ref->p_req_valid, have_new_data);
```
Build `linux-fresnel-fourier 7.0-5` (pkgrel 4→5), deploy, reboot, run libva-HEVC + kdirect-HEVC, capture dmesg.
### Outcome interpretation
| obj at setup entry | p_req_valid (libva run) | Diagnosis |
|---|---|---|
| NULL | n/a | req has no v4l2_ctrl_handler bound at queue time. libva's S_EXT_CTRLS never staged. Bug in libva's request lifecycle. |
| non-NULL | all false | obj found, but staging path never set `p_req_valid`. Bug in `try_set_ext_ctrls_common` for libva's invocation. |
| non-NULL | true for SPS | staging worked but `req_to_new` / `try_or_set_cluster` failed silently. Bug in apply path. Needs another printk after `req_to_new`. |
iter21 finishes when one of these is confirmed for libva. Compare to kdirect baseline (should always show p_req_valid=true for SPS).
### Substrate state at iter21 open
- Kernel `linux-fresnel-fourier 7.0-5` building on boltzmann (PID 1584834, log /tmp/iter21-kbuild.log).
- Backend SHA `c1d4bb53…` (iter15 stable) — backend unchanged from iter15.
- Fork tip `e109306` (α-24 reverted).
- 5-codec anchors: unchanged. Zero regression.
### Phase 5 review note
Diagnostic kernel patch (2 pr_info calls in well-known V4L2 framework function, no behavior change). Phase 5 review skipped per iter17 precedent for diagnostic-only kernel work.
### Phase 7 plan
After 7.0-5 deploys:
1. Reboot fresnel; sddm autologin reseats mfritsche.
2. `sudo dmesg -C`.
3. Run libva HEVC; capture rkvdec_iter20 + iter21_setup lines.
4. `sudo dmesg -C`.
5. Run kdirect HEVC; capture same.
6. Diff; localize bug to one of the three table-row diagnoses.