iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24: pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
dp now match kdirect. HEVC frame 2+ still diverges
(separate bug, likely DPB entry mapping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,146 @@
|
||||
## Iteration 21 — Phase 8 (close)
|
||||
|
||||
Closes 2026-05-14. iter21 = kernel printk at top of `v4l2_ctrl_request_setup` + per-ref dump. FULL close. **Smoking-gun finding: libva's request-clone-handler is missing 6 HEVC stateless controls registered in main_hdl.**
|
||||
|
||||
### Method
|
||||
|
||||
`linux-fresnel-fourier 7.0-5` (pkgrel 4→5) adds two `pr_info` to `v4l2_ctrl_request_setup` in `drivers/media/v4l2-core/v4l2-ctrls-request.c`:
|
||||
|
||||
```c
|
||||
obj = media_request_object_find(req, &req_ops, main_hdl);
|
||||
pr_info("iter21_setup: req=%p main_hdl=%p obj=%p\n", req, main_hdl, obj);
|
||||
...
|
||||
list_for_each_entry(ref, &hdl->ctrl_refs, node) {
|
||||
...
|
||||
pr_info("iter21_setup_ref: ctrl_id=0x%x p_req_valid=%d have_new=%d\n",
|
||||
ctrl->id, ref->p_req_valid, have_new_data);
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Built ~1 min via ccache reuse. Deployed via scp + `pacman -U` + reboot.
|
||||
|
||||
### α-24 result (predicate: kernel-only path required)
|
||||
|
||||
α-24 (libva G_EXT_CTRLS readback after S_EXT_CTRLS) implemented as 1547a5d → amended a9c897f → reverted e109306. Kernel returned **EACCES** for all 13 libva HEVC frames: this V4L2 build disallows userspace probing of `req->p_new` for an uncompleted request. The probe path must run inside the kernel.
|
||||
|
||||
### Result — definitive (libva vs kdirect)
|
||||
|
||||
**libva HEVC frame 1 setup** (clone-hdl ctrl_refs in ID order, 14 entries):
|
||||
|
||||
```
|
||||
0x990a67 p_req_valid=0
|
||||
0x990a6b p_req_valid=0
|
||||
0x990b00 p_req_valid=0
|
||||
0x990b67 p_req_valid=0
|
||||
0x990b68 p_req_valid=0 (5 codec-class menu controls)
|
||||
0xa40900 p_req_valid=0 H264_DECODE_MODE
|
||||
0xa40901 p_req_valid=0 H264_START_CODE
|
||||
0xa40902 p_req_valid=0 H264_SPS
|
||||
0xa40903 p_req_valid=0 H264_PPS
|
||||
0xa40904 p_req_valid=0 H264_SCALING_MATRIX
|
||||
0xa40907 p_req_valid=0 H264_DECODE_PARAMS
|
||||
0xa40a2c p_req_valid=0 (misc stateless)
|
||||
0xa40a2d p_req_valid=0 (misc stateless)
|
||||
0xa40a90 p_req_valid=1 have_new=1 HEVC_SPS — CLONE STOPS HERE
|
||||
```
|
||||
|
||||
**Missing from libva clone (vs kdirect):**
|
||||
- 0xa40905 H264_PRED_WEIGHTS (compound)
|
||||
- 0xa40906 H264_SLICE_PARAMS (compound, dyn_array)
|
||||
- 0xa40a91 HEVC_PPS (compound)
|
||||
- 0xa40a92 HEVC_SLICE_PARAMS (compound, dyn_array)
|
||||
- 0xa40a93 HEVC_SCALING_MATRIX (compound)
|
||||
- 0xa40a94 HEVC_DECODE_PARAMS (compound)
|
||||
- 0xa40a95 HEVC_DECODE_MODE (menu)
|
||||
- 0xa40a96 HEVC_START_CODE (menu)
|
||||
|
||||
**kdirect HEVC frame 1 setup** (same hdl, 21 entries — all of above PLUS the 8 missing):
|
||||
|
||||
```
|
||||
... 14 entries as above ...
|
||||
0xa40a91 p_req_valid=1 have_new=1 HEVC_PPS
|
||||
0xa40a92 p_req_valid=1 have_new=1 HEVC_SLICE_PARAMS
|
||||
0xa40a93 p_req_valid=1 have_new=1 HEVC_SCALING_MATRIX
|
||||
0xa40a94 p_req_valid=1 have_new=1 HEVC_DECODE_PARAMS
|
||||
0xa40a95 p_req_valid=0 HEVC_DECODE_MODE (device-init only)
|
||||
0xa40a96 p_req_valid=0 HEVC_START_CODE (device-init only)
|
||||
```
|
||||
|
||||
### What this means
|
||||
|
||||
`v4l2_ctrl_request_setup(req, main_hdl)`:
|
||||
- finds `obj` for both libva and kdirect (non-NULL) — request properly bound.
|
||||
- iterates `hdl->ctrl_refs` — but **libva's hdl is the request-clone-hdl, and it contains 14 of the 21 source controls**.
|
||||
- libva's HEVC_SPS has `p_req_valid=1` — staging worked for that one control.
|
||||
- The other 6 HEVC controls (PPS, SLICE_PARAMS, SCALING_MATRIX, DECODE_PARAMS, DECODE_MODE, START_CODE) **don't exist in the clone-hdl at all** — they cannot be staged.
|
||||
|
||||
When libva submits its 5-control S_EXT_CTRLS batch (SPS, PPS, SLICE_PARAMS, SCALING_MATRIX, DECODE_PARAMS), only SPS is registered in the clone-hdl. PPS, SLICE_PARAMS, SCALING_MATRIX, DECODE_PARAMS find no ref → `prepare_ext_ctrls` returns `-EINVAL`. (This contradicts iter18 α-22's rc=0 — needs re-investigation of error_idx semantics for the request path; the userspace observation of rc=0 may not reflect the actual kernel error for compound-control lookups in request clones.)
|
||||
|
||||
### iter20's "zero SPS bytes" explained
|
||||
|
||||
iter20 showed `rkvdec sees sps[0..16]=00..00` for libva. That's because:
|
||||
- HEVC_SPS *is* in the clone-hdl with `p_req_valid=1` — so it got STAGED.
|
||||
- But the **content** in `req->p_new[SPS]` is all-zero.
|
||||
|
||||
Two possible reasons for zero content despite p_req_valid=1:
|
||||
1. `user_to_new` ran on a zero-payload from libva. iter15 strace ruled this out — libva's SPS payload is non-zero at ioctl entry.
|
||||
2. `new_to_req` ran, but the data flow is somehow corrupted. Possible if the master/cluster lookup is wrong on the clone-hdl.
|
||||
|
||||
iter22 candidate: add a printk in `new_to_req` and `req_to_new` to log the copy: source pointer, dest pointer, first 4 bytes, payload size.
|
||||
|
||||
### Mechanism status (post-iter21)
|
||||
|
||||
| # | Mechanism | Status |
|
||||
|---|---|---|
|
||||
| 1 | request_fd mismatch | DISPROVED iter17/18 |
|
||||
| 2 | REINIT clears | DISPROVED iter19 |
|
||||
| 3 | Stack-locals stale | DISPROVED iter18 |
|
||||
| 4 | ctrl_hdl mismatch | REFRAMED iter20 |
|
||||
| 5 | error_idx silent failure | DISPROVED iter18 (but warrants re-check given iter21 finding) |
|
||||
| 6 | req->p_new staging incomplete | **CONFIRMED iter21**: clone-hdl missing controls = staging cannot occur for 6 of 7 HEVC controls |
|
||||
| 7 | **NEW iter21**: clone-hdl is missing controls that main_hdl has registered | **Root question for iter22** |
|
||||
|
||||
### Why is the clone incomplete?
|
||||
|
||||
`v4l2_ctrl_request_clone(new_hdl, from=main_hdl)` iterates `main_hdl->ctrl_refs` in ID-sorted order. After cloning HEVC_SPS (0xa40a90), the loop **stops** before HEVC_PPS (0xa40a91). Equivalent stops happen at H264_PRED_WEIGHTS (0xa40905) — both are first compound controls of their codec block.
|
||||
|
||||
Hypothesis: `handler_new_ref` returns non-zero error at the first compound control AFTER an SPS-like single-struct compound, but **only when called from the request-clone path**. Or: `kzalloc(sizeof(*new_ref) + size_extra_req)` fails for ones with larger `elem_size` (HEVC_PPS = 64 bytes, H264_PRED_WEIGHTS = 32 bytes — small, unlikely to OOM but worth verifying).
|
||||
|
||||
Alt hypothesis: `handler_new_ref`'s auto-class-control insertion (`v4l2_ctrl_new_std`) fails for non-compound HEVC menu controls in request-clone path, which propagates `hdl->error` and breaks subsequent iterations.
|
||||
|
||||
Same kernel succeeds for kdirect on the same `from` hdl, so something is **per-request-bind specific** — maybe related to request lifecycle timing in libva (iter6 permanent request_fd at CreateContext) vs kdirect (per-frame request_fd).
|
||||
|
||||
### Substrate state at iter21 close
|
||||
|
||||
- Backend SHA on fresnel: `c1d4bb53…` (iter15 stable, unchanged).
|
||||
- Fork tip `e109306` (α-24 reverted).
|
||||
- Kernel `linux-fresnel-fourier 7.0-5` with iter17 + iter20 + iter21 printks. NOT a shipping kernel.
|
||||
- 5-codec anchors: unchanged. Zero regression.
|
||||
|
||||
### iter22 candidate
|
||||
|
||||
Add printks to `v4l2_ctrl_request_clone` and `handler_new_ref`:
|
||||
|
||||
```c
|
||||
// in v4l2_ctrl_request_clone
|
||||
pr_info("iter22_clone_start: new_hdl=%p from=%p\n", hdl, from);
|
||||
|
||||
// per iteration
|
||||
err = handler_new_ref(hdl, ctrl, &new_ref, false, true);
|
||||
pr_info("iter22_clone_step: id=0x%x err=%d from_other=%d\n",
|
||||
ctrl->id, err, ref->from_other_dev);
|
||||
if (err) {
|
||||
pr_info("iter22_clone_break: at id=0x%x err=%d hdl_error=%d\n",
|
||||
ctrl->id, err, hdl->error);
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
After 7.0-6 deploys, libva HEVC run will show exactly which ctrl_id breaks the loop and the error code. Then we can localize either to `kzalloc` failure, `v4l2_ctrl_new_std` failure (auto-class), or some other condition.
|
||||
|
||||
### Lesson
|
||||
|
||||
iter21 overturns the iter11–iter18 hypothesis space entirely. The S_EXT_CTRLS ioctl wire-byte payload analysis was correct — libva's bytes match kdirect's. But **at the v4l2_ctrl framework level, libva's request-clone is missing the registered controls libva tries to stage**. The bug is in how the V4L2 control framework handles libva's specific request-binding pattern, NOT in libva's ioctl content.
|
||||
|
||||
This is the strongest narrowing since iter17. We've gone from "anywhere in kernel" → "kernel control framework" → "request-clone path specifically" → "iteration breaks at first compound HEVC control".
|
||||
Reference in New Issue
Block a user