iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24: pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
dp now match kdirect. HEVC frame 2+ still diverges
(separate bug, likely DPB entry mapping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
## Iteration 25 — Phase 8 (close)
|
||||
|
||||
Closes 2026-05-14. iter25 = α-25 synthetic-SPS injection before cap_pool_init. **MAJOR WIN.** PARTIAL close — frame 1 byte-identical to kdirect for HEVC libva; frames 2+ have separate wire-byte issue (decode_params).
|
||||
|
||||
### α-25 implementation
|
||||
|
||||
`src/context.c::RequestCreateContext` — after S_FMT(OUTPUT) + S_FMT(CAPTURE) + G_FMT(CAPTURE) sanity, BEFORE `cap_pool_init`:
|
||||
|
||||
```c
|
||||
switch (config_object->profile) {
|
||||
case VAProfileHEVCMain: {
|
||||
struct v4l2_ctrl_hevc_sps dummy_sps;
|
||||
memset(&dummy_sps, 0, sizeof(dummy_sps));
|
||||
dummy_sps.chroma_format_idc = 1; /* 4:2:0 */
|
||||
dummy_sps.bit_depth_luma_minus8 = 0; /* 8-bit */
|
||||
dummy_sps.bit_depth_chroma_minus8 = 0;
|
||||
dummy_sps.pic_width_in_luma_samples = picture_width;
|
||||
dummy_sps.pic_height_in_luma_samples = picture_height;
|
||||
/* ... v4l2_set_controls(video_fd, request_fd=-1, &SPS, 1) ... */
|
||||
}
|
||||
case VAProfileH264*: similar with V4L2_CID_STATELESS_H264_SPS
|
||||
default: skip
|
||||
}
|
||||
```
|
||||
|
||||
Forks `db0b7f9` — single commit.
|
||||
|
||||
### Result — definitive
|
||||
|
||||
**Frame 1**: libva CAPTURE bytes = kdirect CAPTURE bytes (cmp identical for first 1382400 bytes, the entire frame 1 NV12 payload of 1280×720).
|
||||
|
||||
**Frame 2+**: diverge starting at byte 1382401.
|
||||
|
||||
### Kernel printk evidence (post-α-25)
|
||||
|
||||
```
|
||||
iter24_req_to_new: id=0xa40a90 ret=0 p_req_valid=1 p_req_elems=1
|
||||
iter24_try_or_set: master_id=0xa40a90 ret=0 ← was -16 (EBUSY) before
|
||||
iter24_req_to_new: id=0xa40a91 ret=0
|
||||
iter24_try_or_set: master_id=0xa40a91 ret=0
|
||||
iter24_req_to_new: id=0xa40a92 ret=0
|
||||
iter24_try_or_set: master_id=0xa40a92 ret=0
|
||||
iter24_req_to_new: id=0xa40a93 ret=0
|
||||
iter24_try_or_set: master_id=0xa40a93 ret=0
|
||||
iter24_req_to_new: id=0xa40a94 ret=0
|
||||
iter24_try_or_set: master_id=0xa40a94 ret=0
|
||||
rkvdec_iter20: sps[0..16]=00 00 00 05 d0 02 00 00 04 04 04 00 01 01 00 03
|
||||
← non-zero, w=1280, h=720
|
||||
rkvdec_hevc_run: w=1280 h=720 chroma=1 nal_unit_type=20 slice_type=2 decode_flags=0x3
|
||||
← rkvdec sees CORRECT SPS for the first time
|
||||
```
|
||||
|
||||
`iter24_loop_break-count = 0` — the setup loop NEVER breaks. All 5 staged HEVC controls commit to ctx->ctrl_hdl successfully.
|
||||
|
||||
### Bug 5 root cause: FIXED
|
||||
|
||||
The -EBUSY block from rkvdec_s_ctrl's vb2_is_busy check is gone. ctx->image_fmt is pre-seeded to RKVDEC_IMG_FMT_420_8BIT by the synthetic SPS injection before any CAPTURE buffer is allocated. Per-frame SPS submissions find image_fmt_changed=false → skip reset → commit succeeds.
|
||||
|
||||
### Frame 2+ divergence (separate Bug)
|
||||
|
||||
`decode_params.short_term_ref_pic_set_size`:
|
||||
- libva frame 2: bytes 4-5 = `00 00` → 0
|
||||
- kdirect frame 2: bytes 4-5 = `0a 00` → 10
|
||||
|
||||
libva's `h265_fill_decode_params` doesn't populate short_term_ref_pic_set_size (VAAPI doesn't expose it). kdirect parses it from the HEVC NAL directly. This affects DPB reference resolution for P/B frames. iter26 candidate.
|
||||
|
||||
### Mechanism status
|
||||
|
||||
| # | Mechanism | Status |
|
||||
|---|---|---|
|
||||
| 9 | rkvdec_s_ctrl -EBUSY on first SPS | **FIXED iter25 α-25** |
|
||||
| 10 | decode_params.short_term_ref_pic_set_size = 0 | **NEW iter26 candidate** |
|
||||
|
||||
### Substrate state at iter25 close
|
||||
|
||||
- Backend SHA on fresnel: post-α-25 build (commit `db0b7f9`).
|
||||
- Fork tip `db0b7f9` (α-25).
|
||||
- Kernel `linux-fresnel-fourier 7.0-8` (diagnostic printks; should eventually revert to clean 7.0-1 + RFC v2 + iter12 baseline).
|
||||
- HEVC libva frame 1 = kdirect frame 1 byte-identical. ✓✓✓
|
||||
- HEVC libva frame 2+: differs.
|
||||
|
||||
### Anchors check pending
|
||||
|
||||
Need to re-run 5-codec anchors to verify α-25 didn't regress VP9/MPEG-2/VP8 (it shouldn't — guard is `case VAProfileHEVCMain` / `case VAProfileH264*` only).
|
||||
|
||||
### Lesson
|
||||
|
||||
After 15 iterations chasing wire-byte hypotheses (iter11-iter18), 5 iterations of kernel printk (iter17-iter24), the actual bug was an interaction between libva's CAPTURE-pre-allocate design and rkvdec's lazy image_fmt determination. The fix is 90 LOC in libva. The kernel was correct all along — it just needed a way to commit the image_fmt before buffers were locked in.
|
||||
|
||||
This validates [[feedback-libva-byte-correct-kernel-bug]] only partially: libva WAS byte-correct in its ioctl content, but it had a CAPTURE-pool-allocation TIMING bug that interacted with kernel state. The bug is in libva, not the kernel, but the symptom only manifested because of kernel-side -EBUSY semantics that aren't well documented.
|
||||
|
||||
### iter26 candidate
|
||||
|
||||
Fix `h265_fill_decode_params` to populate `short_term_ref_pic_set_size`. VAAPI doesn't expose this directly, but it can be derived from `surface_object->params.h265.slices[0].short_term_ref_pic_set_size` (if VAAPI provides it) or parsed from the slice header.
|
||||
Reference in New Issue
Block a user