Files
fresnel-fourier/phase8_iteration25_close.md
T
marfrit bf67900cd8 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24:    pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
           to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
           before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
           ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
           still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
           HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
           picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
           dp now match kdirect. HEVC frame 2+ still diverges
           (separate bug, likely DPB entry mapping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:10:56 +00:00

95 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Iteration 25 — Phase 8 (close)
Closes 2026-05-14. iter25 = α-25 synthetic-SPS injection before cap_pool_init. **MAJOR WIN.** PARTIAL close — frame 1 byte-identical to kdirect for HEVC libva; frames 2+ have separate wire-byte issue (decode_params).
### α-25 implementation
`src/context.c::RequestCreateContext` — after S_FMT(OUTPUT) + S_FMT(CAPTURE) + G_FMT(CAPTURE) sanity, BEFORE `cap_pool_init`:
```c
switch (config_object->profile) {
case VAProfileHEVCMain: {
struct v4l2_ctrl_hevc_sps dummy_sps;
memset(&dummy_sps, 0, sizeof(dummy_sps));
dummy_sps.chroma_format_idc = 1; /* 4:2:0 */
dummy_sps.bit_depth_luma_minus8 = 0; /* 8-bit */
dummy_sps.bit_depth_chroma_minus8 = 0;
dummy_sps.pic_width_in_luma_samples = picture_width;
dummy_sps.pic_height_in_luma_samples = picture_height;
/* ... v4l2_set_controls(video_fd, request_fd=-1, &SPS, 1) ... */
}
case VAProfileH264*: similar with V4L2_CID_STATELESS_H264_SPS
default: skip
}
```
Forks `db0b7f9` — single commit.
### Result — definitive
**Frame 1**: libva CAPTURE bytes = kdirect CAPTURE bytes (cmp identical for first 1382400 bytes, the entire frame 1 NV12 payload of 1280×720).
**Frame 2+**: diverge starting at byte 1382401.
### Kernel printk evidence (post-α-25)
```
iter24_req_to_new: id=0xa40a90 ret=0 p_req_valid=1 p_req_elems=1
iter24_try_or_set: master_id=0xa40a90 ret=0 ← was -16 (EBUSY) before
iter24_req_to_new: id=0xa40a91 ret=0
iter24_try_or_set: master_id=0xa40a91 ret=0
iter24_req_to_new: id=0xa40a92 ret=0
iter24_try_or_set: master_id=0xa40a92 ret=0
iter24_req_to_new: id=0xa40a93 ret=0
iter24_try_or_set: master_id=0xa40a93 ret=0
iter24_req_to_new: id=0xa40a94 ret=0
iter24_try_or_set: master_id=0xa40a94 ret=0
rkvdec_iter20: sps[0..16]=00 00 00 05 d0 02 00 00 04 04 04 00 01 01 00 03
← non-zero, w=1280, h=720
rkvdec_hevc_run: w=1280 h=720 chroma=1 nal_unit_type=20 slice_type=2 decode_flags=0x3
← rkvdec sees CORRECT SPS for the first time
```
`iter24_loop_break-count = 0` — the setup loop NEVER breaks. All 5 staged HEVC controls commit to ctx->ctrl_hdl successfully.
### Bug 5 root cause: FIXED
The -EBUSY block from rkvdec_s_ctrl's vb2_is_busy check is gone. ctx->image_fmt is pre-seeded to RKVDEC_IMG_FMT_420_8BIT by the synthetic SPS injection before any CAPTURE buffer is allocated. Per-frame SPS submissions find image_fmt_changed=false → skip reset → commit succeeds.
### Frame 2+ divergence (separate Bug)
`decode_params.short_term_ref_pic_set_size`:
- libva frame 2: bytes 4-5 = `00 00` → 0
- kdirect frame 2: bytes 4-5 = `0a 00` → 10
libva's `h265_fill_decode_params` doesn't populate short_term_ref_pic_set_size (VAAPI doesn't expose it). kdirect parses it from the HEVC NAL directly. This affects DPB reference resolution for P/B frames. iter26 candidate.
### Mechanism status
| # | Mechanism | Status |
|---|---|---|
| 9 | rkvdec_s_ctrl -EBUSY on first SPS | **FIXED iter25 α-25** |
| 10 | decode_params.short_term_ref_pic_set_size = 0 | **NEW iter26 candidate** |
### Substrate state at iter25 close
- Backend SHA on fresnel: post-α-25 build (commit `db0b7f9`).
- Fork tip `db0b7f9` (α-25).
- Kernel `linux-fresnel-fourier 7.0-8` (diagnostic printks; should eventually revert to clean 7.0-1 + RFC v2 + iter12 baseline).
- HEVC libva frame 1 = kdirect frame 1 byte-identical. ✓✓✓
- HEVC libva frame 2+: differs.
### Anchors check pending
Need to re-run 5-codec anchors to verify α-25 didn't regress VP9/MPEG-2/VP8 (it shouldn't — guard is `case VAProfileHEVCMain` / `case VAProfileH264*` only).
### Lesson
After 15 iterations chasing wire-byte hypotheses (iter11-iter18), 5 iterations of kernel printk (iter17-iter24), the actual bug was an interaction between libva's CAPTURE-pre-allocate design and rkvdec's lazy image_fmt determination. The fix is 90 LOC in libva. The kernel was correct all along — it just needed a way to commit the image_fmt before buffers were locked in.
This validates [[feedback-libva-byte-correct-kernel-bug]] only partially: libva WAS byte-correct in its ioctl content, but it had a CAPTURE-pool-allocation TIMING bug that interacted with kernel state. The bug is in libva, not the kernel, but the symptom only manifested because of kernel-side -EBUSY semantics that aren't well documented.
### iter26 candidate
Fix `h265_fill_decode_params` to populate `short_term_ref_pic_set_size`. VAAPI doesn't expose this directly, but it can be derived from `surface_object->params.h265.slices[0].short_term_ref_pic_set_size` (if VAAPI provides it) or parsed from the slice header.