Files
fresnel-fourier/phase8_iteration25_close.md
marfrit bf67900cd8 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24:    pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
           to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
           before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
           ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
           still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
           HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
           picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
           dp now match kdirect. HEVC frame 2+ still diverges
           (separate bug, likely DPB entry mapping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:10:56 +00:00

4.5 KiB
Raw Permalink Blame History

Iteration 25 — Phase 8 (close)

Closes 2026-05-14. iter25 = α-25 synthetic-SPS injection before cap_pool_init. MAJOR WIN. PARTIAL close — frame 1 byte-identical to kdirect for HEVC libva; frames 2+ have separate wire-byte issue (decode_params).

α-25 implementation

src/context.c::RequestCreateContext — after S_FMT(OUTPUT) + S_FMT(CAPTURE) + G_FMT(CAPTURE) sanity, BEFORE cap_pool_init:

switch (config_object->profile) {
case VAProfileHEVCMain: {
    struct v4l2_ctrl_hevc_sps dummy_sps;
    memset(&dummy_sps, 0, sizeof(dummy_sps));
    dummy_sps.chroma_format_idc = 1;          /* 4:2:0 */
    dummy_sps.bit_depth_luma_minus8 = 0;       /* 8-bit */
    dummy_sps.bit_depth_chroma_minus8 = 0;
    dummy_sps.pic_width_in_luma_samples = picture_width;
    dummy_sps.pic_height_in_luma_samples = picture_height;
    /* ... v4l2_set_controls(video_fd, request_fd=-1, &SPS, 1) ... */
}
case VAProfileH264*: similar with V4L2_CID_STATELESS_H264_SPS
default: skip
}

Forks db0b7f9 — single commit.

Result — definitive

Frame 1: libva CAPTURE bytes = kdirect CAPTURE bytes (cmp identical for first 1382400 bytes, the entire frame 1 NV12 payload of 1280×720).

Frame 2+: diverge starting at byte 1382401.

Kernel printk evidence (post-α-25)

iter24_req_to_new: id=0xa40a90 ret=0 p_req_valid=1 p_req_elems=1
iter24_try_or_set: master_id=0xa40a90 ret=0     ← was -16 (EBUSY) before
iter24_req_to_new: id=0xa40a91 ret=0
iter24_try_or_set: master_id=0xa40a91 ret=0
iter24_req_to_new: id=0xa40a92 ret=0
iter24_try_or_set: master_id=0xa40a92 ret=0
iter24_req_to_new: id=0xa40a93 ret=0
iter24_try_or_set: master_id=0xa40a93 ret=0
iter24_req_to_new: id=0xa40a94 ret=0
iter24_try_or_set: master_id=0xa40a94 ret=0
rkvdec_iter20: sps[0..16]=00 00 00 05 d0 02 00 00 04 04 04 00 01 01 00 03
                ← non-zero, w=1280, h=720
rkvdec_hevc_run: w=1280 h=720 chroma=1 nal_unit_type=20 slice_type=2 decode_flags=0x3
                ← rkvdec sees CORRECT SPS for the first time

iter24_loop_break-count = 0 — the setup loop NEVER breaks. All 5 staged HEVC controls commit to ctx->ctrl_hdl successfully.

Bug 5 root cause: FIXED

The -EBUSY block from rkvdec_s_ctrl's vb2_is_busy check is gone. ctx->image_fmt is pre-seeded to RKVDEC_IMG_FMT_420_8BIT by the synthetic SPS injection before any CAPTURE buffer is allocated. Per-frame SPS submissions find image_fmt_changed=false → skip reset → commit succeeds.

Frame 2+ divergence (separate Bug)

decode_params.short_term_ref_pic_set_size:

  • libva frame 2: bytes 4-5 = 00 00 → 0
  • kdirect frame 2: bytes 4-5 = 0a 00 → 10

libva's h265_fill_decode_params doesn't populate short_term_ref_pic_set_size (VAAPI doesn't expose it). kdirect parses it from the HEVC NAL directly. This affects DPB reference resolution for P/B frames. iter26 candidate.

Mechanism status

# Mechanism Status
9 rkvdec_s_ctrl -EBUSY on first SPS FIXED iter25 α-25
10 decode_params.short_term_ref_pic_set_size = 0 NEW iter26 candidate

Substrate state at iter25 close

  • Backend SHA on fresnel: post-α-25 build (commit db0b7f9).
  • Fork tip db0b7f9 (α-25).
  • Kernel linux-fresnel-fourier 7.0-8 (diagnostic printks; should eventually revert to clean 7.0-1 + RFC v2 + iter12 baseline).
  • HEVC libva frame 1 = kdirect frame 1 byte-identical. ✓✓✓
  • HEVC libva frame 2+: differs.

Anchors check pending

Need to re-run 5-codec anchors to verify α-25 didn't regress VP9/MPEG-2/VP8 (it shouldn't — guard is case VAProfileHEVCMain / case VAProfileH264* only).

Lesson

After 15 iterations chasing wire-byte hypotheses (iter11-iter18), 5 iterations of kernel printk (iter17-iter24), the actual bug was an interaction between libva's CAPTURE-pre-allocate design and rkvdec's lazy image_fmt determination. The fix is 90 LOC in libva. The kernel was correct all along — it just needed a way to commit the image_fmt before buffers were locked in.

This validates feedback-libva-byte-correct-kernel-bug only partially: libva WAS byte-correct in its ioctl content, but it had a CAPTURE-pool-allocation TIMING bug that interacted with kernel state. The bug is in libva, not the kernel, but the symptom only manifested because of kernel-side -EBUSY semantics that aren't well documented.

iter26 candidate

Fix h265_fill_decode_params to populate short_term_ref_pic_set_size. VAAPI doesn't expose this directly, but it can be derived from surface_object->params.h265.slices[0].short_term_ref_pic_set_size (if VAAPI provides it) or parsed from the slice header.