bf67900cd8
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24: pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
dp now match kdirect. HEVC frame 2+ still diverges
(separate bug, likely DPB entry mapping).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
77 lines
4.5 KiB
Markdown
77 lines
4.5 KiB
Markdown
## Iteration 26 — Phase 8 (close)
|
||
|
||
Closes 2026-05-14. iter26 = α-26 `decode_params.short_term_ref_pic_set_size` from `VAPictureParameterBufferHEVC.st_rps_bits`. PARTIAL close.
|
||
|
||
### α-26 fix
|
||
|
||
`src/h265.c::h265_fill_decode_params` — replaced the comment "VAAPI doesn't expose" with the actual assignment:
|
||
|
||
```c
|
||
decode_params->short_term_ref_pic_set_size = picture->st_rps_bits;
|
||
```
|
||
|
||
VAAPI's `VAPictureParameterBufferHEVC` exposes `st_rps_bits` (u32) as the bit-count of the inline `short_term_ref_pic_set` syntax element in the slice header. The previous comment in libva was wrong — the field IS exposed.
|
||
|
||
Fork `66ef848`.
|
||
|
||
### Empirical result
|
||
|
||
- **HEVC frame 1** (IDR): libva CAPTURE = kdirect CAPTURE byte-identical. ✓
|
||
- **HEVC frames 2–10**: still diverge. Hash unchanged from iter25 (`700aa52d…`).
|
||
- **decode_params bytes** (per iter20 kernel printk) NOW match kdirect for frames 1-3:
|
||
- libva frame 2: `dp[0..16] = 04 00 00 00 0a 00 00 00 01 01 00 00 00 00 00 00`
|
||
- kdirect frame 2: `dp[0..16] = 04 00 00 00 0a 00 00 00 01 01 00 00 00 00 00 00` ✓
|
||
|
||
α-26 fixed the first 16 bytes of decode_params for libva. But the output is identical to iter25 — so the divergence-causing bytes are NOT in `decode_params[0..16]`.
|
||
|
||
### What still differs
|
||
|
||
12,234,632 of 13,824,000 bytes diverge (frames 2-10 nearly all bytes off). Frame 1 is 1,382,400 bytes — byte-identical.
|
||
|
||
`rkvdec_hevc_run` printk for libva still shows `reorder=4` (libva's incorrect `sps_max_num_reorder_pics = sps_max_dec_pic_buffering_minus1`) vs kdirect's `reorder=2`. But kernel source search shows `sps_max_num_reorder_pics` is referenced ONLY in our diagnostic printk — rkvdec_hevc_run hardware setup doesn't use it. So that's not the cause.
|
||
|
||
The likely candidates for frame 2+ divergence:
|
||
1. **DPB entry mapping**: dpb[i].timestamp must match the CAPTURE buffer's timestamp. iter26 didn't probe this. Need to dump `dp[64..96]` (dpb[0..1] entries) and compare libva vs kdirect.
|
||
2. **CAPTURE buffer reuse pattern**: libva's iter5b-β has 24-slot LRU, kdirect has different pool size. Maybe the kernel's reference buffer association differs.
|
||
3. **slice_params bytes**: not yet inspected by printk.
|
||
|
||
### Regression check (non-HEVC anchors)
|
||
|
||
Run with αpha-25 + α-26 changes:
|
||
|
||
| Codec | Result | Notes |
|
||
|---|---|---|
|
||
| H.264 (bbb_1080p30_h264, 3 frames) | **HW = kdirect byte-equal** | Bug 4 FIXED |
|
||
| HEVC (bbb_720p10s_hevc, 1 frame) | HW = kdirect byte-equal | Frame 1 FIXED |
|
||
| HEVC (bbb_720p10s_hevc, 10 frames) | HW ≠ kdirect | Frame 2+ separate bug |
|
||
| VP9 (bbb_720p10s_vp9) | HW = SW byte-equal | Unchanged |
|
||
| MPEG-2 (bbb_720p10s_mpeg2) | **Not testable this boot** | Pre-existing: vainfo only advertises rkvdec profiles, hantro paths not multi-probed |
|
||
| VP8 (bbb_720p10s_vp8) | **Not testable this boot** | Same pre-existing issue |
|
||
|
||
The MPEG-2 / VP8 "not testable" state is due to libva backend's single-device auto-select (chose rkvdec at /dev/video1+/dev/media0 this boot). rkvdec doesn't advertise MPEG-2 / VP8 profiles. To test these, would need libva-side multi-device profile-probe or explicit env override.
|
||
|
||
### Major campaign milestone
|
||
|
||
**Bug 4 (H.264 keyframe-partial → all-correct)** and **Bug 5 (HEVC libva all-zero → frame 1 correct)** root causes identified and **PARTIALLY FIXED** in two iterations after the 6 kernel-printk diagnostic iterations narrowed the failure to `rkvdec_s_ctrl -EBUSY` on first SPS.
|
||
|
||
H.264 is now fully byte-equivalent to kdirect. HEVC has one remaining bug (frame 2+).
|
||
|
||
### Substrate state at iter26 close
|
||
|
||
- Backend fork tip: `66ef848` (α-25 + H264-flag-fix + α-26).
|
||
- Kernel `7.0-8` with diagnostic printks (will eventually revert to clean baseline).
|
||
- 5-codec status: H264 ✅, HEVC ⚠️ (frame 1 ✅), VP9 ✅, MPEG-2/VP8 untestable this boot.
|
||
|
||
### iter27 candidate
|
||
|
||
Add iter20-style kernel printk extension to dump `dp[64..96]` covering `dpb[0..1]` entries. Compare libva vs kdirect DPB entries for HEVC frame 2 to identify if it's:
|
||
- (a) timestamp mismatch → libva references a non-existent CAPTURE buffer.
|
||
- (b) pic_order_cnt_val mismatch.
|
||
- (c) DPB entry flags mismatch.
|
||
|
||
OR alternatively, just inspect slice_params bytes for frame 2 (rkvdec's `run.slices_params[0]` printk extension).
|
||
|
||
### Lesson
|
||
|
||
Two iterations of libva-side patches (α-25 = synthetic SPS, α-26 = st_rps_bits) after the 24-iteration kernel-printk localization fixed Bug 4 fully and Bug 5 partially. The campaign's wire-byte hypothesis arc (iter11-iter18) was overturned by kernel printk, but THEN the actual fix was almost entirely on the libva side. The kernel was correct.
|