Final session summary: H264 + VP9 + HEVC frame 1 byte-equal to SW

Bug 4 (H264 keyframe-partial): FIXED.
Bug 5 (HEVC libva all-zero): partial fix, frame 1 byte-equal.
Root cause: rkvdec_s_ctrl -EBUSY when first SPS triggers image_fmt
reset on busy CAPTURE queue (libva pre-allocates buffers at
CreateContext, kernel blocks the reset).
Fix: 90-LOC synthetic SPS injection in libva CreateContext before
cap_pool_init pre-seeds ctx->image_fmt.

Remaining: HEVC frame 2+ (ffmpeg-vaapi slice_data 40-byte inflation),
MPEG-2/VP8 (libva multi-device probe). Both deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-14 12:10:08 +00:00
parent 02c4192902
commit 8b17bf797a
+66
View File
@@ -0,0 +1,66 @@
## Campaign Session 2026-05-14 — Final Summary
### Starting state
- Bug 4 (H.264 keyframe-partial) and Bug 5 (HEVC libva all-zero CAPTURE) unresolved.
- iter11iter18 had eliminated 8 wire-byte hypotheses without finding the root cause.
- iter17 (this session) added kernel printk in `rkvdec_hevc_run` and found rkvdec sees all-zero SPS contents for libva.
### Iterations completed
| Iter | Type | Output |
|---|---|---|
| 17 | kernel printk in rkvdec_hevc_run | rkvdec sees w=0 h=0 for libva, w=1280 h=720 for kdirect |
| 18 | α-21/22 mechanism eliminations | Mechanisms 3 (stale stack), 5 (error_idx) DISPROVED |
| 19 | α-23 REINIT test | Mechanism 2 (REINIT clears) DISPROVED |
| 20 | kernel printk for ctrl_hdl pointer + ctrl bytes | ctrl_hdl pointers stable; SPS bytes all-zero for libva |
| 21 | kernel printk in v4l2_ctrl_request_setup loop | HEVC_SPS has p_req_valid=1; loop exits after SPS |
| 22 | kernel printk in v4l2_ctrl_request_clone | Clone IS complete (22 controls cloned with err=0) |
| 23 | kernel printk for skip-reason | Loop EXITS at HEVC_SPS, doesn't skip |
| 24 | kernel printk for req_to_new + try_or_set_cluster returns | `try_or_set_cluster ret=-16` (-EBUSY) for HEVC_SPS |
| | | **ROOT CAUSE: rkvdec_s_ctrl returns -EBUSY when SPS triggers image_fmt reset on busy CAPTURE queue** |
| 25 | **α-25 synthetic SPS injection in libva** | **H.264 fully fixed (10F byte-equal to SW)**; **HEVC frame 1 fixed (byte-equal to SW)** |
| 26 | α-26 decode_params.short_term_ref_pic_set_size from VAAPI | Wire-correct; rkvdec doesn't use field |
| 27 | α-27 num_entry_point_offsets from VAAPI | No-op (VAAPI returns 0; rkvdec doesn't use) |
| 28 | α-28 bit_size = (slice_data_size - data_byte_offset) * 8 | No-op (rkvdec doesn't use bit_size) |
### Final 5-codec state
| Codec | Status | Notes |
|---|---|---|
| H.264 | **PASS** (byte-equal SW, 10 frames) | Bug 4 fixed |
| HEVC | **frame 1 PASS** (byte-equal SW); frames 2+ DIVERGE | Bug 5 partial; frame 2+ rooted in ffmpeg-vaapi slice_data buffer 40-byte inflation vs ffmpeg-v4l2request — deferred |
| VP9 | **PASS** (byte-equal SW) | Unchanged |
| MPEG-2 | untestable on this kernel boot | Pre-existing libva single-device profile-probe limitation |
| VP8 | untestable on this kernel boot | Same |
### Root cause discovered
`rkvdec_s_ctrl` on first HEVC_SPS / H264_SPS resolves `image_fmt` via `get_image_fmt()` and, if it differs from cached `ctx->image_fmt` (default `RKVDEC_IMG_FMT_ANY`), tries to reset the CAPTURE format. Reset blocked by `vb2_is_busy` (any CAPTURE buffer allocated → returns true). libva pre-allocates 24 CAPTURE buffers at CreateContext (iter5b-β design) BEFORE the first per-frame S_EXT_CTRLS, so:
- First per-frame SPS staged via `try_or_set_cluster``s_ctrl``rkvdec_s_ctrl` returns -EBUSY.
- `v4l2_ctrl_request_setup` outer loop breaks → SPS never committed to `ctx->ctrl_hdl`.
- `rkvdec_hevc_run_preamble` reads `ctx->ctrl_hdl[SPS]->p_cur` which is zero.
- Hardware sees w=0 h=0 → all-zero CAPTURE.
### Fix delivered
`src/context.c::RequestCreateContext` (α-25 commit `db0b7f9`, fixed `d062fec`):
inject one S_EXT_CTRLS with a synthetic minimal HEVC_SPS or H264_SPS (chroma + bit_depth from profile) at CreateContext, BEFORE `cap_pool_init`. CAPTURE queue is empty at this point → `vb2_is_busy=false``rkvdec_s_ctrl` resets and updates `ctx->image_fmt` → from then on per-frame SPS submissions see `image_fmt_changed=false` → no reset → no -EBUSY → SPS commits correctly.
### Remaining work
1. **HEVC frame 2+ divergence**: 40-byte slice_data buffer inflation between ffmpeg-vaapi vs ffmpeg-v4l2request. Need either (a) ffmpeg-vaapi-side investigation/patch to ensure consistent `size` parameter, (b) libva-backend bitstream parser to find HEVC rbsp_trailing_bits and trim. Deferred.
2. **MPEG-2 / VP8 multi-device probe**: libva backend's `find_codec_device` picks ONE device for the entire session. For RK3399 with both rkvdec (H.264/HEVC/VP9) and hantro (MPEG-2/VP8), the backend should multi-probe and aggregate profiles. Deferred.
### Commits
Campaign repo: `bf67900`, `02c4192`.
Backend fork: `db0b7f9`, `d062fec`, `66ef848`, `719d813`, `c9bfa21`, `754be1d`, `cd286d9` (final tip).
Kernel substrate: `linux-fresnel-fourier 7.0-1` (clean baseline) was used; 7.0-2..7.0-9 added incremental diagnostic printks for iter12 RFC v2 + iter17iter27 root-cause investigation. The diagnostic kernels are NOT shipping; should revert to clean 7.0-X for production once campaign exits diagnostic mode.
### Lesson
The wire-byte hypothesis arc (iter11iter18) chased an empirical illusion: libva's ioctl payloads WERE byte-correct but the BUG was in the interaction between libva's CAPTURE-pool TIMING and rkvdec's lazy `image_fmt` determination. 6 kernel-printk iterations narrowed the failure to one function returning one error code. The fix is 90 LOC in libva. The kernel was correct all along.
The `[[feedback-libva-byte-correct-kernel-bug]]` memory entry was partially overturned: kernel-side -EBUSY semantics interact with libva-side allocation TIMING. Memory entry updated.