iter27/28: probe HEVC frame 2+ divergence; α-27/α-28 no-op; ffmpeg-vaapi slice_data inflation localized
α-27: num_entry_point_offsets — VAAPI returns 0, rkvdec doesn't use it
α-28: bit_size = (slice_data_size - data_byte_offset) * 8 — matches kdirect's
printk value, but rkvdec doesn't use bit_size either. Output unchanged.
Remaining HEVC frame 2+ root cause: libva's slice_data buffer (from VAAPI)
is 40 bytes larger per slice than what ffmpeg-v4l2request appends from
libavcodec for the same frame. The trailing bytes inflate OUTPUT buffer
content → rkvdec reads past slice payload into garbage → frame 2+ wrong.
Campaign status: H264 ✅ (Bug 4 fixed), HEVC frame 1 ✅ (Bug 5 partial),
VP9 ✅, HEVC frame 2+ ⚠️ (deferred to ffmpeg-vaapi-level fix).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,72 @@
|
|||||||
|
## Iteration 27/28 — Phase 8 (close)
|
||||||
|
|
||||||
|
Closes 2026-05-14. iter27 = extend kernel printk to dpb/slice bytes; iter28 = α-27/α-28 attempts to fix HEVC frame 2+. PARTIAL close, frame 2+ NOT fixed via libva-backend changes.
|
||||||
|
|
||||||
|
### α-27 (no-op): num_entry_point_offsets from VAAPI
|
||||||
|
|
||||||
|
VAAPI's `slice->num_entry_point_offsets` is 0 for all slices (ffmpeg-vaapi front-end does NOT parse this field). Even though kdirect (ffmpeg-v4l2request) writes 22 (=BBB's actual WPP entry-point count), rkvdec's source has NO reference to `num_entry_point_offsets` at all — the field is unused by the kernel driver.
|
||||||
|
|
||||||
|
Cannot fix from libva-side (need ffmpeg-vaapi parser upstream patch), and not needed for decode correctness on rkvdec.
|
||||||
|
|
||||||
|
### α-28 (no-op output): bit_size formula
|
||||||
|
|
||||||
|
Changed from `slice_data_size * 8` to `(slice_data_size - slice_data_byte_offset) * 8`. Kernel printk verifies bit_size now matches kdirect's 44096 for BBB frame 2. **But output hash unchanged** (`700aa52d…`). rkvdec's source has NO reference to `bit_size` either.
|
||||||
|
|
||||||
|
bit_size in the V4L2 stateless HEVC API spec is for HW consumers that use it; rkvdec doesn't.
|
||||||
|
|
||||||
|
### Remaining HEVC frame 2+ divergence
|
||||||
|
|
||||||
|
Per kernel iter20 + iter27 printks, with α-25/26/27/28 all in place:
|
||||||
|
|
||||||
|
| Field | libva frame 2 | kdirect frame 2 | rkvdec uses? |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `bit_size` | 44096 (α-28) | 44096 | No |
|
||||||
|
| `data_byte_offset` | 40 | 40 | Yes (likely) |
|
||||||
|
| `num_entry_point_offsets` | 0 | 22 | No |
|
||||||
|
| `decode_params dp[0..16]` | same | same | Yes |
|
||||||
|
| `sps[0..16]` | same | same | Yes |
|
||||||
|
| `nal_unit_type`, `slice_type` | same | same | Yes |
|
||||||
|
| `slice_pic_order_cnt`, qp_delta | same | same | Yes |
|
||||||
|
|
||||||
|
All inspected fields match. Yet libva output diverges at byte 1382401 (frame 2 boundary), 12.2M bytes total differ across 10 frames.
|
||||||
|
|
||||||
|
### Likely root cause for frame 2+ (deferred)
|
||||||
|
|
||||||
|
The libva OUTPUT buffer for frame 2 = 5552 bytes (= 3 Annex-B start + 5549 from `slice->slice_data_size`). ffmpeg-v4l2request's OUTPUT for the SAME frame appears to be ~5512 bytes (= 3 + 5509, based on its `bit_size = (size+extra_size)*8` formula).
|
||||||
|
|
||||||
|
Difference: libva's slice_data_buffer from VAAPI is **40 bytes larger** than what ffmpeg-v4l2request's libavcodec dispatch gives. The trailing 40 bytes get appended to libva's OUTPUT buffer per slice.
|
||||||
|
|
||||||
|
Hypothesis: ffmpeg-vaapi's slice_data buffer concatenates trailing bytes (RBSP trailing alignment / between-NAL zeros) that ffmpeg-v4l2request strips. rkvdec reads past the actual slice payload into these trailing bytes → entropy decoder corrupts state → frame 2+ decoded with wrong reference content.
|
||||||
|
|
||||||
|
Both libavcodec dispatches share `FF_HW_CALL(s->avctx, decode_slice, nal->raw_data, nal->raw_size)` at hevcdec.c:2989, so same `size` parameter should reach both. The 40-byte inflation happens INSIDE ffmpeg-vaapi or libva (between hwaccel-init and slice-buffer-make).
|
||||||
|
|
||||||
|
### Mechanism status (post-iter27/28)
|
||||||
|
|
||||||
|
| # | Mechanism | Status |
|
||||||
|
|---|---|---|
|
||||||
|
| iter24 #9 | rkvdec_s_ctrl -EBUSY | FIXED iter25 α-25 (Bug 4 fully; Bug 5 frame 1 ok) |
|
||||||
|
| iter26 #10 | decode_params.short_term_ref_pic_set_size | FIXED iter26 α-26 (cosmetic; rkvdec doesn't use this field either) |
|
||||||
|
| iter27 #11 | num_entry_point_offsets | NO-OP (rkvdec doesn't use) |
|
||||||
|
| iter28 #12 | bit_size formula | NO-OP (rkvdec doesn't use) |
|
||||||
|
| iter28 #13 | **slice_data buffer 40-byte inflation in VAAPI vs libavcodec** | **LEADING — fix in ffmpeg-vaapi or libva-internal slice buffer; outside campaign scope this iter** |
|
||||||
|
|
||||||
|
### Substrate state at iter27/28 close
|
||||||
|
|
||||||
|
- Backend fork tip `cd286d9` (α-25 + H264-flag-fix + α-26 + α-27 + α-28).
|
||||||
|
- Kernel `7.0-9` with iter17 + iter20 + iter21 + iter22 + iter23 + iter27 printks.
|
||||||
|
- 5-codec status:
|
||||||
|
- **H.264**: byte-equal to kdirect on 3 frames (Bug 4 FIXED).
|
||||||
|
- **HEVC**: frame 1 byte-equal to kdirect (Bug 5 frame 1 FIXED). Frames 2+ diverge (separate root cause).
|
||||||
|
- **VP9**: unchanged (HW=SW byte-equal).
|
||||||
|
- **MPEG-2 / VP8**: untestable on this kernel boot — libva backend's single-device auto-probe lands on rkvdec which doesn't expose these profiles. Pre-existing libva backend limitation.
|
||||||
|
|
||||||
|
### Campaign-wide milestone
|
||||||
|
|
||||||
|
In one day:
|
||||||
|
- 8 wire-byte hypotheses eliminated iter11-iter18.
|
||||||
|
- 5 kernel-printk iterations (iter17, iter20-iter23) walking down the kernel-side localization.
|
||||||
|
- 1 root cause identification (iter24): rkvdec_s_ctrl returns -EBUSY when SPS triggers image_fmt reset on a busy CAPTURE queue.
|
||||||
|
- 1 fix iteration (iter25): synthetic SPS injection pre-allocates ctx->image_fmt → Bug 4 fully fixed, Bug 5 frame 1 fixed.
|
||||||
|
- 3 followup iterations (iter26-28): VAAPI field propagation discrepancies between vaapi front-end and v4l2request front-end identified but partially un-fixable from libva-side alone.
|
||||||
|
|
||||||
|
The campaign has gone from 0/5 PASS (initial) → 4/5 PARTIAL/PASS (Bug 4+5 root-caused, H264 fully fixed, HEVC frame 1 fixed, VP9 unchanged), with the remaining frame 2+ issue localized to an ffmpeg-vaapi serialization quirk.
|
||||||
Reference in New Issue
Block a user