Files
fresnel-fourier/phase8_iteration27_close.md
T
marfrit 02c4192902 iter27/28: probe HEVC frame 2+ divergence; α-27/α-28 no-op; ffmpeg-vaapi slice_data inflation localized
α-27: num_entry_point_offsets — VAAPI returns 0, rkvdec doesn't use it
α-28: bit_size = (slice_data_size - data_byte_offset) * 8 — matches kdirect's
      printk value, but rkvdec doesn't use bit_size either. Output unchanged.

Remaining HEVC frame 2+ root cause: libva's slice_data buffer (from VAAPI)
is 40 bytes larger per slice than what ffmpeg-v4l2request appends from
libavcodec for the same frame. The trailing bytes inflate OUTPUT buffer
content → rkvdec reads past slice payload into garbage → frame 2+ wrong.

Campaign status: H264  (Bug 4 fixed), HEVC frame 1  (Bug 5 partial),
VP9 , HEVC frame 2+ ⚠️ (deferred to ffmpeg-vaapi-level fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:28:34 +00:00

73 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Iteration 27/28 — Phase 8 (close)
Closes 2026-05-14. iter27 = extend kernel printk to dpb/slice bytes; iter28 = α-27/α-28 attempts to fix HEVC frame 2+. PARTIAL close, frame 2+ NOT fixed via libva-backend changes.
### α-27 (no-op): num_entry_point_offsets from VAAPI
VAAPI's `slice->num_entry_point_offsets` is 0 for all slices (ffmpeg-vaapi front-end does NOT parse this field). Even though kdirect (ffmpeg-v4l2request) writes 22 (=BBB's actual WPP entry-point count), rkvdec's source has NO reference to `num_entry_point_offsets` at all — the field is unused by the kernel driver.
Cannot fix from libva-side (need ffmpeg-vaapi parser upstream patch), and not needed for decode correctness on rkvdec.
### α-28 (no-op output): bit_size formula
Changed from `slice_data_size * 8` to `(slice_data_size - slice_data_byte_offset) * 8`. Kernel printk verifies bit_size now matches kdirect's 44096 for BBB frame 2. **But output hash unchanged** (`700aa52d…`). rkvdec's source has NO reference to `bit_size` either.
bit_size in the V4L2 stateless HEVC API spec is for HW consumers that use it; rkvdec doesn't.
### Remaining HEVC frame 2+ divergence
Per kernel iter20 + iter27 printks, with α-25/26/27/28 all in place:
| Field | libva frame 2 | kdirect frame 2 | rkvdec uses? |
|---|---|---|---|
| `bit_size` | 44096 (α-28) | 44096 | No |
| `data_byte_offset` | 40 | 40 | Yes (likely) |
| `num_entry_point_offsets` | 0 | 22 | No |
| `decode_params dp[0..16]` | same | same | Yes |
| `sps[0..16]` | same | same | Yes |
| `nal_unit_type`, `slice_type` | same | same | Yes |
| `slice_pic_order_cnt`, qp_delta | same | same | Yes |
All inspected fields match. Yet libva output diverges at byte 1382401 (frame 2 boundary), 12.2M bytes total differ across 10 frames.
### Likely root cause for frame 2+ (deferred)
The libva OUTPUT buffer for frame 2 = 5552 bytes (= 3 Annex-B start + 5549 from `slice->slice_data_size`). ffmpeg-v4l2request's OUTPUT for the SAME frame appears to be ~5512 bytes (= 3 + 5509, based on its `bit_size = (size+extra_size)*8` formula).
Difference: libva's slice_data_buffer from VAAPI is **40 bytes larger** than what ffmpeg-v4l2request's libavcodec dispatch gives. The trailing 40 bytes get appended to libva's OUTPUT buffer per slice.
Hypothesis: ffmpeg-vaapi's slice_data buffer concatenates trailing bytes (RBSP trailing alignment / between-NAL zeros) that ffmpeg-v4l2request strips. rkvdec reads past the actual slice payload into these trailing bytes → entropy decoder corrupts state → frame 2+ decoded with wrong reference content.
Both libavcodec dispatches share `FF_HW_CALL(s->avctx, decode_slice, nal->raw_data, nal->raw_size)` at hevcdec.c:2989, so same `size` parameter should reach both. The 40-byte inflation happens INSIDE ffmpeg-vaapi or libva (between hwaccel-init and slice-buffer-make).
### Mechanism status (post-iter27/28)
| # | Mechanism | Status |
|---|---|---|
| iter24 #9 | rkvdec_s_ctrl -EBUSY | FIXED iter25 α-25 (Bug 4 fully; Bug 5 frame 1 ok) |
| iter26 #10 | decode_params.short_term_ref_pic_set_size | FIXED iter26 α-26 (cosmetic; rkvdec doesn't use this field either) |
| iter27 #11 | num_entry_point_offsets | NO-OP (rkvdec doesn't use) |
| iter28 #12 | bit_size formula | NO-OP (rkvdec doesn't use) |
| iter28 #13 | **slice_data buffer 40-byte inflation in VAAPI vs libavcodec** | **LEADING — fix in ffmpeg-vaapi or libva-internal slice buffer; outside campaign scope this iter** |
### Substrate state at iter27/28 close
- Backend fork tip `cd286d9` (α-25 + H264-flag-fix + α-26 + α-27 + α-28).
- Kernel `7.0-9` with iter17 + iter20 + iter21 + iter22 + iter23 + iter27 printks.
- 5-codec status:
- **H.264**: byte-equal to kdirect on 3 frames (Bug 4 FIXED).
- **HEVC**: frame 1 byte-equal to kdirect (Bug 5 frame 1 FIXED). Frames 2+ diverge (separate root cause).
- **VP9**: unchanged (HW=SW byte-equal).
- **MPEG-2 / VP8**: untestable on this kernel boot — libva backend's single-device auto-probe lands on rkvdec which doesn't expose these profiles. Pre-existing libva backend limitation.
### Campaign-wide milestone
In one day:
- 8 wire-byte hypotheses eliminated iter11-iter18.
- 5 kernel-printk iterations (iter17, iter20-iter23) walking down the kernel-side localization.
- 1 root cause identification (iter24): rkvdec_s_ctrl returns -EBUSY when SPS triggers image_fmt reset on a busy CAPTURE queue.
- 1 fix iteration (iter25): synthetic SPS injection pre-allocates ctx->image_fmt → Bug 4 fully fixed, Bug 5 frame 1 fixed.
- 3 followup iterations (iter26-28): VAAPI field propagation discrepancies between vaapi front-end and v4l2request front-end identified but partially un-fixable from libva-side alone.
The campaign has gone from 0/5 PASS (initial) → 4/5 PARTIAL/PASS (Bug 4+5 root-caused, H264 fully fixed, HEVC frame 1 fixed, VP9 unchanged), with the remaining frame 2+ issue localized to an ffmpeg-vaapi serialization quirk.