Files
fresnel-fourier/phase4_iter11_plan.md
T
marfrit a18ba53d6b iter11 Phase 3 + 4: HEVC SPS wire-byte diff narrows Bug 5 to α-13
Phase 3 deep strace: only meaningful SPS diff is bytes 10-11.
  libva   bytes 10-11 = 00 00 (sps_max_num_reorder_pics=0, latency=0)
  kdirect bytes 10-11 = 02 04 (reorder=2, latency=4)

Hardcoded at h265.c:110-111 with comment "/* not exposed */". VAAPI's
VAPictureParameterBufferHEVC doesn't forward these; kdirect parses
SPS NAL directly. sps_max_num_reorder_pics = 0 tells rkvdec "no
reordering" -> B-frame decode blocked -> all-zero output (Bug 5 fits).

Secondary diffs (Phase 4b candidates if α-13 doesn't close):
  - SLICE_PARAMS num_entry_point_offsets = 0 (hardcoded at h265.c:356
    with "iter2 doesn't do tiles" comment); kdirect submits 22.
  - PPS UNIFORM_SPACING flag bit 20 (don't-care for non-tiled).

Phase 4 α-13: ~2 LOC fix. Set sps_max_num_reorder_pics =
sps_max_dec_pic_buffering_minus1 (safe upper bound per H.265 §A.4.2).
Leave sps_max_latency_increase_plus1 = 0 (spec "unconstrained").

Phase 5b review required before Phase 6b implementation per
"reviews never skippable".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:58:03 +00:00

91 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 11 — Phase 4 plan
Drafted 2026-05-14 after Phase 3 narrowed Bug 5 to three concrete wire-byte diffs (SPS bytes 10-11, PPS flags bit 20, SLICE_PARAMS bytes 0-3 + 8-11 + 22).
## Mechanism
`src/h265.c::h265_fill_sps` lines 110-111 hardcode:
```c
sps->sps_max_num_reorder_pics = 0; /* not exposed */
sps->sps_max_latency_increase_plus1 = 0; /* not exposed */
```
The kernel uses `sps_max_num_reorder_pics` to size reorder buffers / allow B-frame decode reordering. `= 0` blocks reordering for streams that need it (BBB HEVC has B-frames in IBBP). kdirect derives the real value (2 for BBB) by parsing the SPS NAL directly; libva can't see the SPS NAL since VAAPI consumes it client-side.
Spec constraint: `sps_max_num_reorder_pics[i] ≤ sps_max_dec_pic_buffering_minus1[i]` (allowed reorder ≤ DPB capacity). Using `dec_pic_buffering_minus1` as the value is a safe upper bound — allows full DPB-depth reordering, which is always sufficient and never violates the spec inequality.
For `sps_max_latency_increase_plus1 = 0`: per spec, 0 means "SpsMaxLatencyPictures is not specified" (i.e., no constraint). Leaving at 0 is correct.
## Proposed fix (α-13)
Two-line change in `h265_fill_sps`:
```c
/*
* iter11 α-13 (Bug 5 fix): VAAPI's VAPictureParameterBufferHEVC does
* not expose sps_max_num_reorder_pics. Hardcoding 0 told rkvdec "no
* reordering needed" and blocked B-frame decode — observed empirically
* as all-zero CAPTURE output through libva for BBB HEVC (B-frames in
* IBBP pattern). kdirect parses the SPS NAL directly and submits the
* stream's actual value (2 for BBB level 4.x Main).
*
* Use sps_max_dec_pic_buffering_minus1 as a safe upper bound: per
* H.265 §A.4.2, sps_max_num_reorder_pics ≤ sps_max_dec_pic_buffering_minus1
* always holds, so substituting the latter is conformant and gives the
* kernel enough budget to reorder up to the DPB depth. May allocate
* slightly more reorder slots than the bitstream's tight constraint
* (kdirect's 2 vs our 4 for BBB), but kernel-side allocation cost is
* negligible for typical content.
*
* sps_max_latency_increase_plus1 stays at 0 — spec default meaning
* "SpsMaxLatencyPictures unconstrained" (§A.4.2). Setting kdirect's
* value of 4 (latency_increase = 3) would be more restrictive, not
* more correct.
*/
sps->sps_max_num_reorder_pics = picture->sps_max_dec_pic_buffering_minus1;
sps->sps_max_latency_increase_plus1 = 0;
```
LOC: ~2 functional + ~20 comment.
## Scope
In scope: `src/h265.c::h265_fill_sps`.
Out of scope:
- `num_entry_point_offsets` (slice-header field; needs full slice header parse).
- PPS `UNIFORM_SPACING` flag (don't-care for typical content).
- VP9 / VP8 / H.264 / MPEG-2 paths (HEVC-specific change by construction).
## Phase 5 review concerns
- Is `sps_max_num_reorder_pics = sps_max_dec_pic_buffering_minus1` safe? Reviewer should verify the spec inequality.
- Could it produce wrong decode for streams where actual reorder < dec_pic_buf? Reviewer should consider whether over-reporting reorder causes any kernel-side issue.
- Per `feedback_unconditional_codec_state.md`: HEVC fill is gated by VAProfileHEVCMain dispatch in picture.c, so no risk to non-HEVC codecs.
- Per `feedback_wire_vs_behavior.md`: Phase 7 must verify criterion-1 hash match, not just wire-byte diff vs kdirect.
## Phase 7 verification matrix
1. Build, install on fresnel.
2. Run libva HEVC sweep. Compare hash against kdirect's `9340b832…`.
3. Re-strace to confirm SPS byte 10 now = `sps_max_dec_pic_buffering_minus1` value.
4. Run 5-codec regression sweep. H.264 / VP9 / MPEG-2 / VP8 anchors must hold.
Iter11 PASS = libva_hevc.yuv == kdirect_hevc.yuv + zero regression.
## Risks
- **R-1**: Setting sps_max_num_reorder_pics to dec_pic_buf may over-report and confuse the kernel. Probability: low (spec inequality always holds; kernel uses field as upper bound). Mitigation: rollback is 1 line.
- **R-2**: The `num_entry_point_offsets = 0` issue is the actual cause. α-13 then doesn't fix Bug 5. Probability: medium. Mitigation: Phase 7 hash mismatch → Phase 4b parses slice header for entry points.
- **R-3**: PPS UNIFORM_SPACING / other byte diff is the actual cause. Probability: low (UNIFORM_SPACING is don't-care for non-tiled; BBB use of tiles surfaced via num_entry_point_offsets — see R-2).
- **R-4**: Regression on VP9/MPEG-2/H.264/VP8. Zero by construction (HEVC-only change path).
## Predicted iter11 cadence
- Phase 5b review: 15-20 min.
- Phase 6: 10 min.
- Phase 7: 15 min.
- Phase 8: 10 min.
Total: ~1 hour wallclock if α-13 lands fix.