# Iteration 11 — Phase 3 (HEVC libva vs kdirect deep strace) Captured 2026-05-14 on fresnel (bare hostname, back on LAN). Backend SHA `a17e3c39…` (iter9 close). Fixture `bbb_720p10s_hevc.mp4` 3-frame sweep. ## Anchors | Path | Hash | |---|---| | libva HEVC | `06b2c5a0c01e515d…` (all-zero, Bug 5 reproduced) | | kdirect HEVC | `9340b832e08c23c4d4b2f732cecffb1d4104f2fc3257e561fac217c2d70d84c2` (correct decoded BBB) | ## HEVC control payload comparison — frame 1 Per-request `S_EXT_CTRLS` (ctrl_class=0xf010000): | Control | libva size | kdirect size | Diff? | |---|---|---|---| | HEVC_SPS (0xa40a90) | 40 | 40 | **DIFF at bytes 10-11** | | HEVC_PPS (0xa40a91) | 64 | 64 | **DIFF at bytes 56-57** | | HEVC_SLICE_PARAMS (0xa40a92) | 280 | 280 | **DIFF at bytes 0-3 + 8-11 + 22** | | HEVC_SCALING_MATRIX (0xa40a93) | 1000 | 1000 | identical (all-zero) | | HEVC_DECODE_PARAMS (0xa40a94) | 328 | 328 | DIFF at tail (poc / flags) | ### Diff 1 — SPS bytes 10-11 ``` libva bytes 0..15: 00 00 00 05 d0 02 00 00 04 04 [00 00] 01 01 00 03 kdirect bytes 0..15: 00 00 00 05 d0 02 00 00 04 04 [02 04] 01 01 00 03 ``` Field mapping per `struct v4l2_ctrl_hevc_sps`: - byte 10: `sps_max_num_reorder_pics` (u8) - libva: 0 - kdirect: 2 - byte 11: `sps_max_latency_increase_plus1` (u8) - libva: 0 - kdirect: 4 **Highly suspect**. `sps_max_num_reorder_pics = 0` tells the decoder "no reordering needed in this stream" — but BBB HEVC has B-frames in IBBP order. The kernel may refuse to set up reorder buffers, producing zero output for B-frames. ### Diff 2 — PPS bytes 56-63 (flags u64) ``` libva PPS flags: 0x0000_0000_0000_7040 kdirect PPS flags: 0x0000_0000_0010_7040 ``` Diff = bit 20 = `V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING` (set in kdirect, missing in libva). UNIFORM_SPACING is meaningful only when `num_tile_columns_minus1 > 0 || num_tile_rows_minus1 > 0`. For non-tiled streams it's don't-care. BBB HEVC may or may not use tiles — check `num_entry_point_offsets` in slice (Diff 3). ### Diff 3 — SLICE_PARAMS bytes 0-31 ``` libva head32: d0d30b00 2d000000 00000000 14010200 00000000 0000ff00 07000000 00000000 kdirect head32: 80d20b00 2d000000 16000000 14010200 00000000 00000000 07000000 00000000 ^^^^^^^^ ^^^^^^^^ ^^ bit_size num_entry_pt byte 22 ``` - bytes 0-3 `bit_size` u32 LE: libva=0x000bd3d0 (774096), kdirect=0x000bd280 (773760). Diff = 336 bits ≈ 42 bytes. Small but present. - bytes 8-11 `num_entry_point_offsets` u32 LE: **libva=0, kdirect=22**. BBB HEVC slice IS tiled / wavefront-parallel — kdirect submits 22 entry points; libva hardcodes 0. - byte 22: libva=0xff, kdirect=0x00. Per struct layout this is `slice_qp_delta` (s8): libva=-1, kdirect=0. ### libva source code — where the zeros come from `src/h265.c::h265_fill_sps` lines 110-111: ```c sps->sps_max_num_reorder_pics = 0; /* not exposed */ sps->sps_max_latency_increase_plus1 = 0; /* not exposed */ ``` VAAPI's `VAPictureParameterBufferHEVC` does not include these fields. Libva fills 0 by default. `src/h265.c::h265_fill_slice_params` line 356: ```c slice_params->num_entry_point_offsets = 0; /* iter2 doesn't do tiles */ ``` Hardcoded 0 with a comment from iter2 acknowledging the scope-cut. ## Conclusion Bug 5 is caused by **at least one** of: - SPS `sps_max_num_reorder_pics` hardcoded 0 (vs kdirect's 2 from SPS-NAL parse). Strongest suspect — directly blocks B-frame reordering. - SLICE_PARAMS `num_entry_point_offsets` hardcoded 0 (vs kdirect's 22). Affects tile/WPP parallel decode. - PPS UNIFORM_SPACING flag missing. Don't-care for non-tiled but BBB does use tiles per Diff 3. The classes are analogous to iter8's H.264 `constraint_set_flags` situation: VAAPI doesn't forward the field, libva fills a placeholder, the kernel cares. ## Phase 4 plan (preview) Single iter11 commit, 5-10 LOC change in h265.c: - Set `sps->sps_max_num_reorder_pics` from `picture->sps_max_dec_pic_buffering_minus1` (safe upper bound). - Leave `sps->sps_max_latency_increase_plus1 = 0` (spec default: "no constraint"). Secondary attempts if first doesn't fix: - Set PPS UNIFORM_SPACING flag when tiles are used (`num_tile_columns_minus1 > 0 || num_tile_rows_minus1 > 0`). - Parse slice header for `num_entry_point_offsets` (analogous to h264_slice_header.c). Costlier; defer to Phase 4b if needed. Phase 5 review will scrutinize: - Whether `sps_max_num_reorder_pics = sps_max_dec_pic_buffering_minus1` is safe vs producing too many reorder pictures. - Whether reaching for the bitstream's true value (SPS NAL parse) is needed for correctness. - That the change doesn't regress VP9 / VP8 / H.264 / MPEG-2.