diff --git a/phase4_iter2_plan.md b/phase4_iter2_plan.md new file mode 100644 index 0000000..480c792 --- /dev/null +++ b/phase4_iter2_plan.md @@ -0,0 +1,656 @@ +# Iteration 2 — Phase 4 (plan) + +Implementation plan for iter2 HEVC Main on rkvdec. Inputs: + +- [`phase0_findings_iter2.md`](phase0_findings_iter2.md) — Phase 1 lock (5 boolean criteria). +- [`phase2_iter2_situation.md`](phase2_iter2_situation.md) — six bugs identified in HEVC path. +- [`phase3_iter2_baseline.md`](phase3_iter2_baseline.md) — substrate verified post-upgrade, HEVC cross-validator anchor captured (5-control per-frame batch). + +Per `feedback_dev_process.md` Phase 6 contract-before-code: this plan opens with the contract clauses (kernel UAPI + FFmpeg reference + Phase 3 Baseline B verbatim citations), then specifies code changes that map 1:1 to those clauses. + +## Phase 1 criteria (re-stated; no Phase 3 → Phase 1 loopback this time) + +Per [`phase0_findings_iter2.md`](phase0_findings_iter2.md), all 5 criteria as locked. No Phase 3 surprises required adjustment (criterion 3 already anchored on ffmpeg-direct from the start, mirroring iter1's Phase 5 Q4 amendment). + +1. **vainfo enumeration regression**: `VAProfileHEVCMain` continues to be listed on the rkvdec env binding. (Already passes; iter2 must not strip.) +2. **vaCreateConfig success**: `vaCreateConfig(VAProfileHEVCMain, VAEntrypointVLD)` returns `VA_STATUS_SUCCESS`. (Currently `VA_STATUS_ERROR_UNSUPPORTED_PROFILE = 12`.) +3. **End-to-end ffmpeg-direct decode**: `ffmpeg -hwaccel vaapi -i bbb_720p10s_hevc.mp4 -frames:v 5 -f null -` exits 0; libva trace shows `vaCreateConfig SUCCESS`; no `Failed to create decode configuration` lines; no `EINVAL` from `VIDIOC_S_EXT_CTRLS`. +4. **DMA-BUF GL HW=SW byte-identical at +02s**: 2 distinct frames hash-equal across HW (`mpv --hwdec=vaapi --vo=image`) and SW (`--hwdec=no`); frames 1 vs 2 hash-differ (real motion). +5. **Regression on iter1 MPEG-2 AND T4 H.264**: both prior-iteration cells continue to pass with their reference hashes. + +## Contract clauses (cite-before-code) + +### Clause 1 — Per-frame batched VIDIOC_S_EXT_CTRLS with 5 controls + +**Authority**: Linux mainline `include/uapi/linux/v4l2-controls.h:2090-2300` defines the 5 mandatory + 2 device-wide + 3 conditional HEVC stateless controls: + +```c +#define V4L2_CID_STATELESS_HEVC_SPS (V4L2_CID_CODEC_STATELESS_BASE+400) /* 0xa40a90 */ +#define V4L2_CID_STATELESS_HEVC_PPS (V4L2_CID_CODEC_STATELESS_BASE+401) /* 0xa40a91 */ +#define V4L2_CID_STATELESS_HEVC_SLICE_PARAMS (V4L2_CID_CODEC_STATELESS_BASE+402) /* 0xa40a92 */ +#define V4L2_CID_STATELESS_HEVC_SCALING_MATRIX (V4L2_CID_CODEC_STATELESS_BASE+403) /* 0xa40a93 */ +#define V4L2_CID_STATELESS_HEVC_DECODE_PARAMS (V4L2_CID_CODEC_STATELESS_BASE+404) /* 0xa40a94 */ +#define V4L2_CID_STATELESS_HEVC_DECODE_MODE (V4L2_CID_CODEC_STATELESS_BASE+405) /* 0xa40a95 */ +#define V4L2_CID_STATELESS_HEVC_START_CODE (V4L2_CID_CODEC_STATELESS_BASE+406) /* 0xa40a96 */ +#define V4L2_CID_STATELESS_HEVC_ENTRY_POINT_OFFSETS (V4L2_CID_CODEC_STATELESS_BASE+407) /* not iter2 — tile/wavefront */ +``` + +**Reference implementation**: FFmpeg `libavcodec/v4l2_request_hevc.c:505-565` (`v4l2_request_hevc_queue_decode`) builds a 5-element `v4l2_ext_control` array and submits via `ff_v4l2_request_decode_frame` (single `VIDIOC_S_EXT_CTRLS` per frame). + +**Empirical anchor**: Phase 3 Baseline B strace verbatim ([`phase3_iter2_baseline.md`](phase3_iter2_baseline.md) + `phase0_evidence/2026-05-08/iter2_phase3/ffmpeg_v4l2req.strace.*` gitignored) shows: + +``` +ioctl(/dev/video1, VIDIOC_S_EXT_CTRLS, + {ctrl_class=0xf010000 /* V4L2_CTRL_CLASS_CODEC_STATELESS */, + count=5, + controls=[ + {id=0xa40a90 SPS, size=40, ...}, + {id=0xa40a91 PPS, size=64, ...}, + {id=0xa40a92 SLICE_PARAMS, size=N, ...}, /* dynamic-array */ + {id=0xa40a93 SCALING_MATRIX, size=M, ...}, /* conditional on kernel availability */ + {id=0xa40a94 DECODE_PARAMS, size=328, ...} + ]}) = 0 +``` + +**Implication for iter2**: `h265_set_controls()` builds a 5-entry `struct v4l2_ext_control` array and submits via the existing `v4l2_set_controls(driver_data->video_fd, surface_object->request_fd, controls, 5)` API. One `VIDIOC_S_EXT_CTRLS` per frame, mirroring iter1 MPEG-2 + iter6/7/8 H.264 patterns. + +### Clause 2 — `v4l2_ctrl_hevc_sps` field layout (40 bytes) + +**Authority**: `:2096+` `struct v4l2_ctrl_hevc_sps`: + +```c +struct v4l2_ctrl_hevc_sps { + __u8 video_parameter_set_id; + __u8 seq_parameter_set_id; + __u16 pic_width_in_luma_samples; + __u16 pic_height_in_luma_samples; + __u8 bit_depth_luma_minus8; + __u8 bit_depth_chroma_minus8; + __u8 log2_max_pic_order_cnt_lsb_minus4; + __u8 sps_max_dec_pic_buffering_minus1; + __u8 sps_max_num_reorder_pics; + __u8 sps_max_latency_increase_plus1; + __u8 log2_min_luma_coding_block_size_minus3; + __u8 log2_diff_max_min_luma_coding_block_size; + __u8 log2_min_luma_transform_block_size_minus2; + __u8 log2_diff_max_min_luma_transform_block_size; + __u8 max_transform_hierarchy_depth_inter; + __u8 max_transform_hierarchy_depth_intra; + __u8 pcm_sample_bit_depth_luma_minus1; + __u8 pcm_sample_bit_depth_chroma_minus1; + __u8 log2_min_pcm_luma_coding_block_size_minus3; + __u8 log2_diff_max_min_pcm_luma_coding_block_size; + __u8 num_short_term_ref_pic_sets; + __u8 num_long_term_ref_pics_sps; + __u8 chroma_format_idc; + __u8 sps_max_sub_layers_minus1; + __u8 reserved[6]; + __u64 flags; +}; +``` + +Total 40 bytes (verified against Phase 3 Baseline B verbatim payload size). 9 boolean fields collapsed into u64 `flags`: + +```c +#define V4L2_HEVC_SPS_FLAG_SEPARATE_COLOUR_PLANE (1ULL << 0) +#define V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED (1ULL << 1) +#define V4L2_HEVC_SPS_FLAG_AMP_ENABLED (1ULL << 2) +#define V4L2_HEVC_SPS_FLAG_SAMPLE_ADAPTIVE_OFFSET (1ULL << 3) +#define V4L2_HEVC_SPS_FLAG_PCM_ENABLED (1ULL << 4) +#define V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED (1ULL << 5) +#define V4L2_HEVC_SPS_FLAG_LONG_TERM_REF_PICS_PRESENT (1ULL << 6) +#define V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED (1ULL << 7) +#define V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED (1ULL << 8) +``` + +**VAAPI source mapping** (mostly preserved from current `src/h265.c::h265_fill_sps`, just routed to `flags` collapsed bitmask): + +| New SPS field | Source: VAPictureParameterBufferHEVC `picture` | +|---|---| +| `pic_width_in_luma_samples` | `picture->pic_width_in_luma_samples` | +| `pic_height_in_luma_samples` | `picture->pic_height_in_luma_samples` | +| `bit_depth_luma_minus8` | `picture->bit_depth_luma_minus8` | +| `bit_depth_chroma_minus8` | `picture->bit_depth_chroma_minus8` | +| `chroma_format_idc` | `picture->pic_fields.bits.chroma_format_idc` | +| `log2_max_pic_order_cnt_lsb_minus4` | `picture->log2_max_pic_order_cnt_lsb_minus4` | +| `sps_max_dec_pic_buffering_minus1` | `picture->sps_max_dec_pic_buffering_minus1` | +| `sps_max_num_reorder_pics` | 0 (current code hardcodes; VAAPI doesn't expose) | +| `sps_max_latency_increase_plus1` | 0 (same) | +| `log2_min_luma_coding_block_size_minus3` | `picture->log2_min_luma_coding_block_size_minus3` | +| `log2_diff_max_min_luma_coding_block_size` | `picture->log2_diff_max_min_luma_coding_block_size` | +| `log2_min_luma_transform_block_size_minus2` | `picture->log2_min_transform_block_size_minus2` | +| `log2_diff_max_min_luma_transform_block_size` | `picture->log2_diff_max_min_transform_block_size` | +| `max_transform_hierarchy_depth_inter/intra` | same fields in VAAPI | +| `pcm_sample_bit_depth_luma_minus1`, etc. | same fields | +| `num_short_term_ref_pic_sets` | `picture->num_short_term_ref_pic_sets` | +| `num_long_term_ref_pics_sps` | `picture->num_long_term_ref_pic_sps` | +| `sps_max_sub_layers_minus1` | 0 (VAAPI doesn't expose; placeholder) | +| `video_parameter_set_id` | 0 (VAAPI doesn't expose) | +| `seq_parameter_set_id` | 0 (VAAPI doesn't expose) | +| `flags` (OR of:) | | +| `_SEPARATE_COLOUR_PLANE` | `picture->pic_fields.bits.separate_colour_plane_flag` | +| `_SCALING_LIST_ENABLED` | `picture->pic_fields.bits.scaling_list_enabled_flag` | +| `_AMP_ENABLED` | `picture->pic_fields.bits.amp_enabled_flag` | +| `_SAMPLE_ADAPTIVE_OFFSET` | `picture->slice_parsing_fields.bits.sample_adaptive_offset_enabled_flag` | +| `_PCM_ENABLED` | `picture->pic_fields.bits.pcm_enabled_flag` | +| `_PCM_LOOP_FILTER_DISABLED` | `picture->pic_fields.bits.pcm_loop_filter_disabled_flag` | +| `_LONG_TERM_REF_PICS_PRESENT` | `picture->slice_parsing_fields.bits.long_term_ref_pics_present_flag` | +| `_SPS_TEMPORAL_MVP_ENABLED` | `picture->slice_parsing_fields.bits.sps_temporal_mvp_enabled_flag` | +| `_STRONG_INTRA_SMOOTHING_ENABLED` | `picture->pic_fields.bits.strong_intra_smoothing_enabled_flag` | +| `reserved[6]` | zero (via `memset`) | + +**Phase 3 Baseline B verbatim sanity**: BBB SPS bytes decode to: 1280×720, 8-bit, 4:2:0, no PCM, flags=`SAMPLE_ADAPTIVE_OFFSET | STRONG_INTRA_SMOOTHING_ENABLED` (0x108). iter2 implementation must produce the same 40 bytes for this fixture (Phase 7 byte-compare check). + +### Clause 3 — `v4l2_ctrl_hevc_pps` field layout (64 bytes) + +**Authority**: `:2150+` `struct v4l2_ctrl_hevc_pps`. Total 64 bytes. 19 boolean PPS fields collapsed into u64 `flags`: + +```c +#define V4L2_HEVC_PPS_FLAG_DEPENDENT_SLICE_SEGMENT_ENABLED (1ULL << 0) +#define V4L2_HEVC_PPS_FLAG_OUTPUT_FLAG_PRESENT (1ULL << 1) +#define V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED (1ULL << 2) +#define V4L2_HEVC_PPS_FLAG_CABAC_INIT_PRESENT (1ULL << 3) +#define V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED (1ULL << 4) +#define V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED (1ULL << 5) +#define V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED (1ULL << 6) +#define V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT (1ULL << 7) +#define V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED (1ULL << 8) +#define V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED (1ULL << 9) +#define V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED (1ULL << 10) +#define V4L2_HEVC_PPS_FLAG_TILES_ENABLED (1ULL << 11) +#define V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED (1ULL << 12) +#define V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED (1ULL << 13) +#define V4L2_HEVC_PPS_FLAG_PPS_LOOP_FILTER_ACROSS_SLICES_ENABLED (1ULL << 14) +#define V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_OVERRIDE_ENABLED (1ULL << 15) +#define V4L2_HEVC_PPS_FLAG_PPS_DISABLE_DEBLOCKING_FILTER (1ULL << 16) +#define V4L2_HEVC_PPS_FLAG_LISTS_MODIFICATION_PRESENT (1ULL << 17) +#define V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT (1ULL << 18) +``` + +**VAAPI source mapping**: extracted from BOTH `picture` (VAPictureParameterBufferHEVC) AND `slice` (VASliceParameterBufferHEVC for `dependent_slice_segment_flag`). The current `src/h265.c::h265_fill_pps` (lines 48-102) does the field extraction correctly; iter2 just collapses booleans into the new u64 `flags` bitmask: + +| New PPS field source | Old h265.c location | +|---|---| +| `pps->dependent_slice_segment_flag` (now `flags & DEPENDENT_SLICE_SEGMENT_ENABLED`) | `slice->LongSliceFlags.fields.dependent_slice_segment_flag` (line 54) | +| `pps->output_flag_present_flag` (now `flags & OUTPUT_FLAG_PRESENT`) | `picture->slice_parsing_fields.bits.output_flag_present_flag` | +| `pps->num_extra_slice_header_bits` (kept as field) | `picture->num_extra_slice_header_bits` | +| ... (15 more boolean field-to-flag conversions, mechanical) || +| `pps->init_qp_minus26` (kept) | `picture->init_qp_minus26` | +| `pps->diff_cu_qp_delta_depth` (kept) | `picture->diff_cu_qp_delta_depth` | +| `pps->pps_cb_qp_offset` (kept) | `picture->pps_cb_qp_offset` | +| `pps->pps_cr_qp_offset` (kept) | `picture->pps_cr_qp_offset` | +| `pps->num_tile_columns_minus1` (kept) | `picture->num_tile_columns_minus1` | +| `pps->num_tile_rows_minus1` (kept) | `picture->num_tile_rows_minus1` | +| `pps->pps_beta_offset_div2` (kept) | `picture->pps_beta_offset_div2` | +| `pps->pps_tc_offset_div2` (kept) | `picture->pps_tc_offset_div2` | +| `pps->log2_parallel_merge_level_minus2` (kept) | `picture->log2_parallel_merge_level_minus2` | +| Field added: `column_width_minus1[20]`, `row_height_minus1[22]`, `num_extra_slice_header_bits`, `reserved` | populate from VAAPI (or zero if VAAPI doesn't expose) | +| `flags` u64 with the 19 bits OR'd | (mechanical boolean collapse) | + +### Clause 4 — `v4l2_ctrl_hevc_slice_params` (variable; dynamic-array per frame) + +**Authority**: `` `struct v4l2_ctrl_hevc_slice_params`. Contains per-slice info: bit_size, data_bit_offset, slice_type, slice_pic_order_cnt, slice flags, QP deltas, ref_idx_l0/l1[15], pred_weight_table, num_entry_point_offsets, slice_segment_addr, etc. + +**Phase 0 inventory** confirms rkvdec advertises: + +``` +hevc_slice_parameters 0x00a40a92 (hevc-slice-params): elems=1 dims=[600] flags=has-payload, dynamic-array +``` + +So kernel accepts up to 600 slice_params entries per submission. iter2's bbb_720p10s_hevc.mp4 fixture is x265-ultrafast — typical 1 slice per frame; multi-slice would still fit in the 600-entry envelope. + +**Submission shape**: `size = sizeof(struct v4l2_ctrl_hevc_slice_params) * num_slices_in_frame`. FFmpeg `libavcodec/v4l2_request_hevc.c:540-547` shows the pattern: + +```c +if (ctx->max_slice_params && controls->num_slice_params) { + control[count++] = (struct v4l2_ext_control) { + .id = V4L2_CID_STATELESS_HEVC_SLICE_PARAMS, + .ptr = controls->frame_slice_params, + .size = sizeof(*controls->frame_slice_params) * + FFMIN(controls->num_slice_params, ctx->max_slice_params), + }; +} +``` + +**libva backend behavioral change (NEW for iter2)**: VAAPI clients submit `VASliceParameterBufferType` once per slice via `vaRenderPicture`. The current `src/picture.c::codec_store_buffer:115-135` for HEVC `memcpy(&surface->params.h265.slice, …)` **overwrites** the previous slice's params. iter2 must change to **append**: each VASliceParameterBufferType arrival appends a new entry to a `params.h265.slices[N]` array, with `params.h265.num_slices++`. At end_picture, `h265_set_controls` reads the array and submits as one dynamic-array control. + +**VAAPI source mapping**: existing `src/h265.c::h265_fill_slice_params` (lines 160-365) does the field extraction per-slice correctly. iter2 preserves the extraction logic (NAL header parse, data_bit_offset bit-search, ref_idx, pred_weight) but routes per-slice into an array slot rather than a single struct. + +Critical: NAL header parsing at `h265.c:184-209` extracts `nal_unit_type` and `data_bit_offset` from the slice bitstream. **This logic is preserved** — the new V4L2 API still requires per-slice `bit_size` and `data_bit_offset`. The new struct keeps these fields (they're per-slice metadata, not per-frame). + +**One field MOVES OUT of slice_params**: the DPB array (`dpb[15]`) and `num_active_dpb_entries` / `num_rps_poc_st_curr_before/after` / `num_rps_poc_lt_curr` migrate to **DECODE_PARAMS** (Clause 6). iter2's per-slice fill no longer populates the DPB. + +### Clause 5 — `v4l2_ctrl_hevc_scaling_matrix` (size M; conditional submission) + +**Authority**: `` `struct v4l2_ctrl_hevc_scaling_matrix`. Contains 4 scaling lists (4×4, 8×8, 16×16, 32×32) for luma + chroma intra/inter — substantial struct. + +**Conditional submission per FFmpeg pattern**: query kernel availability once at init via `VIDIOC_QUERY_EXT_CTRL` for the SCALING_MATRIX CID. If kernel advertises (rkvdec on fresnel does, per Phase 3 Baseline B), include in the per-frame batch unconditionally. If kernel doesn't advertise, omit. + +**Phase 3 evidence**: BBB fixture's per-frame batch always contains SCALING_MATRIX (see Baseline B verbatim 30 occurrences across 5 frames + queries). FFmpeg gates on `ctx->has_scaling_matrix` set at init from `ff_v4l2_request_query_control_default_value(...SCALING_MATRIX)`. iter2 mirrors: probe at init, store boolean in the libva backend's per-context state, include in batch if true. + +**VAAPI source mapping**: `VAIQMatrixBufferHEVC` provides the four scaling lists (`scaling_lists_4x4[6][16]`, `_8x8[6][64]`, `_16x16[6][64]`, `_32x32[2][64]` plus DC scaling lists). When `iqmatrix_set==true`, copy from VAAPI struct to V4L2 struct. When `iqmatrix_set==false`, populate with HEVC spec default scaling matrices (per ISO/IEC 23008-2 Table 4-1 — flat 16 across all positions, with DC values 16). + +Phase 3 Baseline B SCALING_MATRIX verbatim payload not field-decoded yet (deferred to Phase 6 transcription); will compare bytes against backend-generated payload at Phase 7 verification time. + +### Clause 6 — `v4l2_ctrl_hevc_decode_params` field layout (328 bytes) + +**Authority**: `` `struct v4l2_ctrl_hevc_decode_params`. NEW in modern API (didn't exist in staging-era). Contains: + +- `pic_order_cnt_val` (s32) — current picture POC. +- `short_term_ref_pic_set_size`, `long_term_ref_pic_set_size` — RPS sizes. +- `num_active_dpb_entries` — count of valid DPB entries. +- `num_poc_st_curr_before/after, num_poc_lt_curr` — short-term + long-term ref counts. +- `poc_st_curr_before[8]`, `poc_st_curr_after[8]`, `poc_lt_curr[8]` — POC arrays for ref pic ordering. +- `dpb[16]` — DPB entries: `{timestamp, flags, field_pic, pic_order_cnt_val, _padding}` per entry. +- `flags` (u64) — `IRAP_PIC`, `IDR_PIC`, `NO_OUTPUT_OF_PRIOR_PICS`, etc. + +Total **328 bytes** (verified against Phase 3 Baseline B verbatim payload size). + +**VAAPI source mapping**: largely preserved from current `src/h265.c::h265_fill_slice_params` lines 269-315 (DPB iteration over `picture->ReferenceFrames[15]`), just routed to a new struct. The existing logic for `dpb[i].timestamp`, `dpb[i].rps`, `dpb[i].pic_order_cnt[0]`, `field_pic` migrates verbatim to `decode_params.dpb[i].timestamp` etc. The DPB-counting logic (`num_rps_poc_st_curr_before/after, num_rps_poc_lt_curr`) migrates to the `num_poc_*` fields of decode_params. + +**Submission**: per-frame, after SPS + PPS in the batch. + +### Clause 7 — Device-wide DECODE_MODE + START_CODE menu controls + +**Authority**: `` defines: + +```c +#define V4L2_CID_STATELESS_HEVC_DECODE_MODE (V4L2_CID_CODEC_STATELESS_BASE+405) +#define V4L2_CID_STATELESS_HEVC_START_CODE (V4L2_CID_CODEC_STATELESS_BASE+406) + +enum v4l2_stateless_hevc_decode_mode { + V4L2_STATELESS_HEVC_DECODE_MODE_SLICE_BASED, + V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED, +}; +enum v4l2_stateless_hevc_start_code { + V4L2_STATELESS_HEVC_START_CODE_NONE, + V4L2_STATELESS_HEVC_START_CODE_ANNEX_B, +}; +``` + +**Phase 0 inventory** confirms fresnel rkvdec advertises: + +``` +hevc_decode_mode 0x00a40a95 (menu): min=1 max=1 default=1 (Frame-Based) flags=has-min-max +hevc_start_code 0x00a40a96 (menu): min=1 max=1 default=1 (Annex B Start Code) flags=has-min-max +``` + +So rkvdec accepts ONLY `FRAME_BASED` decode mode and `ANNEX_B` start code — same constraints as H.264 + MPEG-2. Set both at decoder init via `v4l2_set_controls(driver_data->video_fd, /* request_fd= */ -1, dev_ctrls, 2)` with values `FRAME_BASED` + `ANNEX_B`. + +**Where to set**: extend `src/context.c:142-155`'s existing H.264 device-init block to also set HEVC's two device controls when context is HEVC-profile-bound. Current pattern: 2 ext_controls in one batched call with `request_fd=-1`. iter2 adds 2 more controls (or a separate call) for the HEVC variants. + +Alternative: set them inside `h265_set_controls` once per context (with a "first call" guard). Cleaner location-wise but requires per-context state. Phase 6 implementer chooses. + +### Clause 8 — `RequestCreateConfig` HEVCMain case must `break;` + +**Authority**: C language semantics. `src/config.c:67` `case VAProfileHEVCMain:` falls through to `default:` (line 68) which returns the error. iter1 added `break;` for MPEG-2 cases; HEVCMain is the last case in the same fall-through bucket. + +**Empirical anchor**: Phase 3 Baseline D verified the patch shape in scratch — adding `break;` for HEVCMain lets `vaCreateConfig` return `VA_STATUS_SUCCESS` without affecting iter1 MPEG-2 or T4 H.264 hashes. + +**Fix shape**: 5 lines (case label preserved; comment + break added; matches iter1 Commit A pattern verbatim). + +### Clause 9 — `picture.c::codec_set_controls` HEVCMain dispatch + +**Authority**: existing `src/picture.c:186-191` MPEG-2 dispatch pattern from iter1: + +```c +case VAProfileMPEG2Simple: +case VAProfileMPEG2Main: + rc = mpeg2_set_controls(driver_data, context, surface_object); + if (rc < 0) return VA_STATUS_ERROR_OPERATION_FAILED; + break; +``` + +iter2 replaces the explicit `case VAProfileHEVCMain: return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;` (lines 204-206) with the same shape, dispatching to `h265_set_controls`. Comment updated to remove the stale `Fourier-local: HEVC stripped, no HW support on RK3566.` reference. + +### Clause 10 — Per-slice accumulation in `codec_store_buffer` + +**Authority**: HEVC kernel API requires per-slice slice_params (Clause 4). VAAPI clients submit `VASliceParameterBufferType` once per slice via `vaRenderPicture`. The current `src/picture.c:115-135` for HEVC `VASliceParameterBufferType` does: + +```c +case VAProfileHEVCMain: + memcpy(&surface_object->params.h265.slice, buffer_object->data, sizeof(...)); + break; +``` + +**Behavior change**: replace single-slot copy with array-append: + +```c +case VAProfileHEVCMain: + if (surface_object->params.h265.num_slices < HEVC_MAX_SLICES_PER_FRAME) { + memcpy(&surface_object->params.h265.slices[surface_object->params.h265.num_slices], + buffer_object->data, + sizeof(VASliceParameterBufferHEVC)); + surface_object->params.h265.num_slices++; + } else { + /* exceeded array bound — log and drop; Phase 7 verification flags */ + } + break; +``` + +`HEVC_MAX_SLICES_PER_FRAME` = e.g. 64 (kernel max is 600; conservative). For the BBB fixture this maxes at 1 per frame; the bound is for safety. + +**At BeginPicture**: reset `num_slices = 0` per-frame. Currently `picture.c:287` only resets `params.h264.matrix_set = false`; iter2 adds `params.h265.num_slices = 0` reset for HEVC surfaces. (Or per-profile: switch on `config_object->profile` and reset accordingly. iter2 adds `params.h265.num_slices = 0` unconditionally for now — benign for non-HEVC since the union aliasing puts num_slices in a region overwritten by RenderPicture's per-buffer copies.) + +## Diff scope + +### File 1: `src/config.c` — add `break;` for HEVCMain case (5 lines) + +```diff +@@ -68,6 +68,11 @@ VAStatus RequestCreateConfig(VADriverContextP context, VAProfile profile, + // submission time. + break; + case VAProfileHEVCMain: ++ // fresnel-fourier iter2: HEVC enabled. Same shape as H.264/ ++ // MPEG-2 above — no profile-specific config validation in the ++ // libva backend; validation happens at vaCreateContext / ++ // control submission time. ++ break; + default: + return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; +``` + +### File 2: `src/picture.c` — replace HEVCMain reject with dispatch + per-slice slice_params accumulation (~25 lines) + +Two distinct changes: + +(a) **Dispatch HEVCMain in `codec_set_controls`** (lines 204-206): + +```diff +- case VAProfileHEVCMain: +- /* Fourier-local: HEVC stripped, no HW support on RK3566. */ +- return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; ++ case VAProfileHEVCMain: ++ rc = h265_set_controls(driver_data, context, surface_object); ++ if (rc < 0) ++ return VA_STATUS_ERROR_OPERATION_FAILED; ++ break; +``` + +(b) **Per-slice accumulation in `codec_store_buffer`** (HEVC VASliceParameterBufferType case, lines 127-131): + +```diff +- case VAProfileHEVCMain: +- memcpy(&surface_object->params.h265.slice, +- buffer_object->data, +- sizeof(surface_object->params.h265.slice)); +- break; ++ case VAProfileHEVCMain: { ++ unsigned int n = surface_object->params.h265.num_slices; ++ if (n < HEVC_MAX_SLICES_PER_FRAME) { ++ memcpy(&surface_object->params.h265.slices[n], ++ buffer_object->data, ++ sizeof(VASliceParameterBufferHEVC)); ++ surface_object->params.h265.num_slices = n + 1; ++ } ++ /* note: also keep .slice (singular) populated as last-slice ++ * mirror for h265_fill_pps which reads dependent_slice_segment_flag ++ * from VASliceParameterBufferHEVC->LongSliceFlags */ ++ memcpy(&surface_object->params.h265.slice, ++ buffer_object->data, ++ sizeof(surface_object->params.h265.slice)); ++ break; ++ } +``` + +(c) **Reset `num_slices` in `RequestBeginPicture`** at line 287: + +```diff + surface_object->params.h264.matrix_set = false; ++ surface_object->params.h265.num_slices = 0; +``` + +### File 3: `src/surface.h` — extend `params.h265` to hold slice_params array + +Add inside the `union { ... } params` block: + +```diff + struct { + VAPictureParameterBufferHEVC picture; + VASliceParameterBufferHEVC slice; ++ VASliceParameterBufferHEVC slices[HEVC_MAX_SLICES_PER_FRAME]; ++ unsigned int num_slices; + VAIQMatrixBufferHEVC iqmatrix; + bool iqmatrix_set; + } h265; +``` + +`HEVC_MAX_SLICES_PER_FRAME` = `64` defined in surface.h (or h265.h). Total memory cost: `sizeof(VASliceParameterBufferHEVC)` ≈ 264 bytes × 64 = ~17 KB extra per surface union — significant but acceptable. + +Alternative (smaller memory): heap-allocate `slices` array dynamically (malloc on first slice arrival, realloc on grow, free at surface destroy). More plumbing; defer to Phase 4 plan revision if Phase 7 surfaces memory concerns. iter2 default: stack-array of 64. + +### File 4: `src/h265.c` — full rewrite against new split API (~400 lines) + +Per Clauses 2-7. The bulk of iter2 work. Structure mirrors current h265.c but routes to new struct layouts: + +- `h265_fill_sps()` → fill `struct v4l2_ctrl_hevc_sps` (40 bytes, flags collapsed). ~40 lines. +- `h265_fill_pps()` → fill `struct v4l2_ctrl_hevc_pps` (64 bytes, flags collapsed). ~50 lines. +- `h265_fill_slice_params()` → fill ONE `struct v4l2_ctrl_hevc_slice_params` (per-slice; called from a loop in h265_set_controls over surface->params.h265.slices[]). ~80 lines (preserves NAL header parse, data_bit_offset bit-search, ref_idx, pred_weight). +- **NEW** `h265_fill_decode_params()` → fill `struct v4l2_ctrl_hevc_decode_params` (328 bytes: DPB array, POC, num_active_dpb_entries, etc.). ~60 lines. +- **NEW** `h265_fill_scaling_matrix()` → fill `struct v4l2_ctrl_hevc_scaling_matrix` from `VAIQMatrixBufferHEVC` (or spec defaults if `iqmatrix_set==false`). ~30 lines. +- **NEW** `h265_init_device_controls()` → set DECODE_MODE + START_CODE menus once per context. ~15 lines. Called from h265_set_controls with first-call guard, OR from context.c device-init block. +- `h265_set_controls()` → orchestrator: build SPS, PPS, all slice_params (loop over array), DECODE_PARAMS, SCALING_MATRIX (conditional on init-time probe); submit batched. ~50 lines. + +Plus the static const default scaling matrices (luma + chroma intra/inter, 4 × 64 bytes per scan-size with extra DC values) for the iqmatrix_set==false branch. Per Phase 5 Lesson L2 (`feedback_review_empirical_over_theoretical.md`): transcribe from Phase 3 Baseline B SCALING_MATRIX verbatim payload, NOT from spec recall. Phase 6 protocol: capture the BBB SCALING_MATRIX bytes via verbose strace, decode into the four 64-byte arrays, transcribe with byte-equality assertion. + +### File 5: `src/h265.h` — re-enable + +Currently `meson.build:73` has `# 'h265.h'` commented. Uncomment. + +`h265.h` exposes only `int h265_set_controls(...)` declaration; the new helpers (`h265_fill_decode_params`, `h265_fill_scaling_matrix`, `h265_init_device_controls`) stay file-static. + +### File 6: `src/meson.build` — uncomment h265.c + h265.h + +```diff +@@ -47,7 +47,7 @@ sources = [ + 'request_pool.c', + 'cap_pool.c', +-# 'h265.c' ++ 'h265.c' + ] +@@ -70,7 +70,7 @@ headers = [ + 'cap_pool.h', +-# 'h265.h' ++ 'h265.h' + ] +``` + +### File 7: `src/context.c` — extend device-init for HEVC (optional) + +**Decision (defer to Phase 6 implementer)**: either extend `src/context.c:142-155`'s device-init block to also set HEVC `DECODE_MODE` + `START_CODE` controls (would fire EINVAL on hantro-vpu-dec same as the existing H.264 controls — auxiliary noise, intentionally swallowed by `(void)v4l2_set_controls`). OR set them inside `h265_set_controls` first-call. + +Lower-risk path: extend context.c's existing block (mirrors the existing pattern, minimal new code). Picks up the EINVAL noise cosmetic on non-HEVC devices but matches existing behavior. Phase 6 default: extend context.c. + +### File 8: `include/hevc-ctrls.h` — leave as-is + +The 9-line shim is harmless (per Phase 2 Bug 7 verify-only). NOT deleted in iter2 (lower-risk path; iter1 Phase 5 Nit 6 deferral continues). + +## Phase 6 implementation order + +Phase 6 lands in 2 logical commits + optional fix-forward: + +1. **Commit A — `src/config.c` HEVCMain break**: 5-line diff. Verifies the substrate fix in isolation (Phase 3 Baseline D already proved it). Phase 7 partial verification: criterion 1 + 2 should pass (vainfo enum unchanged, `vaCreateConfig` SUCCESS); criteria 3-5 still fail because picture.c reject is in place. + +2. **Commit B — h265.c rewrite + picture.c HEVCMain dispatch + slice_params accumulation + meson re-enable + surface.h extension + context.c device-init extension**: the bulk of iter2 work. Phase 7 verification: all 5 criteria green. + +3. **Commit C (optional)** — fix-forward if Phase 7 surfaces a regression. Per [`memory/feedback_header_deletion_check.md`](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_header_deletion_check.md), iter2 doesn't delete `hevc-ctrls.h`, so the iter1 Commit-D-style header-completeness oversight doesn't apply. Other fix-forward triggers are Phase 7 → Phase 4 loopback signals; pre-identified below. + +Implementation strategy for Commit B: develop incrementally inside h265.c with `printf` instrumentation showing each per-frame fill (SPS struct hex dump, PPS, decode_params, slice_params count, scaling_matrix presence). After build passes and mpv-vaapi runs without crash, decode 2 frames and compare HW vs SW JPEG hashes. Iterate until match. Strip instrumentation at close (per [`phase8_iteration1_close.md`](phase8_iteration1_close.md) iter1 sweep precedent). + +## Phase 7 verification harness + +Re-uses iter1's 5-criterion shape with HEVC fixture substituted. All 5 run in one pass; raw output captured to `phase0_evidence/2026-05-08-or-later/iter2_phase7/`. + +```bash +# Re-build + install +ssh fresnel ' +cd ~/src/libva-v4l2-request-fourier +git pull --ff-only +ninja -C build && sudo ninja -C build install +sha256sum /usr/lib/dri/v4l2_request_drv_video.so +' + +# Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec bind +ssh fresnel ' +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ +vainfo --display drm --device /dev/dri/renderD128 2>&1 | \ + grep -E "VAProfileHEVCMain" +' + +# Criteria 2 + 3: vaCreateConfig + ffmpeg-direct decode +ssh fresnel ' +mkdir -p /tmp/iter2_phase7 +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ +LIBVA_TRACE=/tmp/iter2_phase7/libva.trace \ +ffmpeg -hide_banner -loglevel info -hwaccel vaapi \ + -i ~/fourier-test/bbb_720p10s_hevc.mp4 -frames:v 5 -f null - +' +# Expected: exit 0, no Failed-to-create-decode-config, libva trace +# shows vaCreateConfig SUCCESS, no EINVAL on S_EXT_CTRLS. + +# Criterion 4: DMA-BUF GL HW vs SW byte-identical at +02s +ssh fresnel ' +mkdir -p /tmp/iter2_phase7/png_hw /tmp/iter2_phase7/png_sw +WAYLAND_DISPLAY=wayland-0 XDG_RUNTIME_DIR=/run/user/1000 \ +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ +mpv --hwdec=vaapi --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:02 \ + --vo-image-outdir=/tmp/iter2_phase7/png_hw \ + ~/fourier-test/bbb_720p10s_hevc.mp4 + +mpv --hwdec=no --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:02 \ + --vo-image-outdir=/tmp/iter2_phase7/png_sw \ + ~/fourier-test/bbb_720p10s_hevc.mp4 + +sha256sum /tmp/iter2_phase7/png_hw/*.jpg /tmp/iter2_phase7/png_sw/*.jpg +' +# Expected: HW frame 1 hash == SW frame 1 hash; HW frame 2 hash == +# SW frame 2 hash; frame 1 hash != frame 2 hash (real motion). +# Per memory feedback_rockchip_pixel_verify_path.md — DMA-BUF GL is +# the cache-coherency-safe verifier; do NOT use ffmpeg-vaapi+hwdownload +# (cache-stale class on RK3399 for both H.264 + MPEG-2; HEVC expected same). + +# Criterion 5: iter1 MPEG-2 + T4 H.264 reference hashes still match +ssh fresnel ' +# H.264 (T4 reference) +mkdir -p /tmp/iter2_phase7/h264_hw /tmp/iter2_phase7/h264_sw +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ +mpv --hwdec=vaapi --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:30 \ + --vo-image-outdir=/tmp/iter2_phase7/h264_hw \ + ~/fourier-test/bbb_1080p30_h264.mp4 +mpv --hwdec=no --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:30 \ + --vo-image-outdir=/tmp/iter2_phase7/h264_sw \ + ~/fourier-test/bbb_1080p30_h264.mp4 + +# MPEG-2 (iter1 reference) +mkdir -p /tmp/iter2_phase7/mpeg2_hw /tmp/iter2_phase7/mpeg2_sw +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1 \ +mpv --hwdec=vaapi --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:02 \ + --vo-image-outdir=/tmp/iter2_phase7/mpeg2_hw \ + ~/fourier-test/bbb_720p10s_mpeg2.ts +mpv --hwdec=no --frames=2 --vo=image --no-audio \ + --no-input-default-bindings --start=00:00:02 \ + --vo-image-outdir=/tmp/iter2_phase7/mpeg2_sw \ + ~/fourier-test/bbb_720p10s_mpeg2.ts + +sha256sum /tmp/iter2_phase7/h264_hw/*.jpg /tmp/iter2_phase7/h264_sw/*.jpg \ + /tmp/iter2_phase7/mpeg2_hw/*.jpg /tmp/iter2_phase7/mpeg2_sw/*.jpg +' +# Expected: +# H.264 frames at +30s: f623d5f7... (frame 1) and 7d7bc6f2... (frame 2) +# MPEG-2 frames at +02s: 6e7873030dbf... (frame 1) and ccc7ce08810d... (frame 2) + +# Bonus byte-compare: post-fix S_EXT_CTRLS payload vs Baseline B verbatim +ssh fresnel ' +mkdir -p /tmp/iter2_phase7/cross +LIBVA_DRIVER_NAME=v4l2_request \ +LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ +LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ +strace -ff -tt -y -v -s 8192 -e trace=ioctl \ + -o /tmp/iter2_phase7/cross/ffmpeg.strace \ + ffmpeg -hide_banner -loglevel error -hwaccel vaapi \ + -i ~/fourier-test/bbb_720p10s_hevc.mp4 -frames:v 2 -f null - +grep "VIDIOC_S_EXT_CTRLS.*ctrl_class=0xf010000.*count=5" \ + /tmp/iter2_phase7/cross/ffmpeg.strace.* | head -2 +' +# Expected per Baseline B: per frame, count=5 with ids 0xa40a90/91/92/93/94 +# in order; SPS bytes for first 40 should match Baseline B's BBB-SPS verbatim +# (1280x720, 8-bit, 4:2:0, flags=SAO|STRONG_INTRA_SMOOTHING). +``` + +## Pass/fail decision + +All 5 criteria PASS → Phase 7 closes green; proceed to Phase 8 (memory update + close iter2). + +Any criterion FAIL → Phase 7 → Phase 4 loopback per `feedback_dev_process.md`. Pre-identified loopback triggers: + +1. **`VIDIOC_S_EXT_CTRLS` returns EINVAL post-fix on per-frame batch**. Likely causes: + - Struct size mismatch between iter2's stack-allocated structs and kernel-expected sizes. Mitigation: `pahole` against kernel UAPI; compare to Phase 3 Baseline B verbatim sizes (40 + 64 + 328 = 432 bytes for the fixed-size controls). + - SCALING_MATRIX size encoding wrong (depends on whether kernel expects fixed or runtime-discovered size). + - reserved fields not zeroed (`memset` was forgotten on a struct). + +2. **HW pixel hashes differ from SW**. Likely causes: + - DPB ordering wrong (FFmpeg populates `poc_st_curr_before/after` in specific order; iter2's translation from VAAPI ReferenceFrames must match). + - Slice_params bit_size or data_bit_offset off-by-N from NAL header byte alignment quirks (preserved logic from old h265.c, but the dynamic-array shape might affect slice boundaries). + - SPS/PPS flags bitmask wrong bit position (e.g., `_SAMPLE_ADAPTIVE_OFFSET` is bit 3, not bit 4 — easy off-by-1). + - SCALING_MATRIX values wrong (transcribed from spec rather than from Baseline B verbatim — per Lesson L2, this is the common trap). + +3. **mpv `--hwdec=vaapi` filters HEVC out** (analogous to vaapi-copy filtering MPEG-2). Mitigation: per Phase 5 Q4 amendment in iter1, fall-forward to ffmpeg `-vf hwdownload` path. Less likely than for MPEG-2 because mpv-vaapi DID engage MPEG-2 in iter1. + +4. **iter1 MPEG-2 OR T4 H.264 regression**. Bug 1 + picture.c HEVCMain dispatch must not touch MPEG-2 / H.264 paths. Mitigation: verify Phase 3 Baseline D-style scratch was scoped right; re-read the diffs against the dispatch tables. + +5. **Slice_params dynamic-array submission shape rejected by kernel**. Possible if kernel expects `count` as element count rather than `size` as bytes (the kernel UAPI might want a different size encoding). Mitigation: cross-validator anchor in Phase 3 Baseline B has the verbatim `size=N` value for one frame's batch; iter2's submission must produce a matching size for matching slice count. If dynamic-array semantics are confusing, FFmpeg `v4l2_request_hevc.c:540-547` has the canonical pattern. + +6. **SCALING_MATRIX availability detection wrong**. iter2 assumes kernel always advertises (matches Baseline B). If on a different host (e.g., ohm) kernel doesn't advertise, the unconditional submission would fail. Mitigation: probe via `VIDIOC_QUERY_EXT_CTRL` at h265_init_device_controls; gate inclusion in batch on probe result. **Defer this defensive path to Phase 6 if Phase 3 Baseline B is anchor enough**. + +7. **Latent bug B3 (h264.matrix_set=false writes inside h265.picture)** — for HEVC surfaces, byte 240 of the `params` union lands inside `h265.picture` (Phase 2 Bug 8 verified). RenderPicture's `VAPictureParameterBufferType` per-frame copy overwrites it. Iter1 Bug 8 documentation explains the masking; iter2 inherits the same masking via ffmpeg-vaapi sender pattern (always sends VAPictureParameterBufferType per frame). If a VAAPI client surfaces without per-frame picture params, iter2 won't catch it — same latent as iter1. + +## Out of scope (LOCKED for iter2) + +- VP9, VP8 work (iter3/iter4). +- HEVC Main 10 (10-bit) profile. +- HEVC Main Still Picture profile. +- HEVC range extensions (SCC, REXT) — `EXT_SPS_ST_RPS`, `EXT_SPS_LT_RPS` controls. +- HEVC tile / wavefront parallel processing — `ENTRY_POINT_OFFSETS` control. +- Performance metrics (Phase 1+ separate iteration). +- Long-duration HEVC stress (>10s). +- Slice-mode decoding (`SLICE_BASED` decode mode) — rkvdec only does FRAME_BASED. +- Phase 4 cross-cutting backlog items B1 (V4L2 device-discovery), B3 (BeginPicture profile-aware reset), B4 (context.c log suppression), B5 (vbv_buffer_size negotiation), L3 (vaDeriveImage cache-stale fix). +- chromium-fourier 149 install on fresnel. +- Upstream Linux engagement. +- `include/hevc-ctrls.h` deletion (carries forward from iter1 Phase 5 Nit 6). + +## Phase 5 entry point + +Phase 5 (second-model review) inputs: this plan + the Phase 3 Baseline B verbatim payloads. Per `feedback_dev_process.md`: + +> Goal, situation, measurements, plan get pasted into DokuWiki. Markus reviews and redacts, then initiates the handover to a fresh model instance. Claude does not curate the artifact going to the reviewer — that would re-introduce the blind-spot accumulation the review is meant to escape. Do not summarize when handing over; paste the actual artifacts. + +Concretely: artifacts to hand over are the four primary documents in this campaign repo (`phase0_findings_iter2.md`, `phase2_iter2_situation.md`, `phase3_iter2_baseline.md`, `phase4_iter2_plan.md`) plus the `phase0_evidence/2026-05-08/iter2_phase3/` raw output. No summary, no executive overview, no "the gist is" framing — Markus has the raw bundle, the reviewer reads it directly. + +Per [`memory/feedback_review_empirical_over_theoretical.md`](../../.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/feedback_review_empirical_over_theoretical.md): when the reviewer flags a numerical mismatch, the right response is "I'll empirically check during Phase 7" — NOT a same-day source-read rebuttal. + +## Predicted iter2 outcome + +The fix is structurally larger than iter1 (10 contract clauses vs 6) but bounded: + +- Trivial: Bugs 1, 8, 9 (config break + meson re-enable + dispatch) total ~15 lines. +- Substantial: Bugs 3, 4, 5, 7, 10 (h265.c rewrite + DECODE_PARAMS + SCALING_MATRIX + slice_params dynamic-array + per-slice accumulation in picture.c) — ~400 lines combined. + +Expected Phase 7 outcome: criteria 1+2 pass after Commit A. Criteria 3+4+5 pass after Commit B. Likely 1-2 Phase 7 → Phase 4 loopbacks for off-by-one bit positions in flags bitmasks or DPB ordering nuances. Phase 8 close estimated to land 4-6 commits on the fork (vs iter1's 4). + +If a major surprise fires (e.g., slice_params dynamic-array submission requires a different ioctl path, or scaling_matrix structure differs significantly between FFmpeg and kernel UAPI), Phase 7 → Phase 4 → Phase 2 loopback to source-read deeper. Substrate is well-understood; major surprises unlikely.