# Iteration 8 — Phase 4b plan (α-1 SPS constraint_set_flags fix) Drafted 2026-05-13 after Phase 7 γ + IMP-1 narrowed Bug 4 to kernel-side partial decode. Phase 4b proposes the targeted α-1 fix: per-profile derivation of `sps.constraint_set_flags` in `h264_set_controls`. ## Empirical anchor for the fix Phase 3 strace evidence (phase3_iter8_findings.md lines 70-78): the ONLY meaningful SPS byte diff between libva (broken) and kdirect (working) is `constraint_set_flags`. All other 1047 bytes of the SPS are identical. ``` libva: M\0 ... (constraint_set_flags = 0x00) kdirect: M\2 ... (constraint_set_flags = 0x02) ``` Phase 7 narrowing: kernel writes only 512 bytes of luma-neutral data + UV scratch markers, deterministic across pre-zeroed runs. Reading kdirect's success vs libva's failure with the only documented control diff being constraint_set_flags = 0x00 vs 0x02 makes this the leading hypothesis for "what triggers rkvdec's early-out." ## What constraint_set_flags means Per ITU-T H.264 §7.4.2.1.1 (SPS RBSP semantics): - bit 0: `constraint_set0_flag` — Baseline conformance. - bit 1: `constraint_set1_flag` — Main conformance subset. - bit 2: `constraint_set2_flag` — Extended conformance subset. - bit 3: `constraint_set3_flag` — Multiple-use indicator (Baseline/Extended/High). - bit 4: `constraint_set4_flag` — High-profile constraint. - bit 5: `constraint_set5_flag` — High-profile constraint. - bits 6-7: reserved. These are informational bits in the H.264 bitstream. The spec does NOT require a decoder to read them. However, **kernel decoders for stateless V4L2 have been observed to use them as profile-detection hints**: a profile_idc=77 (Main) with constraint_set_flags=0 might be classified differently from constraint_set_flags=0x02. rkvdec's hardware decoder typically picks an internal decoder-config table based on the profile + constraint hints; if the table for "Main with bit-1 unset" is unmapped, the hardware may abort early. ## VAAPI gap: constraint_set_flags is not forwarded VAAPI's `VAPictureParameterBufferH264` and `VAConfigParameterBuffer` do NOT include constraint_set_flags. ffmpeg-vaapi (the consumer) parses the SPS NAL client-side but doesn't forward this field through libva's API surface. The libva backend has only the VAProfile (e.g., `VAProfileH264Main`) to work with. This is the same constraint that led to `h264_derive_level_idc` (h264.c:778) — VAAPI doesn't forward level_idc either, so the backend derives it from frame size. ## Proposed fix ### Mechanism Add a per-profile `h264_constraint_set_flags(VAProfile profile)` helper in `h264.c`, similar to `h264_profile_to_idc`. Map per H.264 typical-stream conventions: | VAProfile | profile_idc | constraint_set_flags | Rationale | |---|---|---|---| | VAProfileH264Main | 77 | **0x02** | Conformant Main streams set bit 1 | | VAProfileH264High | 100 | 0x00 | High has no Baseline/Main subset hint | | VAProfileH264ConstrainedBaseline | 66 | 0x42 | bit 1 + bit 6 per the "Constrained" subset | | VAProfileH264MultiviewHigh | 118 | 0x00 | Multi-view doesn't use subset bits | | VAProfileH264StereoHigh | 128 | 0x00 | Stereo doesn't use subset bits | The BBB H.264 fixture is Main → 0x02, matching kdirect. ### Code shape ```c static inline __u8 h264_constraint_set_flags(VAProfile profile) { switch (profile) { case VAProfileH264Main: return 0x02; case VAProfileH264ConstrainedBaseline: return 0x42; case VAProfileH264High: case VAProfileH264MultiviewHigh: case VAProfileH264StereoHigh: default: return 0x00; } } ``` And in `h264_set_controls` around line 909: ```c sps.profile_idc = h264_profile_to_idc(profile); sps.constraint_set_flags = h264_constraint_set_flags(profile); /* NEW */ sps.level_idc = h264_derive_level_idc(...); ``` ### LOC ~15 LOC: new helper (~10) + one line in `h264_set_controls`. ### Scope In scope: `src/h264.c` only. Out of scope: VP9 / VP8 / HEVC / MPEG-2 (profile gating ensures isolation per `feedback_unconditional_codec_state.md`). ### Risks - **R-1**: constraint_set_flags hypothesis is wrong. The kernel doesn't actually use this byte for decoder dispatch; fix doesn't change H.264 output. Probability: low-to-medium. Mitigation: γ dump after the change will confirm whether the kernel writes more or the same. - **R-2**: Per-profile mapping is incorrect for some BBB-class content (e.g., bit-5 should be set for High but isn't). Probability: low — High typically has 0x00. Mitigation: Phase 7b also runs the existing H.264 sweep on multiple variants if available. - **R-3**: VP9/HEVC/MPEG-2/VP8 regression. Probability: zero by construction — change is profile-gated. ## Phase 5b review Per CLAUDE.md "reviews are never skippable," this Phase 4b plan must receive a Phase 5b architect review before Phase 6b implementation. Phase 5b reviewer should validate: - The H.264 spec mapping (per-profile constraint_set_flags values). - That ffmpeg-v4l2request really does take the value from `sps->constraint_set_flags` parsed from SPS NAL (already empirically verified per Phase 3 strace + ffmpeg source at `~/src/aur/ffmpeg-git/src/FFmpeg/libavcodec/v4l2_request_h264.c:140`). - The empirical hypothesis that this single byte is load-bearing for rkvdec's H.264 path is plausible. Reviewer may push for alternative α-2 fixes (e.g., DECODE_PARAMS.dpb[].flags upper bits which also differ per Phase 3 line 85). ## Phase 7b verification matrix 1. Build, install on fresnel. 2. Run γ-instrumented H.264 sweep. Confirm dump now shows full-plane writes (or quantify change). 3. Hash the libva H.264 YUV. Compare against kdirect's `1e7a0bc9…`. 4. Run 5-codec regression sweep: VP9, HEVC, MPEG-2, VP8 hashes unchanged. 5. Run control-payload anchor regression: confirm no unrelated payload changes. Iter8 PASS criteria (5/5 needed for clean close): - C1: libva_h264 == kdirect_h264 hash. - C2-C5: 4 other codecs unchanged. - C6: Control-payload anchors hold for 4 non-H.264 codecs. If α-1 doesn't fix → iter8 PARTIAL with Bug 4 narrowed to "constraint_set_flags isn't load-bearing; investigate DPB flags or other diff." ## Predicted iter8b cadence - Phase 4b: this doc. - Phase 5b: sonnet-architect review (15-20 min). - Phase 6b: implement, build, install (15-20 min). - Phase 7b: verify, 5-codec sweep (15-20 min). - Phase 8: close (10 min). Total: ~1 hour wallclock contingent on fresnel uptime + reviewer turnaround. ## What "iter8 PASS" looks like If α-1 closes Bug 4 cleanly: - iter8 PASS, codec scoreboard H.264 row: rkvdec / iter8 / **PASS direct**. - One-line memory entry candidate: "H.264 SPS.constraint_set_flags must be derived per-VAProfile for rkvdec; VAAPI doesn't forward it. See h264_constraint_set_flags() helper." - iter5b-β iter6 iter7 iter8 = three direct PASS + one quality-of-life close in four iterations. If α-1 PARTIAL: - iter8 = PARTIAL (criterion 1 partial, criteria 2-6 PASS). - Bug 4 narrowed; iter9 candidate.