Single-byte fix candidate. Add h264_constraint_set_flags(VAProfile) helper to h264.c, mirror pattern of h264_profile_to_idc + level_idc derivation. VAAPI doesn't forward this field; libva backend must derive per profile. Mapping per H.264 typical-stream conventions: Main → 0x02 (constraint_set1_flag, matches BBB + kdirect) ConstrainedBaseline → 0x42 High / MultiviewHigh / StereoHigh → 0x00 LOC ~15 in h264.c only. Per-VAProfile-gated; no risk to VP9/VP8/HEVC/ MPEG-2. Phase 5b architect review required before Phase 6b implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.8 KiB
Iteration 8 — Phase 4b plan (α-1 SPS constraint_set_flags fix)
Drafted 2026-05-13 after Phase 7 γ + IMP-1 narrowed Bug 4 to kernel-side partial decode. Phase 4b proposes the targeted α-1 fix: per-profile derivation of sps.constraint_set_flags in h264_set_controls.
Empirical anchor for the fix
Phase 3 strace evidence (phase3_iter8_findings.md lines 70-78): the ONLY meaningful SPS byte diff between libva (broken) and kdirect (working) is constraint_set_flags. All other 1047 bytes of the SPS are identical.
libva: M\0 ... (constraint_set_flags = 0x00)
kdirect: M\2 ... (constraint_set_flags = 0x02)
Phase 7 narrowing: kernel writes only 512 bytes of luma-neutral data + UV scratch markers, deterministic across pre-zeroed runs. Reading kdirect's success vs libva's failure with the only documented control diff being constraint_set_flags = 0x00 vs 0x02 makes this the leading hypothesis for "what triggers rkvdec's early-out."
What constraint_set_flags means
Per ITU-T H.264 §7.4.2.1.1 (SPS RBSP semantics):
- bit 0:
constraint_set0_flag— Baseline conformance. - bit 1:
constraint_set1_flag— Main conformance subset. - bit 2:
constraint_set2_flag— Extended conformance subset. - bit 3:
constraint_set3_flag— Multiple-use indicator (Baseline/Extended/High). - bit 4:
constraint_set4_flag— High-profile constraint. - bit 5:
constraint_set5_flag— High-profile constraint. - bits 6-7: reserved.
These are informational bits in the H.264 bitstream. The spec does NOT require a decoder to read them. However, kernel decoders for stateless V4L2 have been observed to use them as profile-detection hints: a profile_idc=77 (Main) with constraint_set_flags=0 might be classified differently from constraint_set_flags=0x02. rkvdec's hardware decoder typically picks an internal decoder-config table based on the profile + constraint hints; if the table for "Main with bit-1 unset" is unmapped, the hardware may abort early.
VAAPI gap: constraint_set_flags is not forwarded
VAAPI's VAPictureParameterBufferH264 and VAConfigParameterBuffer do NOT include constraint_set_flags. ffmpeg-vaapi (the consumer) parses the SPS NAL client-side but doesn't forward this field through libva's API surface. The libva backend has only the VAProfile (e.g., VAProfileH264Main) to work with.
This is the same constraint that led to h264_derive_level_idc (h264.c:778) — VAAPI doesn't forward level_idc either, so the backend derives it from frame size.
Proposed fix
Mechanism
Add a per-profile h264_constraint_set_flags(VAProfile profile) helper in h264.c, similar to h264_profile_to_idc. Map per H.264 typical-stream conventions:
| VAProfile | profile_idc | constraint_set_flags | Rationale |
|---|---|---|---|
| VAProfileH264Main | 77 | 0x02 | Conformant Main streams set bit 1 |
| VAProfileH264High | 100 | 0x00 | High has no Baseline/Main subset hint |
| VAProfileH264ConstrainedBaseline | 66 | 0x42 | bit 1 + bit 6 per the "Constrained" subset |
| VAProfileH264MultiviewHigh | 118 | 0x00 | Multi-view doesn't use subset bits |
| VAProfileH264StereoHigh | 128 | 0x00 | Stereo doesn't use subset bits |
The BBB H.264 fixture is Main → 0x02, matching kdirect.
Code shape
static inline __u8 h264_constraint_set_flags(VAProfile profile)
{
switch (profile) {
case VAProfileH264Main:
return 0x02;
case VAProfileH264ConstrainedBaseline:
return 0x42;
case VAProfileH264High:
case VAProfileH264MultiviewHigh:
case VAProfileH264StereoHigh:
default:
return 0x00;
}
}
And in h264_set_controls around line 909:
sps.profile_idc = h264_profile_to_idc(profile);
sps.constraint_set_flags = h264_constraint_set_flags(profile); /* NEW */
sps.level_idc = h264_derive_level_idc(...);
LOC
~15 LOC: new helper (~10) + one line in h264_set_controls.
Scope
In scope: src/h264.c only.
Out of scope: VP9 / VP8 / HEVC / MPEG-2 (profile gating ensures isolation per feedback_unconditional_codec_state.md).
Risks
- R-1: constraint_set_flags hypothesis is wrong. The kernel doesn't actually use this byte for decoder dispatch; fix doesn't change H.264 output. Probability: low-to-medium. Mitigation: γ dump after the change will confirm whether the kernel writes more or the same.
- R-2: Per-profile mapping is incorrect for some BBB-class content (e.g., bit-5 should be set for High but isn't). Probability: low — High typically has 0x00. Mitigation: Phase 7b also runs the existing H.264 sweep on multiple variants if available.
- R-3: VP9/HEVC/MPEG-2/VP8 regression. Probability: zero by construction — change is profile-gated.
Phase 5b review
Per CLAUDE.md "reviews are never skippable," this Phase 4b plan must receive a Phase 5b architect review before Phase 6b implementation. Phase 5b reviewer should validate:
- The H.264 spec mapping (per-profile constraint_set_flags values).
- That ffmpeg-v4l2request really does take the value from
sps->constraint_set_flagsparsed from SPS NAL (already empirically verified per Phase 3 strace + ffmpeg source at~/src/aur/ffmpeg-git/src/FFmpeg/libavcodec/v4l2_request_h264.c:140). - The empirical hypothesis that this single byte is load-bearing for rkvdec's H.264 path is plausible. Reviewer may push for alternative α-2 fixes (e.g., DECODE_PARAMS.dpb[].flags upper bits which also differ per Phase 3 line 85).
Phase 7b verification matrix
- Build, install on fresnel.
- Run γ-instrumented H.264 sweep. Confirm dump now shows full-plane writes (or quantify change).
- Hash the libva H.264 YUV. Compare against kdirect's
1e7a0bc9…. - Run 5-codec regression sweep: VP9, HEVC, MPEG-2, VP8 hashes unchanged.
- Run control-payload anchor regression: confirm no unrelated payload changes.
Iter8 PASS criteria (5/5 needed for clean close):
- C1: libva_h264 == kdirect_h264 hash.
- C2-C5: 4 other codecs unchanged.
- C6: Control-payload anchors hold for 4 non-H.264 codecs.
If α-1 doesn't fix → iter8 PARTIAL with Bug 4 narrowed to "constraint_set_flags isn't load-bearing; investigate DPB flags or other diff."
Predicted iter8b cadence
- Phase 4b: this doc.
- Phase 5b: sonnet-architect review (15-20 min).
- Phase 6b: implement, build, install (15-20 min).
- Phase 7b: verify, 5-codec sweep (15-20 min).
- Phase 8: close (10 min).
Total: ~1 hour wallclock contingent on fresnel uptime + reviewer turnaround.
What "iter8 PASS" looks like
If α-1 closes Bug 4 cleanly:
- iter8 PASS, codec scoreboard H.264 row: rkvdec / iter8 / PASS direct.
- One-line memory entry candidate: "H.264 SPS.constraint_set_flags must be derived per-VAProfile for rkvdec; VAAPI doesn't forward it. See h264_constraint_set_flags() helper."
- iter5b-β iter6 iter7 iter8 = three direct PASS + one quality-of-life close in four iterations.
If α-1 PARTIAL:
- iter8 = PARTIAL (criterion 1 partial, criteria 2-6 PASS).
- Bug 4 narrowed; iter9 candidate.