iter8 Phase 4b: α-1 plan — per-profile SPS constraint_set_flags

Single-byte fix candidate. Add h264_constraint_set_flags(VAProfile)
helper to h264.c, mirror pattern of h264_profile_to_idc + level_idc
derivation. VAAPI doesn't forward this field; libva backend must derive
per profile.

Mapping per H.264 typical-stream conventions:
  Main → 0x02 (constraint_set1_flag, matches BBB + kdirect)
  ConstrainedBaseline → 0x42
  High / MultiviewHigh / StereoHigh → 0x00

LOC ~15 in h264.c only. Per-VAProfile-gated; no risk to VP9/VP8/HEVC/
MPEG-2. Phase 5b architect review required before Phase 6b implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 12:25:23 +00:00
parent 84c939692f
commit 678c072d75
+134
View File
@@ -0,0 +1,134 @@
# Iteration 8 — Phase 4b plan (α-1 SPS constraint_set_flags fix)
Drafted 2026-05-13 after Phase 7 γ + IMP-1 narrowed Bug 4 to kernel-side partial decode. Phase 4b proposes the targeted α-1 fix: per-profile derivation of `sps.constraint_set_flags` in `h264_set_controls`.
## Empirical anchor for the fix
Phase 3 strace evidence (phase3_iter8_findings.md lines 70-78): the ONLY meaningful SPS byte diff between libva (broken) and kdirect (working) is `constraint_set_flags`. All other 1047 bytes of the SPS are identical.
```
libva: M\0 ... (constraint_set_flags = 0x00)
kdirect: M\2 ... (constraint_set_flags = 0x02)
```
Phase 7 narrowing: kernel writes only 512 bytes of luma-neutral data + UV scratch markers, deterministic across pre-zeroed runs. Reading kdirect's success vs libva's failure with the only documented control diff being constraint_set_flags = 0x00 vs 0x02 makes this the leading hypothesis for "what triggers rkvdec's early-out."
## What constraint_set_flags means
Per ITU-T H.264 §7.4.2.1.1 (SPS RBSP semantics):
- bit 0: `constraint_set0_flag` — Baseline conformance.
- bit 1: `constraint_set1_flag` — Main conformance subset.
- bit 2: `constraint_set2_flag` — Extended conformance subset.
- bit 3: `constraint_set3_flag` — Multiple-use indicator (Baseline/Extended/High).
- bit 4: `constraint_set4_flag` — High-profile constraint.
- bit 5: `constraint_set5_flag` — High-profile constraint.
- bits 6-7: reserved.
These are informational bits in the H.264 bitstream. The spec does NOT require a decoder to read them. However, **kernel decoders for stateless V4L2 have been observed to use them as profile-detection hints**: a profile_idc=77 (Main) with constraint_set_flags=0 might be classified differently from constraint_set_flags=0x02. rkvdec's hardware decoder typically picks an internal decoder-config table based on the profile + constraint hints; if the table for "Main with bit-1 unset" is unmapped, the hardware may abort early.
## VAAPI gap: constraint_set_flags is not forwarded
VAAPI's `VAPictureParameterBufferH264` and `VAConfigParameterBuffer` do NOT include constraint_set_flags. ffmpeg-vaapi (the consumer) parses the SPS NAL client-side but doesn't forward this field through libva's API surface. The libva backend has only the VAProfile (e.g., `VAProfileH264Main`) to work with.
This is the same constraint that led to `h264_derive_level_idc` (h264.c:778) — VAAPI doesn't forward level_idc either, so the backend derives it from frame size.
## Proposed fix
### Mechanism
Add a per-profile `h264_constraint_set_flags(VAProfile profile)` helper in `h264.c`, similar to `h264_profile_to_idc`. Map per H.264 typical-stream conventions:
| VAProfile | profile_idc | constraint_set_flags | Rationale |
|---|---|---|---|
| VAProfileH264Main | 77 | **0x02** | Conformant Main streams set bit 1 |
| VAProfileH264High | 100 | 0x00 | High has no Baseline/Main subset hint |
| VAProfileH264ConstrainedBaseline | 66 | 0x42 | bit 1 + bit 6 per the "Constrained" subset |
| VAProfileH264MultiviewHigh | 118 | 0x00 | Multi-view doesn't use subset bits |
| VAProfileH264StereoHigh | 128 | 0x00 | Stereo doesn't use subset bits |
The BBB H.264 fixture is Main → 0x02, matching kdirect.
### Code shape
```c
static inline __u8 h264_constraint_set_flags(VAProfile profile)
{
switch (profile) {
case VAProfileH264Main:
return 0x02;
case VAProfileH264ConstrainedBaseline:
return 0x42;
case VAProfileH264High:
case VAProfileH264MultiviewHigh:
case VAProfileH264StereoHigh:
default:
return 0x00;
}
}
```
And in `h264_set_controls` around line 909:
```c
sps.profile_idc = h264_profile_to_idc(profile);
sps.constraint_set_flags = h264_constraint_set_flags(profile); /* NEW */
sps.level_idc = h264_derive_level_idc(...);
```
### LOC
~15 LOC: new helper (~10) + one line in `h264_set_controls`.
### Scope
In scope: `src/h264.c` only.
Out of scope: VP9 / VP8 / HEVC / MPEG-2 (profile gating ensures isolation per `feedback_unconditional_codec_state.md`).
### Risks
- **R-1**: constraint_set_flags hypothesis is wrong. The kernel doesn't actually use this byte for decoder dispatch; fix doesn't change H.264 output. Probability: low-to-medium. Mitigation: γ dump after the change will confirm whether the kernel writes more or the same.
- **R-2**: Per-profile mapping is incorrect for some BBB-class content (e.g., bit-5 should be set for High but isn't). Probability: low — High typically has 0x00. Mitigation: Phase 7b also runs the existing H.264 sweep on multiple variants if available.
- **R-3**: VP9/HEVC/MPEG-2/VP8 regression. Probability: zero by construction — change is profile-gated.
## Phase 5b review
Per CLAUDE.md "reviews are never skippable," this Phase 4b plan must receive a Phase 5b architect review before Phase 6b implementation. Phase 5b reviewer should validate:
- The H.264 spec mapping (per-profile constraint_set_flags values).
- That ffmpeg-v4l2request really does take the value from `sps->constraint_set_flags` parsed from SPS NAL (already empirically verified per Phase 3 strace + ffmpeg source at `~/src/aur/ffmpeg-git/src/FFmpeg/libavcodec/v4l2_request_h264.c:140`).
- The empirical hypothesis that this single byte is load-bearing for rkvdec's H.264 path is plausible. Reviewer may push for alternative α-2 fixes (e.g., DECODE_PARAMS.dpb[].flags upper bits which also differ per Phase 3 line 85).
## Phase 7b verification matrix
1. Build, install on fresnel.
2. Run γ-instrumented H.264 sweep. Confirm dump now shows full-plane writes (or quantify change).
3. Hash the libva H.264 YUV. Compare against kdirect's `1e7a0bc9…`.
4. Run 5-codec regression sweep: VP9, HEVC, MPEG-2, VP8 hashes unchanged.
5. Run control-payload anchor regression: confirm no unrelated payload changes.
Iter8 PASS criteria (5/5 needed for clean close):
- C1: libva_h264 == kdirect_h264 hash.
- C2-C5: 4 other codecs unchanged.
- C6: Control-payload anchors hold for 4 non-H.264 codecs.
If α-1 doesn't fix → iter8 PARTIAL with Bug 4 narrowed to "constraint_set_flags isn't load-bearing; investigate DPB flags or other diff."
## Predicted iter8b cadence
- Phase 4b: this doc.
- Phase 5b: sonnet-architect review (15-20 min).
- Phase 6b: implement, build, install (15-20 min).
- Phase 7b: verify, 5-codec sweep (15-20 min).
- Phase 8: close (10 min).
Total: ~1 hour wallclock contingent on fresnel uptime + reviewer turnaround.
## What "iter8 PASS" looks like
If α-1 closes Bug 4 cleanly:
- iter8 PASS, codec scoreboard H.264 row: rkvdec / iter8 / **PASS direct**.
- One-line memory entry candidate: "H.264 SPS.constraint_set_flags must be derived per-VAProfile for rkvdec; VAAPI doesn't forward it. See h264_constraint_set_flags() helper."
- iter5b-β iter6 iter7 iter8 = three direct PASS + one quality-of-life close in four iterations.
If α-1 PARTIAL:
- iter8 = PARTIAL (criterion 1 partial, criteria 2-6 PASS).
- Bug 4 narrowed; iter9 candidate.