From 678c072d75207a2eae3759dd51995e37a7ff111f Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Wed, 13 May 2026 12:25:23 +0000 Subject: [PATCH] =?UTF-8?q?iter8=20Phase=204b:=20=CE=B1-1=20plan=20?= =?UTF-8?q?=E2=80=94=20per-profile=20SPS=20constraint=5Fset=5Fflags?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Single-byte fix candidate. Add h264_constraint_set_flags(VAProfile) helper to h264.c, mirror pattern of h264_profile_to_idc + level_idc derivation. VAAPI doesn't forward this field; libva backend must derive per profile. Mapping per H.264 typical-stream conventions: Main → 0x02 (constraint_set1_flag, matches BBB + kdirect) ConstrainedBaseline → 0x42 High / MultiviewHigh / StereoHigh → 0x00 LOC ~15 in h264.c only. Per-VAProfile-gated; no risk to VP9/VP8/HEVC/ MPEG-2. Phase 5b architect review required before Phase 6b implementation. Co-Authored-By: Claude Opus 4.7 (1M context) --- phase4b_iter8_plan.md | 134 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 phase4b_iter8_plan.md diff --git a/phase4b_iter8_plan.md b/phase4b_iter8_plan.md new file mode 100644 index 0000000..7c4843d --- /dev/null +++ b/phase4b_iter8_plan.md @@ -0,0 +1,134 @@ +# Iteration 8 — Phase 4b plan (α-1 SPS constraint_set_flags fix) + +Drafted 2026-05-13 after Phase 7 γ + IMP-1 narrowed Bug 4 to kernel-side partial decode. Phase 4b proposes the targeted α-1 fix: per-profile derivation of `sps.constraint_set_flags` in `h264_set_controls`. + +## Empirical anchor for the fix + +Phase 3 strace evidence (phase3_iter8_findings.md lines 70-78): the ONLY meaningful SPS byte diff between libva (broken) and kdirect (working) is `constraint_set_flags`. All other 1047 bytes of the SPS are identical. + +``` +libva: M\0 ... (constraint_set_flags = 0x00) +kdirect: M\2 ... (constraint_set_flags = 0x02) +``` + +Phase 7 narrowing: kernel writes only 512 bytes of luma-neutral data + UV scratch markers, deterministic across pre-zeroed runs. Reading kdirect's success vs libva's failure with the only documented control diff being constraint_set_flags = 0x00 vs 0x02 makes this the leading hypothesis for "what triggers rkvdec's early-out." + +## What constraint_set_flags means + +Per ITU-T H.264 §7.4.2.1.1 (SPS RBSP semantics): +- bit 0: `constraint_set0_flag` — Baseline conformance. +- bit 1: `constraint_set1_flag` — Main conformance subset. +- bit 2: `constraint_set2_flag` — Extended conformance subset. +- bit 3: `constraint_set3_flag` — Multiple-use indicator (Baseline/Extended/High). +- bit 4: `constraint_set4_flag` — High-profile constraint. +- bit 5: `constraint_set5_flag` — High-profile constraint. +- bits 6-7: reserved. + +These are informational bits in the H.264 bitstream. The spec does NOT require a decoder to read them. However, **kernel decoders for stateless V4L2 have been observed to use them as profile-detection hints**: a profile_idc=77 (Main) with constraint_set_flags=0 might be classified differently from constraint_set_flags=0x02. rkvdec's hardware decoder typically picks an internal decoder-config table based on the profile + constraint hints; if the table for "Main with bit-1 unset" is unmapped, the hardware may abort early. + +## VAAPI gap: constraint_set_flags is not forwarded + +VAAPI's `VAPictureParameterBufferH264` and `VAConfigParameterBuffer` do NOT include constraint_set_flags. ffmpeg-vaapi (the consumer) parses the SPS NAL client-side but doesn't forward this field through libva's API surface. The libva backend has only the VAProfile (e.g., `VAProfileH264Main`) to work with. + +This is the same constraint that led to `h264_derive_level_idc` (h264.c:778) — VAAPI doesn't forward level_idc either, so the backend derives it from frame size. + +## Proposed fix + +### Mechanism + +Add a per-profile `h264_constraint_set_flags(VAProfile profile)` helper in `h264.c`, similar to `h264_profile_to_idc`. Map per H.264 typical-stream conventions: + +| VAProfile | profile_idc | constraint_set_flags | Rationale | +|---|---|---|---| +| VAProfileH264Main | 77 | **0x02** | Conformant Main streams set bit 1 | +| VAProfileH264High | 100 | 0x00 | High has no Baseline/Main subset hint | +| VAProfileH264ConstrainedBaseline | 66 | 0x42 | bit 1 + bit 6 per the "Constrained" subset | +| VAProfileH264MultiviewHigh | 118 | 0x00 | Multi-view doesn't use subset bits | +| VAProfileH264StereoHigh | 128 | 0x00 | Stereo doesn't use subset bits | + +The BBB H.264 fixture is Main → 0x02, matching kdirect. + +### Code shape + +```c +static inline __u8 h264_constraint_set_flags(VAProfile profile) +{ + switch (profile) { + case VAProfileH264Main: + return 0x02; + case VAProfileH264ConstrainedBaseline: + return 0x42; + case VAProfileH264High: + case VAProfileH264MultiviewHigh: + case VAProfileH264StereoHigh: + default: + return 0x00; + } +} +``` + +And in `h264_set_controls` around line 909: +```c +sps.profile_idc = h264_profile_to_idc(profile); +sps.constraint_set_flags = h264_constraint_set_flags(profile); /* NEW */ +sps.level_idc = h264_derive_level_idc(...); +``` + +### LOC + +~15 LOC: new helper (~10) + one line in `h264_set_controls`. + +### Scope + +In scope: `src/h264.c` only. + +Out of scope: VP9 / VP8 / HEVC / MPEG-2 (profile gating ensures isolation per `feedback_unconditional_codec_state.md`). + +### Risks + +- **R-1**: constraint_set_flags hypothesis is wrong. The kernel doesn't actually use this byte for decoder dispatch; fix doesn't change H.264 output. Probability: low-to-medium. Mitigation: γ dump after the change will confirm whether the kernel writes more or the same. +- **R-2**: Per-profile mapping is incorrect for some BBB-class content (e.g., bit-5 should be set for High but isn't). Probability: low — High typically has 0x00. Mitigation: Phase 7b also runs the existing H.264 sweep on multiple variants if available. +- **R-3**: VP9/HEVC/MPEG-2/VP8 regression. Probability: zero by construction — change is profile-gated. + +## Phase 5b review + +Per CLAUDE.md "reviews are never skippable," this Phase 4b plan must receive a Phase 5b architect review before Phase 6b implementation. Phase 5b reviewer should validate: +- The H.264 spec mapping (per-profile constraint_set_flags values). +- That ffmpeg-v4l2request really does take the value from `sps->constraint_set_flags` parsed from SPS NAL (already empirically verified per Phase 3 strace + ffmpeg source at `~/src/aur/ffmpeg-git/src/FFmpeg/libavcodec/v4l2_request_h264.c:140`). +- The empirical hypothesis that this single byte is load-bearing for rkvdec's H.264 path is plausible. Reviewer may push for alternative α-2 fixes (e.g., DECODE_PARAMS.dpb[].flags upper bits which also differ per Phase 3 line 85). + +## Phase 7b verification matrix + +1. Build, install on fresnel. +2. Run γ-instrumented H.264 sweep. Confirm dump now shows full-plane writes (or quantify change). +3. Hash the libva H.264 YUV. Compare against kdirect's `1e7a0bc9…`. +4. Run 5-codec regression sweep: VP9, HEVC, MPEG-2, VP8 hashes unchanged. +5. Run control-payload anchor regression: confirm no unrelated payload changes. + +Iter8 PASS criteria (5/5 needed for clean close): +- C1: libva_h264 == kdirect_h264 hash. +- C2-C5: 4 other codecs unchanged. +- C6: Control-payload anchors hold for 4 non-H.264 codecs. + +If α-1 doesn't fix → iter8 PARTIAL with Bug 4 narrowed to "constraint_set_flags isn't load-bearing; investigate DPB flags or other diff." + +## Predicted iter8b cadence + +- Phase 4b: this doc. +- Phase 5b: sonnet-architect review (15-20 min). +- Phase 6b: implement, build, install (15-20 min). +- Phase 7b: verify, 5-codec sweep (15-20 min). +- Phase 8: close (10 min). + +Total: ~1 hour wallclock contingent on fresnel uptime + reviewer turnaround. + +## What "iter8 PASS" looks like + +If α-1 closes Bug 4 cleanly: +- iter8 PASS, codec scoreboard H.264 row: rkvdec / iter8 / **PASS direct**. +- One-line memory entry candidate: "H.264 SPS.constraint_set_flags must be derived per-VAProfile for rkvdec; VAAPI doesn't forward it. See h264_constraint_set_flags() helper." +- iter5b-β iter6 iter7 iter8 = three direct PASS + one quality-of-life close in four iterations. + +If α-1 PARTIAL: +- iter8 = PARTIAL (criterion 1 partial, criteria 2-6 PASS). +- Bug 4 narrowed; iter9 candidate.