diff --git a/phase5b_iter8_review.md b/phase5b_iter8_review.md new file mode 100644 index 0000000..03a677f --- /dev/null +++ b/phase5b_iter8_review.md @@ -0,0 +1,171 @@ +# Iteration 8 — Phase 5b Review (α-1 SPS constraint_set_flags fix) + +Reviewed 2026-05-13. Reviewer: claude-sonnet-4-6. + +Plan under review: `/home/mfritsche/src/fresnel-fourier/phase4b_iter8_plan.md` +Empirical anchor: `/home/mfritsche/src/fresnel-fourier/phase7_iter8_verification.md` + +Sources read (empirical, not theoretical): +- `rkvdec-h264.c` at `/home/mfritsche/src/linux-rfc/drivers/staging/media/rkvdec/rkvdec-h264.c` +- `v4l2_request_h264.c` at `/home/mfritsche/src/libva-multiplanar/references/ffmpeg-kwiboo/libavcodec/v4l2_request_h264.c` +- `h264.c` (libva backend) at `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/src/h264.c` +- `v4l2.c` (libva backend) at `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/src/v4l2.c` +- `/usr/include/linux/v4l2-controls.h` — V4L2 UAPI struct `v4l2_ctrl_h264_sps` + +--- + +## CRIT findings + +### CRIT-1: rkvdec does NOT read constraint_set_flags — the hypothesis is empirically falsified + +The plan's core claim is that `constraint_set_flags` may trigger a "decoder-config table dispatch" or "profile-detection hint" inside rkvdec. **The driver source falsifies this.** + +In `assemble_hw_pps()` (rkvdec-h264.c:633), the function translates the entire `v4l2_ctrl_h264_sps` struct into the hardware PPS packet. Crucially: + +- `sps->constraint_set_flags` is **never read** anywhere in `rkvdec-h264.c`. Zero references. Confirmed by `grep -n "constraint_set" /home/mfritsche/src/linux-rfc/drivers/staging/media/rkvdec/rkvdec-h264.c` returning no output. +- The hardware PPS packet defines `CONSTRAINT_SET3_FLAG` at bit offset 12 (line 45), but that slot is **hardcoded to 1** unconditionally (line 660: `WRITE_PPS(1, CONSTRAINT_SET3_FLAG)`). The incoming `sps->constraint_set_flags` does not feed it. +- `PROFILE_IDC` in the hardware packet is also hardcoded to `0xFF` (line 659). The incoming `sps->profile_idc` does not feed it. + +The kernel's `rkvdec_h264_validate_sps()` (line 1029–1065) is the only SPS validation that can abort decoding. It checks: +- `chroma_format_idc <= 1` +- `bit_depth_luma_minus8 == bit_depth_chroma_minus8` +- `bit_depth_luma_minus8 == 0` +- Frame dimensions within `coded_fmt` + +`constraint_set_flags` is not checked. The field is fully ignored by the rkvdec driver — it does not branch on it, does not write it to hardware, and does not validate it. + +**Verdict**: The α-1 fix will compile and land cleanly, but cannot unblock decoding. `constraint_set_flags = 0x02` versus `0x00` makes zero difference to rkvdec's hardware dispatch or validation path. R-1 in the plan ("constraint_set_flags hypothesis is wrong — probability: low-to-medium") should be rated **near-certain based on source evidence**. + +This is a plan-killer in the strict sense: α-1 will not fix Bug 4. It is low-risk (no regression hazard), but it will consume a build/install/test cycle without advancing the criterion-1 hash match. + +**Recommendation**: Either (a) implement α-1 as a correctness hygiene patch knowing it will not fix Bug 4 and move immediately to the next hypothesis, or (b) skip α-1 and triage the next candidate directly. If α-1 is implemented, the Phase 7b γ dump will show the same 16×32 partial fill, providing one more confirmed-excluded hypothesis, which has diagnostic value. + +--- + +### CRIT-2: The ConstrainedBaseline mapping 0x42 is wrong — bit 6 does not exist + +The plan's mapping table assigns `VAProfileH264ConstrainedBaseline → 0x42`. The plan's own SPS bit table lists: +- bit 0: constraint_set0_flag +- bit 1: constraint_set1_flag +- bit 2: constraint_set2_flag +- bit 3: constraint_set3_flag +- bit 4: constraint_set4_flag +- bit 5: constraint_set5_flag +- bits 6-7: reserved + +`0x42 = 0b01000010 = bit 1 + bit 6`. But bit 6 is a reserved bit — there is no `constraint_set6_flag`. The `v4l2_ctrl_h264_sps` UAPI struct confirms bits 6-7 are unspecified/reserved. + +Per H.264 §A.2.1.1 (Constrained Baseline profile, Amendment 2 and later), Constrained Baseline is signalled as `profile_idc=66, constraint_set1_flag=1, constraint_set4_flag=1`. That yields `bit 1 | bit 4 = 0x02 | 0x10 = 0x12`, not `0x42`. + +The value `0x40` that the plan mentions as an alternative ("or 0x40") is `bit 6` only — also reserved, also wrong. + +Since CRIT-1 already establishes that rkvdec ignores `constraint_set_flags` entirely, a wrong value in the mapping table causes no runtime damage. However, if α-1 is landed as a hygiene patch, the ConstrainedBaseline mapping is incorrect and would produce a non-conformant SPS for any ConstrainedBaseline stream sent to a future driver that does read this field. + +**Amendment required**: Change `VAProfileH264ConstrainedBaseline → 0x12` (bit 1 + bit 4). + +--- + +## IMP findings + +### IMP-1: DECODE_PARAMS DPB flags diff is the strongest remaining candidate — plan under-invests in it + +Phase 3 established a second diff beyond `constraint_set_flags`: + +``` +libva frame-2 DPB flags: 0x00000003 (VALID | ACTIVE) +kdirect frame-2 DPB flags: 0x00030001 (contains bit 16+17 beyond the defined flags) +``` + +The plan classifies this as "alternative α-2" and defers it. But after CRIT-1 confirms `constraint_set_flags` is inert, the DPB flags diff is now the leading candidate. The plan should be amended to make α-2 the **primary** Phase 6b/7b investigation, not a fallback after α-1 fails. + +The rkvdec driver reads `dpb[i].flags` in `config_registers()` at lines 965-973: +```c +if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE) + refer_addr |= RKVDEC_COLMV_USED_FLAG_REF; +if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD) + refer_addr |= RKVDEC_FIELD_REF; +``` + +And `dpb[i].fields` at lines 970-973: +```c +if (dpb[i].fields & V4L2_H264_TOP_FIELD_REF) + refer_addr |= RKVDEC_TOPFIELD_USED_REF; +if (dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF) + refer_addr |= RKVDEC_BOTFIELD_USED_REF; +``` + +The driver also reads DPB flags in `lookup_ref_buf_idx()` at line 747: `if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)`. If ACTIVE is not set correctly, `run->ref_buf[i]` remains NULL, and the driver substitutes the destination buffer as reference — producing corrupted motion compensation without emitting FLAG_ERROR. + +The Phase 3 diff for frame 2 (first P-frame) showed libva DPB `flags=0x3` while kdirect had `flags=0x00030001`. The undefined upper bits in kdirect's value are suspicious but may be a strace read artifact (the reference_ts u64 and flags u32 are adjacent in the struct; a misaligned read could merge them). Before building on this finding, it needs re-examination. The Phase 3 note already flagged this uncertainty (line 107: "may be a kernel-aligned padding artifact"). + +**Required for Phase 6b plan**: Trace the libva backend's `fill_dpb_entry` analog — in `h264.c`, the DPB entry construction — and compare it field-by-field against the ffmpeg `fill_dpb_entry` (v4l2_request_h264.c:71). Specifically: does libva correctly set `V4L2_H264_DPB_ENTRY_FLAG_ACTIVE` for short-ref frames? Does it set `top_field_order_cnt` / `bottom_field_order_cnt`? + +### IMP-2: The plan does not address PPS diff + +Phase 3 noted PPS first bytes were identical. But Phase 3 only compared the first 12 bytes. The PPS struct has 28+ fields; a diff in `num_ref_idx_l0_default_active_minus1` or `pic_init_qp_minus26` or `CABAC` flag would also cause silent decode errors. This was acknowledged as "PPS identical" but with a shallow check. + +Before the Phase 6b plan finalises its α-2 candidate, a full PPS byte comparison between libva and kdirect should be on the checklist. This is a strace re-run (no code change needed), not a code audit. Low effort, eliminates a variable. + +### IMP-3: The success/partial-close criteria need tightening + +The plan states: "If α-1 doesn't fix → iter8 = PARTIAL — Bug 4 narrowed; iter9 candidate." + +With CRIT-1 established before any build, we already know α-1 will not fix Bug 4. The partial-close framing should be revised: + +- iter8 is not partial because α-1 failed — iter8 is partial because Bug 4's root cause is not yet identified. The empirical narrowing achieved (kernel-side partial decode, constraint_set_flags inert, rkvdec validate_sps not triggering) is meaningful progress and should be committed to the memory log regardless of whether α-1 is implemented. +- The Phase 8 close should enumerate what is definitively excluded: (a) libva readback bug — excluded by γ. (b) slot binding / CAPTURE buffer lifecycle — excluded by IMP-1 memset. (c) constraint_set_flags hypothesis — excluded by source read. (d) SPS validate_sps rejection — excluded by source read (none of the checked fields differ between libva and kdirect). + +Naming what's excluded is more useful than noting α-1 failed, because it constrains iter9's search space. + +--- + +## MIN findings + +### MIN-1: The plan conflates "informational per spec" with "not inspected by drivers" + +The plan argues that because H.264 §7.4.2.1.1 marks constraint_set_flags as "informational," the kernel may still use it. This reasoning is correct in principle — drivers often rely on informational SPS fields for hardware quirk routing. However, the plan then asserts that "kernel decoders have been observed to use them as profile-detection hints" without citing a specific driver. This is generic background, not evidence. Having now confirmed rkvdec does not use it, the language in the plan should not be reused as a template for future hypotheses without grounding. + +### MIN-2: `default:` fallthrough returns 0x00 for unrecognised profiles + +The helper in the plan code: +```c +default: + return 0x00; +``` +This is correct fallback behaviour — unknown profiles get no constraint bits. No change needed; note it for hygiene. + +### MIN-3: The `h264_derive_level_idc` parallel in the plan is valid motivation + +The plan correctly observes that `level_idc` was previously derived from frame dimensions because VAAPI doesn't forward it, making `constraint_set_flags` derivation the same class of problem. This framing is accurate and worth keeping as code documentation if α-1 lands as a hygiene patch. + +--- + +## Amendments to Phase 6b + +1. **Drop α-1 as the primary fix hypothesis.** The kernel driver does not read `constraint_set_flags`. α-1 may be landed as a conformance/hygiene one-liner if desired, but Phase 6b should not treat it as the iter8 criterion-1 candidate. + +2. **Fix the ConstrainedBaseline mapping before landing α-1.** Change the value from `0x42` to `0x12` (constraint_set1_flag | constraint_set4_flag). Value: correct conformance for any future driver that reads this field. + +3. **Elevate α-2 (DECODE_PARAMS / DPB diff) to primary Phase 6b candidate.** The next empirical step is a full field-by-field comparison of the libva DPB construction against the ffmpeg DPB construction for frame 2 (first P-frame). Specifically inspect: + - `top_field_order_cnt`, `bottom_field_order_cnt` (libva may be zero-ing these for progressive frames where ffmpeg sends real values). + - `flags` field — does libva set VALID|ACTIVE correctly for short-ref frames? + - `fields` field — does libva set V4L2_H264_TOP_FIELD_REF for progressive frame references? + +4. **Add a PPS full-byte comparison** to the Phase 7b checklist (strace re-run, no build required) to exclude PPS as a diff source. + +5. **Revise the iter8 close criteria.** Phase 8 should log the definitively excluded hypotheses regardless of whether a criterion-1 PASS is achieved. + +--- + +## Summary table + +| ID | Severity | Finding | Impact | +|---|---|---|---| +| CRIT-1 | CRIT | rkvdec ignores constraint_set_flags entirely — driver source confirmed | α-1 will not fix Bug 4 | +| CRIT-2 | CRIT | ConstrainedBaseline mapping 0x42 is wrong (bit 6 = reserved); correct value is 0x12 | Wrong SPS if α-1 landed as-is | +| IMP-1 | IMP | DPB flags diff is now primary candidate; Phase 6b plan should triage it before building | Redirection required | +| IMP-2 | IMP | PPS was only shallowly compared in Phase 3 | Add full-byte PPS check to Phase 7b | +| IMP-3 | IMP | Partial-close framing incomplete; excluded hypotheses not enumerated | Phase 8 close quality | +| MIN-1 | MIN | "Drivers have been observed" claim lacks citation | Language quality | +| MIN-2 | MIN | `default: return 0x00` is correct; no change needed | Hygiene confirmation | +| MIN-3 | MIN | level_idc derivation parallel is valid motivation for code comment | Code documentation |