α-2 (POC strip removal) changed wire bytes (POC now matches kdirect's sentinel-encoded 0x10000) but H.264 output unchanged. POC not load-bearing. 5-codec regression sweep on α-2 backend: all 4 non-H.264 anchors hold. Zero regression. Iter8 close: 5/6 PASS, criterion-1 PARTIAL. Bug 4 narrowed but not fixed. Eliminations achieved: 1. libva-readback bug (γ dump) 2. Slot-binding wrong (γ dump shows correct slot per surface) 3. Stale residue (IMP-1 memset confirmed deterministic kernel write) 4. constraint_set_flags (Phase 5b CRIT-1: rkvdec source review) 5. POC sentinel strip (α-2 wire change, no output change) Remaining candidates for iter9: PPS diff (α-3), DECODE_PARAMS post-DPB fields (α-6), DPB entry order (α-4), slice data encoding (α-5). Fork tip 0226684 carries γ + IMP-1 diagnostic + α-2 hygiene. All env-gated off by default; α-2 is a wire-payload cleanup with zero behavior effect. Lessons distilled: - Reviews are never skippable — Phase 5b CRIT-1 saved a build cycle. - Wire-byte equivalence ≠ behavior equivalence. - Per-driver kludges in shared codec code need explicit gating. - Bug carryover labels can mislead (Bug 4 != "inter race-loss"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.9 KiB
Iteration 8 — Phase 7c (α-2 verification + 5-codec regression sweep)
Captured 2026-05-13. Backend SHA b6a3958a5bca9451… (iter8 P6 γ + IMP-1 + α-2 POC strip removal). Fork tip 0226684.
Verdict on α-2
α-2 (remove POC sentinel strip) did NOT fix Bug 4.
| Metric | Before α-2 | After α-2 |
|---|---|---|
| H.264 libva YUV hash | 71ac099b… |
71ac099b… (unchanged) |
| Frame 1 non-zero bytes | 512 (16×32 leak) | 512 (same pattern) |
| Frame 2 non-zero | 32 sparse UV markers | 32 same |
| Frame 3 non-zero | 16 sparse UV markers | 16 same |
α-2 DID change the wire-level POC bytes (now matches kdirect's sentinel-encoded 0x00010000 for tfoc/bfoc instead of libva's previously-stripped 0), but the kernel produces identical output. POC values are not load-bearing for Bug 4.
What Phase 7c eliminated
This iteration's α-2 attempt eliminated one more hypothesis. Combined with prior phases:
| Hypothesis | Status | Eliminated by |
|---|---|---|
| libva mis-reads CAPTURE buffer | ❌ Eliminated | γ dump shows libva reads what's in buffer |
| Slot-binding wrong | ❌ Eliminated | γ shows correct slot index per surface |
| Stale residue from prior runs | ❌ Eliminated | IMP-1 memset experiment (same hash) |
constraint_set_flags (SPS byte 1) |
❌ Eliminated | Phase 5b CRIT-1: rkvdec source review |
| H.264 POC sentinel strip | ❌ Eliminated | α-2 changed wire, no output change |
What remains as candidate
DECODE_PARAMS / DPB construction beyond POC and flags:
- DPB entry ORDER. Phase 4c strace re-decode shows libva fills DPB in entry-array order (entry[0]=frame_1, entry[1]=frame_2), kdirect fills in short_ref[] order (DPB[0]=most-recent-ref = frame_2). May or may not matter; rkvdec docs unclear.
- DECODE_PARAMS fields after dpb[16] (bytes 512–559 of the 560-byte struct). Phase 3 didn't deep-compare these. Includes nal_ref_idc, frame_num, current frame's tfoc/bfoc, pic_order_cnt_bit_size, dec_ref_pic_marking_bit_size, delta_pic_order_cnt*, flags.
- PPS field-by-field. Phase 5b IMP-2 flagged Phase 3 only shallow-compared 12 bytes. Full byte diff needed.
- Slice data encoding. The OUTPUT bitstream bytes the kernel sees per slice. Phase 3 didn't validate this.
5-codec regression sweep on α-2 backend
| Codec | Anchor (iter7) | iter8 α-2 hash | Verdict |
|---|---|---|---|
| H.264 | 71ac099b… |
71ac099b… |
unchanged (Bug 4 still present) |
| HEVC | 06b2c5a0… |
06b2c5a0… |
unchanged (Bug 5 still present) |
| VP9 | 4f1565e8… |
4f1565e8… |
unchanged (PASS direct) |
| MPEG-2 | 19eefbf4… |
19eefbf4… |
unchanged (PASS) |
| VP8 | bcc57ed5… |
bcc57ed5… |
unchanged (Bug 6 still present) |
Zero regression on the 4 non-H.264 codecs. α-2 is a hygiene cleanup that aligns libva's wire payload more closely with kdirect's, even though it doesn't close Bug 4.
Iter8 criteria final reckoning
| # | Criterion | Verdict |
|---|---|---|
| 1 | libva H.264 == kdirect H.264 | FAIL (PARTIAL — Bug 4 narrowed but unfixed) |
| 2 | VP9 unchanged | PASS |
| 3 | MPEG-2 unchanged | PASS |
| 4 | HEVC unchanged | PASS |
| 5 | VP8 unchanged | PASS |
| 6 | Control-payload anchors hold for 4 non-H.264 codecs | PASS (4 codec hashes unchanged ⇒ controls unchanged) |
5 of 6 PASS. Criterion 1 PARTIAL: Bug 4 significantly narrowed (3 hypotheses eliminated) but not fixed.
Lasting code changes from iter8
The fork now carries 4 commits beyond iter7:
| SHA | Description | Ship status |
|---|---|---|
7eae6ea |
γ env-gated CAPTURE buffer diagnostic dump | Diagnostic, env-gated off |
66ecbef |
IMP-1 env-gated CAPTURE pre-zero diagnostic | Diagnostic, env-gated off |
6f4e583 |
stdlib.h include for picture.c | Mechanical fix |
0226684 |
α-2: POC sentinel strip removal | Hygiene cleanup, no behavior change |
All four commits are zero-regression by env-gating or by being mechanically necessary. Net effect on default behavior: nothing changed except internal POC values sent to rkvdec are sentinel-encoded (which doesn't change decoder output for any of 5 codecs).
Backend SHA chain
| Stage | SHA |
|---|---|
| iter7 close | 520507f6d0a1a7eb3797bed42c6f74e0f3a4826ac8a22ed2655e01a6f20aa874 |
| iter8 P6 γ | 94467f0c0aaba804233206d01666302e42a5cbcaf513a8157045623a66fae58e |
| iter8 P7 IMP-1 | e4649a4854a24ec3209ea5ce6705cef8ed9b80cb277701ecb34eb822516c6030 |
| iter8 P6c α-2 | b6a3958a5bca945164262339dea5cc28f17accce13d57bd9f0c5a5dabbdf1b53 |
iter8 → iter9 handoff
iter8 closes PARTIAL with significant Bug 4 narrowing. Three hypotheses eliminated (constraint_set_flags, POC strip, libva-readback). iter9 candidates:
- α-3 PPS full-byte diff: re-strace and dump all 12 bytes of PPS for libva vs kdirect. If diff exists, target it. Cost: 30 min for strace + analysis + small fix.
- α-4 DPB entry order: change
h264_fill_dpbto fill in most-recent-first order matching kdirect. Cost: ~10 LOC + verify. - α-5 Slice data encoding: deep-dump the OUTPUT buffer bytes libva writes vs kdirect's. Look for NAL-header / start-code / EBSP differences. Cost: 60 min + maybe small fix.
- α-6 DECODE_PARAMS post-DPB fields: re-strace with
-s 1024to capture full DECODE_PARAMS; diff bytes 512-559. Cost: 30 min.
Recommended order: α-3 (cheapest) → α-6 → α-4 → α-5.
Lessons distilled
- Reviews caught a dead-end early. Phase 5b CRIT-1's empirical rkvdec source read saved a ~15 LOC futile change. Memory rule reinforcement:
feedback_review_empirical_over_theoretical.mddirection (reviewer reads source > author's plan-level speculation). - Wire-byte changes aren't always behavior changes. α-2 visibly changed the V4L2 ioctl bytes (POC values flipped) but the decoder output was identical. Wire-level diffs are necessary but not sufficient.
- γ-then-α was the right empirical sequence. The γ dump confirmed libva-readback wasn't the issue; without it, iter9 would still be considering libva-side fixes.