iter8 Phase 7c + 8: close iter8 PARTIAL — Bug 4 narrowed via 5 eliminations

α-2 (POC strip removal) changed wire bytes (POC now matches kdirect's
sentinel-encoded 0x10000) but H.264 output unchanged. POC not load-bearing.

5-codec regression sweep on α-2 backend: all 4 non-H.264 anchors hold.
Zero regression.

Iter8 close: 5/6 PASS, criterion-1 PARTIAL. Bug 4 narrowed but not fixed.

Eliminations achieved:
  1. libva-readback bug (γ dump)
  2. Slot-binding wrong (γ dump shows correct slot per surface)
  3. Stale residue (IMP-1 memset confirmed deterministic kernel write)
  4. constraint_set_flags (Phase 5b CRIT-1: rkvdec source review)
  5. POC sentinel strip (α-2 wire change, no output change)

Remaining candidates for iter9: PPS diff (α-3), DECODE_PARAMS post-DPB
fields (α-6), DPB entry order (α-4), slice data encoding (α-5).

Fork tip 0226684 carries γ + IMP-1 diagnostic + α-2 hygiene. All
env-gated off by default; α-2 is a wire-payload cleanup with zero
behavior effect.

Lessons distilled:
- Reviews are never skippable — Phase 5b CRIT-1 saved a build cycle.
- Wire-byte equivalence ≠ behavior equivalence.
- Per-driver kludges in shared codec code need explicit gating.
- Bug carryover labels can mislead (Bug 4 != "inter race-loss").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 13:01:36 +00:00
parent 16034152a8
commit 3ed1e454fb
3 changed files with 368 additions and 0 deletions
+101
View File
@@ -0,0 +1,101 @@
# Iteration 8 — Phase 7c (α-2 verification + 5-codec regression sweep)
Captured 2026-05-13. Backend SHA `b6a3958a5bca9451…` (iter8 P6 γ + IMP-1 + α-2 POC strip removal). Fork tip `0226684`.
## Verdict on α-2
**α-2 (remove POC sentinel strip) did NOT fix Bug 4.**
| Metric | Before α-2 | After α-2 |
|---|---|---|
| H.264 libva YUV hash | `71ac099b…` | `71ac099b…` (unchanged) |
| Frame 1 non-zero bytes | 512 (16×32 leak) | 512 (same pattern) |
| Frame 2 non-zero | 32 sparse UV markers | 32 same |
| Frame 3 non-zero | 16 sparse UV markers | 16 same |
α-2 DID change the wire-level POC bytes (now matches kdirect's sentinel-encoded `0x00010000` for tfoc/bfoc instead of libva's previously-stripped `0`), but the kernel produces identical output. **POC values are not load-bearing for Bug 4.**
## What Phase 7c eliminated
This iteration's α-2 attempt eliminated one more hypothesis. Combined with prior phases:
| Hypothesis | Status | Eliminated by |
|---|---|---|
| libva mis-reads CAPTURE buffer | ❌ Eliminated | γ dump shows libva reads what's in buffer |
| Slot-binding wrong | ❌ Eliminated | γ shows correct slot index per surface |
| Stale residue from prior runs | ❌ Eliminated | IMP-1 memset experiment (same hash) |
| `constraint_set_flags` (SPS byte 1) | ❌ Eliminated | Phase 5b CRIT-1: rkvdec source review |
| H.264 POC sentinel strip | ❌ Eliminated | α-2 changed wire, no output change |
## What remains as candidate
DECODE_PARAMS / DPB construction beyond POC and flags:
1. **DPB entry ORDER**. Phase 4c strace re-decode shows libva fills DPB in entry-array order (entry[0]=frame_1, entry[1]=frame_2), kdirect fills in short_ref[] order (DPB[0]=most-recent-ref = frame_2). May or may not matter; rkvdec docs unclear.
2. **DECODE_PARAMS fields after dpb[16]** (bytes 512559 of the 560-byte struct). Phase 3 didn't deep-compare these. Includes nal_ref_idc, frame_num, current frame's tfoc/bfoc, pic_order_cnt_bit_size, dec_ref_pic_marking_bit_size, delta_pic_order_cnt*, flags.
3. **PPS field-by-field**. Phase 5b IMP-2 flagged Phase 3 only shallow-compared 12 bytes. Full byte diff needed.
4. **Slice data encoding**. The OUTPUT bitstream bytes the kernel sees per slice. Phase 3 didn't validate this.
## 5-codec regression sweep on α-2 backend
| Codec | Anchor (iter7) | iter8 α-2 hash | Verdict |
|---|---|---|---|
| H.264 | `71ac099b…` | `71ac099b…` | unchanged (Bug 4 still present) |
| HEVC | `06b2c5a0…` | `06b2c5a0…` | unchanged (Bug 5 still present) |
| VP9 | `4f1565e8…` | `4f1565e8…` | unchanged (PASS direct) |
| MPEG-2 | `19eefbf4…` | `19eefbf4…` | unchanged (PASS) |
| VP8 | `bcc57ed5…` | `bcc57ed5…` | unchanged (Bug 6 still present) |
**Zero regression on the 4 non-H.264 codecs.** α-2 is a hygiene cleanup that aligns libva's wire payload more closely with kdirect's, even though it doesn't close Bug 4.
## Iter8 criteria final reckoning
| # | Criterion | Verdict |
|---|---|---|
| 1 | libva H.264 == kdirect H.264 | **FAIL** (PARTIAL — Bug 4 narrowed but unfixed) |
| 2 | VP9 unchanged | PASS |
| 3 | MPEG-2 unchanged | PASS |
| 4 | HEVC unchanged | PASS |
| 5 | VP8 unchanged | PASS |
| 6 | Control-payload anchors hold for 4 non-H.264 codecs | PASS (4 codec hashes unchanged ⇒ controls unchanged) |
**5 of 6 PASS.** Criterion 1 PARTIAL: Bug 4 significantly narrowed (3 hypotheses eliminated) but not fixed.
## Lasting code changes from iter8
The fork now carries 4 commits beyond iter7:
| SHA | Description | Ship status |
|---|---|---|
| `7eae6ea` | γ env-gated CAPTURE buffer diagnostic dump | Diagnostic, env-gated off |
| `66ecbef` | IMP-1 env-gated CAPTURE pre-zero diagnostic | Diagnostic, env-gated off |
| `6f4e583` | stdlib.h include for picture.c | Mechanical fix |
| `0226684` | α-2: POC sentinel strip removal | Hygiene cleanup, no behavior change |
All four commits are zero-regression by env-gating or by being mechanically necessary. Net effect on default behavior: nothing changed except internal POC values sent to rkvdec are sentinel-encoded (which doesn't change decoder output for any of 5 codecs).
## Backend SHA chain
| Stage | SHA |
|---|---|
| iter7 close | `520507f6d0a1a7eb3797bed42c6f74e0f3a4826ac8a22ed2655e01a6f20aa874` |
| iter8 P6 γ | `94467f0c0aaba804233206d01666302e42a5cbcaf513a8157045623a66fae58e` |
| iter8 P7 IMP-1 | `e4649a4854a24ec3209ea5ce6705cef8ed9b80cb277701ecb34eb822516c6030` |
| iter8 P6c α-2 | `b6a3958a5bca945164262339dea5cc28f17accce13d57bd9f0c5a5dabbdf1b53` |
## iter8 → iter9 handoff
iter8 closes PARTIAL with significant Bug 4 narrowing. Three hypotheses eliminated (constraint_set_flags, POC strip, libva-readback). iter9 candidates:
- **α-3 PPS full-byte diff**: re-strace and dump all 12 bytes of PPS for libva vs kdirect. If diff exists, target it. Cost: 30 min for strace + analysis + small fix.
- **α-4 DPB entry order**: change `h264_fill_dpb` to fill in most-recent-first order matching kdirect. Cost: ~10 LOC + verify.
- **α-5 Slice data encoding**: deep-dump the OUTPUT buffer bytes libva writes vs kdirect's. Look for NAL-header / start-code / EBSP differences. Cost: 60 min + maybe small fix.
- **α-6 DECODE_PARAMS post-DPB fields**: re-strace with `-s 1024` to capture full DECODE_PARAMS; diff bytes 512-559. Cost: 30 min.
Recommended order: α-3 (cheapest) → α-6 → α-4 → α-5.
## Lessons distilled
1. **Reviews caught a dead-end early**. Phase 5b CRIT-1's empirical rkvdec source read saved a ~15 LOC futile change. Memory rule reinforcement: `feedback_review_empirical_over_theoretical.md` direction (reviewer reads source > author's plan-level speculation).
2. **Wire-byte changes aren't always behavior changes**. α-2 visibly changed the V4L2 ioctl bytes (POC values flipped) but the decoder output was identical. Wire-level diffs are necessary but not sufficient.
3. **γ-then-α was the right empirical sequence**. The γ dump confirmed libva-readback wasn't the issue; without it, iter9 would still be considering libva-side fixes.