Verification on linux-fresnel-fourier 7.0-1:
PASS:
- Criterion 1: vainfo enumerates VAProfileVP9Profile0 via auto-detect.
- Criterion 2: vaCreateConfig SUCCESS (implicit).
- Criterion 3: ffmpeg-vaapi VP9 5-frame decode exit 0 at 0.307x, no
ioctl errors.
FAIL — three distinguishable bug classes:
Bug 1 (VP9-specific, my Clause 6 parser):
Strace of frame-1 keyframe FRAME control vs Phase 3 anchor:
- byte 8 (lf.flags): mine=0x01 (DELTA_ENABLED only) vs ref=0x03
(ENABLED|UPDATE).
- byte 16 (base_q_idx): mine=0x41 (65) vs ref=0x2e (46).
- byte 17 (delta_q_y_dc): mine=8 vs ref=0.
Bit-trace shows my parser is 2 bits ahead of correct position by
the time it reaches lf_delta_enabled. Fix path: faithful port of
FFmpeg vp9.c::decode_frame_header.
Bug 2 (substrate-wide, cap_pool readback):
Constant RGB(0, 0x4c, 0) "0x4c gray" pattern across all codecs
(VP9, HEVC, MPEG-2, VP8). H.264 keyframe DOES read correctly with
real RGB(0, 0xe3, 0) content; H.264 inter frames revert to 0x4c.
Kernel decode succeeds (Phase 3 strace + ffmpeg-v4l2request
standalone confirm). libva readback returns cap_pool init scratch.
Sibling of iter3 dma_resv blocker but with different signature
(constant 0x4c instead of all-zero 0x00).
Bug 3 (hantro UAPI drift):
MPEG-2 + VP8 produce kernel "Unable to set control(s): Invalid
argument" errors. UAPI struct sizes/fields likely shifted between
6.19.9 and 7.0 (sibling of Phase 3 VP9 struct-size correction
144/1947 -> 168/2040).
Three loopback options proposed (decision pending user):
- A: VP9-only fix (Clause 6 parser); accept Bug 2/3 as substrate
pre-existing; criterion 4 transitive-only per iter3.
- B: Full loopback covering all 3 bugs; possibly requires kernel
patches (vb2_dma_resv RFC v2).
- C: Phase 0 reset; substrate is the primary issue; pause iter4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.6 KiB
Iteration 4 — Phase 7 (verification)
Verification of iter4 Commits Z+A+B+C against the 5 Phase 1 success criteria. Captured 2026-05-10 09:00–09:10 CEST on fresnel linux-fresnel-fourier 7.0-1.
Verdict: criterion 1, 2, 3 PASS. Criteria 4 and 5 FAIL with two distinct root causes that combine into a Phase 7 → Phase 4 (or possibly Phase 0) loopback.
Summary table
| Criterion | Test | Result |
|---|---|---|
| 1 | vainfo enumerates VAProfileVP9Profile0 | PASS (auto-detect rkvdec) |
| 2 | vaCreateConfig SUCCESS | PASS (implicit via 3) |
| 3 | ffmpeg-vaapi VP9 decode exit 0 | PASS (5 frames at 0.307x, no errors) |
| 4 | HW=SW byte-identical | FAIL (two issues, see below) |
| 5 | 4-codec regression | FAIL (substrate-wide cap_pool-readback regression) |
Fork tip: beaa914 (iter4 Commit C). Backend SHA256: f2ff6598....
Criterion 4 — VP9 HW=SW
Test: ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -vf hwdownload bbb_720p10s_vp9.webm -frames:v 5 hw_%04d.png → SHA256 vs Phase 3 SW reference PNGs.
Result: All 5 HW PNGs have the SAME hash 93dd9db51385.... Pixels are constant RGB(0, 0x4c, 0) across all frames. Phase 3 SW reference has varied content RGB(0x70, 0x6d, 0x55) etc.
Conclusion: HW path is producing constant-fill scratch instead of decoded video. Two contributing root causes identified:
Bug 1 — Clause 6 uncompressed-header parser bit-misalignment (VP9-specific)
Strace of my backend's submission vs Phase 3 anchor (frame-1 keyframe):
lf struct (16 B) quant struct (8 B)
mine: 01 00 ff ff 00 00 03 00 01 00 ... 41 08 00 08 00 00 00 00
Phase 3 anchor: 01 00 ff ff 00 00 03 00 03 00 ... 2e 00 00 00 00 00 00 00
Match: bytes 0–7 (lf prefix is correct, level=3, sharpness=0).
Diverge byte 8 (lf.flags): 0x01 (DELTA_ENABLED only) vs 0x03 (ENABLED|UPDATE).
Diverge byte 16 (base_q_idx): 0x41 (65) vs 0x2e (46).
Diverge byte 17 (delta_q_y_dc): 8 vs 0.
The parser is 2 bits ahead by the time it reaches lf_delta_enabled. Bit-trace of the BBB keyframe bitstream (82 49 83 42 40 4f f0 2c f6 06 38 ...) shows the correct bit positions for these fields per VP9 spec section 6.2; my parser's output corresponds to reading them at positions +2 of the correct positions.
The 2-bit drift origin is somewhere in the keyframe color_config / refresh_frame_context section. Likely candidates:
if (1) /* color_space != CS_RGB */shortcut may be reading wrong bits whencolor_space == CS_RGB.- The conditional refresh_frame_context / parallel_decoding_mode reading is off-spec for some path.
- An off-by-one in the keyframe
else(inter) branch may carry over.
The wrong base_q_idx=65 produces garbage dequantization → kernel decodes incorrectly. This alone would invalidate VP9 decode output even if the readback path were perfect.
Fix path: rewrite Clause 6 as a faithful port of FFmpeg vp9.c::decode_frame_header lines 540–700 (the canonical parser). ~150 LOC, replacing my current ~120 LOC partial port. Validate by extracting Phase 3 anchor's verbatim payload bits and confirming byte-by-byte match.
Bug 2 — Cap-pool readback returns scratch buffer (substrate-wide)
The constant RGB(0, 0x4c, 0) pattern appears for all 4 already-shipping codecs as well, not just VP9 (see Criterion 5 below). This is a linux-fresnel-fourier 7.0-1-substrate-level issue, not VP9-specific. Even if Bug 1 is fixed, criterion 4 will still fail without resolving Bug 2.
Symptom: kernel decodes successfully (Phase 3's strace + ffmpeg-v4l2request standalone test confirm decode succeeds), but our libva backend's cap_pool readback returns the buffer init pattern, not the kernel's decoded pixels. The pages may be cached or the slot may be unbound.
This is sibling to the iter3 dma_resv issue but with a different signature (constant 0x4c instead of all-zero 0x00). Documented in memory reference_dmabuf_resv_blocker.md; the new substrate may have made it worse, broader, or simply different.
Criterion 5 — 4-codec regression block
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -vf hwdownload for each codec, 3 frames each, SHA256 of resulting PNGs.
| Codec | Driver | Frame 1 hash | Frames 2-3 hash | Verdict |
|---|---|---|---|---|
| H.264 | rkvdec auto | 50ca48b6... (real RGB(0, 0xe3, 0) content) |
7c70fdda... (stuck RGB(0, 0x4c, 0)) |
PARTIAL — keyframe decodes, inter frames stuck |
| HEVC | rkvdec auto | 85243d2c... (constant RGB(0, 0x4c, 0)) |
same | FAIL |
| MPEG-2 | hantro env | 93dd9db5... (same constant as VP9) |
same | FAIL + kernel Unable to set control(s) errors |
| VP8 | hantro env | f8103ae1... (constant) |
same | FAIL + kernel Unable to set control(s) errors |
Hash 93dd9db5... appears across VP9 and MPEG-2 — confirms shared cap_pool init pattern (0x4c = decimal 76).
Cap-pool readback regression: H.264 keyframe DOES read correctly, indicating the readback path partially works. Inter frames return scratch — the cap_pool slot rotation between QBUF cycles isn't being driven by the kernel's actual decoded frame data. This is the sibling of Bug 2 above.
Hantro Unable to set control(s) errors: a kernel-side rejection on hantro for MPEG-2/VP8. Substrate change appears to have shifted hantro's expected control structure or fields; iter1 (MPEG-2) and iter3 (VP8) were tested on 6.19.9 — UAPI likely drifted between 6.19.9 and 7.0 the same way VP9 did (Phase 3 baseline showed VP9 struct sizes 168/2040 differ from Phase 2's 6.19.9-based estimates 144/1947).
Distinguishing the two bug classes
Bug 1 (VP9 parser) — pure iter4 work; localized to src/vp9.c::vp9_parse_uncompressed_header_lf_quant. Doesn't affect H.264/HEVC/MPEG-2/VP8.
Bug 2 (cap-pool readback) — substrate-wide; affects ALL codecs on linux-fresnel-fourier 7.0-1. Was masked on linux-eos-arm 6.19.9-99 because rkvdec's cap_pool worked there; iter3 hit the hantro variant via dma_resv. Now broader.
Bug 3 (hantro UAPI drift) — substrate-related; affects MPEG-2 + VP8 specifically with kernel-rejection ioctl errors. Probably struct-size or new-required-field in the kernel UAPI for hantro stateless controls. Distinct from Bug 2's silent-pass-but-wrong-buffer pattern.
Path forward — three options
Option A — VP9-only Phase 4 loopback for Bug 1 alone
Fix Clause 6 parser. Re-run Phase 6 + Phase 7. Accept Bug 2 as "criterion-4 transitive proof only" per iter3 precedent (memory reference_dmabuf_resv_blocker.md). Accept Bug 3 as substrate-induced regression of iter1+iter3 (file as new memory entry, defer fix).
Pro: fastest path to a "VP9 controls correct" demonstration; preserves iter4's nominal scope. Con: criterion 4 stays effectively-not-direct; campaign scoreboard slips from "5/5 direct" to "5/5 transitive". Bug 2 + Bug 3 are campaign-wide issues that this iter shouldn't paper over.
Option B — Full Phase 4 loopback covering Bugs 1 + 2 + 3
Fix Clause 6 parser. Investigate Bug 2 (cap-pool readback regression) — likely requires kernel-side patch (vb2_dma_resv RFC v2 per memory reference_fresnel_kernel_substrate.md) OR libva-side cap_pool refactor. Investigate Bug 3 (hantro UAPI drift) — re-run iter1 + iter3 strace baselines on 7.0 to see what changed.
Pro: addresses root causes; restores 4-codec block; criterion 4 direct. Con: scope expansion, possibly substantial. Bug 2 may require landing kernel patches that aren't ready yet (RFC v1 rejected, v2 in design).
Option C — Phase 0 reset: investigate substrate as the primary issue
Step back from iter4. Document that 6.19 → 7.0 substrate change has broken the campaign's Phase 1 baseline assumptions. Choose one of:
- Roll fresnel back to a kernel that has the working cap_pool path (NOT 6.19.9 since decommissioned, but possibly some intermediate kernel before whatever regressed).
- Land kernel patches in
kernel-agentto fix Bug 2 + Bug 3 substrate issues, rebuildlinux-fresnel-fourier, retest from iter1 baseline.
Pro: addresses the real underlying issue. iter4 + iter1 + iter2 + iter3 all become re-anchorable. Con: large scope. Multi-week. iter4 specifically gets paused.
Substrate state at Phase 7 close
- Fork at iter4 Commit C tip
beaa914(Phase 6 build clean, Criterion 1+2+3 PASS). - Phase 7 captures persisted:
fresnel:/tmp/iter4_phase7/andnoether:~/src/fresnel-fourier/iter4_phase7.tgz. - Phase 4 plan ready for amendment after option-decision.
- Memory rules carry forward; no new memory entries added in Phase 7 (the findings are iteration-specific, not cross-cutting yet).
Decision point
Pick A / B / C before Phase 4 loopback or campaign re-orientation.