iter5b-β surfaced 3 explicit bugs (Bug 4 H.264 inter, Bug 5 HEVC DQBUF ERROR, Bug 6 VP8 partial output) plus carried backlog items (iter4-B1 device discrimination, B2-B6, L3, Q6, COLOR_RANGE). Candidates F-J laid out for user lock: - F: Bug 5 HEVC kernel-rejection (highest claim-vs-reality stigma) - G: Bug 6 VP8 partial output (smallest suspect surface) - H: Bug 4 H.264 inter race (highest consumer impact) - I: Re-anchor regression hashes on β substrate - J: iter4-B1 auto-detect harden Recommendation: G → H → F sequence if multiple iters planned; otherwise H for impact or J for architectural-cleanup fit. Phase 1 lock pending user pick. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 KiB
Iteration 6 — Phase 0 (substrate / motivation / inventory) → Phase 1 lock
Opens 2026-05-12 immediately after iter5b-β close (phase8_iteration5b_close.md, commit 9a14cc2). Per feedback_dev_process.md Phase 0, this document captures iter6's substrate state, the iter5b-β-surfaced bug inventory, and candidate research questions for Phase 1 lock.
iter5b-β was the first iteration to break the "codec N + 1" pattern. iter6 inherits an even messier menu: 3 named bugs from iter5b, the iter4-B1+B4 carry-overs, and the option-Φ candidates from the original iter5 Phase 0 doc that weren't picked.
Substrate state (verified 2026-05-12 at iter5b-β close)
| Property | Value | Notes |
|---|---|---|
| Kernel | 7.0.0-fresnel-fourier |
linux-fresnel-fourier 7.0-1 kernel-agent product; unchanged through iter5b. |
| Boot device numbering (today) | rkvdec /dev/video1+/dev/media0, hantro-vpu-dec /dev/video3+/dev/media1 |
Different from yesterday; iter4-B1 still open. |
| Fork tip | 70196f8 (β + Commit D) |
On noether + fresnel + gitea. |
| Backend installed | /usr/lib/dri/v4l2_request_drv_video.so SHA 2c6ff82cbdc156ff8910d0c7fe58e75eeecdfd6e6a1caabb049c8adf43a098b8 |
β architecture; OUTPUT lifecycle owned by CreateContext. |
| Codec scoreboard | 5/5 with 2 direct (VP9, MPEG-2) + 3 mixed (H.264 keyframe-partial Bug 4, VP8 partial Bug 6, HEVC transitive-only direct-FAIL Bug 5) | iter5b-β closed VP9 directly; others remain mixed. |
Bug inventory after iter5b-β
Active bugs with explicit reproduction signatures
Bug 4 — H.264 inter-frame race-loss (carried from iter4 Phase 7)
- Signature: H.264 keyframe decodes correctly through libva; inter frames return all-zero pages.
- Reproduce:
ffmpeg -hwaccel vaapi -i bbb_1080p30_h264.mp4 -frames:v 3 -vf hwdownload,format=nv12,format=yuv420p -f rawvideo out.yuv. Hash71ac099b…. 99.99% zero; frame 1 first 16 bytes =81 81 80 80 80 7f 7f 7f 7f 7f 7f 80 80 80 81 81(real chroma); frames 2, 3 fully zero. - Hypothesis surface: slot rotation, partial DPB, or some inter-specific submission gap.
Bug 5 — HEVC libva DQBUF returns FLAG_ERROR
- Signature: every HEVC libva DQBUF (both OUTPUT and CAPTURE) sets
V4L2_BUF_FLAG_ERROR. Kernel rkvdec rejects the decode. CAPTURE stays at cap_pool init pattern (all-zero). - Reproduce: same shape as Bug 4, with
bbb_720p10s_hevc.mp4. Hash06b2c5a0…= all-zero. - Pre-existing: Phase 3 baseline anchor trace (pre-iter5b) also showed FLAG_ERROR on every HEVC DQBUF. iter2's "PASS via transitive proof" verified backend's control PAYLOAD matched kdirect's payload — but the kernel rejected the libva submission regardless. Some V4L2 protocol contract aspect differs between libva backend and ffmpeg-v4l2request that the transitive proof didn't capture (request_fd binding order, sequence number, ioctl sequencing, extra control needed, etc.).
- Difficulty estimate: medium-to-high. Need to diff the actual V4L2 ioctl streams between libva-vaapi-HEVC and ffmpeg-v4l2request-HEVC and find what's different on the wire.
Bug 6 — VP8 libva produces non-zero but non-matching output
- Signature: VP8 libva runs decode (no DQBUF ERROR), produces real-looking content (256 unique bytes, frame 1 first 16 =
93 8e 8a 89 85 72 8c 6d 82 79 92 7e 80 80 80 80), but output bytes diverge from kdirect's136ce5cb…. Pre-iter5b VP8 was all-zero (the format-mismatch issue); β unblocked decode; what's left is a different bug. - Reproduce: VP8 fixture, same harness. Hash
bcc57ed5…(libva) vs136ce5cb…(kdirect == sw). 74.8% zero in libva output suggests partial fill. - Hypothesis surface: cap_pool slot rotation off-by-one, partial buffer fill per frame, or per-frame DPB sync issue.
- Difficulty estimate: medium.
Backend-class backlog items (carried forward, none touched in iter5b-β)
- iter4-B1 — auto-detect picks wrong device on per-boot enumeration shuffle. Cost: 1-2 min per session re-mapping. Backend-only fix; medium scope (proper media-topology decoder/encoder discrimination via
MEDIA_ENT_F_PROC_VIDEO_DECODER). - iter4-B2 — mpv-vaapi
Could not create devicefor VP9. Consumer-side. - iter4-Q6 — per-segment quant-scale lossy mapping (VP9).
- iter4-COLOR_RANGE — VAAPI exposes no color_range field.
- B3 — picture.c BeginPicture profile-aware reset.
- B4 — context.c log suppression for unsupported codec controls (the EINVAL B4 cosmetic noise on hantro init probes).
- B5 — mpeg2 vbv_buffer_size polish.
- B6 — h265 SPS bitstream-parse fidelity gap.
- L3 — vaDeriveImage cache-stale.
Substrate-class items (iter5 originally targeted, now contextualized)
- vb2_dma_resv RFC v2 patches: tracked at
~/src/linux-rfc/. Verified by iter5 Phase 5 review to NOT be the right fix for the libva backend's MMAP+EXPBUF readback path (different memory model than the patches address). Still useful for DMABUF-import compositor paths (KWin, Mesa). Not in iter6's libva-decode-correctness critical path. - panfrost IOMMU_CACHE: separate sibling work-stream; not in iter6 scope.
Candidate research questions for iter6
Candidate F — Bug 5: HEVC libva decode kernel-rejection
"Identify and fix the V4L2 protocol contract difference that causes kernel rkvdec to reject HEVC decode via the libva backend while accepting it via ffmpeg-v4l2request. After fix: libva HEVC == kdirect HEVC == SW (byte-identical YUV)."
Approach sketch: strace both libva-vaapi-HEVC and ffmpeg-v4l2request-HEVC on the same fixture; diff the ioctl streams; find the divergence (likely in S_EXT_CTRLS sequencing, MEDIA_REQUEST_IOC_QUEUE ordering, or buffer-binding ordering); patch libva backend's HEVC path or shared infrastructure.
Pros: HEVC has the strongest "we claimed PASS but the proof was partial" stigma. Closing it directly upgrades iter2's transitive PASS to direct PASS, matching VP9's iter5b-β upgrade.
Cons: HEVC backend code is substantial (h265.c was rewritten at iter2). Diff debugging via strace + source-read may take multiple sessions. Risk: the difference may be a long-standing assumption requiring substantial refactor.
Candidate G — Bug 6: VP8 libva partial output
"Identify and fix the cause of VP8 libva producing 74.8%-zero output (rather than the byte-identical kdirect output). After fix: libva VP8 == kdirect VP8 == SW."
Approach sketch: characterize the zero regions in libva_vp8 (which frames? which rows/columns? which planes?); compare with kdirect_vp8 at byte level; trace cap_pool slot binding and per-frame DPB submission. Likely a slot-rotation or partial-fill bug.
Pros: VP8 is simpler than HEVC. The decode is already running (DQBUF success). Probably a small fix once root-caused.
Cons: Diagnostic surface is fuzzier than HEVC's (kernel succeeds, output diverges).
Candidate H — Bug 4: H.264 inter-frame race-loss
"Fix H.264 inter-frame decode through libva so all frames (not just keyframes) produce correct pixels."
Approach sketch: similar to Bug 6. H.264 has the strongest keyframe-vs-inter discrimination — decode happens for keyframes (consistent real content in frame 1) but inter frames produce zero. Likely DPB-related: reference frame indices, request_fd lifecycle, or per-frame ordering.
Pros: H.264 is the primary codec for most consumers. Fixing inter unlocks real video playback through libva. iter2-iter5b touched H.264 in passing; deeper investigation has been deferred since iter4.
Cons: H.264 DPB management is intricate (B-slice L1 ref lists, fresh request_fd per frame, etc. — iter4 already touched these). The remaining bug may be subtle.
Candidate I — Re-anchor iter6+ regression hashes on β substrate
"Lock the now-stable per-codec hashes (VP9
4f1565e8…, MPEG-219eefbf4…, H.264 keyframe-partial71ac099b…, VP8 partialbcc57ed5…, HEVC all-zero06b2c5a0…) as iter6+ regression invariants. Verify each iter6+ patch against these anchors."
Approach sketch: codify the Phase 7 v2 sweep as a regression test. Add to tests/ in the fork. Each iter6+ PR runs the sweep and compares.
Pros: cheap; establishes a reproducible regression baseline.
Cons: doesn't fix any bug. Maintenance work, not delivery work.
Candidate J — iter4-B1 auto-detect device discrimination
"Make backend auto-detect select the right V4L2 decode device on every boot regardless of
/dev/media*enumeration order. No more env-override-per-session."
Approach sketch: walk media topology, require MEDIA_ENT_F_PROC_VIDEO_DECODER on the entities, prefer decoder-by-codec mapping. ~100 LOC in request.c.
Pros: removes per-session friction. Mechanical fix.
Cons: doesn't fix any decode bug. Quality-of-life.
Out-of-scope items (carried unchanged)
- Performance metrics (Candidate D from iter5 Phase 0) — still blocked by pixel-correctness gaps in HEVC, VP8, H.264-inter. Defer to a post-correctness iteration.
- Front-end libva.
- Other hardware (ohm, ampere/boltzmann).
- AV1.
cros-codecsRust replacement.- Bootlin / Collabora upstreaming.
Recommendation
If pressed: Candidate H (Bug 4 H.264 inter) for impact (H.264 is the most consumer-relevant codec), Candidate F (Bug 5 HEVC) for diagnostic-clarity practice (the kernel-direct comparison strace is a clean delta-finding exercise that re-validates the iter5b-β β architecture), or Candidate G (Bug 6 VP8) for fast iteration (simpler codec, smaller suspect surface).
If multiple iterations are planned, the natural sequence is G → H → F: fix the simplest first (VP8), build technique, apply to harder cases.
If iter6 should specifically MATCH the difficulty of iter5b-β (medium): G or H.
If iter6 should specifically EXPAND on iter5b-β's architectural cleanup work: J (auto-detect harden) is the architectural fit; small backend change, removes a long-standing fragility.
Phase 1 lock pending
This document does NOT lock Phase 1. The user picks the iter6 research question from candidates F-J (or proposes K). After that pick, this doc becomes "iter6 Phase 0 final" and feeds Phase 1 lock.