Files
fresnel-fourier/phase3_iter5_baseline.md
T
marfrit 3c05564e99 iter5 Phase 3: baseline — 4/5 libva codecs race-lose, MPEG-2 wins, kdirect clean
5-codec sweep matrix on linux-fresnel-fourier 7.0-1 confirms:
- libva path returns all-zero cap_pool init pattern for H.264 (mostly)
  HEVC, VP9, VP8 (always). MPEG-2 wins the race (fastest hantro decode).
- kernel-direct ffmpeg-v4l2request hwdownload byte-matches SW for all
  4 race-losing codecs.
- B4 cosmetic init-probe EINVAL noise reproduced on hantro (2 ioctl per
  codec); MPEG-2 + VP8 stateless control submissions follow at = 0.

iter4 P7's "RGB(0,0x4c,0)" pattern corrected to all-zero raw bytes
(the 0x4c was YUV→RGB conversion of all-zero NV12). Same SHA shape
as iter3's hantro b34860e0 blocker fingerprint.

Control-payload strace anchors persisted as phase-7 invariants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 05:14:57 +00:00

8.0 KiB
Raw Blame History

Iteration 5 — Phase 3 (baseline)

Captured 2026-05-11 06:23 07:07 CEST on fresnel linux-fresnel-fourier 7.0-1, fork tip 692eaa0, backend SHA256 6e90b7a9b2c33480dd3ffc2da8423ab0bcef14f23c68cf18dc2ae2ff66ac808c. Establishes the pre-fix anchor for iter5 Phase 7 four-criterion verification.

Substrate state at Phase 3 open

  • Kernel: 7.0.0-fresnel-fourier #1 SMP PREEMPT Sat May 9 19:24:42 CEST 2026 aarch64.
  • Device topology this boot: /dev/video2 + /dev/media0 = hantro-vpu (decoder, MPEG-2 + VP8); /dev/video3 + /dev/media1 = rkvdec (H.264 + HEVC + VP9); /dev/media2 = uvcvideo.
  • Fork tip: 692eaa0 (iter4 Phase 7 fix-forward). Unchanged from iter5 Phase 0.
  • Test fixtures intact in ~/fourier-test/: bbb_1080p30_h264.mp4, bbb_720p10s_hevc.mp4, bbb_720p10s_vp9.webm, bbb_720p10s_mpeg2.ts, bbb_720p10s_vp8.webm.

Pre-fix YUV sweep — Bug 2 reproduction matrix

Method: ffmpeg-vaapi-hwdownload (via libva backend), ffmpeg-v4l2request-hwdownload (kernel-direct, bypasses cap_pool), and ffmpeg SW. All in format=nv12,format=yuv420p,-f rawvideo, 3 frames each. Hash is SHA256 of the raw YUV bytes. Hantro env override used for MPEG-2 + VP8; rkvdec env override for the others.

Codec Fixture libva hash kdirect hash sw hash libva==kdirect libva==sw kdirect==sw
H.264 1080p30 bbb_1080p30_h264.mp4 71ac099b… 1e7a0bc9… 1e7a0bc9…
HEVC 720p bbb_720p10s_hevc.mp4 06b2c5a0… 9340b832… 9340b832…
VP9 720p bbb_720p10s_vp9.webm 06b2c5a0… 4f1565e8… 4f1565e8…
MPEG-2 720p bbb_720p10s_mpeg2.ts 19eefbf4… 19eefbf4… 7be8cad7…
VP8 720p bbb_720p10s_vp8.webm 06b2c5a0… 136ce5cb… 136ce5cb…

Findings:

  1. Bug 2 reproduces on 4 of 5 codecs (H.264, HEVC, VP9, VP8). The libva-backend cap_pool readback returns wrong bytes — not kernel-decoded pixels. Kernel-direct path is byte-identical to SW reference for the same 4 codecs (cleanly proves the kernel decode itself is correct).

  2. MPEG-2 actually passes through libvalibva_mpeg2 == kdirect_mpeg2. SW differs from both at the same magnitude as in other codecs' HW↔SW comparisons (well-known hantro G1 MPEG-2 idct/dequant precision drift; not a Bug-2 issue). MPEG-2 wins the cap_pool race because hantro MPEG-2 decode completes faster than the userspace cap_pool readback path can issue its read.

  3. Three 720p codecs share the same libva-broken hash 06b2c5a0c01e515d009c0bfbe0e61fafb105a54da5ec621104915cd5949849e8. Verified to be SHA256 of 4 147 200 = 1280 × 720 × 1.5 × 3 zero bytes (python3 -c "import hashlib; print(hashlib.sha256(b'\x00' * 4147200).hexdigest())"). The cap_pool init pattern is all-zero, not RGB(0, 0x4c, 0) as iter4 Phase 7 reported. The earlier 0x4c description was the BT.709 limited-range YUV→RGB conversion of all-zero NV12 rendered through Pillow / ffmpeg PNG encoding.

  4. H.264 1080p has a different broken hash 71ac099b… because the buffer is 1920×1080 (9 331 200 zero bytes ≠ same SHA). Per-byte histogram: 99.99% of bytes are 0x00, with traces of frame-1 keyframe content (16 first bytes of frame 1 = 81818080807f7f7f7f7f7f8080808181 — neutral grey chroma). Frame 2 + 3 are fully zero. Confirms the "H.264 keyframe sometimes wins the race, inter frames always lose" pattern iter4 Phase 7 documented.

  5. Timing-race framing validated. Decode-speed ordering matches race-loss pattern:

    • MPEG-2 hantro: fastest → never loses → libva works.
    • H.264 rkvdec: keyframe-only intermittent → loses on inter → libva mostly broken.
    • HEVC / VP9 / VP8: always loses → libva always broken.

    The race is between kernel vb2_buffer_done(VB2_BUF_STATE_DONE) (which currently signals nothing in dma_resv) and the userspace cap_pool readback path's read() / mmap() of the dmabuf. The RFC v2 patches close the window by attaching a real dma_fence at buf_queue and signalling it at buffer-done; userspace consumers waiting on dma_resv-implicit-sync (or anything that resolves through it) then block correctly until decode finishes.

Control-payload anchors per codec

Method: strace -ff -v -x -s 999999 -e trace=ioctl over ffmpeg-vaapi 3-frame decode for each codec. Per-codec VIDIOC_S_EXT_CTRLS submission summary:

Codec Device Total S_EXT_CTRLS Success EINVAL EINVAL cause
H.264 rkvdec 14 14 0
HEVC rkvdec 15 15 0
VP9 rkvdec 13 13 0
MPEG-2 hantro 8 6 2 B4: H.264 + HEVC init probes EINVAL
VP8 hantro 15 13 2 B4: same

The 2 EINVAL submissions on hantro paths are the B4 cosmetic init-probe noise documented in Phase 0: backend submits 0xa40900 H264_DECODE_MODE / 0xa40901 H264_START_CODE and 0xa40a95 HEVC_DECODE_MODE / 0xa40a96 HEVC_START_CODE at context-init unconditionally. Hantro doesn't expose these CIDs (it's MPEG-2 + VP8 only), so the kernel rejects with EINVAL. The backend proceeds; the actual codec controls (0xa409dc/dd/de MPEG-2 or 0xa409c8 VP8) submit cleanly afterward at = 0.

Full per-PID strace artifacts persisted at iter5_phase3_baseline.tgz/anchors_<codec>/trace.* — bundle pulled to noether at ~/src/fresnel-fourier/iter5_phase3_baseline.tgz (25 MB).

These are the iter5 baseline control-payload anchors. Phase 7 will re-run the same strace on the kernel-agent-produced linux-fresnel-fourier 7.0.kafr2 substrate; each codec's VIDIOC_S_EXT_CTRLS payload should byte-match the iter5 baseline. Predicted: byte-identical match for all 5 codecs, because the kernel patches in iter5 don't touch control-handling code (they only add the vb2_buffer_attach_release_fence opt-in call at *_buf_queue).

Open questions resolved at Phase 3

  • "Bug 3 — hantro UAPI drift"Resolved null at Phase 2 source-read. Empirical Phase 3 confirms: MPEG-2 controls submit at = 0, VP8 controls submit at = 0 (modulo the 2 cosmetic EINVAL B4 init probes).
  • "Does the cap_pool race window manifest as all-zero or non-zero pattern?" — All-zero, confirmed.
  • "Does kernel-direct decode produce SW-equivalent YUV for all 5 codecs?" — Yes for 4 of 5 (H.264, HEVC, VP9, VP8). MPEG-2 HW differs from SW at the codec-precision level, unrelated to Bug 2.

Open questions still pending Phase 4

  • Does the videobuf2-core.c RFC v2 helper patch apply cleanly v6.12 → v7.0? Pending boltzmann reconnection for the git apply --3way verification.
  • Does vb2_buffer_attach_release_fence() need any v7.0-specific adjustments (e.g., dma_resv API rename or signature drift)? Phase 4 verifies during rebase.
  • Will the rkvdec consumer patch need any rkvdec-specific buf_queue refactor (e.g., the buf_queue happens before a delayed-decode worker, in which case the fence should be attached later)? Phase 4 reviews rkvdec.c around line 954 in context.

Artifacts persisted

  • iter5_phase3_baseline.tgz (25 MB): sweep.sh (runner), anchors.sh (anchor capture), *.yuv (15 raw YUV files: 5 libva + 5 kdirect + 5 sw), *.log (15 ffmpeg stderr logs), anchors_<codec>/trace.* (5 codecs × per-PID strace files, ~85 files total).

Phase 4 preview

With the empirical confirmation that:

  • Bug 2 is consistently the timing race on all 4 broken codecs (and the MPEG-2 path that "works" actually wins the same race);
  • Kernel-direct decode is byte-clean for all 5 codecs vs SW (for the 4 that have HW=SW expectation);
  • Backend control-payload submissions are stable and identical across H.264/HEVC/VP9 (all 14/15/13 = 0) and modulo cosmetic noise on hantro;

Phase 4 plan is now fully informed: produce the 4-patch series (3 rebased + 1 new rkvdec consumer), update fleet/fresnel.yaml, build via kernel-agent (or manual fallback) on boltzmann, install on fresnel, re-run the sweep, verify libva_<codec>.yuv == kdirect_<codec>.yuv for all 5 codecs (the post-fix-equivalent of the "all 5 pass" criterion).