Files
fresnel-fourier/phase3_iter5_baseline.md
marfrit 3c05564e99 iter5 Phase 3: baseline — 4/5 libva codecs race-lose, MPEG-2 wins, kdirect clean
5-codec sweep matrix on linux-fresnel-fourier 7.0-1 confirms:
- libva path returns all-zero cap_pool init pattern for H.264 (mostly)
  HEVC, VP9, VP8 (always). MPEG-2 wins the race (fastest hantro decode).
- kernel-direct ffmpeg-v4l2request hwdownload byte-matches SW for all
  4 race-losing codecs.
- B4 cosmetic init-probe EINVAL noise reproduced on hantro (2 ioctl per
  codec); MPEG-2 + VP8 stateless control submissions follow at = 0.

iter4 P7's "RGB(0,0x4c,0)" pattern corrected to all-zero raw bytes
(the 0x4c was YUV→RGB conversion of all-zero NV12). Same SHA shape
as iter3's hantro b34860e0 blocker fingerprint.

Control-payload strace anchors persisted as phase-7 invariants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 05:14:57 +00:00

83 lines
8.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 5 — Phase 3 (baseline)
Captured 2026-05-11 06:23 07:07 CEST on fresnel `linux-fresnel-fourier 7.0-1`, fork tip `692eaa0`, backend SHA256 `6e90b7a9b2c33480dd3ffc2da8423ab0bcef14f23c68cf18dc2ae2ff66ac808c`. Establishes the pre-fix anchor for iter5 Phase 7 four-criterion verification.
## Substrate state at Phase 3 open
- Kernel: `7.0.0-fresnel-fourier #1 SMP PREEMPT Sat May 9 19:24:42 CEST 2026 aarch64`.
- Device topology this boot: `/dev/video2 + /dev/media0` = hantro-vpu (decoder, MPEG-2 + VP8); `/dev/video3 + /dev/media1` = rkvdec (H.264 + HEVC + VP9); `/dev/media2` = uvcvideo.
- Fork tip: `692eaa0` (iter4 Phase 7 fix-forward). Unchanged from iter5 Phase 0.
- Test fixtures intact in `~/fourier-test/`: `bbb_1080p30_h264.mp4`, `bbb_720p10s_hevc.mp4`, `bbb_720p10s_vp9.webm`, `bbb_720p10s_mpeg2.ts`, `bbb_720p10s_vp8.webm`.
## Pre-fix YUV sweep — Bug 2 reproduction matrix
Method: ffmpeg-vaapi-hwdownload (via libva backend), ffmpeg-v4l2request-hwdownload (kernel-direct, bypasses cap_pool), and ffmpeg SW. All in `format=nv12,format=yuv420p,-f rawvideo`, 3 frames each. Hash is SHA256 of the raw YUV bytes. Hantro env override used for MPEG-2 + VP8; rkvdec env override for the others.
| Codec | Fixture | libva hash | kdirect hash | sw hash | libva==kdirect | libva==sw | kdirect==sw |
|---|---|---|---|---|---|---|---|
| H.264 1080p30 | `bbb_1080p30_h264.mp4` | `71ac099b…` | `1e7a0bc9…` | `1e7a0bc9…` | ✗ | ✗ | **✓** |
| HEVC 720p | `bbb_720p10s_hevc.mp4` | `06b2c5a0…` | `9340b832…` | `9340b832…` | ✗ | ✗ | **✓** |
| VP9 720p | `bbb_720p10s_vp9.webm` | `06b2c5a0…` | `4f1565e8…` | `4f1565e8…` | ✗ | ✗ | **✓** |
| MPEG-2 720p | `bbb_720p10s_mpeg2.ts` | `19eefbf4…` | `19eefbf4…` | `7be8cad7…` | **✓** | ✗ | ✗ |
| VP8 720p | `bbb_720p10s_vp8.webm` | `06b2c5a0…` | `136ce5cb…` | `136ce5cb…` | ✗ | ✗ | **✓** |
Findings:
1. **Bug 2 reproduces on 4 of 5 codecs** (H.264, HEVC, VP9, VP8). The libva-backend cap_pool readback returns wrong bytes — not kernel-decoded pixels. Kernel-direct path is byte-identical to SW reference for the same 4 codecs (cleanly proves the kernel decode itself is correct).
2. **MPEG-2 actually passes through libva**`libva_mpeg2 == kdirect_mpeg2`. SW differs from both at the same magnitude as in other codecs' HW↔SW comparisons (well-known hantro G1 MPEG-2 idct/dequant precision drift; not a Bug-2 issue). MPEG-2 wins the cap_pool race because hantro MPEG-2 decode completes faster than the userspace cap_pool readback path can issue its read.
3. **Three 720p codecs share the same libva-broken hash** `06b2c5a0c01e515d009c0bfbe0e61fafb105a54da5ec621104915cd5949849e8`. Verified to be SHA256 of `4 147 200 = 1280 × 720 × 1.5 × 3` zero bytes (`python3 -c "import hashlib; print(hashlib.sha256(b'\x00' * 4147200).hexdigest())"`). The cap_pool init pattern is **all-zero**, not `RGB(0, 0x4c, 0)` as iter4 Phase 7 reported. The earlier `0x4c` description was the BT.709 limited-range YUV→RGB conversion of all-zero NV12 rendered through Pillow / ffmpeg PNG encoding.
4. **H.264 1080p has a different broken hash** `71ac099b…` because the buffer is 1920×1080 (9 331 200 zero bytes ≠ same SHA). Per-byte histogram: 99.99% of bytes are 0x00, with traces of frame-1 keyframe content (16 first bytes of frame 1 = `81818080807f7f7f7f7f7f8080808181` — neutral grey chroma). Frame 2 + 3 are fully zero. Confirms the "H.264 keyframe sometimes wins the race, inter frames always lose" pattern iter4 Phase 7 documented.
5. **Timing-race framing validated.** Decode-speed ordering matches race-loss pattern:
- MPEG-2 hantro: fastest → never loses → libva works.
- H.264 rkvdec: keyframe-only intermittent → loses on inter → libva mostly broken.
- HEVC / VP9 / VP8: always loses → libva always broken.
The race is between kernel `vb2_buffer_done(VB2_BUF_STATE_DONE)` (which currently signals nothing in dma_resv) and the userspace cap_pool readback path's `read()` / `mmap()` of the dmabuf. The RFC v2 patches close the window by attaching a real dma_fence at buf_queue and signalling it at buffer-done; userspace consumers waiting on dma_resv-implicit-sync (or anything that resolves through it) then block correctly until decode finishes.
## Control-payload anchors per codec
Method: `strace -ff -v -x -s 999999 -e trace=ioctl` over ffmpeg-vaapi 3-frame decode for each codec. Per-codec `VIDIOC_S_EXT_CTRLS` submission summary:
| Codec | Device | Total S_EXT_CTRLS | Success | EINVAL | EINVAL cause |
|---|---|---|---|---|---|
| H.264 | rkvdec | 14 | 14 | 0 | — |
| HEVC | rkvdec | 15 | 15 | 0 | — |
| VP9 | rkvdec | 13 | 13 | 0 | — |
| MPEG-2 | hantro | 8 | 6 | 2 | B4: H.264 + HEVC init probes EINVAL |
| VP8 | hantro | 15 | 13 | 2 | B4: same |
The 2 EINVAL submissions on hantro paths are the B4 cosmetic init-probe noise documented in Phase 0: backend submits `0xa40900 H264_DECODE_MODE / 0xa40901 H264_START_CODE` and `0xa40a95 HEVC_DECODE_MODE / 0xa40a96 HEVC_START_CODE` at context-init unconditionally. Hantro doesn't expose these CIDs (it's MPEG-2 + VP8 only), so the kernel rejects with EINVAL. The backend proceeds; the actual codec controls (`0xa409dc/dd/de` MPEG-2 or `0xa409c8` VP8) submit cleanly afterward at `= 0`.
Full per-PID strace artifacts persisted at `iter5_phase3_baseline.tgz/anchors_<codec>/trace.*` — bundle pulled to noether at `~/src/fresnel-fourier/iter5_phase3_baseline.tgz` (25 MB).
These are the iter5 baseline control-payload anchors. Phase 7 will re-run the same strace on the kernel-agent-produced `linux-fresnel-fourier 7.0.kafr2` substrate; each codec's `VIDIOC_S_EXT_CTRLS` payload should byte-match the iter5 baseline. **Predicted: byte-identical match for all 5 codecs**, because the kernel patches in iter5 don't touch control-handling code (they only add the `vb2_buffer_attach_release_fence` opt-in call at `*_buf_queue`).
## Open questions resolved at Phase 3
- ~~"Bug 3 — hantro UAPI drift"~~ — **Resolved null** at Phase 2 source-read. Empirical Phase 3 confirms: MPEG-2 controls submit at `= 0`, VP8 controls submit at `= 0` (modulo the 2 cosmetic EINVAL B4 init probes).
- "Does the cap_pool race window manifest as all-zero or non-zero pattern?" — **All-zero**, confirmed.
- "Does kernel-direct decode produce SW-equivalent YUV for all 5 codecs?" — **Yes for 4 of 5** (H.264, HEVC, VP9, VP8). MPEG-2 HW differs from SW at the codec-precision level, unrelated to Bug 2.
## Open questions still pending Phase 4
- Does the videobuf2-core.c RFC v2 helper patch apply cleanly v6.12 → v7.0? Pending boltzmann reconnection for the `git apply --3way` verification.
- Does `vb2_buffer_attach_release_fence()` need any v7.0-specific adjustments (e.g., dma_resv API rename or signature drift)? Phase 4 verifies during rebase.
- Will the rkvdec consumer patch need any rkvdec-specific buf_queue refactor (e.g., the buf_queue happens before a delayed-decode worker, in which case the fence should be attached later)? Phase 4 reviews rkvdec.c around line 954 in context.
## Artifacts persisted
- `iter5_phase3_baseline.tgz` (25 MB): `sweep.sh` (runner), `anchors.sh` (anchor capture), `*.yuv` (15 raw YUV files: 5 libva + 5 kdirect + 5 sw), `*.log` (15 ffmpeg stderr logs), `anchors_<codec>/trace.*` (5 codecs × per-PID strace files, ~85 files total).
## Phase 4 preview
With the empirical confirmation that:
- Bug 2 is consistently the timing race on all 4 broken codecs (and the MPEG-2 path that "works" actually wins the same race);
- Kernel-direct decode is byte-clean for all 5 codecs vs SW (for the 4 that have HW=SW expectation);
- Backend control-payload submissions are stable and identical across H.264/HEVC/VP9 (all 14/15/13 `= 0`) and modulo cosmetic noise on hantro;
Phase 4 plan is now fully informed: produce the 4-patch series (3 rebased + 1 new rkvdec consumer), update `fleet/fresnel.yaml`, build via kernel-agent (or manual fallback) on boltzmann, install on fresnel, re-run the sweep, verify `libva_<codec>.yuv == kdirect_<codec>.yuv` for all 5 codecs (the post-fix-equivalent of the "all 5 pass" criterion).