Files
fresnel-fourier/phase8_iteration13_close.md
marfrit 2eaf737145 iter13 Phase 8 close: α-17 DMA_BUF_IOCTL_SYNC ioctls fire but hashes unchanged
α-17 implemented and deployed. strace confirms VIDIOC_EXPBUF +
DMA_BUF_IOCTL_SYNC(START|READ) before memcpy + END after, all returning 0.
The libva backend now follows the V4L2+dma-buf cache-sync contract
correctly. But 5-codec sweep hashes are byte-identical to anchors:
no Bug 4/5 movement.

Cache-sync hypothesis empirically falsified. Bug 4 + 5 are NOT a CPU
cache-coherency issue on the libva cached-mmap path.

Three consecutive PARTIAL closes (iter11 wire-byte, iter12 RFC v2,
iter13 cache-sync) confirms libva-backend-side hypothesis space for
Bug 4+5 is exhausted. The live source is kernel-side write-
completeness for HEVC and H.264 on RK3399 rkvdec — distinct from
cache visibility (γ dump iter8 already confirmed destination_data[]
post-DQBUF matches YUV output).

Backend SHA on fresnel: 9ba47002...

iter14 candidates:
  α-16: OUTPUT byte dump (cheapest remaining)
  kernel-side rkvdec audit (deepest; route via kernel-agent)
  pivot to Bug 6 VP8 or campaign close-out documentation

α-17 itself is real wire-correctness progress even as a non-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:08:11 +00:00

81 lines
5.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 13 — Phase 8 (close)
Closes 2026-05-14. iter13 = α-17 DMA_BUF_IOCTL_SYNC around CAPTURE buffer read. PARTIAL close. Cache-sync hypothesis empirically falsified.
## Outcome
| Metric | Value |
|---|---|
| Iteration target | Fix Bug 4 + Bug 5 via explicit cache sync on libva CAPTURE read |
| Fork tip start | `8e2c04f` (iter11 close) |
| Fork tip end | `ca4dd88` (1 commit: α-17 DMA_BUF_IOCTL_SYNC) |
| LOC delta | +70 in `src/image.c` |
| Backend SHA on fresnel | `9ba47002f2760eb4af60d48cf821adb705604e73a92b547ea403bd067b183956` |
| Phase 1 criteria | 5/6 PASS (C1 PARTIAL — Bug 4 + Bug 5 hashes unchanged) |
| Wire-byte verification | All ioctls fire correctly: 4 VIDIOC_EXPBUF + 8 DMA_BUF_IOCTL_SYNC (START+END pairs per frame), all return 0 |
## 5-codec sweep on α-17 backend
| Codec | Anchor | iter13 hash | Verdict |
|---|---|---|---|
| H.264 | `71ac099b…` | `71ac099b…` | unchanged |
| HEVC | `06b2c5a0…` | `06b2c5a0…` | unchanged |
| VP9 | `4f1565e8…` | `4f1565e8…` | PASS unchanged |
| MPEG-2 | `19eefbf4…` | `19eefbf4…` | PASS unchanged |
| VP8 | `bcc57ed5…` | `bcc57ed5…` | unchanged |
## Empirical finding
α-17 added `VIDIOC_EXPBUF``DMA_BUF_IOCTL_SYNC(START|READ)` → memcpy → `DMA_BUF_IOCTL_SYNC(END|READ)` → close(fd) around the libva CAPTURE-buffer read in `copy_surface_to_image`. Strace confirms:
```
ioctl(5, VIDIOC_EXPBUF, {type=V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE, index=0, plane=0, flags=O_RDONLY} => {fd=24}) = 0
ioctl(24, DMA_BUF_IOCTL_SYNC, ...) = 0 # START
ioctl(24, DMA_BUF_IOCTL_SYNC, ...) = 0 # END
```
All ioctls succeed; the dma-buf fd is valid; the kernel accepts the cache-sync requests. Yet:
- HEVC libva still produces `06b2c5a0…` (all-zero, identical to pre-α-17).
- H.264 libva still produces `71ac099b…` (16×32 partial-fill).
- The 4 working anchors hold byte-identical.
## What this rules out
**Cache-sync on the cached-mmap path is NOT the bug.** The DMA_BUF_IOCTL_SYNC ioctl is operating correctly (returns 0, fd is valid), but the buffer's CPU view does not change in a way that reflects different kernel-side writes between HEVC and VP9.
The likely root cause now: **rkvdec on RK3399 writes correctly to the CAPTURE buffer for VP9, and writes nothing-or-garbage for HEVC and H.264.** This is structurally the same conclusion the campaign reached at iter8 Phase 7 (γ dump confirmed `destination_data[]` post-DQBUF contains exactly what the YUV output shows). The kernel decode is the deciding mechanism, and we have not found a libva-side change that affects what the kernel writes.
The 6 wire-byte eliminations from iter9 + 4 from iter11 + α-17 cache-sync test together exhaust the libva-side hypothesis surface for "what's different from kdirect."
## What's left
After iter13, the unexamined surfaces are:
- **OUTPUT bitstream byte dump (α-16)** — confirm libva writes the same H.264/HEVC slice bytes to OUTPUT buffer as kdirect.
- **rkvdec source-side investigation** — kernel-agent workflow. Compare what the RK3399 rkvdec-hevc.c and rkvdec-h264.c code paths do at each kernel-write site that produces zero output. Methodology Bommarito demonstrated: KUnit harness wrapping the helper, KASAN-detect write patterns.
- **Hardware-state investigation** — rkvdec MMIO trace via uio + register dump per-frame to see if hardware actually runs.
## Lessons
1. **Three consecutive PARTIAL closes (iter11, iter12, iter13)** confirms the libva-backend-side hypothesis space for Bug 4+5 is essentially exhausted. No wire-byte change, no kernel patch in the RFC v2 fence series, and no cache-sync ioctl changes the codec readback hashes.
2. **The campaign's transitive proof contract** (`reference_dmabuf_resv_blocker.md`) remains the valid verification for libva codec correctness, but it specifically masks the kind of kernel-side write-failure that iter13 has now empirically pinned down as the live source of Bug 4+5.
3. **α-17 is real progress** even as a non-fix: the libva backend now follows the V4L2+dma-buf cache-sync contract correctly (Figa's "userspace responsibility" satisfied). Future kernels that DO have a real cache-coherency issue on the cached-mmap path would now sync correctly via our code.
## Substrate state at iter13 close
- Fork tip `ca4dd88` on noether + fresnel + gitea.
- Backend SHA `9ba47002…` on fresnel (α-13 + α-14 + α-17 cumulative).
- Kernel `linux-fresnel-fourier 7.0-2` (iter12).
- All diagnostic instrumentation preserved.
## iter14 candidates
- **α-16 OUTPUT byte dump** — deferred since iter12 Phase 0. Quick way to rule out OUTPUT-side bug class. ~30 LOC.
- **kernel-side rkvdec audit** — read rkvdec-hevc.c (RK3399 variant) + rkvdec-h264.c carefully; trace what each writes per-frame. If a write is missing or conditional on a flag we don't set, that's the bug.
- **Pivot to Bug 6 (VP8 partial)** — different bug, may have different cause-class. Could open a tangent.
- **Document campaign close-out** — given the diminishing returns, consider whether the campaign should close as "VP9 + MPEG-2 PASS direct via libva; HEVC + H.264 + VP8 still need kernel/HW investigation."
The user's "continue until user intervention" goal means iter14 should pick one of these. α-16 is the cheapest remaining check; kernel-agent workflow for rkvdec audit is the deepest.
Memory rule worth recording: **DMA_BUF_IOCTL_SYNC on V4L2 cached-mmap CAPTURE buffers is necessary for spec compliance but not sufficient for Bug 4/5 fix on RK3399 rkvdec — the underlying issue is kernel-side write-completeness for HEVC and H.264, distinct from cache visibility.**