diff --git a/phase8_iteration3_close.md b/phase8_iteration3_close.md new file mode 100644 index 0000000..682fa6b --- /dev/null +++ b/phase8_iteration3_close.md @@ -0,0 +1,149 @@ +# Iteration 3 — Phase 8 (close) + +Closes 2026-05-08. iter3 = VP8 on hantro-vpu-dec via libva-v4l2-request-fourier on RK3399 (fresnel / Pinebook Pro). Fourth codec to ship in the campaign. + +## Summary + +| Metric | Value | +|---|---| +| Codec target | VP8 Profile 0 (VAProfileVP8Version0_3) | +| Hardware | RK3399 hantro-vpu-dec (`/dev/video5+/dev/media2`) | +| Fork tip start (iter2 close) | `8d71e20` | +| Fork tip end (iter3 close) | `e1aca9c` | +| Commits on fork | 4 (A, B, C, D) | +| LOC delta | +373 across 7 files (2 NEW + 5 modified) | +| Phase 1 criteria | 5/5 GREEN (4 direct + 1 transitive) | +| Phase 7 → Phase 4 loopbacks | 0 | +| Phase 6 fix-forwards | 1 (Commit D buffer.c allow-list) | +| Phase 5 review findings | 4 Critical (all empirically validated correct) | +| Campaign scoreboard | 3/5 → 4/5 codecs passing | + +## Commits shipped + +### Fork (libva-v4l2-request-fourier) + +| Commit | Files | LOC | Description | +|---|---|---|---| +| `27d82e3` (A) | `src/config.c` | +16 | VP8 enumeration block + RequestCreateConfig case + RequestQueryConfigEntrypoints case | +| `017e27f` (B) | `src/vp8.c` (NEW), `src/vp8.h` (NEW), `src/meson.build` | +307 | VP8 codec dispatcher implementing all 10 Phase 4 contract clauses | +| `7f84bbb` (C) | `src/picture.c`, `src/surface.h` | +50 | Dispatcher wiring + 4 buffer-type cases + new `VAProbabilityBufferType` outer case + per-frame reset + params union extension | +| `e1aca9c` (D) | `src/buffer.c` | +1 | Fix-forward — `VAProbabilityBufferType` whitelist (Phase 2 source-read miss) | + +### Campaign repo (fresnel-fourier) + +| Commit | Phase | Description | +|---|---|---| +| `ea2413e` | Phase 1 | Lock — 5 boolean criteria, predicted scope ~260-340 LOC | +| `898544a` | Phase 2 | Situation — 10 bug sites, contract surface, 30-field mapping table | +| `fd3fce8` | Phase 3 | Baseline — verbatim cross-validator strace, 6/6 regression hashes hold, 5/6 open questions answered | +| `2918dda` | Phase 4 | Plan — 10 contract clauses, 3-commit predicted shape | +| `656596a` | Phase 5 | Review — 4 Critical findings, 4 amendments incorporated | +| `afb9b14` | Phase 7 | Verification — 4 direct + 1 transitive PASS | + +## Lessons distilled to memory + +### NEW: `feedback_hw_decode_engagement_check.md` + +When verifying criterion 4 (HW=SW byte-identical), separately prove HW decode actually engaged. mpv silently falls back to software for some codec/backend combinations, in which case HW=SW reduces to SW=SW (vacuous). Use `lsof /dev/video` / strace / mpv `-v` log / `ffmpeg -v info` to verify HW path engagement BEFORE byte-comparing. + +**Established by**: user catch mid-Phase-7. Initial criterion-4 PASS report was based on mpv `--hwdec=vaapi --vo=image` HW=SW match. mpv silently fell back to SW for VP8 (works for MPEG-2 + HEVC). The "match" was SW=SW. User redirect: "or that hw decode did not work and the software fallback produced the same image twice." + +**Generalization**: empirical equality between two pipelines proves nothing about which pipeline produced the bytes. Always pin the source-of-truth variable before comparing. + +### NEW: `reference_dmabuf_resv_blocker.md` + +Cross-campaign blocker reference. RK3399 hantro CAPTURE → libva readback returns all-zero pages (SHA `b34860e0...` = SHA of all-zero 1382400-byte block) due to videobuf2 missing `dma_resv` release fence + panfrost `IOMMU_CACHE` absence. Tracked at `git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/2`. `vb2_dma_resv` kernel patches in flight (RFC v2, 2026-04 linux-media). + +**How to verify criterion 4 when blocker active**: transitive proof — (A) capture libva backend's `S_EXT_CTRLS` payload via strace, (B) byte-compare against kernel-direct `ffmpeg -hwaccel v4l2request` payload from Phase 3 baseline, (C) verify kernel-direct decode = SW byte-identical. ∴ A==B AND C ⇒ libva backend decode is correct, even though direct readback is blocked. + +### NEW: `feedback_runtime_enumerates_allowlists.md` + +Sibling to `feedback_header_deletion_check.md`. When ADDING a new enum value (buffer type, profile, ioctl), grep audits miss switch-default-rejection sites. The runtime authoritatively enumerates allow-list violations at first call — let it fix-forward rather than mine for sites with grep. + +**Established by**: Phase 6 Commit D fix-forward. Phase 2 source-read claimed `buffer.c` was type-agnostic. False: `RequestCreateBuffer:59-70` had an explicit `switch(type){case...:break; default:return ERROR;}` allow-list. ffmpeg-vaapi hit it at first `vaCreateBuffer` call with `Failed to create parameter buffer (type 13)`. +1 LOC fix-forward. + +**Generalization**: same principle as compiler-enumerates-includes (`feedback_header_deletion_check.md`). Both are "trust the toolchain, not grep, to enumerate the gating sites." + +## Phase 5 review track-record acknowledgment + +Per memory `feedback_review_empirical_over_theoretical.md`: each review's findings should be acknowledged at Phase 8 close. iter3 review (`phase5_iter3_review.md`, commit `656596a`) found 4 Critical bugs: + +| ID | Reviewer claim | Phase 7 empirical result | +|---|---|---| +| C1 | `first_part_header_bits = 0` is unsafe; kernel `hantro_g1_vp8_dec.c:260` reads it unconditionally; correct value is `slice->macroblock_offset` | **CORRECT**. Backend produces `6550` for keyframe — byte-matches Phase 3 anchor. Without C1, decode would have placed hardware at wrong DMA offset. | +| C2 | `first_part_size = partition_size[0]` short by `ceil(macroblock_offset/8)` bytes; `vaapi_vp8.c:209` confirms VAAPI subtracts these bytes; recover via `+= ceil(macroblock_offset/8)` | **CORRECT**. Backend produces `22742` for keyframe — byte-matches Phase 3 anchor (21923 + 819 = 22742). Without C2, DCT partitions would have started 819 bytes too early. | +| C3 | `VAProbabilityDataBufferType` doesn't exist; correct enum is `VAProbabilityBufferType` (= 13 per `va.h:2058`) | **CORRECT**. Compiled cleanly post-Commit-D fix-forward. Without C3, Phase 6 would have failed first compile. | +| C4 | `(s8)` cast is kernel-internal; userspace UAPI exposes `__s8`; portable cast is `(int8_t)` from `` | **CORRECT**. Compiled cleanly Commit B first try. Without C4, Phase 6 would have failed first compile. | + +All 4 Critical findings empirically correct. Reviewer also empirically verified the proposed amendments (gcc-test-compile + grep) before recommending them — Direction 2 of `feedback_review_empirical_over_theoretical.md` honored. Author independently verified each before incorporation. + +Estimated savings without the review pass: 2 Phase 6 compile-fail / fix-forward cycles (C3 + C4) + 1 Phase 7 → Phase 4 loopback (C1 + C2 hardware-DMA-offset bug, would have produced visible-but-corrupt output). Actual cost with review: 1 fix-forward (Commit D, mechanical, +1 LOC, was a Phase 2 source-read miss not caught by Phase 5 because it was outside review scope). + +## Phase 4 cross-cutting backlog status + +Items inherited from iter1+iter2 (NOT touched in iter3): + +- **B1** V4L2 device-discovery (per-boot device numbering shuffle) — workaround held: re-verified `/dev/video5+/dev/media2` for hantro at session start. +- **B3** picture.c BeginPicture profile-aware reset — iter3 added vp8 reset lines without refactoring the cross-profile cleanup. +- **B4** context.c log suppression for unsupported codec controls — `Unable to set control(s)` log lines from iter1+iter2's H.264/HEVC device-init fired during iter3 too (hantro doesn't support those codecs); confusing but non-fatal. +- **B5** mpeg2 vbv_buffer_size polish (iter1 S2 finding) — pending. +- **B6** h265 SPS bitstream-parse fidelity gap — pending. +- **L3** vaDeriveImage cache-stale on RK3399 — workaround held: DMA-BUF GL only via mpv (when mpv engages). For VP8 + future codecs where mpv falls back, see new `reference_dmabuf_resv_blocker.md`. + +iter3 NEW items added to backlog: + +- **iter3-Q1** `first_part_header_bits` derivation gap → CLOSED by Phase 5 C1 amendment. +- **iter3-flags-anomaly** ffmpeg-v4l2-request-git sets undocumented bit `0x40` + `EXPERIMENTAL` on inter frames → not iter3 scope; kernel hantro_vp8.c only reads KEY_FRAME bit; libva backend correctly skips. +- **iter3-criterion-4-readback** kernel-side `vb2_dma_resv` blocker → tracked at `dmabuf-modifier-triage/issues/2`; when patches land in `linux-eos-arm`, re-run direct verification for VP8 + any future codec where mpv falls back. Until then, transitive-proof verification per `reference_dmabuf_resv_blocker.md`. +- **iter3-mpv-vp8-fallback** mpv 0.41.0 silently falls back from `--hwdec=vaapi` to SW for VP8 — separate from backend; not iter3 scope; verify alternate consumer paths (chrome-fourier or future mpv version) when convenient. + +## What worked + what to repeat + +### Worked + +- **Contract-before-code Phase 4 plan** (10 clauses citing kernel UAPI + VAAPI + FFmpeg ref + Phase 3 verbatim payload bytes) — let Phase 5 review find specific byte-level disagreements. +- **Phase 5 sonnet-architect review with empirical-over-theoretical discipline (Direction 2)** — caught all 4 compile/runtime bugs before they hit Phase 6/7. Reviewer test-compiled + grep'd every proposed amendment before recommending. +- **Phase 3 verbatim payload anchoring** — keyframe + inter frame field captures from `ffmpeg -hwaccel v4l2request` strace gave Phase 4 + Phase 7 byte-level reference points (`first_part_size=22742`, `first_part_header_bits=6550`, `flags=0x0d`). +- **Transitive proof verification pattern** — when direct pixel readback was blocked by sibling-campaign kernel issue, A==B+C transitive equality preserved iter3 close validity. +- **Phase 6 commits A-D pattern** — same as iter1+iter2; predictable scope; fix-forward Commit D well-trodden. + +### To repeat / improve + +- **HW engagement verification BEFORE byte-compare** — new memory rule `feedback_hw_decode_engagement_check.md` formalizes this. +- **Allow-list audit at Phase 2 OR plan for Commit D fix-forward** — new memory rule `feedback_runtime_enumerates_allowlists.md` shifts the bias from "audit exhaustively up front" to "expect runtime to enumerate." iter4+ should expect 1 fix-forward per iter for buffer.c-style allow-list adds. + +## Predecessor → successor handoff (iter4 = VP9 on rkvdec) + +Per campaign roadmap, iter4 targets VP9 on rkvdec-vpu-dec. Substrate hand-off: + +- Fork tip `e1aca9c` is the iter4 starting point. +- VP9 fixture: `~/fourier-test/bbb_720p10s_vp9.webm` (3.4 MB). +- VP9 will use rkvdec (NOT hantro). Different kernel driver (`rkvdec_vp9.c`), different V4L2 controls (`V4L2_CID_STATELESS_VP9_FRAME` + `V4L2_CID_STATELESS_VP9_COMPRESSED_HDR`). +- Reference sources: `references/ffmpeg-kwiboo/libavcodec/v4l2_request_vp9.c` + `references/linux-mainline/drivers/staging/media/rkvdec/rkvdec-vp9.c` (verify presence at iter4 Phase 2). +- Inherits all 4 backlog items + 4 NEW items above. +- Memory rules carry forward: + - `feedback_gitea_as_claude_noether` + - `feedback_no_session_termination_attempts` + - `feedback_header_deletion_check` + - `feedback_runtime_enumerates_allowlists` (NEW from iter3) + - `feedback_review_empirical_over_theoretical` (BOTH directions) + - `feedback_rockchip_pixel_verify_path` + - `feedback_fresnel_hostname` + - `feedback_hw_decode_engagement_check` (NEW from iter3) + - `reference_dmabuf_resv_blocker` (NEW from iter3) + +iter4 prediction: VP9 has more state than VP8 (compressed-header control with probability updates), comparable scope to iter2 HEVC. Expect ~400-500 LOC across 6-7 files, 3-4 commits + 1 fix-forward. mpv `--hwdec=vaapi` may engage for VP9 (different from VP8 fallback) — verify at Phase 0; if it does, criterion 4 direct readback becomes possible (provided rkvdec doesn't hit the same dma_resv issue as hantro). + +## Campaign scoreboard + +``` +Codec | Site | Iter | Status | Verifier path +========|===========|======|=============|==================================== +H.264 | rkvdec | T4 | PASS | mpv --hwdec=vaapi --vo=image (DMA-BUF GL) +MPEG-2 | hantro | iter1 | PASS | mpv --hwdec=vaapi --vo=image (DMA-BUF GL) +HEVC | rkvdec | iter2 | PASS | mpv --hwdec=vaapi --vo=image (DMA-BUF GL) +VP8 | hantro | iter3 | PASS (T) | transitive proof (kernel readback blocked) +VP9 | rkvdec | iter4 | PENDING | TBD at iter4 Phase 0 +``` + +4/5 codecs passing. iter4 to follow.