# Phase 3 FINAL close — 10 iterations, structural stall, PIVOT recommended Date: 2026-05-17 ~08:35. Phase 3 closes at first-light-impossible after 10 register-tuning iterations + 2 architect-review cycles. ## What Janet's structural review (post phase3_close.md) added Janet's prescription: zero `reg197_cabactbl_base` + struct-size sanity check, then PIVOT if still stuck. Acted on it; the diagnostic IOMMU faults revealed substantial new info: | Iter | Change | Result | |---|---|---| | 7 | `reg197 = 0` + sizeof asserts | IOMMU fault @ iova=0x700 (HW reads reg197 at offset 0x700!) | | 8 | `reg197 = 16KB zero scratch` | IOMMU fault @ iova=0x300 (next register HW reads) | | 9 | iter8 + `reg161/163 = scratch` (pps_base / rps_base) | No faults; HW hangs silently | | 10 | iter9 + `reg160/162/172 = scratch` (zero-byte prob bases) | Identical hang to iter9 | **Key new findings:** - HW reads `reg197_cabactbl_base`, `reg161_pps_base`, `reg163_rps_base`, and the prob bases UNCONDITIONALLY for VP9, despite BSP not explicitly populating most of these. BSP works because its kernel driver treats `fd=0` as "no buffer" and skips translation; mainline writes raw IOVAs so any 0 register IS read by HW. - **Probability buffer content does NOT matter for the stall.** Zero-byte scratch (iter10) and legacy-format `priv_tbl.probs` (iter1-6) both hang HW identically. The "wrong prob format" hypothesis is falsified. - Struct size check: `vp9_param = 196 bytes` vs needed `256 bytes` (60-byte tail unwritten for reg113..127), `common_addr = 60 bytes` vs full `128 bytes` (gap of reg143..159). HEVC has the same gaps and works → unwritten tail isn't the cause. ## Final hypothesis space (all 3 require pivot or major effort) 1. **BSP-specific kernel-side init we're missing**: cache config (`RKVDEC_REG_CACHE0/1/2_SIZE_BASE` + clear), AXI/QoS setup (`reg256/257/270`), or SRAM pool routing. Mainline HEVC works without these so they may not be VP9-relevant — but VP9's HW pipeline stages may differ. 2. **vdpu381 mode-2 (VP9) was reserved but not validated**: Casanova's v7.0 series shipped HEVC + H.264 register definitions; `VDPU381_MODE_VP9=2` constant is in the header but no working reference exists. We're the first to drive it via mainline. 3. **Probability buffer requires VP9-specific INITIALIZATION** (per BSP `hal_vp9d_prob_default`) — not just any pointer. Iter10 (zero-byte scratch) hung the same way as legacy-format, but maybe HW needs SPECIFIC byte patterns (CABAC-like lookup tables) that we'd need to reverse-engineer. ## Branch state `boltzmann:~/src/linux-rockchip:vp9-enablement-iter1` — head `3d7ffae30626`. 7 commits total. 1620 LoC across 4 files. Compiles clean. Format enumerates correctly. HW does not decode. ## Ampere state Currently loaded: iter10 (HW hangs on every VP9 frame). Recovery to sibling-campaign close: `sudo cp ~/vp9-iter1-backup/rockchip-vdec.ko.sibling-campaign-close /lib/modules/$(uname -r)/kernel/drivers/media/platform/rockchip/rkvdec/rockchip-vdec.ko && sudo depmod -a && sudo modprobe -r rockchip-vdec && sudo modprobe rockchip-vdec`. HEVC bit-perfect restored. ## Pivot options (Janet recommended PIVOT) | Option | Effort | Outcome | |---|---|---| | **A: AV1 on vdpu383** | 1-2 days port (Casanova has full vdpu383 + AV1 already in v7.0) | New HW codec on ampere working; VP9 stays SW-only | | **B: Add full BSP `mpp_rkvdec2` rkvdec2_run sequence to mainline** | Weeks; won't upstream | VP9 might work; not maintainable | | **C: Coordinate with Collabora upstream** | Indefinite wait | Eventually a clean port exists | | **D: Reverse-engineer probability buffer init from BSP `hal_vp9d_prob_default`** | Days of careful work | Maybe completes the current attempt | | **E: Document findings + abandon** | 0 | Campaign closes at "structural impossibility identified" | ## Campaign total - 0:50 — sibling campaign ampere-kernel-decoders closes (HEVC bit-perfect) - 1:00 — Phase 0 opens - 1:35 — Phase 1 plan v1 → Janet AMEND (2 BLOCKERs) - 2:00 — Phase 1 plan v2 → Janet AMEND (2 amendments) - 2:15 — Phase 1 v3 amendment → PROCEED - 2:30 — Phase 2 implementation begins - 3:00 — Phase 2.1 complete (full register-packing translation, compiles clean) - 8:00 — Phase 3 install begins after morning resume - 8:15 — Phase 3 first-light fails (6 iterations) - 8:20 — Janet structural review → PIVOT (with one final test prescription) - 8:30 — Iters 7-10 confirm structural impossibility - 8:35 — Phase 3 final close Total: ~7h45m of active work (excluding overnight break). ## Honest assessment for the user We have a structurally-correct kernel module that exposes VP9F format on `/dev/video1`, compiles clean, has the right segmented register layout per the BSP, and writes to all the right register addresses. The hardware reads our register configuration but never decodes — it hangs at some stage we cannot see through the IRQ status interface. Without HW logic-analyzer access or Collabora's internal validation suite, the gap between "register state we provide" and "register state HW needs" is opaque to mainline. BSP works because it includes complete kernel + userspace + firmware-table init; mainline's vdpu381 path is HEVC + H.264 only and that's not coincidence. Per Janet's PIVOT verdict, recommend Option A (AV1 on vdpu383) for shortest-path-to-working-HW-decode on ampere. VP9-on-vdpu381 should be coordinated with Collabora or deferred. Our work is preserved as the substrate for whoever takes this forward.