diff --git a/phase0_findings_iter8.md b/phase0_findings_iter8.md new file mode 100644 index 0000000..1da05bf --- /dev/null +++ b/phase0_findings_iter8.md @@ -0,0 +1,97 @@ +# Phase 0 — iteration 8 substrate (libva-multiplanar campaign — final iteration) + +Opened 2026-05-06 immediately after iter7 close + post-close research. iter8 is the **campaign-closing iteration**: anchors the deliverables to measured numbers (candidate E — performance binding cell), then formally closes the campaign. + +iter6 met the operator's primary goal end-to-end (Firefox HW decode of YouTube avc1 on PineTab2 with sandbox enabled). iter7 closed three internal carry items (msync verify + slot-leak recovery + cap_pool race harness). Post-iter7 research dropped Track F (DMABUF on OUTPUT, technical merit) and Track D (upstreaming, philosophical — see `memory/project_no_upstreaming_philosophical.md`). Track E is the last remaining candidate within the campaign. Follow-on top-level campaigns (`fourier-fresnel`, `panvk-bifrost`) are chartered separately and not part of this campaign's iteration sequence. + +## Predecessor close-out summary (iteration 7 → iteration 8) + +iter7 landed two fork commits: + +- `988b848` — main A+B+C: slot-leak `request_pool_force_release`, cap_pool race synthetic harness in `tests/`, msync pixel-verify shell harness in `tests/`. +- `7bd0818` — Phase 7 finalization: OUTPUT-pool teardown on resolution-change in CreateSurfaces2 (latent bug surfaced by the synthetic harness). +- `dcaa1f1` — silicon-ID nomenclature fix (PineTab2 = RK3566 silicon, hantro driver via the `rockchip,rk3568-vpu` DT compatible). + +iter7 carried into iter8: +- **STREAMON-on-context-recreate after resolution change** — corner case (real consumers don't trigger), low priority +- **Pool-size parameterization** — iter6 sonnet review carry, low priority +- **Fault-inject build for Track B** — empirical hard-guarantee for the slot-leak recovery code path; sonnet code-review covered semantic correctness, deferred unless concretely needed + +None of those are blockers for iter8 close. + +## Iteration 8 candidate research question (single track) + +### E. Performance binding cell (carried iter1..iter7 — finally locked iter8) + +> Anchor measured numbers for the four primary consumer paths on `bbb_1080p30_h264.mp4`. Drop count, CPU%, frame timing, GPU/VPU freq, memory footprint. Reproducible from a documented script. + +**Why now**: iter1-iter7 each prioritized closing a binary blocker over measurement. Measuring a broken decoder is useless; iter7-end driver is the first stable substrate where numbers are meaningful and won't drift between consumer probes. This is the campaign's empirical anchor, the closing artifact. + +**Why this matters even without upstreaming** (per `project_no_upstreaming_philosophical.md` Track D drop): +- Personal regression detection: any future change to the fork has a measured "before" to reference. +- Realism check on the campaign's own qualitative claims (iter5/iter6/iter7 closes used "GREEN" without numbers — E forces honesty about what HW decode actually saves). +- Calibrates expectations for the follow-on campaigns (`fourier-fresnel` will compare RK3399 numbers against PineTab2's anchor; `panvk-bifrost` will reference the GLES-vs-future-Vulkan delta). + +**Plan**: shell script in `tests/run_perf_binding_cell.sh`. Runs each of four consumer configurations for 30s on the campaign fixture, captures: +- `pidstat -u -p 1 30` → per-second CPU% timeseries → median, p90 +- `/sys/class/devfreq/fde60000.gpu/cur_freq` polled at 100ms cadence → freq residency histogram +- mpv `--term-status-msg='${frame-drop-count} ${time-pos} ${vsync-jitter}'` → drops + actual position + jitter +- Firefox via `top -p` snapshot during steady-state playback (RDD process) since `about:processes` isn't programmatically scrapeable +- `/proc//status` VmRSS at start + end → memory delta +- Optional: `/sys/kernel/debug/...hantro...` if exposed + +Four consumer configurations: +1. **mpv `--hwdec=vaapi`** — DMA-BUF zero-copy path (full HW) +2. **mpv `--hwdec=vaapi-copy`** — HW decode + VAImage readback to userspace +3. **Firefox 150 (iter5-amend, sandbox enabled)** — production HW path through libva +4. **mpv `--hwdec=no` (SW baseline)** — control + +**Risk**: low. Measurement-only. No driver code changes. + +**Effort**: 3-4 hours including script + run + parsing + markdown table generation. + +## In-scope (LOCKED 2026-05-06 for iteration 8) — E + +Operator locked **E** as the sole iter8 track. iter8 is the campaign-closing iteration. + +D (upstreaming) was dropped 2026-05-06 on philosophical grounds (`memory/project_no_upstreaming_philosophical.md`). +F (DMABUF on OUTPUT) was dropped 2026-05-06 on technical grounds (`track_F_research_2026-05-06.md`). +A, B, C closed iter7. iter1-iter6 carries all closed. + +iter7 carries (STREAMON-on-context-recreate, pool-size parameterization, slot-leak fault-inject) remain as low-priority items in the campaign-close doc, not iter8 scope. + +## Out-of-scope (LOCKED 2026-05-06 for iteration 8) + +- iter1-iter7 completed work — done. +- Codecs outside H.264 (MPEG-2 dropped iter6, others out per iter1 lock). +- New target hardware (fresnel, ampere) — separate top-level campaigns. +- Upstreaming — dropped on philosophical grounds. +- DMABUF on OUTPUT — dropped on technical grounds. +- Driver code changes — measurement only. + +## Phase 1 success criterion (LOCKED 2026-05-06 for iteration 8) + +> 1. **Reproducible measurement script** committed to `tests/run_perf_binding_cell.sh` (or similar) that runs each of the four consumer configurations for ≥30 seconds against `bbb_1080p30_h264.mp4` on ohm and emits a markdown-formatted table with the following columns per row: consumer, CPU% median, CPU% p90, drops in measurement window, p50 frame interval (ms), GPU freq median (MHz), VmRSS delta (MiB). +> +> 2. **Anchored numbers** for all four consumers captured into a campaign artifact (`phase7_iter8_perf_anchor.md` or similar). Numbers must come from a clean ohm run on the iter7-end driver (sha `54999017…` or rebuild from iter7 HEAD `7bd0818`). +> +> 3. **Honest qualitative interpretation** in the close doc. If the numbers are uglier than expected (e.g., HW decode only saves 30% browser CPU rather than 80%), document that. The campaign's prior qualitative descriptors get re-anchored to the actual data. +> +> 4. **Phase 5 sonnet review** confirms: (a) script is fixture-agnostic (works for any H.264 file the operator passes), (b) measurements aren't fixture-hardcoded, (c) results are presented honestly without spin. +> +> 5. **Campaign close doc** (`phase8_iteration8_close.md`) explicitly states "campaign closes" and lists residual carries for any future operator who picks this up. + +## Phase 1 LOCKED. Iteration 8 proceeds. + +iter8 = candidate **E** alone. Phases 2..8 + campaign-close: +- Phase 2: situation analysis — measurement methodology, parsing approach, edge cases (SW baseline drops to dozens at 1080p30, expect Firefox numbers limited by what we can scrape without Firefox-internal hooks) +- Phase 3: baseline anchor — quick smoke run of `pidstat` + `/sys/class/devfreq` polling on ohm to confirm tooling availability +- Phase 4: implement `tests/run_perf_binding_cell.sh` +- Phase 5: sonnet review +- Phase 6: deploy script (commit + sync to ohm) +- Phase 7: run, capture, generate table +- Phase 8: close iteration AND close campaign + +## Stop point + +After Phase 8 close, the campaign formally closes. Future operator-initiated work would re-open as a new top-level campaign (e.g., `fourier-fresnel` per `project_followon_campaigns.md`).