iter8 Phase 0+1: lock E (perf binding cell) — campaign-closing iteration
iter8 is the final iteration. Locks Track E (carried iter1..iter7) as the empirical-anchor closing artifact: measure CPU%, drops, frame timing, GPU freq, memory across four consumer configurations (mpv DMA-BUF, mpv vaapi-copy, Firefox-fourier, SW baseline) on bbb_1080p30_h264.mp4 against the iter7-end driver. Why now: iter1-iter7 prioritized binary blockers; measuring a broken decoder is useless. iter7-end driver is the first stable substrate where numbers don't drift between consumer probes. Why this matters even without upstreaming (D dropped 2026-05-06): - Personal regression detection for any future fork change - Realism check on the campaign's own qualitative claims - Calibration for follow-on campaigns (fourier-fresnel will compare RK3399 numbers against this anchor) Phase 1 success criterion (5 parts): 1. Reproducible script in tests/ 2. Anchored numbers in a campaign artifact 3. Honest qualitative interpretation (no spin) 4. Phase 5 sonnet review confirms script is fixture-agnostic 5. Campaign close doc states "campaign closes" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,97 @@
|
||||
# Phase 0 — iteration 8 substrate (libva-multiplanar campaign — final iteration)
|
||||
|
||||
Opened 2026-05-06 immediately after iter7 close + post-close research. iter8 is the **campaign-closing iteration**: anchors the deliverables to measured numbers (candidate E — performance binding cell), then formally closes the campaign.
|
||||
|
||||
iter6 met the operator's primary goal end-to-end (Firefox HW decode of YouTube avc1 on PineTab2 with sandbox enabled). iter7 closed three internal carry items (msync verify + slot-leak recovery + cap_pool race harness). Post-iter7 research dropped Track F (DMABUF on OUTPUT, technical merit) and Track D (upstreaming, philosophical — see `memory/project_no_upstreaming_philosophical.md`). Track E is the last remaining candidate within the campaign. Follow-on top-level campaigns (`fourier-fresnel`, `panvk-bifrost`) are chartered separately and not part of this campaign's iteration sequence.
|
||||
|
||||
## Predecessor close-out summary (iteration 7 → iteration 8)
|
||||
|
||||
iter7 landed two fork commits:
|
||||
|
||||
- `988b848` — main A+B+C: slot-leak `request_pool_force_release`, cap_pool race synthetic harness in `tests/`, msync pixel-verify shell harness in `tests/`.
|
||||
- `7bd0818` — Phase 7 finalization: OUTPUT-pool teardown on resolution-change in CreateSurfaces2 (latent bug surfaced by the synthetic harness).
|
||||
- `dcaa1f1` — silicon-ID nomenclature fix (PineTab2 = RK3566 silicon, hantro driver via the `rockchip,rk3568-vpu` DT compatible).
|
||||
|
||||
iter7 carried into iter8:
|
||||
- **STREAMON-on-context-recreate after resolution change** — corner case (real consumers don't trigger), low priority
|
||||
- **Pool-size parameterization** — iter6 sonnet review carry, low priority
|
||||
- **Fault-inject build for Track B** — empirical hard-guarantee for the slot-leak recovery code path; sonnet code-review covered semantic correctness, deferred unless concretely needed
|
||||
|
||||
None of those are blockers for iter8 close.
|
||||
|
||||
## Iteration 8 candidate research question (single track)
|
||||
|
||||
### E. Performance binding cell (carried iter1..iter7 — finally locked iter8)
|
||||
|
||||
> Anchor measured numbers for the four primary consumer paths on `bbb_1080p30_h264.mp4`. Drop count, CPU%, frame timing, GPU/VPU freq, memory footprint. Reproducible from a documented script.
|
||||
|
||||
**Why now**: iter1-iter7 each prioritized closing a binary blocker over measurement. Measuring a broken decoder is useless; iter7-end driver is the first stable substrate where numbers are meaningful and won't drift between consumer probes. This is the campaign's empirical anchor, the closing artifact.
|
||||
|
||||
**Why this matters even without upstreaming** (per `project_no_upstreaming_philosophical.md` Track D drop):
|
||||
- Personal regression detection: any future change to the fork has a measured "before" to reference.
|
||||
- Realism check on the campaign's own qualitative claims (iter5/iter6/iter7 closes used "GREEN" without numbers — E forces honesty about what HW decode actually saves).
|
||||
- Calibrates expectations for the follow-on campaigns (`fourier-fresnel` will compare RK3399 numbers against PineTab2's anchor; `panvk-bifrost` will reference the GLES-vs-future-Vulkan delta).
|
||||
|
||||
**Plan**: shell script in `tests/run_perf_binding_cell.sh`. Runs each of four consumer configurations for 30s on the campaign fixture, captures:
|
||||
- `pidstat -u -p <PID> 1 30` → per-second CPU% timeseries → median, p90
|
||||
- `/sys/class/devfreq/fde60000.gpu/cur_freq` polled at 100ms cadence → freq residency histogram
|
||||
- mpv `--term-status-msg='${frame-drop-count} ${time-pos} ${vsync-jitter}'` → drops + actual position + jitter
|
||||
- Firefox via `top -p` snapshot during steady-state playback (RDD process) since `about:processes` isn't programmatically scrapeable
|
||||
- `/proc/<PID>/status` VmRSS at start + end → memory delta
|
||||
- Optional: `/sys/kernel/debug/...hantro...` if exposed
|
||||
|
||||
Four consumer configurations:
|
||||
1. **mpv `--hwdec=vaapi`** — DMA-BUF zero-copy path (full HW)
|
||||
2. **mpv `--hwdec=vaapi-copy`** — HW decode + VAImage readback to userspace
|
||||
3. **Firefox 150 (iter5-amend, sandbox enabled)** — production HW path through libva
|
||||
4. **mpv `--hwdec=no` (SW baseline)** — control
|
||||
|
||||
**Risk**: low. Measurement-only. No driver code changes.
|
||||
|
||||
**Effort**: 3-4 hours including script + run + parsing + markdown table generation.
|
||||
|
||||
## In-scope (LOCKED 2026-05-06 for iteration 8) — E
|
||||
|
||||
Operator locked **E** as the sole iter8 track. iter8 is the campaign-closing iteration.
|
||||
|
||||
D (upstreaming) was dropped 2026-05-06 on philosophical grounds (`memory/project_no_upstreaming_philosophical.md`).
|
||||
F (DMABUF on OUTPUT) was dropped 2026-05-06 on technical grounds (`track_F_research_2026-05-06.md`).
|
||||
A, B, C closed iter7. iter1-iter6 carries all closed.
|
||||
|
||||
iter7 carries (STREAMON-on-context-recreate, pool-size parameterization, slot-leak fault-inject) remain as low-priority items in the campaign-close doc, not iter8 scope.
|
||||
|
||||
## Out-of-scope (LOCKED 2026-05-06 for iteration 8)
|
||||
|
||||
- iter1-iter7 completed work — done.
|
||||
- Codecs outside H.264 (MPEG-2 dropped iter6, others out per iter1 lock).
|
||||
- New target hardware (fresnel, ampere) — separate top-level campaigns.
|
||||
- Upstreaming — dropped on philosophical grounds.
|
||||
- DMABUF on OUTPUT — dropped on technical grounds.
|
||||
- Driver code changes — measurement only.
|
||||
|
||||
## Phase 1 success criterion (LOCKED 2026-05-06 for iteration 8)
|
||||
|
||||
> 1. **Reproducible measurement script** committed to `tests/run_perf_binding_cell.sh` (or similar) that runs each of the four consumer configurations for ≥30 seconds against `bbb_1080p30_h264.mp4` on ohm and emits a markdown-formatted table with the following columns per row: consumer, CPU% median, CPU% p90, drops in measurement window, p50 frame interval (ms), GPU freq median (MHz), VmRSS delta (MiB).
|
||||
>
|
||||
> 2. **Anchored numbers** for all four consumers captured into a campaign artifact (`phase7_iter8_perf_anchor.md` or similar). Numbers must come from a clean ohm run on the iter7-end driver (sha `54999017…` or rebuild from iter7 HEAD `7bd0818`).
|
||||
>
|
||||
> 3. **Honest qualitative interpretation** in the close doc. If the numbers are uglier than expected (e.g., HW decode only saves 30% browser CPU rather than 80%), document that. The campaign's prior qualitative descriptors get re-anchored to the actual data.
|
||||
>
|
||||
> 4. **Phase 5 sonnet review** confirms: (a) script is fixture-agnostic (works for any H.264 file the operator passes), (b) measurements aren't fixture-hardcoded, (c) results are presented honestly without spin.
|
||||
>
|
||||
> 5. **Campaign close doc** (`phase8_iteration8_close.md`) explicitly states "campaign closes" and lists residual carries for any future operator who picks this up.
|
||||
|
||||
## Phase 1 LOCKED. Iteration 8 proceeds.
|
||||
|
||||
iter8 = candidate **E** alone. Phases 2..8 + campaign-close:
|
||||
- Phase 2: situation analysis — measurement methodology, parsing approach, edge cases (SW baseline drops to dozens at 1080p30, expect Firefox numbers limited by what we can scrape without Firefox-internal hooks)
|
||||
- Phase 3: baseline anchor — quick smoke run of `pidstat` + `/sys/class/devfreq` polling on ohm to confirm tooling availability
|
||||
- Phase 4: implement `tests/run_perf_binding_cell.sh`
|
||||
- Phase 5: sonnet review
|
||||
- Phase 6: deploy script (commit + sync to ohm)
|
||||
- Phase 7: run, capture, generate table
|
||||
- Phase 8: close iteration AND close campaign
|
||||
|
||||
## Stop point
|
||||
|
||||
After Phase 8 close, the campaign formally closes. Future operator-initiated work would re-open as a new top-level campaign (e.g., `fourier-fresnel` per `project_followon_campaigns.md`).
|
||||
Reference in New Issue
Block a user