iter8 is the final iteration. Locks Track E (carried iter1..iter7) as the empirical-anchor closing artifact: measure CPU%, drops, frame timing, GPU freq, memory across four consumer configurations (mpv DMA-BUF, mpv vaapi-copy, Firefox-fourier, SW baseline) on bbb_1080p30_h264.mp4 against the iter7-end driver. Why now: iter1-iter7 prioritized binary blockers; measuring a broken decoder is useless. iter7-end driver is the first stable substrate where numbers don't drift between consumer probes. Why this matters even without upstreaming (D dropped 2026-05-06): - Personal regression detection for any future fork change - Realism check on the campaign's own qualitative claims - Calibration for follow-on campaigns (fourier-fresnel will compare RK3399 numbers against this anchor) Phase 1 success criterion (5 parts): 1. Reproducible script in tests/ 2. Anchored numbers in a campaign artifact 3. Honest qualitative interpretation (no spin) 4. Phase 5 sonnet review confirms script is fixture-agnostic 5. Campaign close doc states "campaign closes" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.3 KiB
Phase 0 — iteration 8 substrate (libva-multiplanar campaign — final iteration)
Opened 2026-05-06 immediately after iter7 close + post-close research. iter8 is the campaign-closing iteration: anchors the deliverables to measured numbers (candidate E — performance binding cell), then formally closes the campaign.
iter6 met the operator's primary goal end-to-end (Firefox HW decode of YouTube avc1 on PineTab2 with sandbox enabled). iter7 closed three internal carry items (msync verify + slot-leak recovery + cap_pool race harness). Post-iter7 research dropped Track F (DMABUF on OUTPUT, technical merit) and Track D (upstreaming, philosophical — see memory/project_no_upstreaming_philosophical.md). Track E is the last remaining candidate within the campaign. Follow-on top-level campaigns (fourier-fresnel, panvk-bifrost) are chartered separately and not part of this campaign's iteration sequence.
Predecessor close-out summary (iteration 7 → iteration 8)
iter7 landed two fork commits:
988b848— main A+B+C: slot-leakrequest_pool_force_release, cap_pool race synthetic harness intests/, msync pixel-verify shell harness intests/.7bd0818— Phase 7 finalization: OUTPUT-pool teardown on resolution-change in CreateSurfaces2 (latent bug surfaced by the synthetic harness).dcaa1f1— silicon-ID nomenclature fix (PineTab2 = RK3566 silicon, hantro driver via therockchip,rk3568-vpuDT compatible).
iter7 carried into iter8:
- STREAMON-on-context-recreate after resolution change — corner case (real consumers don't trigger), low priority
- Pool-size parameterization — iter6 sonnet review carry, low priority
- Fault-inject build for Track B — empirical hard-guarantee for the slot-leak recovery code path; sonnet code-review covered semantic correctness, deferred unless concretely needed
None of those are blockers for iter8 close.
Iteration 8 candidate research question (single track)
E. Performance binding cell (carried iter1..iter7 — finally locked iter8)
Anchor measured numbers for the four primary consumer paths on
bbb_1080p30_h264.mp4. Drop count, CPU%, frame timing, GPU/VPU freq, memory footprint. Reproducible from a documented script.
Why now: iter1-iter7 each prioritized closing a binary blocker over measurement. Measuring a broken decoder is useless; iter7-end driver is the first stable substrate where numbers are meaningful and won't drift between consumer probes. This is the campaign's empirical anchor, the closing artifact.
Why this matters even without upstreaming (per project_no_upstreaming_philosophical.md Track D drop):
- Personal regression detection: any future change to the fork has a measured "before" to reference.
- Realism check on the campaign's own qualitative claims (iter5/iter6/iter7 closes used "GREEN" without numbers — E forces honesty about what HW decode actually saves).
- Calibrates expectations for the follow-on campaigns (
fourier-fresnelwill compare RK3399 numbers against PineTab2's anchor;panvk-bifrostwill reference the GLES-vs-future-Vulkan delta).
Plan: shell script in tests/run_perf_binding_cell.sh. Runs each of four consumer configurations for 30s on the campaign fixture, captures:
pidstat -u -p <PID> 1 30→ per-second CPU% timeseries → median, p90/sys/class/devfreq/fde60000.gpu/cur_freqpolled at 100ms cadence → freq residency histogram- mpv
--term-status-msg='${frame-drop-count} ${time-pos} ${vsync-jitter}'→ drops + actual position + jitter - Firefox via
top -psnapshot during steady-state playback (RDD process) sinceabout:processesisn't programmatically scrapeable /proc/<PID>/statusVmRSS at start + end → memory delta- Optional:
/sys/kernel/debug/...hantro...if exposed
Four consumer configurations:
- mpv
--hwdec=vaapi— DMA-BUF zero-copy path (full HW) - mpv
--hwdec=vaapi-copy— HW decode + VAImage readback to userspace - Firefox 150 (iter5-amend, sandbox enabled) — production HW path through libva
- mpv
--hwdec=no(SW baseline) — control
Risk: low. Measurement-only. No driver code changes.
Effort: 3-4 hours including script + run + parsing + markdown table generation.
In-scope (LOCKED 2026-05-06 for iteration 8) — E
Operator locked E as the sole iter8 track. iter8 is the campaign-closing iteration.
D (upstreaming) was dropped 2026-05-06 on philosophical grounds (memory/project_no_upstreaming_philosophical.md).
F (DMABUF on OUTPUT) was dropped 2026-05-06 on technical grounds (track_F_research_2026-05-06.md).
A, B, C closed iter7. iter1-iter6 carries all closed.
iter7 carries (STREAMON-on-context-recreate, pool-size parameterization, slot-leak fault-inject) remain as low-priority items in the campaign-close doc, not iter8 scope.
Out-of-scope (LOCKED 2026-05-06 for iteration 8)
- iter1-iter7 completed work — done.
- Codecs outside H.264 (MPEG-2 dropped iter6, others out per iter1 lock).
- New target hardware (fresnel, ampere) — separate top-level campaigns.
- Upstreaming — dropped on philosophical grounds.
- DMABUF on OUTPUT — dropped on technical grounds.
- Driver code changes — measurement only.
Phase 1 success criterion (LOCKED 2026-05-06 for iteration 8)
Reproducible measurement script committed to
tests/run_perf_binding_cell.sh(or similar) that runs each of the four consumer configurations for ≥30 seconds againstbbb_1080p30_h264.mp4on ohm and emits a markdown-formatted table with the following columns per row: consumer, CPU% median, CPU% p90, drops in measurement window, p50 frame interval (ms), GPU freq median (MHz), VmRSS delta (MiB).Anchored numbers for all four consumers captured into a campaign artifact (
phase7_iter8_perf_anchor.mdor similar). Numbers must come from a clean ohm run on the iter7-end driver (sha54999017…or rebuild from iter7 HEAD7bd0818).Honest qualitative interpretation in the close doc. If the numbers are uglier than expected (e.g., HW decode only saves 30% browser CPU rather than 80%), document that. The campaign's prior qualitative descriptors get re-anchored to the actual data.
Phase 5 sonnet review confirms: (a) script is fixture-agnostic (works for any H.264 file the operator passes), (b) measurements aren't fixture-hardcoded, (c) results are presented honestly without spin.
Campaign close doc (
phase8_iteration8_close.md) explicitly states "campaign closes" and lists residual carries for any future operator who picks this up.
Phase 1 LOCKED. Iteration 8 proceeds.
iter8 = candidate E alone. Phases 2..8 + campaign-close:
- Phase 2: situation analysis — measurement methodology, parsing approach, edge cases (SW baseline drops to dozens at 1080p30, expect Firefox numbers limited by what we can scrape without Firefox-internal hooks)
- Phase 3: baseline anchor — quick smoke run of
pidstat+/sys/class/devfreqpolling on ohm to confirm tooling availability - Phase 4: implement
tests/run_perf_binding_cell.sh - Phase 5: sonnet review
- Phase 6: deploy script (commit + sync to ohm)
- Phase 7: run, capture, generate table
- Phase 8: close iteration AND close campaign
Stop point
After Phase 8 close, the campaign formally closes. Future operator-initiated work would re-open as a new top-level campaign (e.g., fourier-fresnel per project_followon_campaigns.md).