4536dd3283
Eight iterations (2026-05-04 → 2026-05-06) close. Operator's primary goal — Firefox + mpv hardware-decode H.264 on PineTab2 (RK3566 silicon via hantro/rk3568-vpu DT compatible) end-to-end with sandboxes enabled — was met at iter6 and is anchored with measured numbers this iteration. iter8 perf binding cell (30s per consumer, bbb_1080p30_h264.mp4): - Firefox-fourier RDD process: 8% CPU during HW decode - mpv vaapi-copy: 66% CPU vs SW baseline 97% (-31pp, ~32% relative) - mpv vaapi-dmabuf: silent SW fallback in --vo=null (documented limitation; needs a working VO that this hardware doesn't have) - mpv SW baseline: 97% CPU - All four configs: zero drops in 30s, decode keeps up with realtime Phase 5 sonnet review caught 3 issues pre-commit, all fixed: - pidstat $8 column heuristic broken — replaced with header-driven %CPU field detection - GPU freq median's nested-subshell /dev/stdin pipeline unreliable — replaced with temp-file path - --frames=$((DURATION*30)) hardcoded 30fps — replaced with --length=$DURATION (framerate-agnostic) Phase 1 success criterion: 5/5 gates met. Tracks dropped (recorded for honest accounting): - D (upstreaming) — philosophical, AI-slop-buster review climate - F (DMABUF on OUTPUT) — technical, no consumer exercises it - MPEG-2 — CPU handles it fine, no user need Residual carries documented for any future operator: - STREAMON-on-context-recreate corner case - Pool-size parameterization - Fault-inject build for slot-leak recovery - DMABUF zero-copy mpv perf measurement (needs different harness) - Firefox-with-HW-disabled SW baseline measurement Follow-on campaigns chartered separately: - fourier-fresnel (RK3399 / Pinebook Pro port) - panvk-bifrost (Vulkan-on-Mali for Bifrost) Twelve fork commits, three test harnesses, one Firefox sandbox patch, eight iterations of campaign documentation. All on git.reauktion.de under claude-noether <claude@reauktion.de> from iter5 onward. Campaign closes. Done. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
137 lines
11 KiB
Markdown
137 lines
11 KiB
Markdown
# Iteration 8 close (Phase 8) — Track E GREEN; campaign closes
|
||
|
||
Opened 2026-05-06 immediately after iter7 close + post-close research. Locked candidate **E** (performance binding cell) as the sole iter8 track. iter8 was operator-declared as the **campaign-closing iteration**: anchors the deliverables to measured numbers, then formally closes the campaign.
|
||
|
||
## Verdict
|
||
|
||
GREEN with one documented limitation (mpv vaapi-dmabuf path was unmeasured in this run; see Phase 7 anchor for the reason). Campaign formally closes after this iteration.
|
||
|
||
## What landed
|
||
|
||
### Fork commit (libva-v4l2-request-fourier)
|
||
|
||
- `65969da` — `tests/run_perf_binding_cell.sh` (297-line shell harness). Runs four consumer configurations against a fixture for `$DURATION` seconds, captures CPU% (median + p90 from pidstat by-header parsing), GPU freq median (devfreq sysfs polled at 100ms cadence), drops in window (mpv `--term-status-msg`), p50 frame interval (mpv only), VmRSS delta (`/proc/PID/status`). Emits a markdown table with raw numbers per consumer — no aggregation, no improvement ratios, no curated framing.
|
||
|
||
### Campaign artifacts (libva-multiplanar)
|
||
|
||
- `phase0_findings_iter8.md` — substrate + Phase 1 lock (E only)
|
||
- `phase7_iter8_perf_anchor.md` — measured-numbers anchor (this iteration's data)
|
||
- `phase8_iteration8_close.md` — this file (iteration close + campaign close)
|
||
|
||
## Phase 5 sonnet review
|
||
|
||
APPROVE-WITH-CHANGES. Three findings, all addressed before commit:
|
||
|
||
1. **pidstat `$8` column heuristic was broken** — the original parser scanned right-to-left for a numeric field, ignored the result, and unconditionally printed `$8` (which is `%usr`, not `%CPU`, on sysstat 12.7). Fixed: header-driven `%CPU` field detection. Robust across sysstat point releases.
|
||
|
||
2. **GPU freq median used unreliable `/dev/stdin` in nested subshell-over-pipe** — implementation-defined behavior, often returns empty. Fixed: temp-file path.
|
||
|
||
3. **`--frames=$((DURATION * 30))` hardcoded 30fps** — fixture-hardcoding violation per `feedback_no_fixture_hardcoding.md`. Fixed: `--length=$DURATION` (wall-time bounded, framerate-agnostic).
|
||
|
||
Plus minor: empty `cpu_pct.log` now emits `ERR` rather than silent `0`, distinguishing measurement failure from "process used no CPU."
|
||
|
||
## Phase 7 results (raw numbers, 30s per consumer)
|
||
|
||
| Consumer | CPU% p50 | CPU% p90 | Drops | p50 frame ms | GPU MHz median | VmRSS Δ MiB |
|
||
|---|---|---|---|---|---|---|
|
||
| mpv-vaapi-dmabuf | 90.00 | 146.00 | 0 | — | 200 | 0.0 |
|
||
| mpv-vaapi-copy | 66.00 | 68.00 | 0 | — | 200 | 0.0 |
|
||
| firefox-fourier-hw | 8.00 | 9.00 | — | — | 400 | 9.7 |
|
||
| mpv-sw-baseline | 97.00 | 145.00 | 0 | — | 200 | 0.0 |
|
||
|
||
Full interpretation in `phase7_iter8_perf_anchor.md`. Headlines:
|
||
|
||
- **Firefox HW decode**: RDD process at 8% CPU during sustained 30s 1080p30 decode. The work is in the hantro VPU; RDD orchestrates.
|
||
- **mpv vaapi-copy**: 66% CPU vs SW baseline 97% — **31 percentage points, ~32% relative CPU reduction.** The remaining 66% is mpv's userspace readback (`vaGetImage`) + demux/parse/scheduling overhead, not decode.
|
||
- **mpv vaapi-dmabuf**: silent SW fallback in `--vo=null` configuration. The DMABUF zero-copy path requires `--vo=gpu` (libplacebo, broken on this hardware due to Mali-G52 Vulkan unsupported state) or `--vo=drm` (KMS access not available from sudo'd shell). Not a campaign deliverable failure — a measurement-harness limitation. Documented in the anchor doc.
|
||
- **GPU MHz median column tracks Mali (compositor freq), not hantro VPU.** Misleading for decode-cost reasoning. Future measurement efforts wanting VPU utilization need a separate path.
|
||
|
||
## Phase 1 success criterion — final per gate
|
||
|
||
| Criterion | Result |
|
||
|---|---|
|
||
| Reproducible measurement script committed to `tests/run_perf_binding_cell.sh` | ✓ HIT — `65969da` in fork |
|
||
| Anchored numbers captured into a campaign artifact | ✓ HIT — `phase7_iter8_perf_anchor.md` |
|
||
| Honest qualitative interpretation in close doc | ✓ HIT — limitations of the dmabuf measurement path AND of the GPU MHz column documented above |
|
||
| Phase 5 sonnet review confirms script is fixture-agnostic, no fixture-hardcoding, results presented honestly | ✓ HIT — APPROVE-WITH-CHANGES, all 3 findings addressed |
|
||
| Campaign close doc explicitly states "campaign closes" | ✓ HIT — see "Campaign close" section below |
|
||
|
||
**Joint success: 5/5 gates met.** iter8 closes GREEN.
|
||
|
||
---
|
||
|
||
# Campaign close (libva-multiplanar)
|
||
|
||
After eight iterations spanning 2026-05-04 through 2026-05-06, the libva-multiplanar campaign formally **closes**.
|
||
|
||
## Operator's primary goal — MET
|
||
|
||
**Goal**: make Firefox + mpv hardware-decode H.264 video on PineTab2 (RK3566 silicon, hantro driver via the `rockchip,rk3568-vpu` DT compatible) end-to-end, with sandboxes enabled, on the v4l2_request libva backend.
|
||
|
||
**Met at iter6** (Firefox sustained 50s+ on bbb fixture without errors, RDD process holds `/dev/video1` + `/dev/media0` throughout, lsof verified). iter5-amendment closed Firefox sandbox; iter6 closed the per-OUTPUT-slot REINIT race that broke Firefox's MediaSource pipeline; iter7 hardened the carry items (msync verify, slot-leak recovery, cap_pool race harness).
|
||
|
||
iter8 anchors the empirical claim with measured numbers (this iteration).
|
||
|
||
## Iteration outcomes
|
||
|
||
| Iter | Locked tracks | Outcome | Date closed |
|
||
|---|---|---|---|
|
||
| 1 | initial multi-planar bring-up | iter1 known bugs identified + 3-fix list | 2026-05-04 |
|
||
| 2 | A+B+E (the three iter1 known bugs) | mpv vaapi DMA-BUF "smooth" per operator inspection | 2026-05-04 |
|
||
| 3 | F+A (Firefox sandbox + frame-11 EINVAL diagnosis) | F GREEN with patch; A diagnosed (fix deferred) | 2026-05-05 |
|
||
| 4 | A solo (frame-11 EINVAL fix) | GREEN — fresh request_fd per frame, DPB FFmpeg-semantics matching | 2026-05-05 |
|
||
| 5 | A+G+B+E (sweep + PGO Firefox + libplacebo + multi-context) | GREEN, all four | 2026-05-05 |
|
||
| 5-amend | iter5-G Firefox seccomp gap surfaced in real use | iter3 patch extended to UtilitySandboxPolicy; sandbox closes | 2026-05-05 |
|
||
| 6 | I→A∪I (Firefox QBUF EINVAL → cap_pool race merger) | GREEN — per-OUTPUT-slot REINIT discipline replaces close+alloc | 2026-05-05 |
|
||
| 7 | A+B+C (msync verify + slot-leak recovery + cap_pool harness) | GREEN — all three; bonus OUTPUT-pool teardown fix | 2026-05-06 |
|
||
| 8 | E (perf binding cell) | GREEN — numbers anchored | 2026-05-06 |
|
||
|
||
Eight iterations. Twelve fork commits since the campaign opened (against the bootlin baseline). Three test harnesses in `tests/`. One firefox-fourier patch (combined broker + RDD seccomp + Utility seccomp, three gates closed).
|
||
|
||
## Tracks dropped + reasons
|
||
|
||
- **Track D** (Bootlin / Mozilla upstreaming) — dropped 2026-05-06 on philosophical grounds. The AI-slop-buster review climate in 2026 maintainership makes the social cost of submission exceed the benefit when personal requirements are met. See `memory/project_no_upstreaming_philosophical.md` for the operator-verbatim rationale.
|
||
|
||
- **Track F** (V4L2_MEMORY_DMABUF on OUTPUT) — dropped 2026-05-06 on technical merit. Sonnet architect research found: every production V4L2 stateless H.264 consumer (FFmpeg, GStreamer, Chromium) uses MMAP on OUTPUT. Kernel-side DMABUF capability advertised by hantro/rkvdec but unexercised for H.264. Original cap_pool race justification closed organically by iter5/iter6/iter7 fixes. See `track_F_research_2026-05-06.md`.
|
||
|
||
- **Multi-codec (MPEG-2)** — dropped 2026-05-06 at iter6 close. CPU handles MPEG-2 trivially on the A55 cluster; the campaign's user audience doesn't need MPEG-2 HW path.
|
||
|
||
## Residual carries (low-priority, listed for any future operator picking this up)
|
||
|
||
1. **STREAMON-on-context-recreate after resolution change** — corner case surfaced by the iter7 cap_pool harness when sonnet's pre-commit suggestion to add `vaCreateContext` was tested. Real consumers (Firefox, mpv) don't trigger this — they create one context per decoder lifetime. iter7 reverted the test to the no-context iter5 sonnet C4 specification; the new bug stays latent.
|
||
|
||
2. **Pool-size parameterization** — iter6 sonnet review suggested `max(surfaces_count, DPB_SIZE_H264_MAX)` instead of the hardcoded 16. Empirically 16 is fine; not a current bottleneck.
|
||
|
||
3. **Fault-injection build for slot-leak Track B recovery** — Phase 1 success criterion B partial: sonnet code-reviewed the semantics, happy-path regression confirmed clean. A debug build with `-DITER7_FAULT_INJECT_REINIT` would close the gap empirically. Deferred unless concretely needed.
|
||
|
||
4. **DMABUF zero-copy mpv perf measurement** — iter8 harness couldn't measure this path (`--vo=null` falls back; `--vo=gpu` blocked by Mali-G52 Vulkan unavailability; `--vo=drm` blocked by sudo-shell KMS access). A dedicated harness running as a desktop-session user with a working VO would close this.
|
||
|
||
5. **Firefox-with-HW-disabled SW baseline** — iter8 only measured Firefox's HW path. A complementary Firefox-SW row would frame the saving precisely (estimated 60-80pp+ saving extrapolated from mpv-SW vs mpv-HW).
|
||
|
||
## Memory inheritance for future campaigns
|
||
|
||
The campaign-specific memory at `~/.claude/projects/-home-mfritsche-src-libva-multiplanar/memory/` contains 12 entries. Most relevant for follow-on work:
|
||
|
||
- `feedback_request_fd_lifecycle.md` — REINIT-vs-close+alloc lessons (iter4 → iter6 evolution)
|
||
- `feedback_kernel_obfuscation_compound.md` — V4L2 cluster validation pattern (per-control TRY isolation as the diagnostic escape)
|
||
- `feedback_seccomp_returns_enosys.md` — Firefox sandbox debugging signature
|
||
- `reference_ffmpeg_v4l2_request_is_authority.md` — FFmpeg's H.264 control semantics as the empirical authority
|
||
- `feedback_no_fixture_hardcoding.md` — test-harness honesty principle
|
||
- `project_no_upstreaming_philosophical.md` — campaign-defining stance recorded 2026-05-06
|
||
|
||
## Follow-on campaigns (chartered iter5, separate top-level)
|
||
|
||
- **`fourier-fresnel`** — port the fork from PineTab2 (RK3566 via hantro/rk3568-vpu) to fresnel RK3399 (Pinebook Pro). Validates iter1-iter8 fixes on a second hardware target. Charter at `~/src/libva-multiplanar/firefox-fourier/...` or as a fresh repo when opened.
|
||
- **`panvk-bifrost`** — Vulkan-on-Mali for Bifrost-gen GPUs (Mali-G52 on RK3566/RK3568, etc.). Document-only at `~/src/panvk-bifrost/`. Sequenced after fourier-fresnel.
|
||
|
||
Per `project_followon_campaigns.md`, neither opens without explicit operator instruction.
|
||
|
||
## Final state
|
||
|
||
- **Driver**: `libva-v4l2-request-fourier` HEAD `65969da`, `7aa59a1b...` sha256 of installed `.so`. iter1..iter8 substantive work landed across 12 commits.
|
||
- **Firefox**: `firefox-150.0.1-1.1` with the iter5-amendment combined sandbox patch (broker + RDD seccomp + Utility seccomp). Build infrastructure: `firefox-fourier` LXD on boltzmann, persistent.
|
||
- **Test harnesses**: 3 in `tests/` — `cap_pool_probe_pattern.c` (race regression), `run_msync_pixel_verify.sh` (pixel-correctness), `run_perf_binding_cell.sh` (perf anchor).
|
||
- **Campaign documentation**: `phase0_findings_iter[1..8].md`, `phase8_iteration[1..8]_close.md`, plus per-iteration situation/plan/review docs as needed. All committed to `git.reauktion.de:marfrit/libva-multiplanar`. All committed under `claude-noether <claude@reauktion.de>` from iter5 onward.
|
||
|
||
The campaign closes with a working deliverable, anchored numbers, and an honest accounting. **Done.**
|