Files
libva-multiplanar/phase8_iteration8_close.md
T
claude-noether 4536dd3283 iter8 + campaign close: Track E GREEN, libva-multiplanar campaign closes
Eight iterations (2026-05-04 → 2026-05-06) close. Operator's primary
goal — Firefox + mpv hardware-decode H.264 on PineTab2 (RK3566 silicon
via hantro/rk3568-vpu DT compatible) end-to-end with sandboxes
enabled — was met at iter6 and is anchored with measured numbers
this iteration.

iter8 perf binding cell (30s per consumer, bbb_1080p30_h264.mp4):
- Firefox-fourier RDD process: 8% CPU during HW decode
- mpv vaapi-copy: 66% CPU vs SW baseline 97% (-31pp, ~32% relative)
- mpv vaapi-dmabuf: silent SW fallback in --vo=null (documented
  limitation; needs a working VO that this hardware doesn't have)
- mpv SW baseline: 97% CPU
- All four configs: zero drops in 30s, decode keeps up with realtime

Phase 5 sonnet review caught 3 issues pre-commit, all fixed:
- pidstat $8 column heuristic broken — replaced with header-driven
  %CPU field detection
- GPU freq median's nested-subshell /dev/stdin pipeline unreliable
  — replaced with temp-file path
- --frames=$((DURATION*30)) hardcoded 30fps — replaced with
  --length=$DURATION (framerate-agnostic)

Phase 1 success criterion: 5/5 gates met.

Tracks dropped (recorded for honest accounting):
- D (upstreaming) — philosophical, AI-slop-buster review climate
- F (DMABUF on OUTPUT) — technical, no consumer exercises it
- MPEG-2 — CPU handles it fine, no user need

Residual carries documented for any future operator:
- STREAMON-on-context-recreate corner case
- Pool-size parameterization
- Fault-inject build for slot-leak recovery
- DMABUF zero-copy mpv perf measurement (needs different harness)
- Firefox-with-HW-disabled SW baseline measurement

Follow-on campaigns chartered separately:
- fourier-fresnel (RK3399 / Pinebook Pro port)
- panvk-bifrost (Vulkan-on-Mali for Bifrost)

Twelve fork commits, three test harnesses, one Firefox sandbox
patch, eight iterations of campaign documentation. All on
git.reauktion.de under claude-noether <claude@reauktion.de> from
iter5 onward.

Campaign closes. Done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:45:04 +00:00

11 KiB
Raw Blame History

Iteration 8 close (Phase 8) — Track E GREEN; campaign closes

Opened 2026-05-06 immediately after iter7 close + post-close research. Locked candidate E (performance binding cell) as the sole iter8 track. iter8 was operator-declared as the campaign-closing iteration: anchors the deliverables to measured numbers, then formally closes the campaign.

Verdict

GREEN with one documented limitation (mpv vaapi-dmabuf path was unmeasured in this run; see Phase 7 anchor for the reason). Campaign formally closes after this iteration.

What landed

Fork commit (libva-v4l2-request-fourier)

  • 65969datests/run_perf_binding_cell.sh (297-line shell harness). Runs four consumer configurations against a fixture for $DURATION seconds, captures CPU% (median + p90 from pidstat by-header parsing), GPU freq median (devfreq sysfs polled at 100ms cadence), drops in window (mpv --term-status-msg), p50 frame interval (mpv only), VmRSS delta (/proc/PID/status). Emits a markdown table with raw numbers per consumer — no aggregation, no improvement ratios, no curated framing.

Campaign artifacts (libva-multiplanar)

  • phase0_findings_iter8.md — substrate + Phase 1 lock (E only)
  • phase7_iter8_perf_anchor.md — measured-numbers anchor (this iteration's data)
  • phase8_iteration8_close.md — this file (iteration close + campaign close)

Phase 5 sonnet review

APPROVE-WITH-CHANGES. Three findings, all addressed before commit:

  1. pidstat $8 column heuristic was broken — the original parser scanned right-to-left for a numeric field, ignored the result, and unconditionally printed $8 (which is %usr, not %CPU, on sysstat 12.7). Fixed: header-driven %CPU field detection. Robust across sysstat point releases.

  2. GPU freq median used unreliable /dev/stdin in nested subshell-over-pipe — implementation-defined behavior, often returns empty. Fixed: temp-file path.

  3. --frames=$((DURATION * 30)) hardcoded 30fps — fixture-hardcoding violation per feedback_no_fixture_hardcoding.md. Fixed: --length=$DURATION (wall-time bounded, framerate-agnostic).

Plus minor: empty cpu_pct.log now emits ERR rather than silent 0, distinguishing measurement failure from "process used no CPU."

Phase 7 results (raw numbers, 30s per consumer)

Consumer CPU% p50 CPU% p90 Drops p50 frame ms GPU MHz median VmRSS Δ MiB
mpv-vaapi-dmabuf 90.00 146.00 0 200 0.0
mpv-vaapi-copy 66.00 68.00 0 200 0.0
firefox-fourier-hw 8.00 9.00 400 9.7
mpv-sw-baseline 97.00 145.00 0 200 0.0

Full interpretation in phase7_iter8_perf_anchor.md. Headlines:

  • Firefox HW decode: RDD process at 8% CPU during sustained 30s 1080p30 decode. The work is in the hantro VPU; RDD orchestrates.
  • mpv vaapi-copy: 66% CPU vs SW baseline 97% — 31 percentage points, ~32% relative CPU reduction. The remaining 66% is mpv's userspace readback (vaGetImage) + demux/parse/scheduling overhead, not decode.
  • mpv vaapi-dmabuf: silent SW fallback in --vo=null configuration. The DMABUF zero-copy path requires --vo=gpu (libplacebo, broken on this hardware due to Mali-G52 Vulkan unsupported state) or --vo=drm (KMS access not available from sudo'd shell). Not a campaign deliverable failure — a measurement-harness limitation. Documented in the anchor doc.
  • GPU MHz median column tracks Mali (compositor freq), not hantro VPU. Misleading for decode-cost reasoning. Future measurement efforts wanting VPU utilization need a separate path.

Phase 1 success criterion — final per gate

Criterion Result
Reproducible measurement script committed to tests/run_perf_binding_cell.sh ✓ HIT — 65969da in fork
Anchored numbers captured into a campaign artifact ✓ HIT — phase7_iter8_perf_anchor.md
Honest qualitative interpretation in close doc ✓ HIT — limitations of the dmabuf measurement path AND of the GPU MHz column documented above
Phase 5 sonnet review confirms script is fixture-agnostic, no fixture-hardcoding, results presented honestly ✓ HIT — APPROVE-WITH-CHANGES, all 3 findings addressed
Campaign close doc explicitly states "campaign closes" ✓ HIT — see "Campaign close" section below

Joint success: 5/5 gates met. iter8 closes GREEN.


Campaign close (libva-multiplanar)

After eight iterations spanning 2026-05-04 through 2026-05-06, the libva-multiplanar campaign formally closes.

Operator's primary goal — MET

Goal: make Firefox + mpv hardware-decode H.264 video on PineTab2 (RK3566 silicon, hantro driver via the rockchip,rk3568-vpu DT compatible) end-to-end, with sandboxes enabled, on the v4l2_request libva backend.

Met at iter6 (Firefox sustained 50s+ on bbb fixture without errors, RDD process holds /dev/video1 + /dev/media0 throughout, lsof verified). iter5-amendment closed Firefox sandbox; iter6 closed the per-OUTPUT-slot REINIT race that broke Firefox's MediaSource pipeline; iter7 hardened the carry items (msync verify, slot-leak recovery, cap_pool race harness).

iter8 anchors the empirical claim with measured numbers (this iteration).

Iteration outcomes

Iter Locked tracks Outcome Date closed
1 initial multi-planar bring-up iter1 known bugs identified + 3-fix list 2026-05-04
2 A+B+E (the three iter1 known bugs) mpv vaapi DMA-BUF "smooth" per operator inspection 2026-05-04
3 F+A (Firefox sandbox + frame-11 EINVAL diagnosis) F GREEN with patch; A diagnosed (fix deferred) 2026-05-05
4 A solo (frame-11 EINVAL fix) GREEN — fresh request_fd per frame, DPB FFmpeg-semantics matching 2026-05-05
5 A+G+B+E (sweep + PGO Firefox + libplacebo + multi-context) GREEN, all four 2026-05-05
5-amend iter5-G Firefox seccomp gap surfaced in real use iter3 patch extended to UtilitySandboxPolicy; sandbox closes 2026-05-05
6 I→AI (Firefox QBUF EINVAL → cap_pool race merger) GREEN — per-OUTPUT-slot REINIT discipline replaces close+alloc 2026-05-05
7 A+B+C (msync verify + slot-leak recovery + cap_pool harness) GREEN — all three; bonus OUTPUT-pool teardown fix 2026-05-06
8 E (perf binding cell) GREEN — numbers anchored 2026-05-06

Eight iterations. Twelve fork commits since the campaign opened (against the bootlin baseline). Three test harnesses in tests/. One firefox-fourier patch (combined broker + RDD seccomp + Utility seccomp, three gates closed).

Tracks dropped + reasons

  • Track D (Bootlin / Mozilla upstreaming) — dropped 2026-05-06 on philosophical grounds. The AI-slop-buster review climate in 2026 maintainership makes the social cost of submission exceed the benefit when personal requirements are met. See memory/project_no_upstreaming_philosophical.md for the operator-verbatim rationale.

  • Track F (V4L2_MEMORY_DMABUF on OUTPUT) — dropped 2026-05-06 on technical merit. Sonnet architect research found: every production V4L2 stateless H.264 consumer (FFmpeg, GStreamer, Chromium) uses MMAP on OUTPUT. Kernel-side DMABUF capability advertised by hantro/rkvdec but unexercised for H.264. Original cap_pool race justification closed organically by iter5/iter6/iter7 fixes. See track_F_research_2026-05-06.md.

  • Multi-codec (MPEG-2) — dropped 2026-05-06 at iter6 close. CPU handles MPEG-2 trivially on the A55 cluster; the campaign's user audience doesn't need MPEG-2 HW path.

Residual carries (low-priority, listed for any future operator picking this up)

  1. STREAMON-on-context-recreate after resolution change — corner case surfaced by the iter7 cap_pool harness when sonnet's pre-commit suggestion to add vaCreateContext was tested. Real consumers (Firefox, mpv) don't trigger this — they create one context per decoder lifetime. iter7 reverted the test to the no-context iter5 sonnet C4 specification; the new bug stays latent.

  2. Pool-size parameterization — iter6 sonnet review suggested max(surfaces_count, DPB_SIZE_H264_MAX) instead of the hardcoded 16. Empirically 16 is fine; not a current bottleneck.

  3. Fault-injection build for slot-leak Track B recovery — Phase 1 success criterion B partial: sonnet code-reviewed the semantics, happy-path regression confirmed clean. A debug build with -DITER7_FAULT_INJECT_REINIT would close the gap empirically. Deferred unless concretely needed.

  4. DMABUF zero-copy mpv perf measurement — iter8 harness couldn't measure this path (--vo=null falls back; --vo=gpu blocked by Mali-G52 Vulkan unavailability; --vo=drm blocked by sudo-shell KMS access). A dedicated harness running as a desktop-session user with a working VO would close this.

  5. Firefox-with-HW-disabled SW baseline — iter8 only measured Firefox's HW path. A complementary Firefox-SW row would frame the saving precisely (estimated 60-80pp+ saving extrapolated from mpv-SW vs mpv-HW).

Memory inheritance for future campaigns

The campaign-specific memory at ~/.claude/projects/-home-mfritsche-src-libva-multiplanar/memory/ contains 12 entries. Most relevant for follow-on work:

  • feedback_request_fd_lifecycle.md — REINIT-vs-close+alloc lessons (iter4 → iter6 evolution)
  • feedback_kernel_obfuscation_compound.md — V4L2 cluster validation pattern (per-control TRY isolation as the diagnostic escape)
  • feedback_seccomp_returns_enosys.md — Firefox sandbox debugging signature
  • reference_ffmpeg_v4l2_request_is_authority.md — FFmpeg's H.264 control semantics as the empirical authority
  • feedback_no_fixture_hardcoding.md — test-harness honesty principle
  • project_no_upstreaming_philosophical.md — campaign-defining stance recorded 2026-05-06

Follow-on campaigns (chartered iter5, separate top-level)

  • fourier-fresnel — port the fork from PineTab2 (RK3566 via hantro/rk3568-vpu) to fresnel RK3399 (Pinebook Pro). Validates iter1-iter8 fixes on a second hardware target. Charter at ~/src/libva-multiplanar/firefox-fourier/... or as a fresh repo when opened.
  • panvk-bifrost — Vulkan-on-Mali for Bifrost-gen GPUs (Mali-G52 on RK3566/RK3568, etc.). Document-only at ~/src/panvk-bifrost/. Sequenced after fourier-fresnel.

Per project_followon_campaigns.md, neither opens without explicit operator instruction.

Final state

  • Driver: libva-v4l2-request-fourier HEAD 65969da, 7aa59a1b... sha256 of installed .so. iter1..iter8 substantive work landed across 12 commits.
  • Firefox: firefox-150.0.1-1.1 with the iter5-amendment combined sandbox patch (broker + RDD seccomp + Utility seccomp). Build infrastructure: firefox-fourier LXD on boltzmann, persistent.
  • Test harnesses: 3 in tests/cap_pool_probe_pattern.c (race regression), run_msync_pixel_verify.sh (pixel-correctness), run_perf_binding_cell.sh (perf anchor).
  • Campaign documentation: phase0_findings_iter[1..8].md, phase8_iteration[1..8]_close.md, plus per-iteration situation/plan/review docs as needed. All committed to git.reauktion.de:marfrit/libva-multiplanar. All committed under claude-noether <claude@reauktion.de> from iter5 onward.

The campaign closes with a working deliverable, anchored numbers, and an honest accounting. Done.