Campaign reopen — iter8's "campaign-closing" status was contingent on "mpv --hwdec=vaapi smooth", which doesn't hold against fresh-install interactive testing. iter9 single-track scope: - Bug #1 (libva-v4l2-request-fourier#1) only - mpv H.264 fresh-login through ≥30s of decode without any of: cap_pool double-init, REQBUFS EBUSY, REINIT bad-fd, OUTPUT ENOMEM - Phase 0 will source-read cap_pool + request_pool + iter6 REINIT, build a vo=null reproduction harness, prepare bisect against iter5 baseline, and a libva-direct C probe for minimal repro Bug #2 (presentation green) is dmabuf-modifier-triage's job — peer campaign opened 2026-05-08 at ~/src/dmabuf-modifier-triage/. README cross-link now points at it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.8 KiB
Phase 0 — iteration 9 substrate (libva-multiplanar campaign — REOPEN)
Opened 2026-05-08 after the iter8 production-tip artifact (libva-v4l2-request-fourier-1.0.0.r280.65969da-1, shipped to [marfrit] 2026-05-08) was found to fail under fresh-install interactive mpv H.264 playback on ohm. iter8's "campaign-closing" status (per phase0_findings_iter8.md line 3) was contingent on the iter5/8 close claim of "mpv --hwdec=vaapi smooth" — that claim does not hold against a fresh-install + fresh-Plasma-session test path.
This is a campaign reopen, not a continuation. iter8 still represents the validated state under the test paths it was measured against (the tests/run_perf_binding_cell.sh harness). iter9 exists to address what the harness didn't catch.
Predecessor close-out summary (iter8 → iter9)
iter8 landed three fork commits on top of iter7:
dcaa1f1(2026-05-06) — docs/silicon-ID fix (PineTab2 = RK3566 silicon).65969da(2026-05-06) —tests/run_perf_binding_cell.shharness for measured per-consumer drop/CPU/freq/memory numbers.- (iter8 close commit not in fork log; close artifact
phase8_iteration8_close.mdrecords GREEN for E on 2026-05-06.)
iter8 then sat for two days. On 2026-05-08:
- The fork was packaged as
libva-v4l2-request-fourier(PKGBUILD at~/src/marfrit-packages/arch/libva-v4l2-request-fourier/), pinned to_commit=65969da. CI built and published to[marfrit]via Gitea Actions run #65 success. ohm pulled the package viapacman -Syu.[marfrit]repo enabled in/etc/pacman.conf./etc/profile.d/libva-v4l2-request.shexportsLIBVA_DRIVER_NAME=v4l2_request+LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1+LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0. vainfoconfirmed the new driver loads cleanly, enumerates the H.264 + MPEG-2 profile list (same shape as predecessor).- Interactive
mpv --hwdec=vaapi --vo=dmabuf-wayland fourier-test/bbb_1080p30_h264.mp4immediately hit the cap_pool / REQBUFS / REINIT cascade described in marfrit/libva-v4l2-request-fourier#1. - Separately: even the
--hwdec=v4l2request(libva-bypassed) path produced solid green frames via--vo=dmabuf-wayland, isolated to a different bug. That second bug moved to its own peer campaign at~/src/dmabuf-modifier-triage/and does not gate iter9.
Locked research question — iter9
Triage and fix the probe-then-decode lifecycle cascade exposed in interactive mpv H.264 playback at the iter8 production tip — fresh login through to ≥30s of decode without any of the following events:
cap_pool_initfiring twice for overlapping slot ranges in a single mpv invocation,VIDIOC_REQBUFSreturning EBUSY,Unable to reinit media request: Bad file descriptor,Unable to create buffer for type 9: No buffer space available. The campaign re-closes only when this test path passes from a freshly-logged-in Plasma session.
This is the test path iter5/8 closes implicitly claimed worked. iter9 makes the claim explicit and verifiable.
Hypothesis space (Phase 0 must read source to confirm)
Three layers can produce the observed cascade:
-
cap_poollifecycle insrc/cap_pool.c+ callers. Twocap_pool_initevents for slot range 0..23 in close succession before the first decoded frame strongly suggests probe-context + decode-context double-init without teardown between. mpv's VA-API call sequence is roughly:vaInitialize→vaQueryConfigProfiles→vaCreateConfig→vaCreateContext→vaCreateSurfaces→ decode loop. Ifcap_pool_initis wired intovaCreateContextrather thanvaCreateConfig, both the probe context and the actual context would init separately and require teardown to be symmetric. -
request_poollifecycle + iter6's REINIT. "Unable to reinit media request: Bad file descriptor" is a direct iter6 output (commita09c03c, "iter6 fix: per-OUTPUT-slot request_fd binding via REINIT"). The fd is being closed before REINIT runs. Possible causes: request_pool teardown closes the fd unconditionally, or the iter7 slot-leak fix (commit988b848addsrequest_pool_force_release) mistakenly closes a still-bound request fd. -
VIDIOC_REQBUFS without prior STREAMOFF. EBUSY on REQBUFS means the queue is in
STREAMINGstate. The fork's STREAMOFF call sites need to be audited — everyREQBUFS(count=N)after a previous successfulREQBUFS(count=M, M>0)must be preceded bySTREAMOFFif the queue was started in between.
Phase 0 will deliver
-
Source-read of
cap_pool+request_pool+surface.c+context.cat commit65969da. Output:phase0_iter9_source_read.mdcapturing the actual call graph: whichvaXxxentry point triggerscap_pool_init, when STREAMON happens, when STREAMOFF happens, when REINIT happens, who owns each fd. Read against the iter5 sweep commits (951233a,848fc0c,843febc,d3a299b,c8b6ede,b993355) and the iter6/7 fix commits (a09c03c,988b848,7bd0818) so the diff that introduced the leak is identifiable. -
Reproduction harness —
tests/run_iter9_lifecycle_repro.sh. Wraps a singlempv --hwdec=vaapi --vo=null --frames=300 ...call (vo=null isolates the bug from any presentation issues) withLIBVA_TRACEcapture, parses output for the four cascade signatures, exits non-zero if any signature fires. Anchored tobbb_1080p30_h264.mp4. Critical: must launch from a fresh subshell with no leftover env / VA state so probe-then-decode lifecycle is exercised. -
Bisection plan against the iter5..iter8 commit range. If the source-read in (1) doesn't unambiguously identify the regression-introducing commit, prepare a
git bisectscript using the harness in (2) so phase 4 can mechanically narrow. -
iter5 close re-validation. Re-run the harness from (2) against the iter5-state commit (
c8b6ede= "iter5 sweep follow-up"). Two outcomes — both useful:- If iter5 also fails → the bug pre-dates iter6/7's additions and the iter5 close was over-claimed.
- If iter5 passes → iter6 (request_fd REINIT) or iter7 (slot-leak) introduced the regression. Bisect (3) narrows further.
-
Sanity check against a single-track ffmpeg vainfo + decode probe. Build a small C harness that calls
vaInitialize+vaQueryConfigProfiles+vaCreateConfig+vaCreateContext+vaCreateSurfaces+ a single decode + teardown, all via libva direct (no mpv, no ffmpeg). If the harness reproduces the cascade with no mpv complexity, the test surface area for phase 4's fix shrinks dramatically.
After Phase 0 closes, Phase 1 will replicate the baseline from items 2 + 4 (per feedback_replicate_baseline_first.md). Phase 2 will source-deep-dive on the layer item 1 fingered. Phase 3 will write the deterministic regression test. Phase 4 will fix. Phase 5 review will be sonnet.
In-scope (LOCKED 2026-05-08 for iteration 9)
Single-track. Decoder-side cascade only.
- Bug #1 (libva-v4l2-request-fourier#1) only.
- Test fixture:
~/fourier-test/bbb_1080p30_h264.mp4(already on ohm). - Target host: ohm. fresnel sits — fresnel-fourier is a peer campaign with separate iteration cadence; whatever iter9 fixes on ohm will be auto-inherited when fresnel-fourier rebuilds against the same fork master.
Out-of-scope (LOCKED 2026-05-08 for iteration 9)
- Bug #2 (libva-multiplanar#1) — owned by
~/src/dmabuf-modifier-triage/. Not gating iter9. - Performance / measurement. iter8's perf binding cell is already in the fork (
tests/run_perf_binding_cell.sh) and re-runs as part of any future iteration close. iter9 only needs to demonstrate that the cascade no longer fires; numbers are not a deliverable. - Other consumers (Firefox, chromium-fourier, vainfo). mpv is the consumer that surfaced the bug; mpv is the consumer iter9 closes against. Sweep to other consumers is iter10's call.
- Other codecs (HEVC, VP9). H.264 only.
- Other hardware. ohm only.
- Upstreaming. Per
feedback_no_upstream.md.
Reference history
phase0_findings.md— original campaign Phase 0 substrate.phase0_findings_iter[2-8].md— per-iter substrate.phase8_iteration[1-8]_close.md— per-iter close artifacts. Re-read iter5 + iter7 + iter8 closes specifically; the iter9 hypothesis space refers back to their explicit fix commits.~/src/libva-multiplanar/libva-v4l2-request-fourier/— fork atmaster = 229d6d1today (with fresnel-fourier MPEG-2 commits past65969da); iter9 work happens on this master, with potentially a feature branch if the fix becomes large enough to warrant one.~/src/marfrit-packages/arch/libva-v4l2-request-fourier/PKGBUILD— bump_commitafter iter9 close + close validation passes.