iter9 input: cap_pool double-init + REQBUFS EBUSY + REINIT bad-fd cascade under mpv H.264 playback (ohm/RK3568) #1
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
At the iter8 production tip (commit
65969da3, packaged aslibva-v4l2-request-fourier-1.0.0.r280.65969da-1and shipped to[marfrit]2026-05-08), interactive mpv H.264 playback on ohm hits a CAPTURE-pool / OUTPUT-queue lifecycle bug that the iter5/8 closes did not catch. The decoder produces no picture; ffmpeg gives up after severalFailed to create surfaceerrors.Trace (verbatim from user's session, mfritsche@ohm 2026-05-08)
Diagnosis
cap_pool_initfires twice for the same slot range (0..23) before any decoded frame, then a third time at 24..47 (pool growth). Strongly suggests probe-then-decode lifecycle: VA-API capability probe creates a context that initializes the pool, mpv then creates the actual decode context which initializes the pool again without the first being torn down.VIDIOC_REQBUFSreturns EBUSY → the queue is still STREAMON'd from the first init. Iter5/6/7's lifecycle code is not STREAMOFF'ing before reconfiguring.Unable to reinit media request: Bad file descriptor→ iter6's per-OUTPUT-slot REINIT (commita09c03c) is using a stale fd, suggesting the request_fd binding lifecycle is racing with context teardown.Unable to create buffer for type 9: No buffer space availablecascade is the kernel's OUTPUT queue running out of memory because failed REQBUFS calls leave the previous count allocated.Verification that bug is libva-side
mpv --hwdec=v4l2request --vo=gpu fourier-test/bbb_1080p30_h264.mp4→ correct picture displayed (slow due to GPU shader path on Mali-G52, but content is real). This bypasses libva entirely and uses ffmpeg's native v4l2-request hwaccel against/dev/media0directly.Iter5/8 close did not catch this because
tests/run_perf_binding_cell.shharness exercises a different consumer matrix and likely doesn't trigger the probe-then-decode handshake that mpv does (mpv callsvaQueryConfigProfiles+vaCreateConfig+vaCreateContextin close succession; the harness probably reuses one context).--hwdec=vaapismooth" — but the validation evidence files don't show a re-test against a freshly-installed package on a freshly-logged-in session, so the lifecycle path here may have been masked by a long-running mpv process that survived the bug.Suggested iter9 starting points
vainfoprofile-list +vaCreateContext+ decode path back-to-back, mirroring mpv's exact call sequence. The iter7 cap_pool harness alone is insufficient.cap_pool+request_poolonvaDestroyContext— verify STREAMOFF is called before the next REQBUFS.Environment
libva-v4l2-request-fourier-1.0.0.r280.65969da-1from[marfrit](commit65969da3= iter8 close)ffmpeg-v4l2-request 2:8.1-3installed)LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0Workaround
Use
--hwdec=v4l2requestinstead of--hwdec=vaapi— bypasses libva entirely. Same hardware decode, different code path.Not reproducible at iter39 (
cf8cd9d). Closing as fixed.Empirical re-test (2026-05-18, ohm, RK3568 / hantro G1)
Upgraded
libva-v4l2-request-fourier 1.0.0.r280.65969da-1(iter8, the version this issue was filed against) →1:1.0.0.r361.cf8cd9d-1(iter39 + Hi10P/Main10 work).Reproducer A — the issue's exact mpv command (workaround equivalent:
vaapi-copy)Bugs from the original trace: ✅ all gone
cap_pool_initdouble-init for same slot range → GONE (single init only)Unable to request buffers: Device or resource busy(REQBUFS EBUSY) → GONEUnable to reinit media request: Bad file descriptor(iter6 REINIT path) → GONENo buffer space available→ GONEFailed to create surface: 2cascade → GONEReproducer B — direct ffmpeg-vaapi decode (30 frames, real content sanity check)
Root-cause class — fixed by which iterations?
The issue's diagnosis section called out three distinct lifecycle bugs:
RequestDestroyContextnow tears downoutput_poolBEFOREREQBUFS(0)ANDcap_pool_destroyBEFORE the nextCreateContext(see comment block atsrc/context.c:756–767"Phase 5 v2 review CRIT-2").src/context.c:735–742— both OUTPUT and CAPTURE STREAMOFF unconditionally on context destroy.MEDIA_IOC_REQUEST_ALLOCper request lifecycle.Plus separately, iter38 multi-device probe + iter39 Hi10P/Main10 sub-profile work both exercised the create/destroy cycle heavily and would have re-surfaced any lingering lifecycle bugs.
Caveats / out of scope
--hwdec=vaapi(no-copy) still fails on ohm with[vd] Could not create device. Using software decoding.— but that's an ffmpeg-sideav_hwdevice_ctx_createfailure during the vaapi-drm/vaapi-wayland adapter probe, NOT the libva CAPTURE-pool / OUTPUT-queue lifecycle bug this issue documented. Separate ticket if it matters; the bypass--hwdec=vaapi-copyworks, and direct ffmpeg-vaapi works.Unable to set control(s): Invalid argumentline during cap_pool_init is transient and harmless — the decode continues and produces correct content. Worth a follow-up cleanup but not a regression.Closing.