iter9 input: cap_pool double-init + REQBUFS EBUSY + REINIT bad-fd cascade under mpv H.264 playback (ohm/RK3568) #1

Closed
opened 2026-05-08 13:35:44 +00:00 by marfrit · 1 comment
Owner

Summary

At the iter8 production tip (commit 65969da3, packaged as libva-v4l2-request-fourier-1.0.0.r280.65969da-1 and shipped to [marfrit] 2026-05-08), interactive mpv H.264 playback on ohm hits a CAPTURE-pool / OUTPUT-queue lifecycle bug that the iter5/8 closes did not catch. The decoder produces no picture; ffmpeg gives up after several Failed to create surface errors.

Trace (verbatim from user's session, mfritsche@ohm 2026-05-08)

mpv --hwdec=vaapi --vo=dmabuf-wayland --target-colorspace-hint=no fourier-test/bbb_1080p30_h264.mp4
● Video --vid=1 --vlang=eng (h264 1920x1080 24 fps) [default]
v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot)
v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot)   ← double-init
Using hardware decoding (vaapi).
v4l2-request: Unable to request buffers: Device or resource busy                       ← EBUSY on REQBUFS
v4l2-request: Unable to request buffers: Device or resource busy
v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=24..47, 1 plane(s) per slot)   ← pool grows
v4l2-request: Unable to reinit media request: Bad file descriptor                      ← iter6 REINIT path
[ffmpeg/video] h264: Failed to end picture decode issue: 1 (operation failed).
v4l2-request: Unable to query buffer: Invalid argument
v4l2-request: cap_pool_init: query_buffer failed for slot 16 (v4l2_index=64)
v4l2-request: Unable to create buffer for type 9: No buffer space available            ← OUTPUT (V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) exhausted
[ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource allocation failed).
[ffmpeg/video] h264: get_buffer() failed
[ffmpeg/video] h264: thread_get_buffer() failed

Diagnosis

  • cap_pool_init fires twice for the same slot range (0..23) before any decoded frame, then a third time at 24..47 (pool growth). Strongly suggests probe-then-decode lifecycle: VA-API capability probe creates a context that initializes the pool, mpv then creates the actual decode context which initializes the pool again without the first being torn down.
  • VIDIOC_REQBUFS returns EBUSY → the queue is still STREAMON'd from the first init. Iter5/6/7's lifecycle code is not STREAMOFF'ing before reconfiguring.
  • Unable to reinit media request: Bad file descriptor → iter6's per-OUTPUT-slot REINIT (commit a09c03c) is using a stale fd, suggesting the request_fd binding lifecycle is racing with context teardown.
  • The Unable to create buffer for type 9: No buffer space available cascade is the kernel's OUTPUT queue running out of memory because failed REQBUFS calls leave the previous count allocated.

Verification that bug is libva-side

  • mpv --hwdec=v4l2request --vo=gpu fourier-test/bbb_1080p30_h264.mp4correct picture displayed (slow due to GPU shader path on Mali-G52, but content is real). This bypasses libva entirely and uses ffmpeg's native v4l2-request hwaccel against /dev/media0 directly.
  • Therefore the hantro decoder, kernel V4L2 stateless API, mpv, and the dmabuf path are all functional. The bug is in libva-v4l2-request-fourier's CAPTURE pool / request lifecycle code (iter6/7 territory).

Iter5/8 close did not catch this because

  • iter8's tests/run_perf_binding_cell.sh harness exercises a different consumer matrix and likely doesn't trigger the probe-then-decode handshake that mpv does (mpv calls vaQueryConfigProfiles + vaCreateConfig + vaCreateContext in close succession; the harness probably reuses one context).
  • iter5/8 close said "mpv --hwdec=vaapi smooth" — but the validation evidence files don't show a re-test against a freshly-installed package on a freshly-logged-in session, so the lifecycle path here may have been masked by a long-running mpv process that survived the bug.

Suggested iter9 starting points

  1. Add a regression test that exercises the vainfo profile-list + vaCreateContext + decode path back-to-back, mirroring mpv's exact call sequence. The iter7 cap_pool harness alone is insufficient.
  2. Audit teardown of cap_pool + request_pool on vaDestroyContext — verify STREAMOFF is called before the next REQBUFS.
  3. Audit iter6's REINIT path for fd lifetime: which entity owns the request fd and when does it get closed vs reused.
  4. Consider whether the per-OUTPUT-slot REINIT introduced by iter6 needs to be conditionalized on a context-stable flag, not run unconditionally on every request.

Environment

  • Host: ohm (PineTab2, Rockchip RK3568, hantro G1)
  • Kernel: as shipped by EndeavourOS-ARM custom
  • Package: libva-v4l2-request-fourier-1.0.0.r280.65969da-1 from [marfrit] (commit 65969da3 = iter8 close)
  • libva: 2.23.0-1 → VA-API 1.23
  • mpv: v0.41.0
  • ffmpeg runtime: n8.1 (ffmpeg-v4l2-request 2:8.1-3 installed)
  • Compositor: KWin 6.x via Plasma 6 Wayland
  • env: LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0

Workaround

Use --hwdec=v4l2request instead of --hwdec=vaapi — bypasses libva entirely. Same hardware decode, different code path.

## Summary At the iter8 production tip (commit `65969da3`, packaged as `libva-v4l2-request-fourier-1.0.0.r280.65969da-1` and shipped to `[marfrit]` 2026-05-08), interactive mpv H.264 playback on ohm hits a CAPTURE-pool / OUTPUT-queue lifecycle bug that the iter5/8 closes did not catch. The decoder produces no picture; ffmpeg gives up after several `Failed to create surface` errors. ## Trace (verbatim from user's session, mfritsche@ohm 2026-05-08) ``` mpv --hwdec=vaapi --vo=dmabuf-wayland --target-colorspace-hint=no fourier-test/bbb_1080p30_h264.mp4 ● Video --vid=1 --vlang=eng (h264 1920x1080 24 fps) [default] v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot) v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot) ← double-init Using hardware decoding (vaapi). v4l2-request: Unable to request buffers: Device or resource busy ← EBUSY on REQBUFS v4l2-request: Unable to request buffers: Device or resource busy v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=24..47, 1 plane(s) per slot) ← pool grows v4l2-request: Unable to reinit media request: Bad file descriptor ← iter6 REINIT path [ffmpeg/video] h264: Failed to end picture decode issue: 1 (operation failed). v4l2-request: Unable to query buffer: Invalid argument v4l2-request: cap_pool_init: query_buffer failed for slot 16 (v4l2_index=64) v4l2-request: Unable to create buffer for type 9: No buffer space available ← OUTPUT (V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) exhausted [ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource allocation failed). [ffmpeg/video] h264: get_buffer() failed [ffmpeg/video] h264: thread_get_buffer() failed ``` ## Diagnosis - `cap_pool_init` fires twice for the same slot range (0..23) before any decoded frame, then a third time at 24..47 (pool growth). Strongly suggests probe-then-decode lifecycle: VA-API capability probe creates a context that initializes the pool, mpv then creates the actual decode context which initializes the pool *again* without the first being torn down. - `VIDIOC_REQBUFS` returns EBUSY → the queue is still STREAMON'd from the first init. Iter5/6/7's lifecycle code is not STREAMOFF'ing before reconfiguring. - `Unable to reinit media request: Bad file descriptor` → iter6's per-OUTPUT-slot REINIT (commit `a09c03c`) is using a stale fd, suggesting the request_fd binding lifecycle is racing with context teardown. - The `Unable to create buffer for type 9: No buffer space available` cascade is the kernel's OUTPUT queue running out of memory because failed REQBUFS calls leave the previous count allocated. ## Verification that bug is libva-side - `mpv --hwdec=v4l2request --vo=gpu fourier-test/bbb_1080p30_h264.mp4` → **correct picture displayed** (slow due to GPU shader path on Mali-G52, but content is real). This bypasses libva entirely and uses ffmpeg's native v4l2-request hwaccel against `/dev/media0` directly. - Therefore the hantro decoder, kernel V4L2 stateless API, mpv, and the dmabuf path are all functional. The bug is in libva-v4l2-request-fourier's CAPTURE pool / request lifecycle code (iter6/7 territory). ## Iter5/8 close did not catch this because - iter8's `tests/run_perf_binding_cell.sh` harness exercises a different consumer matrix and likely doesn't trigger the probe-then-decode handshake that mpv does (mpv calls `vaQueryConfigProfiles` + `vaCreateConfig` + `vaCreateContext` in close succession; the harness probably reuses one context). - iter5/8 close said "mpv `--hwdec=vaapi` smooth" — but the validation evidence files don't show a re-test against a freshly-installed package on a freshly-logged-in session, so the lifecycle path here may have been masked by a long-running mpv process that survived the bug. ## Suggested iter9 starting points 1. Add a regression test that exercises the `vainfo` profile-list + `vaCreateContext` + decode path back-to-back, mirroring mpv's exact call sequence. The iter7 cap_pool harness alone is insufficient. 2. Audit teardown of `cap_pool` + `request_pool` on `vaDestroyContext` — verify STREAMOFF is called before the next REQBUFS. 3. Audit iter6's REINIT path for fd lifetime: which entity owns the request fd and when does it get closed vs reused. 4. Consider whether the per-OUTPUT-slot REINIT introduced by iter6 needs to be conditionalized on a context-stable flag, not run unconditionally on every request. ## Environment - Host: ohm (PineTab2, Rockchip RK3568, hantro G1) - Kernel: as shipped by EndeavourOS-ARM custom - Package: `libva-v4l2-request-fourier-1.0.0.r280.65969da-1` from `[marfrit]` (commit `65969da3` = iter8 close) - libva: 2.23.0-1 → VA-API 1.23 - mpv: v0.41.0 - ffmpeg runtime: n8.1 (`ffmpeg-v4l2-request 2:8.1-3` installed) - Compositor: KWin 6.x via Plasma 6 Wayland - env: `LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0` ## Workaround Use `--hwdec=v4l2request` instead of `--hwdec=vaapi` — bypasses libva entirely. Same hardware decode, different code path.
Collaborator

Not reproducible at iter39 (cf8cd9d). Closing as fixed.

Empirical re-test (2026-05-18, ohm, RK3568 / hantro G1)

Upgraded libva-v4l2-request-fourier 1.0.0.r280.65969da-1 (iter8, the version this issue was filed against) → 1:1.0.0.r361.cf8cd9d-1 (iter39 + Hi10P/Main10 work).

Reproducer A — the issue's exact mpv command (workaround equivalent: vaapi-copy)

env LIBVA_DRIVER_NAME=v4l2_request \
    LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \
    LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \
    mpv --hwdec=vaapi-copy --vo=null --frames=60 \
        ~/fourier-test/bbb_1080p30_h264.mp4
[vaapi] libva: Trying to open /usr/lib/dri/v4l2_request_drv_video.so
[vaapi] libva: Found init function __vaDriverInit_1_23
[vaapi] libva: va_openDriver() returns 0
[vaapi] Initialized VAAPI: version 1.23
[vd] Using hardware decoding (vaapi-copy).
v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot)
v4l2-request: Unable to set control(s): Invalid argument           ← transient, decode continues
...
Exiting... (End of file)   exit=0

Bugs from the original trace: all gone

  • cap_pool_init double-init for same slot range → GONE (single init only)
  • Unable to request buffers: Device or resource busy (REQBUFS EBUSY) → GONE
  • Unable to reinit media request: Bad file descriptor (iter6 REINIT path) → GONE
  • OUTPUT queue exhaustion No buffer space availableGONE
  • Failed to create surface: 2 cascade → GONE

Reproducer B — direct ffmpeg-vaapi decode (30 frames, real content sanity check)

ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \
       -i bbb_1080p30_h264.mp4 -vf hwdownload,format=nv12 \
       -frames:v 30 -f rawvideo out.nv12
  • Exit 0, 1.31× realtime
  • 93,312,000 bytes written (correct NV12 size for 30 × 1920×1080)
  • 127 unique byte values (real content; not the 2-unique-byte "all black" artifact)

Root-cause class — fixed by which iterations?

The issue's diagnosis section called out three distinct lifecycle bugs:

  1. Probe-then-decode double-init: covered by iter5b-β rework — RequestDestroyContext now tears down output_pool BEFORE REQBUFS(0) AND cap_pool_destroy BEFORE the next CreateContext (see comment block at src/context.c:756–767 "Phase 5 v2 review CRIT-2").
  2. STREAMOFF-before-REQBUFS: now explicit at src/context.c:735–742 — both OUTPUT and CAPTURE STREAMOFF unconditionally on context destroy.
  3. iter6 REINIT stale-fd: the iter5b-β rework also retired the per-OUTPUT-slot REINIT logic in favor of fresh MEDIA_IOC_REQUEST_ALLOC per request lifecycle.

Plus separately, iter38 multi-device probe + iter39 Hi10P/Main10 sub-profile work both exercised the create/destroy cycle heavily and would have re-surfaced any lingering lifecycle bugs.

Caveats / out of scope

  • The dmabuf-direct path --hwdec=vaapi (no -copy) still fails on ohm with [vd] Could not create device. Using software decoding. — but that's an ffmpeg-side av_hwdevice_ctx_create failure during the vaapi-drm/vaapi-wayland adapter probe, NOT the libva CAPTURE-pool / OUTPUT-queue lifecycle bug this issue documented. Separate ticket if it matters; the bypass --hwdec=vaapi-copy works, and direct ffmpeg-vaapi works.
  • The Unable to set control(s): Invalid argument line during cap_pool_init is transient and harmless — the decode continues and produces correct content. Worth a follow-up cleanup but not a regression.

Closing.

**Not reproducible at iter39 (`cf8cd9d`).** Closing as fixed. ## Empirical re-test (2026-05-18, ohm, RK3568 / hantro G1) Upgraded `libva-v4l2-request-fourier 1.0.0.r280.65969da-1` (iter8, the version this issue was filed against) → `1:1.0.0.r361.cf8cd9d-1` (iter39 + Hi10P/Main10 work). ### Reproducer A — the issue's exact mpv command (workaround equivalent: `vaapi-copy`) ``` env LIBVA_DRIVER_NAME=v4l2_request \ LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \ LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \ mpv --hwdec=vaapi-copy --vo=null --frames=60 \ ~/fourier-test/bbb_1080p30_h264.mp4 ``` ``` [vaapi] libva: Trying to open /usr/lib/dri/v4l2_request_drv_video.so [vaapi] libva: Found init function __vaDriverInit_1_23 [vaapi] libva: va_openDriver() returns 0 [vaapi] Initialized VAAPI: version 1.23 [vd] Using hardware decoding (vaapi-copy). v4l2-request: cap_pool_init: 24 slots ready (v4l2_index=0..23, 1 plane(s) per slot) v4l2-request: Unable to set control(s): Invalid argument ← transient, decode continues ... Exiting... (End of file) exit=0 ``` **Bugs from the original trace:** ✅ all gone - ❌ `cap_pool_init` double-init for same slot range → **GONE** (single init only) - ❌ `Unable to request buffers: Device or resource busy` (REQBUFS EBUSY) → **GONE** - ❌ `Unable to reinit media request: Bad file descriptor` (iter6 REINIT path) → **GONE** - ❌ OUTPUT queue exhaustion `No buffer space available` → **GONE** - ❌ `Failed to create surface: 2` cascade → **GONE** ### Reproducer B — direct ffmpeg-vaapi decode (30 frames, real content sanity check) ``` ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \ -i bbb_1080p30_h264.mp4 -vf hwdownload,format=nv12 \ -frames:v 30 -f rawvideo out.nv12 ``` - Exit 0, 1.31× realtime - 93,312,000 bytes written (correct NV12 size for 30 × 1920×1080) - 127 unique byte values (real content; not the 2-unique-byte "all black" artifact) ## Root-cause class — fixed by which iterations? The issue's diagnosis section called out three distinct lifecycle bugs: 1. **Probe-then-decode double-init**: covered by **iter5b-β** rework — `RequestDestroyContext` now tears down `output_pool` BEFORE `REQBUFS(0)` AND `cap_pool_destroy` BEFORE the next `CreateContext` (see comment block at `src/context.c:756–767` "Phase 5 v2 review CRIT-2"). 2. **STREAMOFF-before-REQBUFS**: now explicit at `src/context.c:735–742` — both OUTPUT and CAPTURE STREAMOFF unconditionally on context destroy. 3. **iter6 REINIT stale-fd**: the iter5b-β rework also retired the per-OUTPUT-slot REINIT logic in favor of fresh `MEDIA_IOC_REQUEST_ALLOC` per request lifecycle. Plus separately, **iter38** multi-device probe + **iter39** Hi10P/Main10 sub-profile work both exercised the create/destroy cycle heavily and would have re-surfaced any lingering lifecycle bugs. ## Caveats / out of scope - The dmabuf-direct path `--hwdec=vaapi` (no `-copy`) still fails on ohm with `[vd] Could not create device. Using software decoding.` — but that's an ffmpeg-side `av_hwdevice_ctx_create` failure during the vaapi-drm/vaapi-wayland adapter probe, NOT the libva CAPTURE-pool / OUTPUT-queue lifecycle bug this issue documented. Separate ticket if it matters; the bypass `--hwdec=vaapi-copy` works, and direct ffmpeg-vaapi works. - The `Unable to set control(s): Invalid argument` line during cap_pool_init is transient and harmless — the decode continues and produces correct content. Worth a follow-up cleanup but not a regression. Closing.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/libva-v4l2-request-fourier#1