forked from marfrit/libva-v4l2-request-fourier
iter2 Fix 3: decoupled CAPTURE buffer pool with LRU recycling
Pre-iter2 each VA surface was permanently 1:1 bound to one V4L2 CAPTURE
buffer. mpv reusing a surface for a new decode while the compositor still
held an EXPBUF'd dma_buf fd to the prior frame caused the kernel to
write fresh decode output into the same physical memory the compositor
was reading -- visible as stutter / back-and-forth swap on
mpv --hwdec=vaapi --vo=gpu playback.
Architecture:
- New cap_pool abstraction (cap_pool.{h,c}) owns N CAPTURE buffers
(N = max(surfaces_count, MIN_CAP_POOL=24)) with per-slot state
{FREE, IN_DECODE, DECODED, EXPORTED} guarded by pthread_mutex_t.
- Surfaces no longer own buffers; each vaBeginPicture acquires the
oldest FREE slot (LRU), binds it for the decode cycle, and the slot
cycles IN_DECODE -> DECODED (post-DQBUF) -> EXPORTED (post-EXPBUF).
- Slot is released on next BeginPicture for the same surface or on
vaDestroySurfaces.
Limitations (Sonnet Phase 5 review iter2 9.x, deferred to iter3+):
- Option-A statistical mitigation; race window narrows to "pool
exhausted, force-recycle of oldest EXPORTED slot." For typical mpv
16-surface playback with MIN_CAP_POOL=24 the fallback never fires.
- Multi-context concurrent use not addressed (one V4L2 device, multiple
cap_pools -- iter3 scope).
Other call sites updated:
- picture.c::BeginPicture acquires + binds, releasing prior slot if any.
- surface.c::SyncSurface marks slot DECODED after DQBUF.
- surface.c::ExportSurfaceHandle marks slot EXPORTED, retaining OUR
EXPBUF fd for force-recycle close().
- surface.c::DestroySurfaces releases via surface_unbind_slot;
cap_pool owns the mmaps now.
- surface.c::CreateSurfaces2 destroys the pool in the resolution-change
path before REQBUFS(0) (else stale v4l2_index after Fix 1's REQBUFS).
- context.c::DestroyContext invokes cap_pool_destroy.
- image.c::DeriveImage skips copy_surface_to_image when current_slot is
NULL (ffmpeg av_hwframe_ctx_init probes derive on undecoded surfaces).
Verified: mpv vaapi-copy 200 frames bbb_1080p30, 0 drops, LRU visibly
recycling slot indices, real luma gradient. mpv vaapi --vo=gpu
operator-inspection follows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -32,6 +32,7 @@
|
||||
#include "context.h"
|
||||
#include "object_heap.h"
|
||||
#include "request_pool.h"
|
||||
#include "cap_pool.h"
|
||||
#include "video.h"
|
||||
#include <va/va.h>
|
||||
|
||||
@@ -63,6 +64,22 @@ struct request_data {
|
||||
* RequestCreateContext, torn down at driver Terminate.
|
||||
*/
|
||||
struct request_pool output_pool;
|
||||
|
||||
/*
|
||||
* CAPTURE (decoded-frame) buffer pool, decoupled from VA
|
||||
* surfaces (iter2 Fix 3). Each surface acquires a slot at
|
||||
* vaBeginPicture time and releases it on the next acquisition
|
||||
* or vaDestroySurfaces. Pool sized to max(surfaces_count,
|
||||
* MIN_CAP_POOL) at first vaCreateSurfaces2; torn down at
|
||||
* vaDestroyContext.
|
||||
*
|
||||
* Background: pre-iter2 each surface was 1:1 bound to one
|
||||
* CAPTURE buffer index; mpv re-using a surface for a new decode
|
||||
* caused V4L2 to re-QBUF the same physical buffer while a
|
||||
* compositor still held an EXPBUF'd dma_buf fd, producing
|
||||
* visible stutter on mpv vaapi --vo=gpu.
|
||||
*/
|
||||
struct cap_pool capture_pool;
|
||||
};
|
||||
|
||||
VAStatus VA_DRIVER_INIT_FUNC(VADriverContextP context);
|
||||
|
||||
Reference in New Issue
Block a user