forked from marfrit/libva-v4l2-request-fourier
iter6 fix: per-OUTPUT-slot request_fd binding via REINIT
iter4 (385dee1) replaced the original media_request_reinit pattern with close+media_request_alloc per frame to escape an EINVAL on S_EXT_CTRLS that turned out to be a DPB-payload bug (74d8dd1, FFmpeg V4L2_H264_FRAME_REF semantics). The per-frame close+alloc model worked for mpv vaapi-copy (single-surface recycle) but raced under Firefox 150's MediaSource pipeline (multi-surface rotation): fd=30 got reused via lowest-free-fd allocation faster than the kernel- side per-buffer state-machine could tear down the prior request, producing intermittent VIDIOC_QBUF EINVAL on OUTPUT after 1..53 successful frames. Phase 2 telemetry confirmed: - DQBUF returned the index we passed (no FIFO mismatch) - SPS/PPS/DECODE_PARAMS/SCALING_MATRIX byte-identical between mpv and Firefox first 64 bytes - Pool size bump 4 -> 16 only delayed the failure (62 frames) - Different OUTPUT slot indices failed across runs (race signature) Fix: each OUTPUT pool slot owns a permanent request_fd allocated once at request_pool_init and REINIT'd between uses in RequestSyncSurface. 1:1 slot-to-fd binding eliminates cross-slot fd reuse entirely. Pool stays driver-wide (multi-context safe per iter5 Track E); slots cycle through 16 distinct fds in round-robin acquire. Files: - request_pool.h: add request_fd field to slot struct; init signature takes media_fd - request_pool.c: alloc per-slot fd at init, close at destroy - context.c: pass driver_data->media_fd; pool size 4 -> 16 - picture.c: BeginPicture binds slot->request_fd to surface; EndPicture's per-frame media_request_alloc removed - surface.c: RequestSyncSurface uses media_request_reinit instead of close+alloc; DestroySurfaces close removed (slot owns fd); error path close removed; surface_object NULL-init for the -Wmaybe-uninitialized warning fix Empirical verification (clean build sha ebe396d5..., no diagnostic instrumentation): - Firefox 150 + bbb_1080p30_h264.mp4 + LIBVA_DRIVER_NAME=v4l2_request + sandbox enabled: 35s+ playback, zero "Unable to queue buffer" / "Unable to set control(s)", lsof shows RDD process holds /dev/video1 + /dev/media0 throughout. Driver stderr: only the single cap_pool_init: 24 slots ready line. - mpv vaapi-copy 50 frames: zero errors, "Using hardware decoding (vaapi-copy)" - no regression vs iter5-end driver. Pool-size bump diagnostic (Phase 5 sonnet design review feedback): 4 -> 16 alone took 1->62 frames, far short of the 30s success criterion (~900 frames at 30fps). REINIT discipline is the actual fix; pool 16 is comfortable headroom over typical H.264 MaxDpbFrames. Phase 5 sonnet code review: APPROVE-WITH-CHANGES (one comment attribution corrected: cleanup runs at RequestTerminate, not RequestDestroyContext, since the pool is driver-wide). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+7
-1
@@ -96,8 +96,14 @@ VAStatus RequestCreateContext(VADriverContextP context, VAConfigID config_id,
|
||||
* surfaces present at vaCreateContext time, NULL-derefing on
|
||||
* surfaces created later.
|
||||
*/
|
||||
/*
|
||||
* iter6: pool size 16 gives comfortable headroom over typical H.264
|
||||
* MaxDpbFrames (16) for any consumer that pipelines decode requests.
|
||||
* Each slot owns its own request_fd (REINIT'd per use).
|
||||
*/
|
||||
rc = request_pool_init(&driver_data->output_pool,
|
||||
driver_data->video_fd, output_type, 4);
|
||||
driver_data->video_fd, driver_data->media_fd,
|
||||
output_type, 16);
|
||||
if (rc < 0) {
|
||||
status = VA_STATUS_ERROR_ALLOCATION_FAILED;
|
||||
goto error;
|
||||
|
||||
+14
-7
@@ -274,6 +274,14 @@ VAStatus RequestBeginPicture(VADriverContextP context, VAContextID context_id,
|
||||
surface_object->source_index = slot->index;
|
||||
surface_object->source_data = slot->data;
|
||||
surface_object->source_size = slot->size;
|
||||
/*
|
||||
* iter6: bind the slot's permanent request_fd to this surface for the
|
||||
* duration of the decode cycle. Replaces the iter4 close+alloc-per-
|
||||
* frame model. The fd is REINIT'd (not closed) at RequestSyncSurface,
|
||||
* so the kernel-side request object is reset in place — no fd-reuse
|
||||
* race with another slot's pending decode.
|
||||
*/
|
||||
surface_object->request_fd = slot->request_fd;
|
||||
surface_object->slices_size = 0;
|
||||
surface_object->slices_count = 0;
|
||||
surface_object->params.h264.matrix_set = false;
|
||||
@@ -357,14 +365,13 @@ VAStatus RequestEndPicture(VADriverContextP context, VAContextID context_id)
|
||||
|
||||
gettimeofday(&surface_object->timestamp, NULL);
|
||||
|
||||
/*
|
||||
* iter6: request_fd was bound to the surface in BeginPicture from
|
||||
* the OUTPUT pool slot's permanent fd. Per-frame allocation is gone.
|
||||
*/
|
||||
request_fd = surface_object->request_fd;
|
||||
if (request_fd < 0) {
|
||||
request_fd = media_request_alloc(driver_data->media_fd);
|
||||
if (request_fd < 0)
|
||||
return VA_STATUS_ERROR_OPERATION_FAILED;
|
||||
|
||||
surface_object->request_fd = request_fd;
|
||||
}
|
||||
if (request_fd < 0)
|
||||
return VA_STATUS_ERROR_OPERATION_FAILED;
|
||||
|
||||
rc = codec_set_controls(driver_data, context_object,
|
||||
config_object->profile, surface_object);
|
||||
|
||||
+23
-1
@@ -14,11 +14,13 @@
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <sys/mman.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include "media.h"
|
||||
#include "utils.h"
|
||||
#include "v4l2.h"
|
||||
|
||||
int request_pool_init(struct request_pool *pool, int video_fd,
|
||||
int request_pool_init(struct request_pool *pool, int video_fd, int media_fd,
|
||||
unsigned int output_type, unsigned int count)
|
||||
{
|
||||
unsigned int index_base;
|
||||
@@ -40,6 +42,9 @@ int request_pool_init(struct request_pool *pool, int video_fd,
|
||||
pool->count = count;
|
||||
pool->next = 0;
|
||||
|
||||
for (i = 0; i < count; i++)
|
||||
pool->slots[i].request_fd = -1;
|
||||
|
||||
rc = v4l2_create_buffers(video_fd, output_type, count, &index_base);
|
||||
if (rc < 0)
|
||||
goto error;
|
||||
@@ -63,6 +68,21 @@ int request_pool_init(struct request_pool *pool, int video_fd,
|
||||
}
|
||||
|
||||
pool->slots[i].size = length;
|
||||
|
||||
/*
|
||||
* iter6: each pool slot owns a permanent media-request fd,
|
||||
* allocated once here and REINIT'd between uses in
|
||||
* RequestSyncSurface. Replaces the iter4 close+alloc-per-
|
||||
* frame model, whose lowest-free fd reuse was racing with
|
||||
* the kernel's per-buffer state-machine teardown when the
|
||||
* consumer rotated through multiple OUTPUT pool slots
|
||||
* faster than the kernel cleanup drained (Firefox's
|
||||
* MediaSource pattern). 1:1 slot-to-fd binding eliminates
|
||||
* cross-slot fd reuse.
|
||||
*/
|
||||
pool->slots[i].request_fd = media_request_alloc(media_fd);
|
||||
if (pool->slots[i].request_fd < 0)
|
||||
goto error;
|
||||
}
|
||||
|
||||
pool->initialized = true;
|
||||
@@ -81,6 +101,8 @@ void request_pool_destroy(struct request_pool *pool)
|
||||
return;
|
||||
|
||||
for (i = 0; i < pool->count; i++) {
|
||||
if (pool->slots[i].request_fd >= 0)
|
||||
close(pool->slots[i].request_fd);
|
||||
if (pool->slots[i].data != NULL && pool->slots[i].size > 0)
|
||||
munmap(pool->slots[i].data, pool->slots[i].size);
|
||||
}
|
||||
|
||||
+7
-1
@@ -37,6 +37,12 @@ struct request_pool_slot {
|
||||
void *data; /* mmap pointer for this slot */
|
||||
unsigned int size; /* mmap size in bytes */
|
||||
bool busy; /* true while borrowed for a request */
|
||||
int request_fd; /* per-slot media-request fd, allocated
|
||||
* once at pool init, REINIT'd between
|
||||
* uses. iter6: replaces iter4 close+
|
||||
* alloc-per-frame to eliminate cross-
|
||||
* slot fd-reuse race that broke Firefox
|
||||
* MediaSource's multi-surface decode. */
|
||||
};
|
||||
|
||||
struct request_pool {
|
||||
@@ -52,7 +58,7 @@ struct request_pool {
|
||||
* VIDIOC_S_FMT on the OUTPUT queue. Returns 0 on success, -1 on
|
||||
* failure.
|
||||
*/
|
||||
int request_pool_init(struct request_pool *pool, int video_fd,
|
||||
int request_pool_init(struct request_pool *pool, int video_fd, int media_fd,
|
||||
unsigned int output_type, unsigned int count);
|
||||
|
||||
/*
|
||||
|
||||
+41
-16
@@ -394,8 +394,13 @@ VAStatus RequestDestroySurfaces(VADriverContextP context,
|
||||
*/
|
||||
surface_unbind_slot(driver_data, surface_object);
|
||||
|
||||
if (surface_object->request_fd > 0)
|
||||
close(surface_object->request_fd);
|
||||
/*
|
||||
* iter6: request_fd is owned by the OUTPUT pool slot, not by
|
||||
* the surface. Do not close here. The pool closes all slot
|
||||
* fds at request_pool_destroy time, which fires from
|
||||
* RequestTerminate (driver unload) — the pool is driver-wide
|
||||
* and survives context destroy/recreate cycles.
|
||||
*/
|
||||
|
||||
object_heap_free(&driver_data->surface_heap,
|
||||
(struct object_base *)surface_object);
|
||||
@@ -408,7 +413,7 @@ VAStatus RequestSyncSurface(VADriverContextP context, VASurfaceID surface_id)
|
||||
{
|
||||
|
||||
struct request_data *driver_data = context->pDriverData;
|
||||
struct object_surface *surface_object;
|
||||
struct object_surface *surface_object = NULL;
|
||||
VAStatus status;
|
||||
struct video_format *video_format;
|
||||
unsigned int output_type, capture_type;
|
||||
@@ -454,18 +459,26 @@ VAStatus RequestSyncSurface(VADriverContextP context, VASurfaceID surface_id)
|
||||
}
|
||||
|
||||
/*
|
||||
* iter4: instead of REINITing for reuse, close the request_fd here
|
||||
* and force the next BeginPicture to allocate a fresh one. The cached
|
||||
* request_fd path was hitting EINVAL on S_EXT_CTRLS for some surface
|
||||
* recycle patterns (4 individual TRY_EXT_CTRLS on the same fd all
|
||||
* fail with EINVAL — the fd state is bad even though queue+wait+reinit
|
||||
* appeared successful). Allocating fresh per frame is unambiguous and
|
||||
* sidesteps any state-lifecycle issue. Tradeoff: ~one extra ioctl per
|
||||
* frame (MEDIA_IOC_REQUEST_ALLOC + close), negligible cost.
|
||||
* iter6: the request_fd belongs to the OUTPUT pool slot, not to the
|
||||
* surface. REINIT to reset its state in place — close+alloc would
|
||||
* reuse the lowest-free fd number against a kernel object whose
|
||||
* teardown hasn't fully drained, racing with QBUF on a slot that
|
||||
* was just released. The pool's 1:1 slot-to-fd binding eliminates
|
||||
* cross-slot fd reuse, and REINIT here resets the request object
|
||||
* for the next decode cycle on the same slot.
|
||||
*
|
||||
* Iter4's frame-11 EINVAL (which prompted the iter4 close+alloc
|
||||
* model) was a control-payload bug — DPB carry-over with FFmpeg's
|
||||
* V4L2_H264_FRAME_REF semantics not yet matched. That's been fixed
|
||||
* since iter4 (`74d8dd1`), so REINIT is no longer compromised by
|
||||
* the cluster-validation EINVAL pattern.
|
||||
*/
|
||||
close(request_fd);
|
||||
rc = media_request_reinit(request_fd);
|
||||
if (rc < 0) {
|
||||
status = VA_STATUS_ERROR_OPERATION_FAILED;
|
||||
goto error;
|
||||
}
|
||||
surface_object->request_fd = -1;
|
||||
(void)0; /* placeholder for the now-removed reinit error path */
|
||||
|
||||
rc = v4l2_dequeue_buffer(driver_data->video_fd, -1, output_type,
|
||||
surface_object->source_index, 1);
|
||||
@@ -509,10 +522,22 @@ VAStatus RequestSyncSurface(VADriverContextP context, VASurfaceID surface_id)
|
||||
goto complete;
|
||||
|
||||
error:
|
||||
if (request_fd >= 0) {
|
||||
close(request_fd);
|
||||
/*
|
||||
* iter6: request_fd is owned by the OUTPUT pool slot. Do not
|
||||
* close here on error. The slot's REINIT may not have run if
|
||||
* we errored before it; the slot stays busy=true and is not
|
||||
* returned to acquire-rotation until RequestTerminate runs
|
||||
* request_pool_destroy. (The pool is driver-wide; it survives
|
||||
* RequestDestroyContext.) That's a bounded leak we accept: with
|
||||
* pool size 16 and rare errors, slot starvation only matters
|
||||
* after many error cycles — at which point acquire returns -1
|
||||
* cleanly.
|
||||
*
|
||||
* TODO: a future iteration could add a request_pool_force_release
|
||||
* that REINITs the fd and frees the slot for error recovery.
|
||||
*/
|
||||
if (surface_object != NULL)
|
||||
surface_object->request_fd = -1;
|
||||
}
|
||||
|
||||
complete:
|
||||
return status;
|
||||
|
||||
Reference in New Issue
Block a user