iter7: A+B+C — slot-leak fix, cap_pool harness, msync verify harness
Closes three internal carry items in one fork commit. iter6 deferred
these as TODOs; iter7 lands the implementations + supporting tests.
# Track B — slot-leak error recovery (src/)
iter6 documented the RequestSyncSurface error paths as a "bounded
leak we accept" — slots stayed busy=true after REINIT/DQBUF failures
until RequestTerminate ran. With pool=16 and rare errors this was
acceptable, but a sustained-error scenario could starve the pool.
Adds request_pool_force_release(pool, index) which:
1. Tries media_request_reinit on the slot's fd (cheap path)
2. Falls back to close + media_request_alloc (recovery)
3. Leaves the slot dead-busy if even alloc fails (other slots
unaffected, pool capacity reduced by 1 until destroy)
Wires it into surface.c RequestSyncSurface error paths only for
errors before the OUTPUT-DQBUF attempt. After OUTPUT-DQBUF failure
the V4L2 buffer is in indeterminate kernel state, so a separate
error label (`error_buffer_indeterminate`) leaves the slot
dead-busy — reusing the slot would QBUF on a kernel-still-held
buffer and EINVAL.
Phase 5 sonnet review caught this discriminator subtlety pre-commit.
Files: request_pool.{h,c}, surface.c.
# Track C — cap_pool race synthetic harness (tests/)
iter5 sonnet C4 / iter6 candidate A: cap_pool resolution-change
race was organically exercised by YT's quality renegotiations
(iter6 close, 4 cap_pool_init events clean) but had no
deterministic regression test.
tests/cap_pool_probe_pattern.c — ~170-line C program: opens
libva display, vaCreateConfig, vaCreateSurfaces(small) +
vaCreateContext (triggers OUTPUT pool init at small resolution),
dispose, vaCreateSurfaces(big) + vaCreateContext (forces S_FMT
on the new resolution against an in-use OUTPUT pool — the actual
race-hitting path).
Phase 5 sonnet flagged that without vaCreateContext the test
would pass trivially (OUTPUT pool never init'd, REQBUFS(0) on
empty queue is a no-op). Fixed before commit.
tests/run_cap_pool_probe.sh — runner; greps driver stderr for
REQBUFS / EBUSY / "Unable to set format" race indicators.
# Track A — msync pixel-correctness verify harness (tests/)
iter5 sweep removed msync(MS_SYNC|MS_INVALIDATE) from CAPTURE
DQBUF path. iter5 sonnet C3 flagged: no formal pixel verification.
tests/run_msync_pixel_verify.sh — runs FFmpeg SW decode (libavcodec
reference) and FFmpeg HW decode (via our v4l2_request driver),
compares NV12 byte streams. Probes fixture dimensions via ffprobe
and uses crop=$W:$H after hwdownload to normalize MB-padding
artifacts (hantro pads height to 16-line align; SW returns
crop-aligned).
Phase 5 sonnet flagged the stride-mismatch false-failure risk
pre-commit. Fixed: explicit crop + diagnostic that distinguishes
genuine pixel divergence from MB-padding stride artifacts.
# Phase 5 sonnet code review
Verdict: APPROVE-WITH-CHANGES. Three actionable findings, all
addressed before this commit:
1. surface.c error path: separated OUTPUT-DQBUF-failure into
error_buffer_indeterminate label, slot stays dead-busy
2. cap_pool_probe_pattern.c: added vaCreateContext to actually
exercise the OUTPUT pool init at the small resolution
3. run_msync_pixel_verify.sh: explicit crop on HW path,
stride-mismatch diagnostic distinguished from corruption
Empirical verification (Phase 6+7 deploy + run): pending operator
ohm-tools availability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -41,6 +41,7 @@ int request_pool_init(struct request_pool *pool, int video_fd, int media_fd,
|
|||||||
|
|
||||||
pool->count = count;
|
pool->count = count;
|
||||||
pool->next = 0;
|
pool->next = 0;
|
||||||
|
pool->media_fd = media_fd; /* iter7: kept for force_release re-alloc */
|
||||||
|
|
||||||
for (i = 0; i < count; i++)
|
for (i = 0; i < count; i++)
|
||||||
pool->slots[i].request_fd = -1;
|
pool->slots[i].request_fd = -1;
|
||||||
@@ -152,6 +153,62 @@ void request_pool_release(struct request_pool *pool, unsigned int index)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void request_pool_force_release(struct request_pool *pool, unsigned int index)
|
||||||
|
{
|
||||||
|
struct request_pool_slot *slot;
|
||||||
|
unsigned int i;
|
||||||
|
|
||||||
|
if (pool == NULL || pool->slots == NULL)
|
||||||
|
return;
|
||||||
|
|
||||||
|
slot = NULL;
|
||||||
|
for (i = 0; i < pool->count; i++) {
|
||||||
|
if (pool->slots[i].index == index) {
|
||||||
|
slot = &pool->slots[i];
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (slot == NULL)
|
||||||
|
return;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Try to recover the kernel-side request object via REINIT first.
|
||||||
|
* REINIT is the cheap path: kernel resets the request in place,
|
||||||
|
* fd stays valid, slot can be reused immediately.
|
||||||
|
*/
|
||||||
|
if (slot->request_fd >= 0 && media_request_reinit(slot->request_fd) == 0) {
|
||||||
|
slot->busy = false;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* REINIT failed (or slot's fd was already invalid). Close the fd
|
||||||
|
* and try to allocate a fresh one. This costs an extra ioctl pair
|
||||||
|
* relative to the REINIT happy path but keeps the slot usable.
|
||||||
|
*
|
||||||
|
* NOTE: alloc may return the same lowest-free fd number that was
|
||||||
|
* just closed. That's fine here because (a) this is a rare error-
|
||||||
|
* recovery path, not the per-frame happy path, and (b) the slot's
|
||||||
|
* V4L2 buffer has already been DQBUF'd by this point (or is in an
|
||||||
|
* indeterminate state we can't recover from regardless), so the
|
||||||
|
* iter6 race condition (cross-slot fd-reuse against a kernel
|
||||||
|
* buffer in mid-cleanup) does not apply.
|
||||||
|
*/
|
||||||
|
if (slot->request_fd >= 0)
|
||||||
|
close(slot->request_fd);
|
||||||
|
slot->request_fd = media_request_alloc(pool->media_fd);
|
||||||
|
if (slot->request_fd < 0) {
|
||||||
|
/*
|
||||||
|
* Realloc failed. Slot is now permanently dead — leave
|
||||||
|
* busy=true so acquire skips it. Pool capacity is
|
||||||
|
* effectively reduced by 1 until pool destroy.
|
||||||
|
*/
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
slot->busy = false;
|
||||||
|
}
|
||||||
|
|
||||||
struct request_pool_slot *request_pool_slot(struct request_pool *pool,
|
struct request_pool_slot *request_pool_slot(struct request_pool *pool,
|
||||||
unsigned int index)
|
unsigned int index)
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -49,6 +49,8 @@ struct request_pool {
|
|||||||
struct request_pool_slot *slots;
|
struct request_pool_slot *slots;
|
||||||
unsigned int count;
|
unsigned int count;
|
||||||
unsigned int next; /* round-robin acquire cursor */
|
unsigned int next; /* round-robin acquire cursor */
|
||||||
|
int media_fd; /* iter7: kept for
|
||||||
|
* force_release re-alloc */
|
||||||
bool initialized;
|
bool initialized;
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -79,6 +81,21 @@ int request_pool_acquire(struct request_pool *pool);
|
|||||||
*/
|
*/
|
||||||
void request_pool_release(struct request_pool *pool, unsigned int index);
|
void request_pool_release(struct request_pool *pool, unsigned int index);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* iter7: error-recovery release. Called from RequestSyncSurface error
|
||||||
|
* paths when media_request_reinit or VIDIOC_DQBUF failed mid-cycle and
|
||||||
|
* the slot's request_fd is now in an undefined state. REINITs the fd;
|
||||||
|
* if REINIT fails (kernel-side request object too far gone), close
|
||||||
|
* the fd and re-alloc a fresh one. If realloc also fails, the slot
|
||||||
|
* is left busy=true (effectively dead, count decremented by 1) — pool
|
||||||
|
* survives but with reduced capacity until driver terminate. Other
|
||||||
|
* slots are unaffected.
|
||||||
|
*
|
||||||
|
* Caller passes the V4L2 buffer index from request_pool_acquire().
|
||||||
|
*/
|
||||||
|
void request_pool_force_release(struct request_pool *pool,
|
||||||
|
unsigned int index);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Look up the pool slot owning a given V4L2 buffer index. Returns
|
* Look up the pool slot owning a given V4L2 buffer index. Returns
|
||||||
* pointer to the slot on success, NULL if the index is out of range.
|
* pointer to the slot on success, NULL if the index is out of range.
|
||||||
|
|||||||
+57
-13
@@ -484,7 +484,14 @@ VAStatus RequestSyncSurface(VADriverContextP context, VASurfaceID surface_id)
|
|||||||
surface_object->source_index, 1);
|
surface_object->source_index, 1);
|
||||||
if (rc < 0) {
|
if (rc < 0) {
|
||||||
status = VA_STATUS_ERROR_OPERATION_FAILED;
|
status = VA_STATUS_ERROR_OPERATION_FAILED;
|
||||||
goto error;
|
/*
|
||||||
|
* iter7: OUTPUT DQBUF failed. The V4L2 buffer is in an
|
||||||
|
* indeterminate kernel state — it may still be QUEUED. Do
|
||||||
|
* NOT return the slot to acquire-rotation: the next QBUF
|
||||||
|
* on it would EINVAL. Leave source_data set so the error
|
||||||
|
* handler skips force_release and the slot stays dead-busy.
|
||||||
|
*/
|
||||||
|
goto error_buffer_indeterminate;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@@ -523,21 +530,58 @@ VAStatus RequestSyncSurface(VADriverContextP context, VASurfaceID surface_id)
|
|||||||
|
|
||||||
error:
|
error:
|
||||||
/*
|
/*
|
||||||
* iter6: request_fd is owned by the OUTPUT pool slot. Do not
|
* iter7: error recovery for the OUTPUT pool slot. If the surface
|
||||||
* close here on error. The slot's REINIT may not have run if
|
* acquired a slot in BeginPicture (source_data != NULL indicates
|
||||||
* we errored before it; the slot stays busy=true and is not
|
* an active borrow), reset the slot's request_fd via
|
||||||
* returned to acquire-rotation until RequestTerminate runs
|
* request_pool_force_release so the slot returns to the
|
||||||
* request_pool_destroy. (The pool is driver-wide; it survives
|
* acquire-rotation. force_release tries REINIT first; falls back
|
||||||
* RequestDestroyContext.) That's a bounded leak we accept: with
|
* to close+alloc if REINIT fails; leaves the slot dead-busy if
|
||||||
* pool size 16 and rare errors, slot starvation only matters
|
* even alloc fails (other slots unaffected). Replaces iter6's
|
||||||
* after many error cycles — at which point acquire returns -1
|
* accepted bounded leak.
|
||||||
* cleanly.
|
|
||||||
*
|
*
|
||||||
* TODO: a future iteration could add a request_pool_force_release
|
* Reachable from: media_request_queue / wait_completion / REINIT
|
||||||
* that REINITs the fd and frees the slot for error recovery.
|
* failures. NOT reachable for OUTPUT-DQBUF failure (separate label
|
||||||
|
* `error_buffer_indeterminate` below) because in that case the
|
||||||
|
* V4L2 buffer is in an indeterminate kernel state and reusing the
|
||||||
|
* slot would EINVAL on the next QBUF.
|
||||||
|
*
|
||||||
|
* If the surface never acquired a slot (source_data == NULL),
|
||||||
|
* there is no slot to release; nothing to do.
|
||||||
*/
|
*/
|
||||||
if (surface_object != NULL)
|
if (surface_object != NULL) {
|
||||||
|
if (surface_object->source_data != NULL) {
|
||||||
|
request_pool_force_release(&driver_data->output_pool,
|
||||||
|
surface_object->source_index);
|
||||||
|
surface_object->source_data = NULL;
|
||||||
|
surface_object->source_size = 0;
|
||||||
|
}
|
||||||
surface_object->request_fd = -1;
|
surface_object->request_fd = -1;
|
||||||
|
}
|
||||||
|
goto complete;
|
||||||
|
|
||||||
|
error_buffer_indeterminate:
|
||||||
|
/*
|
||||||
|
* iter7: OUTPUT DQBUF failed after a successful REINIT. The kernel
|
||||||
|
* V4L2 buffer is in an unknown state (possibly still QUEUED with
|
||||||
|
* pending decode result, possibly half-dequeued, possibly stuck
|
||||||
|
* in driver internals). The slot's request_fd has already been
|
||||||
|
* REINIT'd to a clean state, but reusing the slot for a new
|
||||||
|
* decode would QBUF on a buffer the kernel may still hold —
|
||||||
|
* triggering exactly the iter6 race we eliminated for the happy
|
||||||
|
* path.
|
||||||
|
*
|
||||||
|
* Leave the slot dead-busy: don't release, don't force_release.
|
||||||
|
* Other slots are unaffected. If this fires repeatedly, the pool
|
||||||
|
* leaks slots until starvation, at which point acquire returns -1
|
||||||
|
* and BeginPicture cleanly propagates ALLOCATION_FAILED. This is
|
||||||
|
* a strictly safer failure mode than reusing an indeterminate
|
||||||
|
* V4L2 buffer.
|
||||||
|
*/
|
||||||
|
if (surface_object != NULL) {
|
||||||
|
surface_object->source_data = NULL;
|
||||||
|
surface_object->source_size = 0;
|
||||||
|
surface_object->request_fd = -1;
|
||||||
|
}
|
||||||
|
|
||||||
complete:
|
complete:
|
||||||
return status;
|
return status;
|
||||||
|
|||||||
@@ -0,0 +1,174 @@
|
|||||||
|
/*
|
||||||
|
* cap_pool_probe_pattern.c — synthetic regression test for the
|
||||||
|
* iter5 sonnet C4 / iter6 candidate A "cap_pool resolution-change race."
|
||||||
|
*
|
||||||
|
* Exercises the surface-allocation pattern that originally tripped
|
||||||
|
* REQBUFS-EBUSY on the iter5-end driver: vaCreateSurfaces at one
|
||||||
|
* resolution, then vaDestroySurfaces, then vaCreateSurfaces at a
|
||||||
|
* different resolution. iter6's REINIT discipline + cap_pool's
|
||||||
|
* REQBUFS(0)-on-CAPTURE-and-OUTPUT during S_FMT-on-resolution-change
|
||||||
|
* (CreateSurfaces2 in surface.c) closes this race; this test anchors
|
||||||
|
* that fact with a deterministic repro.
|
||||||
|
*
|
||||||
|
* Build:
|
||||||
|
* gcc -O2 -Wall -Wextra -o cap_pool_probe_pattern \
|
||||||
|
* cap_pool_probe_pattern.c \
|
||||||
|
* $(pkg-config --cflags --libs libva libva-drm)
|
||||||
|
*
|
||||||
|
* Run:
|
||||||
|
* LIBVA_DRIVER_NAME=v4l2_request \
|
||||||
|
* LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \
|
||||||
|
* LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \
|
||||||
|
* ./cap_pool_probe_pattern
|
||||||
|
*
|
||||||
|
* Pass criterion (on iter6 driver and later):
|
||||||
|
* - Exit code 0
|
||||||
|
* - No "REQBUFS" / "EBUSY" / "Unable to request buffers" /
|
||||||
|
* "Unable to set format" lines on the v4l2-request driver's stderr
|
||||||
|
* - vainfo or visual inspection confirms the test program reached
|
||||||
|
* the "PASS" line on stdout
|
||||||
|
*
|
||||||
|
* Fail behavior pre-iter5: vaCreateSurfaces at the second resolution
|
||||||
|
* would emit REQBUFS-EBUSY because OUTPUT/CAPTURE buffers from the
|
||||||
|
* first allocation hadn't been torn down before S_FMT was attempted
|
||||||
|
* on the new resolution. iter5's CreateSurfaces2 added the dual
|
||||||
|
* REQBUFS(0) drain; iter6's REINIT keeps the OUTPUT pool's request_fd
|
||||||
|
* lifecycle clean across the destroy-recreate cycle.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#include <errno.h>
|
||||||
|
#include <fcntl.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
#include <string.h>
|
||||||
|
#include <unistd.h>
|
||||||
|
|
||||||
|
#include <va/va.h>
|
||||||
|
#include <va/va_drm.h>
|
||||||
|
|
||||||
|
#define DRM_RENDER_NODE "/dev/dri/renderD128"
|
||||||
|
|
||||||
|
static const char *va_status_str(VAStatus s)
|
||||||
|
{
|
||||||
|
return vaErrorStr(s);
|
||||||
|
}
|
||||||
|
|
||||||
|
#define VA_OK_OR_FAIL(call, msg) do { \
|
||||||
|
VAStatus _vs = (call); \
|
||||||
|
if (_vs != VA_STATUS_SUCCESS) { \
|
||||||
|
fprintf(stderr, "FAIL: %s: %s (0x%x)\n", \
|
||||||
|
(msg), va_status_str(_vs), _vs); \
|
||||||
|
return 10; \
|
||||||
|
} \
|
||||||
|
} while (0)
|
||||||
|
|
||||||
|
int main(void)
|
||||||
|
{
|
||||||
|
int drm_fd;
|
||||||
|
VADisplay dpy;
|
||||||
|
int va_major = 0, va_minor = 0;
|
||||||
|
VAConfigID config = VA_INVALID_ID;
|
||||||
|
VAContextID context = VA_INVALID_ID;
|
||||||
|
VASurfaceID small_surfaces[4];
|
||||||
|
VASurfaceID big_surfaces[4];
|
||||||
|
const unsigned int small_w = 128, small_h = 128;
|
||||||
|
const unsigned int big_w = 1920, big_h = 1080;
|
||||||
|
|
||||||
|
/* Open render node + libva display. */
|
||||||
|
drm_fd = open(DRM_RENDER_NODE, O_RDWR);
|
||||||
|
if (drm_fd < 0) {
|
||||||
|
fprintf(stderr, "FAIL: open(%s): %s\n",
|
||||||
|
DRM_RENDER_NODE, strerror(errno));
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
dpy = vaGetDisplayDRM(drm_fd);
|
||||||
|
if (dpy == NULL) {
|
||||||
|
fprintf(stderr, "FAIL: vaGetDisplayDRM returned NULL\n");
|
||||||
|
close(drm_fd);
|
||||||
|
return 2;
|
||||||
|
}
|
||||||
|
|
||||||
|
VA_OK_OR_FAIL(vaInitialize(dpy, &va_major, &va_minor),
|
||||||
|
"vaInitialize");
|
||||||
|
printf("libva %d.%d initialized via %s\n", va_major, va_minor,
|
||||||
|
DRM_RENDER_NODE);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* vaCreateConfig with H.264 Main + VLD entrypoint forces our
|
||||||
|
* driver's RequestCreateConfig to set up the H.264 decode path,
|
||||||
|
* which is the path that reaches CreateSurfaces2 (and the
|
||||||
|
* resolution-change handling there).
|
||||||
|
*/
|
||||||
|
VA_OK_OR_FAIL(vaCreateConfig(dpy, VAProfileH264Main, VAEntrypointVLD,
|
||||||
|
NULL, 0, &config),
|
||||||
|
"vaCreateConfig(H264Main, VLD)");
|
||||||
|
|
||||||
|
/* Phase 1: allocate small probe-pattern surfaces + context.
|
||||||
|
*
|
||||||
|
* vaCreateContext on our driver triggers RequestCreateContext, which
|
||||||
|
* runs the OUTPUT pool's request_pool_init (allocates 16 OUTPUT
|
||||||
|
* V4L2 buffers via VIDIOC_CREATE_BUFS at the small resolution) and
|
||||||
|
* the device-init S_EXT_CTRLS (DECODE_MODE / START_CODE). Without
|
||||||
|
* the context, vaCreateSurfaces alone would not exercise the path
|
||||||
|
* that the iter5 C4 race fired on (REQBUFS-EBUSY when the pool
|
||||||
|
* already has buffers at the previous resolution).
|
||||||
|
*/
|
||||||
|
printf("Phase 1: vaCreateSurfaces %ux%u, count=4; vaCreateContext\n",
|
||||||
|
small_w, small_h);
|
||||||
|
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
||||||
|
small_w, small_h, small_surfaces, 4,
|
||||||
|
NULL, 0),
|
||||||
|
"vaCreateSurfaces (small)");
|
||||||
|
VA_OK_OR_FAIL(vaCreateContext(dpy, config,
|
||||||
|
(int)small_w, (int)small_h, 0,
|
||||||
|
small_surfaces, 4, &context),
|
||||||
|
"vaCreateContext (small)");
|
||||||
|
|
||||||
|
/* Phase 2: dispose context + small surfaces. The driver-wide OUTPUT
|
||||||
|
* pool stays initialized (RequestDestroyContext does NOT call
|
||||||
|
* request_pool_destroy — only RequestTerminate does), so the
|
||||||
|
* REQBUFS(0) drain on the next CreateSurfaces2 is the actual
|
||||||
|
* race-hitting path.
|
||||||
|
*/
|
||||||
|
printf("Phase 2: vaDestroyContext; vaDestroySurfaces (small)\n");
|
||||||
|
VA_OK_OR_FAIL(vaDestroyContext(dpy, context),
|
||||||
|
"vaDestroyContext (small)");
|
||||||
|
context = VA_INVALID_ID;
|
||||||
|
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, small_surfaces, 4),
|
||||||
|
"vaDestroySurfaces (small)");
|
||||||
|
|
||||||
|
/* Phase 3: allocate at the new (much larger) resolution. This is
|
||||||
|
* where pre-iter5 hit REQBUFS-EBUSY because OUTPUT/CAPTURE buffers
|
||||||
|
* from the small allocation hadn't been torn down before S_FMT on
|
||||||
|
* the new size. iter5's CreateSurfaces2 added the dual REQBUFS(0)
|
||||||
|
* drain; iter6's REINIT keeps the OUTPUT pool's request_fd
|
||||||
|
* lifecycle clean across the destroy-recreate cycle.
|
||||||
|
*/
|
||||||
|
printf("Phase 3: vaCreateSurfaces %ux%u, count=4 (resolution change); vaCreateContext\n",
|
||||||
|
big_w, big_h);
|
||||||
|
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
||||||
|
big_w, big_h, big_surfaces, 4,
|
||||||
|
NULL, 0),
|
||||||
|
"vaCreateSurfaces (big)");
|
||||||
|
VA_OK_OR_FAIL(vaCreateContext(dpy, config,
|
||||||
|
(int)big_w, (int)big_h, 0,
|
||||||
|
big_surfaces, 4, &context),
|
||||||
|
"vaCreateContext (big)");
|
||||||
|
|
||||||
|
/* Phase 4: clean up. */
|
||||||
|
printf("Phase 4: cleanup\n");
|
||||||
|
VA_OK_OR_FAIL(vaDestroyContext(dpy, context),
|
||||||
|
"vaDestroyContext (big)");
|
||||||
|
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, big_surfaces, 4),
|
||||||
|
"vaDestroySurfaces (big)");
|
||||||
|
VA_OK_OR_FAIL(vaDestroyConfig(dpy, config),
|
||||||
|
"vaDestroyConfig");
|
||||||
|
VA_OK_OR_FAIL(vaTerminate(dpy),
|
||||||
|
"vaTerminate");
|
||||||
|
close(drm_fd);
|
||||||
|
|
||||||
|
printf("PASS: cap_pool probe-pattern resolution-change handled cleanly.\n");
|
||||||
|
printf("Inspect driver stderr for absence of REQBUFS/EBUSY/Unable lines.\n");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
Executable
+50
@@ -0,0 +1,50 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# run_cap_pool_probe.sh — orchestrate the cap_pool probe-pattern regression test.
|
||||||
|
#
|
||||||
|
# Runs the cap_pool_probe_pattern test program with the v4l2_request driver
|
||||||
|
# and grep-checks driver stderr for race indicators. Exits 0 on PASS, 1 on FAIL.
|
||||||
|
#
|
||||||
|
# Usage: ./run_cap_pool_probe.sh [path_to_test_binary]
|
||||||
|
# If no argument, looks for ./cap_pool_probe_pattern in the same directory.
|
||||||
|
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
BIN="${1:-$(dirname "$0")/cap_pool_probe_pattern}"
|
||||||
|
|
||||||
|
if [[ ! -x "$BIN" ]]; then
|
||||||
|
echo "FAIL: test binary not found or not executable: $BIN" >&2
|
||||||
|
echo "Build it first:" >&2
|
||||||
|
echo " gcc -O2 -Wall -Wextra -o $BIN $(dirname "$0")/cap_pool_probe_pattern.c \\" >&2
|
||||||
|
echo " \$(pkg-config --cflags --libs libva libva-drm)" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
LOG=$(mktemp -t cap_pool_probe.XXXXXX.log)
|
||||||
|
trap 'rm -f "$LOG"' EXIT
|
||||||
|
|
||||||
|
env LIBVA_DRIVER_NAME=v4l2_request \
|
||||||
|
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \
|
||||||
|
LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \
|
||||||
|
"$BIN" >"$LOG" 2>&1
|
||||||
|
rc=$?
|
||||||
|
|
||||||
|
echo "--- test program output ---"
|
||||||
|
cat "$LOG"
|
||||||
|
echo "--- end output ---"
|
||||||
|
|
||||||
|
if [[ "$rc" -ne 0 ]]; then
|
||||||
|
echo "FAIL: test binary exited with rc=$rc" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Race indicators (case-insensitive grep on driver stderr lines).
|
||||||
|
# These should NOT appear on iter6 driver and later.
|
||||||
|
race_lines=$(grep -iE 'REQBUFS|EBUSY|Unable to request buffers|Unable to set format' "$LOG" || true)
|
||||||
|
if [[ -n "$race_lines" ]]; then
|
||||||
|
echo "FAIL: driver stderr contains race indicators:" >&2
|
||||||
|
echo "$race_lines" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "PASS: cap_pool probe-pattern test clean (no race indicators)."
|
||||||
|
exit 0
|
||||||
Executable
+139
@@ -0,0 +1,139 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# run_msync_pixel_verify.sh — verify decoded pixel correctness post-msync-removal.
|
||||||
|
#
|
||||||
|
# iter5 sweep commit d3a299b removed msync(MS_SYNC|MS_INVALIDATE) from the
|
||||||
|
# CAPTURE buffer DQBUF path alongside the iter1 patch-0010 hex-dump diagnostic.
|
||||||
|
# iter5 Phase 5 sonnet caveat C3 flagged: no formal pixel-correctness check
|
||||||
|
# was done. This script is that check.
|
||||||
|
#
|
||||||
|
# Approach:
|
||||||
|
# 1. SW reference: ffmpeg libavcodec H.264 decode of bbb_1080p30_h264.mp4,
|
||||||
|
# first 100 frames, NV12 raw output -> sw_ref.yuv.
|
||||||
|
# 2. HW subject: same input through our v4l2_request driver via
|
||||||
|
# ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi
|
||||||
|
# -i ... -vf hwdownload,format=nv12 -f rawvideo -pix_fmt nv12
|
||||||
|
# Captures the post-DQBUF buffer through libva readback, exercising
|
||||||
|
# the same code path we removed msync from.
|
||||||
|
# 3. Compare: byte-for-byte cmp + per-frame sha256.
|
||||||
|
#
|
||||||
|
# Pass: byte-for-byte identical (or per-frame sha matches) -> msync
|
||||||
|
# verifiably unnecessary on this hardware/kernel; iter5 sonnet C3 closes.
|
||||||
|
# Fail: divergence; restore msync in surface.c, re-run, document outcome.
|
||||||
|
#
|
||||||
|
# Usage: ./run_msync_pixel_verify.sh [fixture_path]
|
||||||
|
# If no argument, defaults to /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4
|
||||||
|
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
FIXTURE="${1:-/home/mfritsche/fourier-test/bbb_1080p30_h264.mp4}"
|
||||||
|
N_FRAMES=100
|
||||||
|
WORKDIR=$(mktemp -d -t msync_verify.XXXXXX)
|
||||||
|
trap 'rm -rf "$WORKDIR"' EXIT
|
||||||
|
|
||||||
|
if [[ ! -f "$FIXTURE" ]]; then
|
||||||
|
echo "FAIL: fixture not found: $FIXTURE" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Probe fixture dimensions for crop alignment of the HW path.
|
||||||
|
# Hantro pads height to MB boundaries (16-line align); FFmpeg SW decode
|
||||||
|
# returns crop-aligned (visible) frame size. Without explicit cropping
|
||||||
|
# on the HW side, hwdownload + format=nv12 emits MB-padded frames, which
|
||||||
|
# would diverge in size from SW even if pixels are correct.
|
||||||
|
read FIXTURE_W FIXTURE_H < <(ffprobe -v error -select_streams v:0 \
|
||||||
|
-show_entries stream=width,height -of csv=p=0 "$FIXTURE" \
|
||||||
|
| tr ',' ' ')
|
||||||
|
if [[ -z "${FIXTURE_W:-}" || -z "${FIXTURE_H:-}" ]]; then
|
||||||
|
echo "FAIL: ffprobe could not read width/height from $FIXTURE" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "Fixture: $FIXTURE ($FIXTURE_W x $FIXTURE_H)"
|
||||||
|
echo "Frames: $N_FRAMES"
|
||||||
|
echo "Workdir: $WORKDIR"
|
||||||
|
echo
|
||||||
|
|
||||||
|
# 1. SW reference
|
||||||
|
echo "[1/3] FFmpeg SW decode -> sw_ref.yuv"
|
||||||
|
ffmpeg -hide_banner -loglevel error -y \
|
||||||
|
-i "$FIXTURE" \
|
||||||
|
-frames:v "$N_FRAMES" \
|
||||||
|
-f rawvideo -pix_fmt nv12 \
|
||||||
|
"$WORKDIR/sw_ref.yuv"
|
||||||
|
SW_BYTES=$(stat -c %s "$WORKDIR/sw_ref.yuv")
|
||||||
|
SW_SHA=$(sha256sum "$WORKDIR/sw_ref.yuv" | cut -d' ' -f1)
|
||||||
|
echo " sw_ref.yuv: $SW_BYTES bytes, sha256=$SW_SHA"
|
||||||
|
|
||||||
|
# 2. HW subject via libva v4l2_request
|
||||||
|
# Explicit crop=$FIXTURE_W:$FIXTURE_H after hwdownload normalizes any
|
||||||
|
# MB-padding the HW driver applies (hantro pads height to multiples of
|
||||||
|
# 16). Without this crop, an iter6+ correct decode could falsely
|
||||||
|
# diverge in total byte count from the SW reference.
|
||||||
|
echo "[2/3] FFmpeg HW decode via v4l2_request driver -> hw_capture.yuv"
|
||||||
|
env LIBVA_DRIVER_NAME=v4l2_request \
|
||||||
|
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1 \
|
||||||
|
LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media0 \
|
||||||
|
ffmpeg -hide_banner -loglevel error -y \
|
||||||
|
-hwaccel vaapi -hwaccel_output_format vaapi \
|
||||||
|
-i "$FIXTURE" \
|
||||||
|
-vf "hwdownload,format=nv12,crop=$FIXTURE_W:$FIXTURE_H:0:0" \
|
||||||
|
-frames:v "$N_FRAMES" \
|
||||||
|
-f rawvideo -pix_fmt nv12 \
|
||||||
|
"$WORKDIR/hw_capture.yuv"
|
||||||
|
HW_BYTES=$(stat -c %s "$WORKDIR/hw_capture.yuv")
|
||||||
|
HW_SHA=$(sha256sum "$WORKDIR/hw_capture.yuv" | cut -d' ' -f1)
|
||||||
|
echo " hw_capture.yuv: $HW_BYTES bytes, sha256=$HW_SHA"
|
||||||
|
echo
|
||||||
|
|
||||||
|
# 3. Compare
|
||||||
|
echo "[3/3] Compare"
|
||||||
|
if [[ "$SW_BYTES" -ne "$HW_BYTES" ]]; then
|
||||||
|
# Diagnose stride/padding artifacts before declaring pixel
|
||||||
|
# corruption. With explicit crop in step 2 this should not
|
||||||
|
# happen, but if a future kernel change shifts the alignment
|
||||||
|
# we want a clear diagnostic, not a false pixel-corruption
|
||||||
|
# accusation.
|
||||||
|
EXPECTED_SW=$(( FIXTURE_W * FIXTURE_H * 3 / 2 * N_FRAMES ))
|
||||||
|
for PAD in 16 32; do
|
||||||
|
PADDED_H=$(( (FIXTURE_H + PAD - 1) / PAD * PAD ))
|
||||||
|
EXPECTED_PADDED=$(( FIXTURE_W * PADDED_H * 3 / 2 * N_FRAMES ))
|
||||||
|
if [[ "$HW_BYTES" -eq "$EXPECTED_PADDED" ]]; then
|
||||||
|
echo "DIAGNOSTIC: HW size $HW_BYTES matches MB-padded layout" >&2
|
||||||
|
echo " ($FIXTURE_W x $PADDED_H, $PAD-line align). The crop=$FIXTURE_W:$FIXTURE_H" >&2
|
||||||
|
echo " filter step did not normalize. Check FFmpeg version / hwdownload behavior." >&2
|
||||||
|
echo " This is a stride artifact, not pixel corruption." >&2
|
||||||
|
exit 3
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "FAIL: size mismatch (SW=$SW_BYTES vs HW=$HW_BYTES, expected $EXPECTED_SW)" >&2
|
||||||
|
echo " Different frame count or NV12 packing — investigate." >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ "$SW_SHA" == "$HW_SHA" ]]; then
|
||||||
|
echo "PASS: byte-for-byte identical."
|
||||||
|
echo " msync removal verified safe on this hardware/kernel."
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Per-frame divergence analysis on full-buffer mismatch.
|
||||||
|
echo "Buffer-level sha differs. Computing per-frame divergence..."
|
||||||
|
FRAME_SIZE=$(( SW_BYTES / N_FRAMES ))
|
||||||
|
DIVERGENT=0
|
||||||
|
for ((i = 0; i < N_FRAMES; i++)); do
|
||||||
|
OFFSET=$(( i * FRAME_SIZE ))
|
||||||
|
SW_FRAME_SHA=$(dd if="$WORKDIR/sw_ref.yuv" bs="$FRAME_SIZE" \
|
||||||
|
count=1 skip="$i" 2>/dev/null | sha256sum | cut -d' ' -f1)
|
||||||
|
HW_FRAME_SHA=$(dd if="$WORKDIR/hw_capture.yuv" bs="$FRAME_SIZE" \
|
||||||
|
count=1 skip="$i" 2>/dev/null | sha256sum | cut -d' ' -f1)
|
||||||
|
if [[ "$SW_FRAME_SHA" != "$HW_FRAME_SHA" ]]; then
|
||||||
|
DIVERGENT=$(( DIVERGENT + 1 ))
|
||||||
|
[[ "$DIVERGENT" -le 5 ]] && \
|
||||||
|
echo " frame $i: SW=$SW_FRAME_SHA HW=$HW_FRAME_SHA"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "FAIL: $DIVERGENT / $N_FRAMES frames diverge from SW reference."
|
||||||
|
echo " Action: restore msync(MS_SYNC|MS_INVALIDATE) in surface.c"
|
||||||
|
echo " RequestSyncSurface DQBUF path; re-run this script."
|
||||||
|
exit 1
|
||||||
Reference in New Issue
Block a user