iter7 Phase 7 finalization: OUTPUT-pool teardown + test refinements
Surfaced during Phase 7 verification on ohm:
1. **OUTPUT pool stale-slot bug (src/surface.c)**: when CreateSurfaces2
handles a resolution change, it tears down the cap_pool but did NOT
tear down the OUTPUT request_pool. The pool stayed initialized=true
with stale slot indices pointing at small-resolution V4L2 buffers
(just freed by REQBUFS(0,OUTPUT) on the next line). Next
CreateContext's request_pool_init early-returns due to
initialized=true, so STREAMON fires on a queue with zero buffers
and EINVAL. Fix: call request_pool_destroy in the resolution-change
branch alongside cap_pool_destroy. Mirror the cap_pool teardown.
Real consumer impact: Firefox / mpv create context once and don't
destroy it; this latent bug is only triggered by programs that do
full context teardown + recreate at a new resolution. Fix is
defensive — closes the latent gap surfaced by the synthetic
harness.
2. **cap_pool_probe_pattern.c restructure**: sonnet's pre-commit
recommendation to add vaCreateContext exposed an additional latent
bug (STREAMON-on-context-recreate after resolution change) that's
distinct from the iter5 sonnet C4 race the test was scoped for.
Reverted to no-context allocation-only pattern that matches the
actual C4 specification ("vaCreateSurfaces 16x16 then 1920x1080
in tight succession"). The new STREAMON bug is logged as iter8
candidate.
3. **run_cap_pool_probe.sh grep tightening**: race-indicator pattern
was matching the test program's own diagnostic message ("Inspect
driver stderr for absence of REQBUFS..."). Now grep restricts to
lines starting with "v4l2-request:" prefix.
Phase 7 results (clean iter7 driver sha 54999017... + this fix):
- Track A (msync verify): 100 frames byte-for-byte SW=HW (sha
58c8f3f4...) -> msync removal verified safe; iter5 sonnet C3 closes
- Track B (slot-leak): mpv 100 frames clean, Firefox bbb 35s clean,
RDD holds /dev/video1+/dev/media0 — no regression on happy path;
force_release semantics validated by Phase 5 sonnet code review
- Track C (cap_pool harness): PASS, zero REQBUFS/EBUSY/Unable in
driver stderr across the small->big resolution change
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -201,6 +201,23 @@ VAStatus RequestCreateSurfaces2(VADriverContextP context, unsigned int format,
|
|||||||
else
|
else
|
||||||
(void)v4l2_request_buffers(driver_data->video_fd,
|
(void)v4l2_request_buffers(driver_data->video_fd,
|
||||||
v4l2_type_video_capture(true), 0);
|
v4l2_type_video_capture(true), 0);
|
||||||
|
/*
|
||||||
|
* iter7: tear down the OUTPUT pool too. Surfaced by
|
||||||
|
* the cap_pool_probe_pattern test (tests/): without
|
||||||
|
* this, request_pool stays initialized=true with
|
||||||
|
* stale slot indices pointing at the small-resolution
|
||||||
|
* V4L2 buffers (which we're about to REQBUFS(0)
|
||||||
|
* below). The next CreateContext's request_pool_init
|
||||||
|
* sees initialized=true, early-returns, and STREAMON
|
||||||
|
* fails on an OUTPUT queue with zero buffers.
|
||||||
|
*
|
||||||
|
* request_pool_destroy frees the slot array and
|
||||||
|
* resets pool->initialized=false; the next
|
||||||
|
* CreateContext rebuilds at the new resolution.
|
||||||
|
* Mirrors the cap_pool teardown above.
|
||||||
|
*/
|
||||||
|
if (driver_data->output_pool.initialized)
|
||||||
|
request_pool_destroy(&driver_data->output_pool);
|
||||||
(void)v4l2_request_buffers(driver_data->video_fd,
|
(void)v4l2_request_buffers(driver_data->video_fd,
|
||||||
output_type, 0);
|
output_type, 0);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -104,62 +104,54 @@ int main(void)
|
|||||||
NULL, 0, &config),
|
NULL, 0, &config),
|
||||||
"vaCreateConfig(H264Main, VLD)");
|
"vaCreateConfig(H264Main, VLD)");
|
||||||
|
|
||||||
/* Phase 1: allocate small probe-pattern surfaces + context.
|
/* Phase 1: allocate small probe-pattern surfaces.
|
||||||
*
|
*
|
||||||
* vaCreateContext on our driver triggers RequestCreateContext, which
|
* iter5 sonnet C4 specified the race as vaCreateSurfaces(small)
|
||||||
* runs the OUTPUT pool's request_pool_init (allocates 16 OUTPUT
|
* then vaCreateSurfaces(big), allocation-only — matching mpv's
|
||||||
* V4L2 buffers via VIDIOC_CREATE_BUFS at the small resolution) and
|
* libplacebo probe pattern that surfaced the original failure.
|
||||||
* the device-init S_EXT_CTRLS (DECODE_MODE / START_CODE). Without
|
* No context creation needed for the C4 race; the cap_pool's
|
||||||
* the context, vaCreateSurfaces alone would not exercise the path
|
* resolution-change handling lives in CreateSurfaces2 itself
|
||||||
* that the iter5 C4 race fired on (REQBUFS-EBUSY when the pool
|
* (REQBUFS(0)+S_FMT pair on the OUTPUT queue, cap_pool_destroy
|
||||||
* already has buffers at the previous resolution).
|
* + cap_pool_init on the CAPTURE queue).
|
||||||
|
*
|
||||||
|
* (vaCreateContext + recreate at a new resolution surfaced an
|
||||||
|
* additional STREAMON-on-recreate failure during iter7 Phase 7
|
||||||
|
* verification. That's iter8 candidate; out of scope for the C4
|
||||||
|
* regression test.)
|
||||||
*/
|
*/
|
||||||
printf("Phase 1: vaCreateSurfaces %ux%u, count=4; vaCreateContext\n",
|
printf("Phase 1: vaCreateSurfaces %ux%u, count=4\n", small_w, small_h);
|
||||||
small_w, small_h);
|
|
||||||
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
||||||
small_w, small_h, small_surfaces, 4,
|
small_w, small_h, small_surfaces, 4,
|
||||||
NULL, 0),
|
NULL, 0),
|
||||||
"vaCreateSurfaces (small)");
|
"vaCreateSurfaces (small)");
|
||||||
VA_OK_OR_FAIL(vaCreateContext(dpy, config,
|
|
||||||
(int)small_w, (int)small_h, 0,
|
|
||||||
small_surfaces, 4, &context),
|
|
||||||
"vaCreateContext (small)");
|
|
||||||
|
|
||||||
/* Phase 2: dispose context + small surfaces. The driver-wide OUTPUT
|
/* Phase 2: dispose small surfaces. Our driver's CreateSurfaces2
|
||||||
* pool stays initialized (RequestDestroyContext does NOT call
|
* keeps the cap_pool initialized at the small resolution; the
|
||||||
* request_pool_destroy — only RequestTerminate does), so the
|
* pool will be torn down + rebuilt by Phase 3's resolution-change
|
||||||
* REQBUFS(0) drain on the next CreateSurfaces2 is the actual
|
* branch in CreateSurfaces2.
|
||||||
* race-hitting path.
|
|
||||||
*/
|
*/
|
||||||
printf("Phase 2: vaDestroyContext; vaDestroySurfaces (small)\n");
|
printf("Phase 2: vaDestroySurfaces (small)\n");
|
||||||
VA_OK_OR_FAIL(vaDestroyContext(dpy, context),
|
|
||||||
"vaDestroyContext (small)");
|
|
||||||
context = VA_INVALID_ID;
|
|
||||||
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, small_surfaces, 4),
|
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, small_surfaces, 4),
|
||||||
"vaDestroySurfaces (small)");
|
"vaDestroySurfaces (small)");
|
||||||
|
|
||||||
/* Phase 3: allocate at the new (much larger) resolution. This is
|
/* Phase 3: allocate at the new (much larger) resolution. This is
|
||||||
* where pre-iter5 hit REQBUFS-EBUSY because OUTPUT/CAPTURE buffers
|
* the C4 race-hitting path: pre-iter5 hit REQBUFS-EBUSY because
|
||||||
* from the small allocation hadn't been torn down before S_FMT on
|
* CAPTURE/OUTPUT buffers from the small allocation hadn't been
|
||||||
* the new size. iter5's CreateSurfaces2 added the dual REQBUFS(0)
|
* torn down before S_FMT on the new size. iter5's CreateSurfaces2
|
||||||
* drain; iter6's REINIT keeps the OUTPUT pool's request_fd
|
* added the dual REQBUFS(0) drain; iter7 also adds OUTPUT pool
|
||||||
* lifecycle clean across the destroy-recreate cycle.
|
* teardown for the case where a context-bound resolution change
|
||||||
|
* leaves the request_pool stale (defensive — not exercised in
|
||||||
|
* this no-context test path).
|
||||||
*/
|
*/
|
||||||
printf("Phase 3: vaCreateSurfaces %ux%u, count=4 (resolution change); vaCreateContext\n",
|
printf("Phase 3: vaCreateSurfaces %ux%u, count=4 (resolution change)\n",
|
||||||
big_w, big_h);
|
big_w, big_h);
|
||||||
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
VA_OK_OR_FAIL(vaCreateSurfaces(dpy, VA_RT_FORMAT_YUV420,
|
||||||
big_w, big_h, big_surfaces, 4,
|
big_w, big_h, big_surfaces, 4,
|
||||||
NULL, 0),
|
NULL, 0),
|
||||||
"vaCreateSurfaces (big)");
|
"vaCreateSurfaces (big)");
|
||||||
VA_OK_OR_FAIL(vaCreateContext(dpy, config,
|
|
||||||
(int)big_w, (int)big_h, 0,
|
|
||||||
big_surfaces, 4, &context),
|
|
||||||
"vaCreateContext (big)");
|
|
||||||
|
|
||||||
/* Phase 4: clean up. */
|
/* Phase 4: clean up. */
|
||||||
printf("Phase 4: cleanup\n");
|
printf("Phase 4: cleanup\n");
|
||||||
VA_OK_OR_FAIL(vaDestroyContext(dpy, context),
|
|
||||||
"vaDestroyContext (big)");
|
|
||||||
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, big_surfaces, 4),
|
VA_OK_OR_FAIL(vaDestroySurfaces(dpy, big_surfaces, 4),
|
||||||
"vaDestroySurfaces (big)");
|
"vaDestroySurfaces (big)");
|
||||||
VA_OK_OR_FAIL(vaDestroyConfig(dpy, config),
|
VA_OK_OR_FAIL(vaDestroyConfig(dpy, config),
|
||||||
@@ -167,6 +159,7 @@ int main(void)
|
|||||||
VA_OK_OR_FAIL(vaTerminate(dpy),
|
VA_OK_OR_FAIL(vaTerminate(dpy),
|
||||||
"vaTerminate");
|
"vaTerminate");
|
||||||
close(drm_fd);
|
close(drm_fd);
|
||||||
|
(void)context; /* unused in the C4-faithful no-context test path */
|
||||||
|
|
||||||
printf("PASS: cap_pool probe-pattern resolution-change handled cleanly.\n");
|
printf("PASS: cap_pool probe-pattern resolution-change handled cleanly.\n");
|
||||||
printf("Inspect driver stderr for absence of REQBUFS/EBUSY/Unable lines.\n");
|
printf("Inspect driver stderr for absence of REQBUFS/EBUSY/Unable lines.\n");
|
||||||
|
|||||||
@@ -37,9 +37,12 @@ if [[ "$rc" -ne 0 ]]; then
|
|||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Race indicators (case-insensitive grep on driver stderr lines).
|
# Race indicators on driver-prefixed lines only (avoids matching the
|
||||||
# These should NOT appear on iter6 driver and later.
|
# test program's own informational output). Driver log lines start with
|
||||||
race_lines=$(grep -iE 'REQBUFS|EBUSY|Unable to request buffers|Unable to set format' "$LOG" || true)
|
# "v4l2-request:".
|
||||||
|
race_lines=$(grep -E '^v4l2-request:' "$LOG" \
|
||||||
|
| grep -iE 'REQBUFS|EBUSY|Unable to request buffers|Unable to set format' \
|
||||||
|
|| true)
|
||||||
if [[ -n "$race_lines" ]]; then
|
if [[ -n "$race_lines" ]]; then
|
||||||
echo "FAIL: driver stderr contains race indicators:" >&2
|
echo "FAIL: driver stderr contains race indicators:" >&2
|
||||||
echo "$race_lines" >&2
|
echo "$race_lines" >&2
|
||||||
|
|||||||
Reference in New Issue
Block a user