forked from marfrit/libva-v4l2-request-fourier
ca4dd880070255fb62665b617308d35701813338
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
988b848908 |
iter7: A+B+C — slot-leak fix, cap_pool harness, msync verify harness
Closes three internal carry items in one fork commit. iter6 deferred
these as TODOs; iter7 lands the implementations + supporting tests.
# Track B — slot-leak error recovery (src/)
iter6 documented the RequestSyncSurface error paths as a "bounded
leak we accept" — slots stayed busy=true after REINIT/DQBUF failures
until RequestTerminate ran. With pool=16 and rare errors this was
acceptable, but a sustained-error scenario could starve the pool.
Adds request_pool_force_release(pool, index) which:
1. Tries media_request_reinit on the slot's fd (cheap path)
2. Falls back to close + media_request_alloc (recovery)
3. Leaves the slot dead-busy if even alloc fails (other slots
unaffected, pool capacity reduced by 1 until destroy)
Wires it into surface.c RequestSyncSurface error paths only for
errors before the OUTPUT-DQBUF attempt. After OUTPUT-DQBUF failure
the V4L2 buffer is in indeterminate kernel state, so a separate
error label (`error_buffer_indeterminate`) leaves the slot
dead-busy — reusing the slot would QBUF on a kernel-still-held
buffer and EINVAL.
Phase 5 sonnet review caught this discriminator subtlety pre-commit.
Files: request_pool.{h,c}, surface.c.
# Track C — cap_pool race synthetic harness (tests/)
iter5 sonnet C4 / iter6 candidate A: cap_pool resolution-change
race was organically exercised by YT's quality renegotiations
(iter6 close, 4 cap_pool_init events clean) but had no
deterministic regression test.
tests/cap_pool_probe_pattern.c — ~170-line C program: opens
libva display, vaCreateConfig, vaCreateSurfaces(small) +
vaCreateContext (triggers OUTPUT pool init at small resolution),
dispose, vaCreateSurfaces(big) + vaCreateContext (forces S_FMT
on the new resolution against an in-use OUTPUT pool — the actual
race-hitting path).
Phase 5 sonnet flagged that without vaCreateContext the test
would pass trivially (OUTPUT pool never init'd, REQBUFS(0) on
empty queue is a no-op). Fixed before commit.
tests/run_cap_pool_probe.sh — runner; greps driver stderr for
REQBUFS / EBUSY / "Unable to set format" race indicators.
# Track A — msync pixel-correctness verify harness (tests/)
iter5 sweep removed msync(MS_SYNC|MS_INVALIDATE) from CAPTURE
DQBUF path. iter5 sonnet C3 flagged: no formal pixel verification.
tests/run_msync_pixel_verify.sh — runs FFmpeg SW decode (libavcodec
reference) and FFmpeg HW decode (via our v4l2_request driver),
compares NV12 byte streams. Probes fixture dimensions via ffprobe
and uses crop=$W:$H after hwdownload to normalize MB-padding
artifacts (hantro pads height to 16-line align; SW returns
crop-aligned).
Phase 5 sonnet flagged the stride-mismatch false-failure risk
pre-commit. Fixed: explicit crop + diagnostic that distinguishes
genuine pixel divergence from MB-padding stride artifacts.
# Phase 5 sonnet code review
Verdict: APPROVE-WITH-CHANGES. Three actionable findings, all
addressed before this commit:
1. surface.c error path: separated OUTPUT-DQBUF-failure into
error_buffer_indeterminate label, slot stays dead-busy
2. cap_pool_probe_pattern.c: added vaCreateContext to actually
exercise the OUTPUT pool init at the small resolution
3. run_msync_pixel_verify.sh: explicit crop on HW path,
stride-mismatch diagnostic distinguished from corruption
Empirical verification (Phase 6+7 deploy + run): pending operator
ohm-tools availability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a09c03c154 |
iter6 fix: per-OUTPUT-slot request_fd binding via REINIT
iter4 ( |
||
|
|
565f5c0de4 |
context: introduce request_pool, decouple OUTPUT buffers from surfaces
Commit 3 of the upstreamable plan (upstreamable_design.md §1, §5).
Replaces the prior per-surface OUTPUT-buffer ownership model with a
small driver-wide pool sized by codec pipeline depth (4 H.264 frames
in flight), allocated unconditionally regardless of caller's
num_render_targets.
Prior art (kernel UAPI dev-stateless-decoder.rst, ffmpeg
v4l2_request.c, Chromium V4L2StatelessVideoDecoder, GStreamer
v4l2slh264dec) all decouple OUTPUT and CAPTURE pool sizing. fourier's
"output_count == surfaces_count" model was a category error: OUTPUT
buffers are request-time bitstream slots, CAPTURE buffers are
picture-time DPB slots; their lifecycles and sizing are independent.
Changes:
* NEW src/request_pool.{c,h} (~200 LoC):
- request_pool_init(): CREATE_BUFS + per-slot QUERYBUF + mmap.
- request_pool_destroy(): munmap all, idempotent.
- request_pool_acquire(): round-robin claim; returns V4L2 buffer
index of an unused slot or -1.
- request_pool_release(): mark slot free for reuse.
- request_pool_slot(): accessor for ptr/size given a buffer index.
* src/request.h: add struct request_pool output_pool to request_data.
* src/context.c::RequestCreateContext: replace the per-surface
OUTPUT loop with a single request_pool_init() call (count=4,
independent of surfaces_count). Drop the now-unused locals
(length, offset, source_data, output_buffers_count, index,
index_base, i, surface_object). DELETES patch 0002's
"output_buffers_count = ... ? ... : 4" hack inline — the pool's
own count parameter supersedes it.
* src/picture.c::RequestBeginPicture: borrow a pool slot at frame
start, write its mmap pointer/size/index into the surface's
transient source_* fields. The fields stay (still useful as
a borrow handle that the existing codec_store_buffer memcpys
target), but no longer represent surface-permanent ownership.
Reset slices_size/slices_count here too (was implicit on first
Render).
* src/surface.c::RequestSyncSurface: after VIDIOC_DQBUF returns
the OUTPUT buffer, release the pool slot and clear the surface's
borrow handle. Fixes the segv on second-frame submission.
* src/surface.c::RequestDestroySurfaces: remove the munmap of
source_data — pool owns the mmap.
* src/request.c::RequestTerminate: call request_pool_destroy()
before close(video_fd) so munmaps still target a valid fd.
* src/meson.build: add request_pool.c and request_pool.h to the
sources/headers lists.
This commit removes 0002's OUTPUT-pool hack inline (the
"floor to 4" line is gone). The DECODE_MODE/START_CODE block in 0002
remains until commit 4 lands.
Build-verified clean on aarch64.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|