picture, request_pool: transparent OUTPUT-pool resize on bitstream overrun (#15) #16

Merged
marfrit merged 1 commits from claude-noether/libva-v4l2-request-fourier:noether/output-pool-resize-issue-15 into master 2026-05-21 11:23:09 +00:00
Owner

Closes #15 — the proper fix track for #13.

PR #14 left the OUTPUT-pool slot's sizeimage fixed for the life of the pool, returning VA_STATUS_ERROR_ALLOCATION_FAILED on overrun and forcing the libva consumer to drop the frame. This patch absorbs a mid-session resolution upshift transparently: on overrun, snapshot the in-flight surface's accumulated bytes, STREAMOFF the OUTPUT queue, REQBUFS(0), re-S_FMT with a bigger sizeimage hint, CREATE_BUFS, re-mmap, re-alloc per-slot media_request fds, STREAMON — then restore the bytes and continue. The triggering memcpy succeeds, the frame survives.

What changes

  1. src/request_pool.h / src/request_pool.crequest_pool_init gains three S_FMT-param args (pixelformat, picture_width, picture_height) which it stashes on the pool. New request_pool_resize(pool, new_sizeimage_min) runs the STREAMOFF → REQBUFS(0) → S_FMT(sizeimage-override) → CREATE_BUFS → mmap → media_request_alloc → STREAMON cycle.
  2. src/v4l2.h / src/v4l2.c — new v4l2_set_format_sizeimage() helper. v4l2_set_format() keeps the SOURCE_SIZE_MAX (1 MiB) default for CreateContext-time S_FMT; the resize path passes its own hint.
  3. src/picture.c — new static codec_store_buffer_ensure_capacity(). The three append sites in VASliceDataBufferType call it before each memcpy/memset. On overflow it does the save / release / resize / re-acquire / re-mirror / restore dance.
  4. src/context.c — pass the cached S_FMT params into request_pool_init.

Why the resize is safe at codec_store_buffer time

The inline-Sync-in-EndPicture pattern guarantees that when codec_store_buffer runs, exactly ONE OUTPUT pool slot is borrowed (the current render_surface_id's) and it has NOT yet been VIDIOC_QBUF'd to the kernel. ensure_capacity temporarily releases that slot (clean busy=false flip, no kernel state touched), runs the resize (which the pool guards with a busy-scan precondition), then re-acquires.

CAPTURE side is intentionally untouched. V4L2 stateless treats per-queue streaming independently; STREAMOFF on OUTPUT does not disturb the still-STREAMON CAPTURE queue. A genuine resolution-upshift CAPTURE-side budget mismatch then surfaces as a clean V4L2_BUF_FLAG_ERROR on the next DQBUF, handled by the existing surface error path.

Geometric growth

New sizeimage = max(need × 2, source_size × 2), capped at 1 GiB, rounded up to a 4 KiB page. Compute in size_t so the doubling cannot wrap at 2 GiB on 32-bit unsigned int before the clamp. The post-resize need > source_size re-check returns hard ERROR_ALLOCATION_FAILED rather than retrying — no infinite-loop possibility if the kernel rejects the hint.

On resize failure

The surface's slot has already been released by ensure_capacity when the resize fails. We re-acquire it (it's a clean busy=false slot the inline-Sync invariant guarantees no one else grabbed) so the surface's source_data / source_size / request_fd mirror stays internally consistent. Then surface VA_STATUS_ERROR_ALLOCATION_FAILED to libva — fallback behaviour matches pre-patch (consumer recreates the surface).

Independent review

Ran code-review pass (independent of this author): traced Begin/Render/End/Sync invariants in picture.c + surface.c, traced the OUTPUT-only V4L2 cycle's interaction with CAPTURE for rkvdec / hantro / daedalus_v4l2, audited the geometric-growth math for overflow + infinite-loop. Verdict: sound. One minor observation about the resize-failure surface state, addressed by the explicit re-acquire (see commit message + inline comment in codec_store_buffer_ensure_capacity).

Test plan

  • CI: libva-v4l2-request-fourier-aarch64 + -debian Gitea Actions jobs build clean.
  • On higgs (daedalus path): the #13 SIGSEGV repro (mpv --hwdec=vaapi-copy bbb_1080p30_h264.mp4 against a 720p-default daedalus session) decodes through the upshift instead of crashing; request_log shows the codec_store_buffer: OUTPUT-pool resize line at the first 1080p slice.
  • On fresnel: regress-check the 5 in-spec codecs (rkvdec H.264 / HEVC / VP9, hantro MPEG-2 / VP8) — no resize ever triggers (steady-state sizeimage suffices), vainfo enumerates all 8 codec profiles, no perf regression.
  • Confirm Firefox VAAPI decode no longer leaves the latent heap-overrun hazard (sub-sizeimage slices remain on the happy path, larger slices trigger the resize cleanly).
Closes #15 — the proper fix track for #13. PR #14 left the OUTPUT-pool slot's `sizeimage` fixed for the life of the pool, returning `VA_STATUS_ERROR_ALLOCATION_FAILED` on overrun and forcing the libva consumer to drop the frame. This patch absorbs a mid-session resolution upshift transparently: on overrun, snapshot the in-flight surface's accumulated bytes, STREAMOFF the OUTPUT queue, REQBUFS(0), re-`S_FMT` with a bigger `sizeimage` hint, CREATE_BUFS, re-mmap, re-`alloc` per-slot `media_request` fds, STREAMON — then restore the bytes and continue. The triggering memcpy succeeds, the frame survives. ## What changes 1. **`src/request_pool.h` / `src/request_pool.c`** — `request_pool_init` gains three `S_FMT`-param args (`pixelformat`, `picture_width`, `picture_height`) which it stashes on the pool. New `request_pool_resize(pool, new_sizeimage_min)` runs the STREAMOFF → REQBUFS(0) → S_FMT(sizeimage-override) → CREATE_BUFS → mmap → media_request_alloc → STREAMON cycle. 2. **`src/v4l2.h` / `src/v4l2.c`** — new `v4l2_set_format_sizeimage()` helper. `v4l2_set_format()` keeps the `SOURCE_SIZE_MAX` (1 MiB) default for `CreateContext`-time `S_FMT`; the resize path passes its own hint. 3. **`src/picture.c`** — new static `codec_store_buffer_ensure_capacity()`. The three append sites in `VASliceDataBufferType` call it before each memcpy/memset. On overflow it does the save / release / resize / re-acquire / re-mirror / restore dance. 4. **`src/context.c`** — pass the cached S_FMT params into `request_pool_init`. ## Why the resize is safe at `codec_store_buffer` time The inline-Sync-in-EndPicture pattern guarantees that when `codec_store_buffer` runs, exactly ONE OUTPUT pool slot is borrowed (the current `render_surface_id`'s) and it has NOT yet been `VIDIOC_QBUF`'d to the kernel. `ensure_capacity` temporarily releases that slot (clean `busy=false` flip, no kernel state touched), runs the resize (which the pool guards with a busy-scan precondition), then re-acquires. CAPTURE side is intentionally untouched. V4L2 stateless treats per-queue streaming independently; STREAMOFF on OUTPUT does not disturb the still-STREAMON CAPTURE queue. A genuine resolution-upshift CAPTURE-side budget mismatch then surfaces as a clean `V4L2_BUF_FLAG_ERROR` on the next DQBUF, handled by the existing surface error path. ## Geometric growth New `sizeimage = max(need × 2, source_size × 2)`, capped at 1 GiB, rounded up to a 4 KiB page. Compute in `size_t` so the doubling cannot wrap at 2 GiB on 32-bit `unsigned int` before the clamp. The post-resize `need > source_size` re-check returns hard `ERROR_ALLOCATION_FAILED` rather than retrying — no infinite-loop possibility if the kernel rejects the hint. ## On resize failure The surface's slot has already been released by `ensure_capacity` when the resize fails. We re-acquire it (it's a clean `busy=false` slot the inline-Sync invariant guarantees no one else grabbed) so the surface's `source_data` / `source_size` / `request_fd` mirror stays internally consistent. Then surface `VA_STATUS_ERROR_ALLOCATION_FAILED` to libva — fallback behaviour matches pre-patch (consumer recreates the surface). ## Independent review Ran code-review pass (independent of this author): traced Begin/Render/End/Sync invariants in `picture.c` + `surface.c`, traced the OUTPUT-only V4L2 cycle's interaction with CAPTURE for rkvdec / hantro / daedalus_v4l2, audited the geometric-growth math for overflow + infinite-loop. Verdict: sound. One minor observation about the resize-failure surface state, addressed by the explicit re-acquire (see commit message + inline comment in `codec_store_buffer_ensure_capacity`). ## Test plan - [ ] CI: `libva-v4l2-request-fourier-aarch64` + `-debian` Gitea Actions jobs build clean. - [ ] On higgs (daedalus path): the #13 SIGSEGV repro (`mpv --hwdec=vaapi-copy bbb_1080p30_h264.mp4` against a 720p-default daedalus session) decodes through the upshift instead of crashing; `request_log` shows the `codec_store_buffer: OUTPUT-pool resize` line at the first 1080p slice. - [ ] On fresnel: regress-check the 5 in-spec codecs (rkvdec H.264 / HEVC / VP9, hantro MPEG-2 / VP8) — no resize ever triggers (steady-state sizeimage suffices), `vainfo` enumerates all 8 codec profiles, no perf regression. - [ ] Confirm Firefox VAAPI decode no longer leaves the latent heap-overrun hazard (sub-`sizeimage` slices remain on the happy path, larger slices trigger the resize cleanly).
marfrit added 1 commit 2026-05-21 11:12:45 +00:00
Follow-up to #13 (PR #14, bounds-check floor).  When a stream-level
resolution upshift mid-session pushes an Annex-B start code / VP8
header pad / slice payload past the OUTPUT pool slot's mmap, the
bounds check used to return VA_STATUS_ERROR_ALLOCATION_FAILED and
force the libva consumer to recreate the surface (losing the frame).
This patch absorbs the resize transparently:

  1. codec_store_buffer's three append sites call a new
     codec_store_buffer_ensure_capacity() before each memcpy/memset.
  2. On overflow, ensure_capacity snapshots the in-flight surface's
     accumulated bytes, temporarily releases its OUTPUT pool slot,
     and calls request_pool_resize.
  3. request_pool_resize STREAMOFFs the OUTPUT queue, munmaps every
     slot, closes every per-slot media-request fd, REQBUFS(0)s the
     V4L2 buffers, re-issues S_FMT with a sizeimage hint = 2× the
     required total (capped at 1 GiB, rounded up to a 4 KiB page),
     CREATE_BUFSes the original slot count, per-slot queries +
     mmaps + media_request_allocs, and STREAMONs.
  4. ensure_capacity re-acquires a pool slot, re-mirrors
     source_{index,data,size,request_fd} onto the surface, and
     restores the saved bytes via memcpy.

The cached S_FMT params (pixelformat, picture_width, picture_height)
are stashed on the request_pool at init time so the resize is fully
self-contained — caller passes only the new sizeimage hint.

A new v4l2_set_format_sizeimage() helper accepts an explicit
sizeimage override; v4l2_set_format keeps the SOURCE_SIZE_MAX (1 MiB)
default for CreateContext-time S_FMT.

The pre-condition for the resize is "no pool slot may be borrowed."
The inline-Sync-in-EndPicture pattern (RequestEndPicture calls
RequestSyncSurface before returning) guarantees that during
codec_store_buffer, the only borrowed slot is the current
render_surface_id's — which the resize trigger explicitly releases
before invoking the pool function. request_pool_resize asserts the
invariant via a busy-scan and bails loudly if anyone breaks it
rather than corrupting in-flight V4L2 state.

On resize failure: re-acquire the just-released slot (it was a
clean busy=false flip; the resize aborted before tearing it down
in the common case, or zeroed its mmap fields in the late-abort
case — either way the re-acquire keeps surface_object's mirror
internally consistent) and surface the original
VA_STATUS_ERROR_ALLOCATION_FAILED so libva clients fall back to
surface recreation as before this patch.

CAPTURE side is untouched — the V4L2 stateless API treats per-queue
streaming independently, so STREAMOFF/STREAMON on OUTPUT does not
disrupt the CAPTURE queue, and a resolution-upshift CAPTURE budget
mismatch becomes a clean V4L2_BUF_FLAG_ERROR on the next DQBUF
(handled by the existing surface error path).

Closes marfrit/libva-v4l2-request-fourier#15.
marfrit merged commit c454618ae1 into master 2026-05-21 11:23:09 +00:00
marfrit deleted branch noether/output-pool-resize-issue-15 2026-05-21 11:23:09 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/libva-v4l2-request-fourier#16