4 Commits

Author SHA1 Message Date
claude-noether bfcb286031 picture: bounds-check codec_store_buffer slice writes against source_size
surface_object->source_data points at an OUTPUT-pool mmap of fixed
size source_size, negotiated by v4l2_query_buffer at request_pool_init
time (kernel sizeimage at S_FMT).  codec_store_buffer's
VASliceDataBufferType branch appended to it at three sites (H.264 Annex-B
start code, VP8 uncompressed-header pad, slice payload) without
consulting that capacity — a stream-level resolution upshift would walk
past the mmap and SIGSEGV inside the memcpy (mpv --hwdec=vaapi-copy on
the daedalus path, issue #13) or corrupt adjacent heap (Firefox RDD).

Add a check at each append site that fails the RenderPicture call with
VA_STATUS_ERROR_ALLOCATION_FAILED when slices_size+payload exceeds
source_size, and logs the over-budget request for postmortem.
libavcodec recreates the surface at the new dimensions on the next
BeginPicture, so a refused upshift slice is recoverable.

Doesn't address the root cause (surfaces should be re-created on
resolution change, or source_data should be grown on demand) but
removes the memory-safety hazard while the larger refactor waits.

Closes marfrit/libva-v4l2-request-fourier#13.
2026-05-21 12:14:48 +02:00
marfrit 77f9236466 Merge pull request 'av1: populate V4L2_CID_STATELESS_AV1_SEQUENCE in codec_set_controls (#11 libva side)' (#12) from claude-noether/libva-v4l2-request-fourier:noether/av1-set-controls-bug-11 into master
Reviewed-on: #12
2026-05-20 19:14:49 +00:00
claude-noether 9fa18f2312 av1: populate V4L2_CID_STATELESS_AV1_SEQUENCE in codec_set_controls
Implements the libva-side portion of issue #11 — replaces PR #10's
no-op AV1 dispatch with a real av1_set_controls that maps VAAPI's
VADecPictureParameterBufferAV1.seq_info_fields + scalar fields onto
struct v4l2_ctrl_av1_sequence (the kernel uAPI control declared at
linux/v4l2-controls.h:2891-2919).

Daemon-track context (issue #11 daemon side, operator-owned):
ffmpeg-vaapi splits the AV1 bitstream client-side and strips the
OBU_SEQUENCE_HEADER before delivery; the V4L2 OUTPUT buffer contains
only OBU_FRAME_HEADER + OBU_TILE_GROUP.  libdav1d in the daedalus
daemon cannot parse this — it expects a complete OBU stream.  The
daemon side has to synthesise OBU_SEQUENCE_HEADER from the SEQUENCE
ctrl and prepend it to the slice bitstream.  This libva-side change
just makes the SEQUENCE ctrl populated and queued via S_EXT_CTRLS;
the daemon track is the consumer.

Three small touch points beyond the new src/av1.{c,h}:

  - src/surface.h: add an av1 leaf to surface->params holding
    VADecPictureParameterBufferAV1.  Slice params intentionally
    absent — the daedalus daemon consumes the slice OBU bytes
    directly from the OUTPUT buffer; no per-tile-group struct →
    OBU re-synthesis required from libva today.
  - src/picture.c: copy the picture-param buffer into the new leaf
    in RenderPicture, mirror of the per-codec memcpy pattern, plus
    call av1_set_controls from codec_set_controls (replacing the
    no-op).
  - src/meson.build: register src/av1.c.

Sequence-field mapping covers everything VAAPI exposes at the
sequence level (12 of 18 V4L2_AV1_SEQUENCE_FLAG_* bits + the four
scalars).  Bits VAAPI doesn't carry at the sequence level
(WARPED_MOTION, REF_FRAME_MVS, SUPERRES, RESTORATION,
SEPARATE_UV_DELTA_Q) stay clear; per-frame consumers (libdav1d via
the daemon, vpu981 via the hardware path) read those from the
OBU_FRAME_HEADER that is already in the slice buffer anyway.  See
feedback memory `feedback_vaapi_blind_to_some_hevc_sps_fields` for
the precedent.

Build verified on higgs (Debian 13 trixie, gcc 14.2.0, libva 2.22.0,
linux uAPI v4l2-controls.h sizeof(struct v4l2_ctrl_av1_sequence)==12):
clean meson + ninja link of v4l2_request_drv_video.so, vainfo
enumerates VAProfileAV1Profile0 via daedalus_v4l2 slot, av1_set_controls
symbol present.

Out of scope on this PR (operator-track, issue #11 follow-up):
  - daedalus-v4l2 kernel module wire-protocol extension (daedalus_
    collect_av1_meta + AV1 ctrl request_setup).
  - daedalus daemon OBU synthesiser (~400 LoC AV1 OBU encoder in
    daemon/src/av1_obu_synth.{c,h}).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 21:13:07 +02:00
marfrit 9a9cfd05db Merge pull request 'picture: no-op codec_set_controls case for VAProfileAV1Profile0' (#10) from noether/picture-av1-noop into master
Reviewed-on: #10
2026-05-20 19:07:12 +00:00
5 changed files with 271 additions and 25 deletions
+155
View File
@@ -0,0 +1,155 @@
/*
* Copyright (C) 2026 Markus Fritsche <fritsche.markus@gmail.com>
*
* AV1 codec dispatcher. Populates V4L2_CID_STATELESS_AV1_SEQUENCE
* (struct v4l2_ctrl_av1_sequence) from VAAPI's VADecPictureParameterBufferAV1.
*
* Why a single SEQUENCE control and not the full V4L2_CID_STATELESS_AV1_*
* family (FRAME, TILE_GROUP_ENTRY, FILM_GRAIN):
*
* - The daedalus_v4l2 daemon path consumes the OUTPUT bitstream
* directly via libavcodec/libdav1d. libdav1d needs a complete OBU
* stream that includes the sequence header — ffmpeg-vaapi strips the
* sequence header on the client side (its parser is split across
* VAPictureParameterBufferAV1 + slice payload, with OBU_SEQUENCE_HEADER
* consumed and not re-emitted), so the daemon side has to synthesise
* it from the SEQUENCE ctrl. The other AV1 ctrls (FRAME / TILE /
* FILM_GRAIN) are not needed for that synthesis — the OBU_FRAME_HEADER
* + OBU_TILE_GROUP that libdav1d also needs are still in the slice
* bitstream.
*
* - The vpu981 (RK3588 dedicated AV1 hantro) hardware path doesn't
* consult these controls either — vpu981's driver parses the AV1
* bitstream directly. So setting only SEQUENCE is correct for both
* destination decoders.
*
* Reference: marfrit/libva-v4l2-request-fourier issue #11
* (DAEMON-PPS-style sequence-header re-synthesis on the daemon
* side, paralleling the H.264 SPS/PPS work in DAEMON-PPS).
* kernel uAPI: <linux/v4l2-controls.h> @ 2891-2919.
* VAAPI: <va/va_dec_av1.h> typedef
* VADecPictureParameterBufferAV1.
*/
#include "av1.h"
#include "v4l2.h"
#include "utils.h"
#include <stdint.h>
#include <string.h>
#include <linux/v4l2-controls.h>
#include <linux/videodev2.h>
/*
* VADecPictureParameterBufferAV1 reaches us transitively via surface.h →
* va_backend.h → va.h → va_dec_av1.h (va_dec_av1.h alone won't compile
* standalone — it needs va.h's VA_PADDING_LOW / va_deprecated machinery).
*/
/* Compile-time UAPI shift guard, sibling to vp9.c's pattern. */
_Static_assert(sizeof(struct v4l2_ctrl_av1_sequence) == 12,
"v4l2_ctrl_av1_sequence size mismatch — kernel UAPI changed");
/*
* Map VAAPI bit_depth_idx (0/1/2 → 8/10/12) to the kernel ctrl's plain
* uint8_t bit_depth field. ffmpeg-vaapi sets idx from the bitstream
* BitDepth value, so this is an exact inverse of AV1 spec 5.5.2.
*/
static uint8_t av1_bit_depth_from_idx(uint8_t idx)
{
switch (idx) {
case 0: return 8;
case 1: return 10;
case 2: return 12;
default:
/* Spec-illegal; pass through so a reviewer / test catches it. */
return 8;
}
}
int av1_set_controls(struct request_data *driver_data,
struct object_context *context,
struct object_surface *surface_object)
{
VADecPictureParameterBufferAV1 *picture =
&surface_object->params.av1.picture;
struct v4l2_ctrl_av1_sequence sequence;
struct v4l2_ext_control ctrls[1];
int rc;
(void)context;
memset(&sequence, 0, sizeof sequence);
/*
* Scalar mapping. Names align with kernel uAPI; off-by-one and
* idx→value translations are annotated.
*/
sequence.seq_profile = picture->profile;
sequence.order_hint_bits =
(uint8_t)(picture->order_hint_bits_minus_1 + 1u);
sequence.bit_depth = av1_bit_depth_from_idx(picture->bit_depth_idx);
sequence.max_frame_width_minus_1 = picture->frame_width_minus1;
sequence.max_frame_height_minus_1 = picture->frame_height_minus1;
/*
* Sequence-header flag mapping. VAAPI exposes most of these directly
* in seq_info_fields.fields.*; the ones that don't have a 1:1 mirror
* (V4L2_AV1_SEQUENCE_FLAG_ENABLE_WARPED_MOTION, _ENABLE_REF_FRAME_MVS,
* _ENABLE_SUPERRES, _ENABLE_RESTORATION, _SEPARATE_UV_DELTA_Q) live in
* VAAPI's per-frame pic_info_fields rather than the sequence struct.
* For SEQUENCE-control purposes we treat them as best-effort
* unobservable from libva and leave the corresponding bits clear; the
* daedalus daemon's OBU synthesiser (issue #11 daemon track) carries
* the SEQUENCE bytes verbatim, so per-frame consumers (libdav1d) will
* still see the full bitstream truth for those toggles via the
* OBU_FRAME stream already in the slice buffer. See feedback memory
* `feedback_vaapi_blind_to_some_hevc_sps_fields` for the precedent.
*/
if (picture->seq_info_fields.fields.still_picture)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_STILL_PICTURE;
if (picture->seq_info_fields.fields.use_128x128_superblock)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_USE_128X128_SUPERBLOCK;
if (picture->seq_info_fields.fields.enable_filter_intra)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_FILTER_INTRA;
if (picture->seq_info_fields.fields.enable_intra_edge_filter)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_INTRA_EDGE_FILTER;
if (picture->seq_info_fields.fields.enable_interintra_compound)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_INTERINTRA_COMPOUND;
if (picture->seq_info_fields.fields.enable_masked_compound)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_MASKED_COMPOUND;
if (picture->seq_info_fields.fields.enable_dual_filter)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_DUAL_FILTER;
if (picture->seq_info_fields.fields.enable_order_hint)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_ORDER_HINT;
if (picture->seq_info_fields.fields.enable_jnt_comp)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_JNT_COMP;
if (picture->seq_info_fields.fields.enable_cdef)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_CDEF;
if (picture->seq_info_fields.fields.mono_chrome)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_MONO_CHROME;
if (picture->seq_info_fields.fields.color_range)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_COLOR_RANGE;
if (picture->seq_info_fields.fields.subsampling_x)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_SUBSAMPLING_X;
if (picture->seq_info_fields.fields.subsampling_y)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_SUBSAMPLING_Y;
if (picture->seq_info_fields.fields.film_grain_params_present)
sequence.flags |= V4L2_AV1_SEQUENCE_FLAG_FILM_GRAIN_PARAMS_PRESENT;
/* Single-control batched submission. */
memset(ctrls, 0, sizeof ctrls);
ctrls[0].id = V4L2_CID_STATELESS_AV1_SEQUENCE;
ctrls[0].ptr = &sequence;
ctrls[0].size = sizeof sequence;
rc = v4l2_set_controls(driver_data->video_fd,
surface_object->request_fd,
ctrls, 1);
if (rc < 0)
return VA_STATUS_ERROR_OPERATION_FAILED;
return VA_STATUS_SUCCESS;
}
+39
View File
@@ -0,0 +1,39 @@
/*
* Copyright (C) 2026 Markus Fritsche <fritsche.markus@gmail.com>
*
* AV1 codec dispatcher — populates V4L2_CID_STATELESS_AV1_SEQUENCE
* (struct v4l2_ctrl_av1_sequence) from VAAPI's VADecPictureParameterBufferAV1.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
* THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _AV1_H_
#define _AV1_H_
#include "context.h"
#include "request.h"
#include "surface.h"
int av1_set_controls(struct request_data *driver_data,
struct object_context *context,
struct object_surface *surface);
#endif /* _AV1_H_ */
+1
View File
@@ -53,6 +53,7 @@ sources = [
'h265.c', 'h265.c',
'vp8.c', 'vp8.c',
'vp9.c', 'vp9.c',
'av1.c',
'codec.c', 'codec.c',
'nv15.c', 'nv15.c',
'nv12_col128.c', 'nv12_col128.c',
+64 -25
View File
@@ -36,6 +36,7 @@
#include "mpeg2.h" #include "mpeg2.h"
#include "vp8.h" #include "vp8.h"
#include "vp9.h" #include "vp9.h"
#include "av1.h"
#include <assert.h> #include <assert.h>
#include <stdio.h> #include <stdio.h>
@@ -61,16 +62,37 @@ static VAStatus codec_store_buffer(struct request_data *driver_data,
struct object_buffer *buffer_object) struct object_buffer *buffer_object)
{ {
switch (buffer_object->type) { switch (buffer_object->type) {
case VASliceDataBufferType: case VASliceDataBufferType: {
/* /*
* Since there is no guarantee that the allocation * Since there is no guarantee that the allocation
* order is the same as the submission order (via * order is the same as the submission order (via
* RenderPicture), we can't use a V4L2 buffer directly * RenderPicture), we can't use a V4L2 buffer directly
* and have to copy from a regular buffer. * and have to copy from a regular buffer.
*
* Bounds check (issue #13): surface_object->source_data points
* at an OUTPUT-pool mmap of fixed size source_size, negotiated
* at S_FMT time. A stream-level resolution upshift can produce
* a slice larger than this allocation; without the guard, the
* memcpy walks past the mmap and SIGSEGVs (mpv --hwdec=vaapi-
* copy) or corrupts adjacent heap (Firefox RDD). Each append
* site below checks the running total against source_size and
* fails the RenderPicture call instead of corrupting memory;
* libavcodec re-creates the surface at the new resolution on
* the next BeginPicture.
*/ */
size_t cap = surface_object->source_size;
size_t need;
if (context->h264_start_code) { if (context->h264_start_code) {
static const char start_code[3] = { 0x00, 0x00, 0x01 }; static const char start_code[3] = { 0x00, 0x00, 0x01 };
need = (size_t)surface_object->slices_size +
sizeof(start_code);
if (need > cap) {
request_log("codec_store_buffer: H.264 start code would overflow OUTPUT buffer (%zu > %zu) — resolution upshift mid-stream?\n",
need, cap);
return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
memcpy(surface_object->source_data + memcpy(surface_object->source_data +
surface_object->slices_size, surface_object->slices_size,
start_code, sizeof(start_code)); start_code, sizeof(start_code));
@@ -104,19 +126,34 @@ static VAStatus codec_store_buffer(struct request_data *driver_data,
unsigned int header_size = unsigned int header_size =
surface_object->params.vp8.picture.pic_fields.bits.key_frame == 0 ? surface_object->params.vp8.picture.pic_fields.bits.key_frame == 0 ?
10 : 3; 10 : 3;
need = (size_t)surface_object->slices_size + header_size;
if (need > cap) {
request_log("codec_store_buffer: VP8 header pad would overflow OUTPUT buffer (%zu > %zu)\n",
need, cap);
return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
memset(surface_object->source_data + memset(surface_object->source_data +
surface_object->slices_size, surface_object->slices_size,
0, header_size); 0, header_size);
surface_object->slices_size += header_size; surface_object->slices_size += header_size;
} }
memcpy(surface_object->source_data + {
surface_object->slices_size, size_t payload = (size_t)buffer_object->size *
buffer_object->data, buffer_object->count;
buffer_object->size * buffer_object->count); need = (size_t)surface_object->slices_size + payload;
surface_object->slices_size += if (need > cap) {
buffer_object->size * buffer_object->count; request_log("codec_store_buffer: slice payload would overflow OUTPUT buffer (%zu > %zu) — resolution upshift mid-stream?\n",
need, cap);
return VA_STATUS_ERROR_ALLOCATION_FAILED;
}
memcpy(surface_object->source_data +
surface_object->slices_size,
buffer_object->data, payload);
surface_object->slices_size += payload;
}
surface_object->slices_count++; surface_object->slices_count++;
break; break;
}
case VAPictureParameterBufferType: case VAPictureParameterBufferType:
switch (profile) { switch (profile) {
@@ -157,6 +194,12 @@ static VAStatus codec_store_buffer(struct request_data *driver_data,
sizeof(surface_object->params.vp9.picture)); sizeof(surface_object->params.vp9.picture));
break; break;
case VAProfileAV1Profile0:
memcpy(&surface_object->params.av1.picture,
buffer_object->data,
sizeof(surface_object->params.av1.picture));
break;
default: default:
break; break;
} }
@@ -320,26 +363,22 @@ static VAStatus codec_set_controls(struct request_data *driver_data,
case VAProfileAV1Profile0: case VAProfileAV1Profile0:
/* /*
* AV1 has no codec-specific V4L2 control dispatch wired up * Populates V4L2_CID_STATELESS_AV1_SEQUENCE from
* yet on this branch (see config.c VAProfileAV1Profile0 * VAPictureParameterBufferAV1. The daedalus_v4l2 daemon
* comment). For the daedalus_v4l2 daemon path that's fine: * (issue #11 daemon track) synthesises an OBU_SEQUENCE_HEADER
* AV1 frames are self-describing per-frame (OBU sequence + * from this ctrl and prepends it to the slice bitstream
* frame headers carry everything libavcodec needs), so the * before handing it to libavcodec/libdav1d, which otherwise
* bitstream in the V4L2 OUTPUT buffer is sufficient — no * cannot parse the (sequence-header-stripped) OUTPUT buffer
* V4L2_CID_STATELESS_AV1_* controls have to be populated. * that ffmpeg-vaapi delivers.
* *
* Per-codec dispatch in request_switch_device_for_profile * On the RK3588 vpu981 hardware path the same SEQUENCE ctrl
* has already retargeted (video_fd, media_fd) to * is harmless: vpu981's driver parses the OBU stream
* video_fd_daedalus (or video_fd_vpu981 on RK3588 if * directly and ignores the ctrl payload, so no per-decoder
* present) by the time we get here; the OUTPUT buffer will * gating is required here.
* be queued via that fd and the kernel forwards bytes to
* the daemon as a regular REQ_DECODE. No-op is the
* correct shape.
*
* When the vpu981-targeted V4L2_CID_STATELESS_AV1_* dispatch
* lands from the av1-iter1 operator branch, replace this
* with av1_set_controls(...).
*/ */
rc = av1_set_controls(driver_data, context, surface_object);
if (rc < 0)
return VA_STATUS_ERROR_OPERATION_FAILED;
break; break;
default: default:
+12
View File
@@ -122,6 +122,18 @@ struct object_surface {
VADecPictureParameterBufferVP9 picture; VADecPictureParameterBufferVP9 picture;
VASliceParameterBufferVP9 slice; VASliceParameterBufferVP9 slice;
} vp9; } vp9;
struct {
/*
* AV1 picture parameter buffer. Slice params are
* intentionally absent — the daedalus daemon track
* (issue #11) consumes the slice OBU bytes directly
* from the OUTPUT bitstream and synthesises only the
* sequence-header OBU from V4L2_CID_STATELESS_AV1_
* SEQUENCE. No per-tile-group struct→OBU re-synthesis
* required from libva today.
*/
VADecPictureParameterBufferAV1 picture;
} av1;
} params; } params;
int request_fd; int request_fd;