8d71e20bf7
Rewrites src/h265.c (407 lines → 588 lines) and the picture.c HEVC
dispatch + per-slice accumulation against the modern split V4L2_CID_
STATELESS_HEVC_{SPS,PPS,SLICE_PARAMS,SCALING_MATRIX,DECODE_PARAMS,
DECODE_MODE,START_CODE} stateless controls. Replaces the staging-era
V4L2_CID_MPEG_VIDEO_HEVC_{SPS,PPS,SLICE_PARAMS} CIDs that were
removed from the kernel UAPI.
Per-frame submission: ONE batched VIDIOC_S_EXT_CTRLS, count=5,
ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS:
0xa40a90 SPS (40 bytes)
0xa40a91 PPS (64 bytes)
0xa40a92 SLICE_PARAMS (variable; dynamic-array; one entry per slice)
0xa40a93 SCALING_MATRIX (1296 bytes; memset-zero when no scaling list)
0xa40a94 DECODE_PARAMS (328 bytes; per-frame DPB info)
Plus device-wide menus set once at context.c init (separate batched
S_EXT_CTRLS call so a kernel without HEVC controls — e.g. hantro on
RK3568/RK3399 — silently fails its batch without invalidating H.264):
0xa40a95 DECODE_MODE (FRAME_BASED on rkvdec)
0xa40a96 START_CODE (ANNEX_B on rkvdec)
Reference: FFmpeg libavcodec/v4l2_request_hevc.c:505-565
(v4l2_request_hevc_queue_decode batched submission shape).
Phase 5 review amendments incorporated:
C1 (data_byte_offset NOT data_bit_offset):
Old h265.c at lines 184-209 ran an 8-bit search to compute
bit-granularity offset. New API renames the field to
data_byte_offset (u32 byte offset). Bit-search dropped; replaced
with plain byte offset = source_offset + slice->slice_data_byte_offset.
C2 (dpb_entry.flags only LONG_TERM_REFERENCE; pic_order_cnt_val
singular; poc_st_curr_*[] arrays hold DPB INDICES not POC):
h265_fill_decode_params replaces old slice-params DPB iteration
with explicit DPB classification + index-array population.
For each VAAPI ReferenceFrames[i]:
- Classify into ST_CURR_BEFORE / ST_CURR_AFTER / LT_CURR via
VA_PICTURE_HEVC_RPS_* flags.
- Set dpb[j].timestamp, .pic_order_cnt_val (singular), .field_pic.
- Set dpb[j].flags = LONG_TERM_REFERENCE iff RPS_LT_CURR.
- Append j (DPB index, u8) to poc_st_curr_before[k] /
poc_st_curr_after[k] / poc_lt_curr[k] based on classification.
C3 (union-aliasing reasoning corrected):
BeginPicture's params.h265.num_slices = 0 reset is benign for
non-HEVC profiles because byte ~17764 of the params union is past
any field non-HEVC profiles read, NOT because RenderPicture's
per-buffer copies overwrite that location. Wording amended in
phase4_iter2_plan.md per phase5_iter2_review.md.
S1 (PPS flags 19 + 20 — DEBLOCKING_FILTER_CONTROL_PRESENT and
UNIFORM_SPACING):
Empirically VAAPI does NOT expose either flag in the
VAPictureParameterBufferHEVC pic_fields.bits or
slice_parsing_fields.bits. Both bits left zero. BBB-720p10s_hevc
fixture uses neither tiles nor explicit deblocking-control
parameters, so the omission is correct for the iter2 binding cell.
S2 (3 PPS scalars added):
pic_parameter_set_id (default 0; VAAPI doesn't expose),
num_ref_idx_l0_default_active_minus1, num_ref_idx_l1_default_
active_minus1 (both populated from VAAPI picture struct).
Q2 (slice_segment_addr populated):
Was missing in old h265.c. Now sourced from
VAAPI's slice->slice_segment_address.
S3 (SCALING_MATRIX content choice):
Implementer choice taken: when iqmatrix_set==false (BBB has no
scaling list per SPS flags = SAO|STRONG_INTRA_SMOOTHING),
h265_fill_scaling_matrix sends memset-zero. Matches FFmpeg's
sl=NULL pattern at v4l2_request_hevc.c:384-403 (preserves
byte-equality vs cross-validator anchor).
S4 (FFmpeg function name fix): cosmetic; no code impact.
Plus one Phase 6 inline correction: phase 5 review S1 suggested
VAAPI exposes uniform_spacing_flag in pic_fields.bits; empirical
test-compile shows it doesn't. Comment added in h265_fill_pps
documenting the omission.
Picture.c changes (3 edits):
1. codec_set_controls HEVCMain dispatch (lines 204-206 → call
h265_set_controls; replaces explicit Fourier-local: HEVC stripped
reject).
2. codec_store_buffer HEVC VASliceParameterBufferType case: append
VAAPI slice param to params.h265.slices[N] array, increment
num_slices. Single-slice mirror at .slice retained for
h265_fill_pps (which reads dependent_slice_segment_flag from
LongSliceFlags).
3. RequestBeginPicture: add params.h265.num_slices = 0 reset
alongside existing h264.matrix_set = false reset.
Surface.h: extend params.h265 struct with slices[HEVC_MAX_SLICES_PER_
FRAME=64] array + num_slices counter. ~17 KB extra per surface union;
24 surfaces in iter7 cap_pool = ~400 KB total surface_heap growth.
object_heap allocator picks up new size automatically via
sizeof(struct object_surface).
Context.c: separate 2-control batched call sets HEVC DECODE_MODE +
START_CODE device-wide. Same best-effort (void)v4l2_set_controls
pattern as the existing H.264 device-init block; if kernel doesn't
advertise HEVC controls (hantro on RK3568/RK3399), the batch silently
fails without invalidating the H.264 batch.
Meson.build: uncomment 'h265.c' (line 50) and 'h265.h' (line 73)
in sources + headers lists.
H265.h: added HEVC_MAX_SLICES_PER_FRAME=64 #define before struct
forward declarations.
Phase 6 smoke test on fresnel (post Commit A + Commit B):
Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec env binding
(/dev/video1 + /dev/media0). PASS.
Criterion 3: ffmpeg -hwaccel vaapi HEVC decode of bbb_720p10s_hevc.mp4
-frames:v 5 -f null -, exit 0. cap_pool_init: 24 slots
ready. PASS.
Criterion 4: mpv --hwdec=vaapi --vo=image at +02s seek, HEVC fixture:
HW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
SW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
HW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
SW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
HW=SW byte-identical for both frames; frame1 != frame2 (real motion).
PASS.
Criterion 5: regression hashes hold for both prior cells:
H.264 +30s HW frame 1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9 (T4 ref MATCH)
H.264 +30s HW frame 2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8 (T4 ref MATCH)
MPEG-2 +02s HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 (iter1 ref MATCH)
MPEG-2 +02s HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de (iter1 ref MATCH)
PASS.
All five criteria green on first build attempt — Phase 5 review
caught the 3 Critical UAPI errors (data_bit_offset → data_byte_offset
rename; dpb.rps field gone + pic_order_cnt_val rename + index-array
semantics) that would have been Phase 6 compile failures or silent
Phase 7 byte-compare divergences. Without that review pass, this
commit would have been the start of a 2+ loopback debugging cycle.
Refs:
../fresnel-fourier/phase4_iter2_plan.md (10 contract clauses,
File 4 patch shape)
../fresnel-fourier/phase5_iter2_review.md (C1, C2, C3, S1, S2,
S3, S4, Q2 amendments
all incorporated)
../fresnel-fourier/phase0_evidence/2026-05-08/iter2_phase3/
ffmpeg_v4l2req.stdout (cross-validator anchor — Phase 7
bonus byte-compare verification target)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
300 lines
9.7 KiB
C
300 lines
9.7 KiB
C
/*
|
|
* Copyright (C) 2007 Intel Corporation
|
|
* Copyright (C) 2016 Florent Revest <florent.revest@free-electrons.com>
|
|
* Copyright (C) 2018 Paul Kocialkowski <paul.kocialkowski@bootlin.com>
|
|
*
|
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
|
* copy of this software and associated documentation files (the
|
|
* "Software"), to deal in the Software without restriction, including
|
|
* without limitation the rights to use, copy, modify, merge, publish,
|
|
* distribute, sub license, and/or sell copies of the Software, and to
|
|
* permit persons to whom the Software is furnished to do so, subject to
|
|
* the following conditions:
|
|
*
|
|
* The above copyright notice and this permission notice (including the
|
|
* next paragraph) shall be included in all copies or substantial portions
|
|
* of the Software.
|
|
*
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
|
|
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
|
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
|
|
* IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE FOR
|
|
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
|
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
|
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
*/
|
|
|
|
#include "context.h"
|
|
#include "config.h"
|
|
#include "request.h"
|
|
#include "surface.h"
|
|
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
|
|
#include <assert.h>
|
|
|
|
#include <sys/ioctl.h>
|
|
#include <sys/mman.h>
|
|
|
|
#include <linux/videodev2.h>
|
|
|
|
#include <hevc-ctrls.h>
|
|
|
|
#include "utils.h"
|
|
#include "v4l2.h"
|
|
|
|
#include "autoconfig.h"
|
|
|
|
VAStatus RequestCreateContext(VADriverContextP context, VAConfigID config_id,
|
|
int picture_width, int picture_height, int flags,
|
|
VASurfaceID *surfaces_ids, int surfaces_count,
|
|
VAContextID *context_id)
|
|
{
|
|
struct request_data *driver_data = context->pDriverData;
|
|
struct object_config *config_object;
|
|
struct object_context *context_object = NULL;
|
|
struct video_format *video_format;
|
|
VASurfaceID *ids = NULL;
|
|
VAContextID id;
|
|
VAStatus status;
|
|
unsigned int output_type, capture_type;
|
|
int rc;
|
|
|
|
video_format = driver_data->video_format;
|
|
if (video_format == NULL)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
output_type = v4l2_type_video_output(video_format->v4l2_mplane);
|
|
capture_type = v4l2_type_video_capture(video_format->v4l2_mplane);
|
|
|
|
config_object = CONFIG(driver_data, config_id);
|
|
if (config_object == NULL) {
|
|
status = VA_STATUS_ERROR_INVALID_CONFIG;
|
|
goto error;
|
|
}
|
|
|
|
id = object_heap_allocate(&driver_data->context_heap);
|
|
context_object = CONTEXT(driver_data, id);
|
|
if (context_object == NULL) {
|
|
status = VA_STATUS_ERROR_ALLOCATION_FAILED;
|
|
goto error;
|
|
}
|
|
memset(&context_object->dpb, 0, sizeof(context_object->dpb));
|
|
|
|
/*
|
|
* Initialize the OUTPUT (bitstream-input) buffer pool. Sized by
|
|
* codec pipeline depth (4 H.264 frames in flight is sufficient
|
|
* for current hantro/rkvdec scheduling); independent of caller-
|
|
* supplied surfaces_count. Pool is owned by driver_data so it
|
|
* outlives any single context destroy/recreate cycle.
|
|
*
|
|
* This replaces the prior per-surface OUTPUT loop, which (a)
|
|
* created an empty queue when surfaces_count==0 (ffmpeg vaapi-
|
|
* copy path) and (b) only populated surface->source_* for
|
|
* surfaces present at vaCreateContext time, NULL-derefing on
|
|
* surfaces created later.
|
|
*/
|
|
/*
|
|
* iter6: pool size 16 gives comfortable headroom over typical H.264
|
|
* MaxDpbFrames (16) for any consumer that pipelines decode requests.
|
|
* Each slot owns its own request_fd (REINIT'd per use).
|
|
*/
|
|
rc = request_pool_init(&driver_data->output_pool,
|
|
driver_data->video_fd, driver_data->media_fd,
|
|
output_type, 16);
|
|
if (rc < 0) {
|
|
status = VA_STATUS_ERROR_ALLOCATION_FAILED;
|
|
goto error;
|
|
}
|
|
|
|
/*
|
|
* The surface_ids array has been allocated by the caller and
|
|
* we don't have any indication wrt its life time. Let's make sure
|
|
* its life span is under our control.
|
|
*/
|
|
if (surfaces_count > 0) {
|
|
ids = malloc(surfaces_count * sizeof(VASurfaceID));
|
|
if (ids == NULL) {
|
|
status = VA_STATUS_ERROR_ALLOCATION_FAILED;
|
|
goto error;
|
|
}
|
|
|
|
memcpy(ids, surfaces_ids,
|
|
surfaces_count * sizeof(VASurfaceID));
|
|
}
|
|
|
|
/*
|
|
* Stateless H.264 device-wide controls. The kernel V4L2 stateless
|
|
* framework requires DECODE_MODE and START_CODE be set on the
|
|
* device fd (request_fd=-1) before VIDIOC_STREAMON; per-request
|
|
* controls (SPS/PPS/etc.) attached to a request_fd come later.
|
|
*
|
|
* hantro-vpu via rockchip,rk3568-vpu DT compatible (covers RK3568
|
|
* and RK3566 — PineTab2 silicon — since they're close enough)
|
|
* accepts only DECODE_MODE_FRAME_BASED.
|
|
* START_CODE_ANNEX_B preserves leading 0x00000001 in the slice
|
|
* payload that h264.c assembles. Errors here are not fatal: not
|
|
* every backing driver supports both controls (e.g. cedrus may
|
|
* default to SLICE_BASED without exposing DECODE_MODE).
|
|
*/
|
|
{
|
|
struct v4l2_ext_control dev_ctrls[2] = {
|
|
{
|
|
.id = V4L2_CID_STATELESS_H264_DECODE_MODE,
|
|
.value = V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED,
|
|
},
|
|
{
|
|
.id = V4L2_CID_STATELESS_H264_START_CODE,
|
|
.value = V4L2_STATELESS_H264_START_CODE_ANNEX_B,
|
|
},
|
|
};
|
|
(void)v4l2_set_controls(driver_data->video_fd, -1,
|
|
dev_ctrls, 2);
|
|
}
|
|
|
|
/*
|
|
* iter2: HEVC device-wide controls. Same best-effort pattern as
|
|
* H.264 above — separate batched call so a kernel that does not
|
|
* advertise HEVC controls (e.g. hantro-vpu-dec on RK3568/RK3399)
|
|
* silently fails on this batch without invalidating the H.264
|
|
* batch. rkvdec on RK3399 advertises HEVC and accepts FRAME_BASED
|
|
* + ANNEX_B (only supported menu values per Phase 0 v4l2_inventory).
|
|
*/
|
|
{
|
|
struct v4l2_ext_control hevc_dev_ctrls[2] = {
|
|
{
|
|
.id = V4L2_CID_STATELESS_HEVC_DECODE_MODE,
|
|
.value = V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED,
|
|
},
|
|
{
|
|
.id = V4L2_CID_STATELESS_HEVC_START_CODE,
|
|
.value = V4L2_STATELESS_HEVC_START_CODE_ANNEX_B,
|
|
},
|
|
};
|
|
(void)v4l2_set_controls(driver_data->video_fd, -1,
|
|
hevc_dev_ctrls, 2);
|
|
}
|
|
|
|
/*
|
|
* Mirror the ANNEX_B start-code mode set on the device above
|
|
* into context_object->h264_start_code so picture.c::
|
|
* codec_store_buffer prepends 0x00 0x00 0x01 to each slice
|
|
* payload it copies into the OUTPUT buffer. Without this, the
|
|
* kernel — which we just told to expect ANNEX_B — sees a raw
|
|
* NAL stream with no start codes, fails to find slice
|
|
* boundaries, and emits a zeroed CAPTURE buffer (visually a
|
|
* flat dark-green frame).
|
|
*
|
|
* h264_get_controls() exists for this purpose but is never
|
|
* called in the current code path; the planned probe-then-set
|
|
* commit will replace this hardcoded assignment with a runtime
|
|
* read of the kernel's accepted START_CODE value.
|
|
*/
|
|
context_object->h264_start_code = true;
|
|
|
|
rc = v4l2_set_stream(driver_data->video_fd, output_type, true);
|
|
if (rc < 0) {
|
|
status = VA_STATUS_ERROR_OPERATION_FAILED;
|
|
goto error;
|
|
}
|
|
|
|
rc = v4l2_set_stream(driver_data->video_fd, capture_type, true);
|
|
if (rc < 0) {
|
|
status = VA_STATUS_ERROR_OPERATION_FAILED;
|
|
goto error;
|
|
}
|
|
|
|
context_object->config_id = config_id;
|
|
context_object->render_surface_id = VA_INVALID_ID;
|
|
context_object->surfaces_ids = ids;
|
|
context_object->surfaces_count = surfaces_count;
|
|
context_object->picture_width = picture_width;
|
|
context_object->picture_height = picture_height;
|
|
context_object->flags = flags;
|
|
|
|
*context_id = id;
|
|
|
|
status = VA_STATUS_SUCCESS;
|
|
goto complete;
|
|
|
|
error:
|
|
if (ids != NULL)
|
|
free(ids);
|
|
|
|
if (context_object != NULL)
|
|
object_heap_free(&driver_data->context_heap,
|
|
(struct object_base *)context_object);
|
|
|
|
complete:
|
|
return status;
|
|
}
|
|
|
|
VAStatus RequestDestroyContext(VADriverContextP context, VAContextID context_id)
|
|
{
|
|
struct request_data *driver_data = context->pDriverData;
|
|
struct object_context *context_object;
|
|
struct video_format *video_format;
|
|
unsigned int output_type, capture_type;
|
|
VAStatus status;
|
|
int rc;
|
|
|
|
video_format = driver_data->video_format;
|
|
if (video_format == NULL)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
output_type = v4l2_type_video_output(video_format->v4l2_mplane);
|
|
capture_type = v4l2_type_video_capture(video_format->v4l2_mplane);
|
|
|
|
context_object = CONTEXT(driver_data, context_id);
|
|
if (context_object == NULL)
|
|
return VA_STATUS_ERROR_INVALID_CONTEXT;
|
|
|
|
rc = v4l2_set_stream(driver_data->video_fd, output_type, false);
|
|
if (rc < 0)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
rc = v4l2_set_stream(driver_data->video_fd, capture_type, false);
|
|
if (rc < 0)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
/* Buffers liberation */
|
|
|
|
status = RequestDestroySurfaces(context, context_object->surfaces_ids,
|
|
context_object->surfaces_count);
|
|
if (status != VA_STATUS_SUCCESS)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
free(context_object->surfaces_ids);
|
|
|
|
object_heap_free(&driver_data->context_heap,
|
|
(struct object_base *)context_object);
|
|
|
|
rc = v4l2_request_buffers(driver_data->video_fd, output_type, 0);
|
|
if (rc < 0)
|
|
return VA_STATUS_ERROR_OPERATION_FAILED;
|
|
|
|
/*
|
|
* Iter2 Fix 3: cap_pool owns the CAPTURE buffers' mmaps + any
|
|
* outstanding our_export_fds. Tear it down (which also issues
|
|
* REQBUFS(0) on CAPTURE), so the next CreateSurfaces2 cycle sees
|
|
* a clean slate and rebuilds the pool at the new resolution.
|
|
*/
|
|
cap_pool_destroy(&driver_data->capture_pool, driver_data->video_fd,
|
|
capture_type);
|
|
|
|
/*
|
|
* Iteration 2 Fix 1: the kernel CAPTURE format state is no longer
|
|
* guaranteed after the dual REQBUFS(0). Invalidate the
|
|
* LAST_OUTPUT_WIDTH/HEIGHT cache so the next CreateSurfaces2 will
|
|
* unconditionally re-S_FMT on OUTPUT. Without this, multi-video
|
|
* Firefox sessions on mozilla.org corrupted the next session's
|
|
* CAPTURE format query (kernel returned 48x48 instead of the
|
|
* cached "already 1920x1088"); the exported descriptor encoded
|
|
* wrong pitch/offset.
|
|
*/
|
|
surface_reset_format_cache(driver_data);
|
|
|
|
return VA_STATUS_SUCCESS;
|
|
}
|