ampere-av1 Phase 2.1: implement av1_set_controls body (~500 LoC)
Replaces stub av1_set_controls with full VAAPI → V4L2 stateless AV1
control translation. Four V4L2 controls batched per-frame:
V4L2_CID_STATELESS_AV1_SEQUENCE (sequence-level flags)
V4L2_CID_STATELESS_AV1_FRAME (heavy — quant, lf, cdef, lr, gm,
tile_info, refs, frame flags)
V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY[] (DYNAMIC_ARRAY, size=MAX(N,1))
V4L2_CID_STATELESS_AV1_FILM_GRAIN (gated on driver_data->has_av1_film_grain)
Reference: Kwiboo/FFmpeg v4l2-request-n8.1:libavcodec/v4l2_request_av1.c
(636 LoC); same V4L2 output schema, sourced from VAAPI's
VADecPictureParameterBufferAV1 instead of FFmpeg's AV1RawSequenceHeader.
VAAPI gap notes (fields the spec needs but VAAPI doesn't expose):
- sequence max_frame_{width,height}_minus_1 — use current frame size
- enable_warped_motion / enable_ref_frame_mvs / enable_superres /
enable_restoration sequence-level — conservative set-true (per-frame
flags gate actual behavior)
- order_hints[], reference_frame_ts[] — zero (kernel cross-refs by
OUTPUT timestamp / surface id)
- tile_start_col_sb[] / tile_start_row_sb[] — reconstruct via
prefix-sum on VAAPI's width/height_in_sbs_minus_1[]
- tile_size_bytes — set to 4 for multi-tile frames (max value), 0
for single-tile (matches Kwiboo's conditional)
- render_width/height — fall back to coded dimensions
- current_frame_id / refresh_frame_flags / skip_mode_frame_idx /
buffer_removal_time / frame_refs_short_signaling — zero
- film_grain_params_ref_idx / update_grain — zero (only consulted in
reuse paths; apply_grain=1 + populated arrays drive decode directly)
F1/F2/F3 risk mitigations per phase1_plan_v2:
F1: mi_col/row_starts sentinel = 2 * ((frame_width + 7) >> 3) at
index [tile_cols]/[tile_rows] — mirrors Kwiboo lines 238/244
F2: superres_denom direct from VAAPI's superres_scale_denominator
(VAAPI's encoding is the final value; no AV1_SUPERRES_DENOM_MIN
math). Fallback to AV1_SUPERRES_NUM=8 if zero.
F3: loop_restoration_size[] gated on USES_LR flag derived from
y_t != 0 || cb_t != 0 || cr_t != 0 — mirrors Kwiboo lines 281-287
Plus:
- request.h: has_av1_film_grain bool on driver_data
- request.c: probe VIDIOC_QUERY_EXT_CTRL for FILM_GRAIN on vpu981 fd
at VA_DRIVER_INIT (Janet v3 amendment A: init-time, not lazy)
Compile-tested on boltzmann (aarch64 native, gcc 15.2.1): clean .so,
0 errors, pre-existing GStreamer #warnings only.
Phase 3 verification on ampere is next: 208x208 smoke + film_grain
stress vector (av1-1-b8-23-film_grain-50.ivf) byte-compare libva vs
kdirect (Phase 0 proved kdirect bit-perfect).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -830,6 +830,24 @@ VAStatus VA_DRIVER_INIT_FUNC(VADriverContextP context)
|
||||
"vendored GStreamer parser)\n");
|
||||
}
|
||||
|
||||
/*
|
||||
* ampere-av1 Phase 2.1: probe V4L2_CID_STATELESS_AV1_FILM_GRAIN
|
||||
* on the vpu981 fd. Per Janet v3 amendment, this runs at backend
|
||||
* init (not lazily) so any race window with concurrent device
|
||||
* switching can't observe an inconsistent flag.
|
||||
*/
|
||||
driver_data->has_av1_film_grain = false;
|
||||
if (driver_data->video_fd_vpu981 >= 0) {
|
||||
struct v4l2_query_ext_ctrl qec;
|
||||
if (v4l2_query_ext_ctrl(driver_data->video_fd_vpu981,
|
||||
V4L2_CID_STATELESS_AV1_FILM_GRAIN,
|
||||
&qec) == 0) {
|
||||
driver_data->has_av1_film_grain = true;
|
||||
request_log("ampere-av1: vpu981 advertises FILM_GRAIN "
|
||||
"control (will include in per-frame batch)\n");
|
||||
}
|
||||
}
|
||||
|
||||
status = VA_STATUS_SUCCESS;
|
||||
goto complete;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user