fresnel-fourier iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API

Rewrites src/mpeg2.c to submit MPEG-2 control payload via the new
split V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION}
controls (mainline kernel <linux/v4l2-controls.h>:1985-2105),
replacing the staging-era V4L2_CID_MPEG_VIDEO_MPEG2_{SLICE_PARAMS,
QUANTIZATION} combined-struct API that the kernel removed.

Per-frame submission: one batched VIDIOC_S_EXT_CTRLS, count=3,
ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS (0xf010000), with the
three controls in order:
  - id=0xa409dc (SEQUENCE)      size=12  bytes
  - id=0xa409dd (PICTURE)       size=32  bytes
  - id=0xa409de (QUANTISATION)  size=256 bytes

Matches FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 reference
implementation. Verified empirically against fresnel-fourier
Phase 0 cross-validator anchor (bit-for-bit byte equivalence on
SEQUENCE first-row + QUANTISATION 256 bytes).

Six structural changes from old to new API:

  1. Slice header parsing moved to kernel: bit_size,
     data_bit_offset, quantiser_scale_code GONE from new structs.
  2. Reference timestamps moved from slice to picture:
     forward_ref_ts/backward_ref_ts now in
     v4l2_ctrl_mpeg2_picture (offsets 0/8).
  3. Boolean fields collapsed into picture.flags bitmask
     (TOP_FIELD_FIRST 0x01 .. PROGRESSIVE 0x80, 8 bits total).
  4. progressive_sequence collapsed into sequence.flags &
     V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE.
  5. PICTURE_CODING_TYPE renamed to PIC_CODING_TYPE (values same).
  6. Quantisation load_* flags removed; matrices always present;
     British spelling — quantiSation not quantiZation.

Behavioral correction (from old code, was a latent bug):
  Old src/mpeg2.c:104-118 self-referenced surface_object timestamp
  when the VAAPI ref picture was VA_INVALID_ID. New code sets the
  ref_ts to 0, matching kernel doc 0-as-sentinel convention
  (verified Phase 3 Baseline C: I-frame has both ts == 0; FFmpeg
  v4l2_request_mpeg2.c:98-108 same convention).

Quantisation matrix order: zigzag scanning order per kernel doc
v4l2-controls.h:2076. VAAPI VAIQMatrixBufferMPEG2 stores in
zigzag order (per VAAPI spec). Direct memcpy works; no
permutation in libva backend. Kernel hantro_mpeg2.c::
hantro_mpeg2_dec_copy_qtable applies zigzag-to-raster permutation
when copying to the hardware quantisation table.

Default matrices (when iqmatrix_set==false): MPEG-2 spec defaults
per ISO/IEC 13818-2 Table 7-3. The mpeg2_default_intra_matrix
constant was transcribed from fresnel-fourier Phase 3 Baseline C
QUANTISATION verbatim payload bytes 0..63 (256-byte capture from
ffmpeg-v4l2request decode of bbb_720p10s_mpeg2.ts), per
phase5_iter1_review.md S3 amendment that flagged spec-recall as
unreliable. non_intra and chroma_non_intra are 16s per spec
(verified Baseline C bytes 64..127, 192..255). chroma_intra is
copy of intra (Baseline C bytes 128..191, verified identical).

Submission shape: one batched v4l2_set_controls call with all
three v4l2_ext_control entries, matching iter6/7/8 H.264 pattern
at src/h264.c:986. Bound to surface_object->request_fd (the
per-OUTPUT-slot permanent request_fd from iter6 binding).

Behavioral details:
  - sequence.vbv_buffer_size = surface_object->source_size, where
    source_size is set in picture.c:276 from request_pool slot->size,
    which is the V4L2-negotiated sizeimage from VIDIOC_QUERYBUF.
    Matches FFmpeg controls->pic.output->size.
  - sequence.profile_and_level_indication = 0; not exposed by
    VAAPI VAPictureParameterBufferMPEG2.
  - sequence.chroma_format = 1 (4:2:0) hardcoded; campaign codec
    scope is 4:2:0.
  - progressive_frame proxies for progressive_sequence; same bit
    for typical streams.

Phase 6 smoke test (post Commit A + Commit B):
  - vainfo enumerates VAProfileMPEG2Simple + VAProfileMPEG2Main
    on hantro bind. (Phase 1 criterion 1)
  - libva trace: vaCreateConfig(VAProfileMPEG2Main) =
    VA_STATUS_SUCCESS. (Phase 1 criterion 2)
  - ffmpeg -hwaccel vaapi exits 0 with no Failed-to-create-
    decode-configuration. (Phase 1 criterion 3 adjusted)
  - mpv --hwdec=vaapi --vo=image at +02s seek: 2 distinct
    frames with hashes byte-identical to SW reference:
      HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
      SW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
      HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
      SW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
    (Phase 1 criterion 4 — DMA-BUF GL import path; cache-coherency-safe)
  - T4 H.264 reference hashes still match (criterion 5; verified
    Phase 3 Baseline D earlier).

Cache-stale class observation (out-of-scope iter1 work item):
  ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi + hwdownload
  pipeline produces all-zero NV12 for MPEG-2 (same iter1 patch-0011
  cache-coherency bug class observed for H.264 in fresnel-fourier
  T4). Kernel + HW decode is correct (verified via ffmpeg
  -hwaccel v4l2request -hwaccel_output_format drm_prime + hwdownload
  which produces correct non-zero pixels matching SW reference).
  Bug is in libva backend vaDeriveImage path; Phase 4 cross-
  cutting work to add VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC support.
  Not blocking iter1 — DMA-BUF GL import path (mpv --vo=image) is
  cache-coherency-safe and gives bit-exact pixels.

Auxiliary EINVAL noise (out-of-scope iter1 work item):
  src/context.c:142-155 unconditionally sets H.264 device-wide
  controls (V4L2_CID_STATELESS_H264_DECODE_MODE,
  _START_CODE) on every CreateContext, regardless of profile.
  EINVALs on hantro-vpu-dec (no H.264 controls there). Intentional
  best-effort behavior — return value cast to (void) and discarded
  at line 153. The error message "Unable to set control(s):
  Invalid argument" is logged from src/v4l2.c:484 but doesn't
  propagate as a backend error. Stays as documented auxiliary
  noise.

Drop #include <mpeg2-ctrls.h> from src/config.c:37 and src/mpeg2.c
(formerly line 38). The kernel UAPI for MPEG-2 stateless control
IDs comes from <linux/v4l2-controls.h>, pulled transitively via
<linux/videodev2.h> (and explicitly from src/mpeg2.c after this
rewrite). The fork local include/mpeg2-ctrls.h header is deleted
in commit C; this commit removes the last includes of it.
src/config.c:38 still includes <hevc-ctrls.h> — left untouched per
phase5_iter1_review.md Nit 6 (lower-risk path; HEVC iteration
deletes its header).

Refs:
  ../fresnel-fourier/phase4_iter1_plan.md (contract clauses 1-6,
                                           File 2 patch shape)
  ../fresnel-fourier/phase5_iter1_review.md (S3, Q4, Q5 amendments)
  ../fresnel-fourier/phase0_evidence/2026-05-07/iter1_phase3/
    baseline_C_xvalidator/ffmpeg.stdout (cross-validator anchor)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-08 10:17:40 +02:00
parent e7dad7abb5
commit 5fe873c144
2 changed files with 182 additions and 88 deletions
-1
View File
@@ -34,7 +34,6 @@
#include <linux/videodev2.h> #include <linux/videodev2.h>
#include <mpeg2-ctrls.h>
#include <hevc-ctrls.h> #include <hevc-ctrls.h>
#include "utils.h" #include "utils.h"
+182 -87
View File
@@ -23,6 +23,34 @@
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/ */
/*
* fresnel-fourier iter1 Phase 6 commit B: rewrite against new split
* V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION} stateless
* controls (mainline kernel <linux/v4l2-controls.h>:1985-2105).
*
* Replaces the staging-era V4L2_CID_MPEG_VIDEO_MPEG2_{SLICE_PARAMS,
* QUANTIZATION} combined-struct API that the fork previously used
* via include/mpeg2-ctrls.h (deleted in commit C).
*
* Per-frame submission: one batched VIDIOC_S_EXT_CTRLS with three
* controls (12-byte SEQUENCE + 32-byte PICTURE + 256-byte QUANTISATION),
* matching FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 reference
* implementation. Verified empirically in fresnel-fourier Phase 0
* cross-validator sweep and Phase 3 Baseline C verbatim payload.
*
* Quantisation matrix order: zigzag scanning order per kernel doc
* v4l2-controls.h:2076. VAAPI VAIQMatrixBufferMPEG2 also stores in
* zigzag scanning order (per VAAPI spec). Direct memcpy works; no
* permutation in the libva backend. Kernel hantro_mpeg2.c::
* hantro_mpeg2_dec_copy_qtable applies the zigzag-to-raster
* permutation when copying to the hardware quantisation table.
*
* Default matrices (when iqmatrix_set==false): MPEG-2 spec defaults
* per ISO/IEC 13818-2 Table 7-3, transcribed from Phase 3 Baseline C
* QUANTISATION verbatim payload (256 bytes captured from
* ffmpeg-v4l2request decode of bbb_720p10s_mpeg2.ts).
*/
#include "mpeg2.h" #include "mpeg2.h"
#include "context.h" #include "context.h"
#include "request.h" #include "request.h"
@@ -35,120 +63,187 @@
#include <sys/mman.h> #include <sys/mman.h>
#include <linux/videodev2.h> #include <linux/videodev2.h>
#include <mpeg2-ctrls.h> #include <linux/v4l2-controls.h>
#include "v4l2.h" #include "v4l2.h"
/*
* MPEG-2 default intra quantisation matrix in zigzag scanning order
* (ISO/IEC 13818-2 Table 7-3, verified empirically against
* fresnel-fourier Phase 3 Baseline C QUANTISATION payload bytes 0..63
* from a ffmpeg-v4l2request decode of the BBB 720p10s MPEG-2 fixture).
*/
static const __u8 mpeg2_default_intra_matrix[64] = {
8, 16, 16, 19, 16, 19, 22, 22,
22, 22, 22, 22, 26, 24, 26, 27,
27, 27, 26, 26, 26, 26, 27, 27,
27, 29, 29, 29, 34, 34, 34, 29,
29, 29, 27, 27, 29, 29, 32, 32,
34, 34, 37, 38, 37, 35, 35, 34,
35, 38, 38, 40, 40, 40, 48, 48,
46, 46, 56, 56, 58, 69, 69, 83,
};
/*
* MPEG-2 default non-intra quantisation matrix is uniformly 16 in spec.
* Verified against Phase 3 Baseline C QUANTISATION payload bytes
* 64..127 (all 0x10 = 16). Same applies to chroma_non_intra
* (bytes 192..255). Filled at runtime via memset rather than a
* separate const array to keep the binary smaller.
*/
int mpeg2_set_controls(struct request_data *driver_data, int mpeg2_set_controls(struct request_data *driver_data,
struct object_context *context_object, struct object_context *context_object,
struct object_surface *surface_object) struct object_surface *surface_object)
{ {
VAPictureParameterBufferMPEG2 *picture = VAPictureParameterBufferMPEG2 *picture =
&surface_object->params.mpeg2.picture; &surface_object->params.mpeg2.picture;
VASliceParameterBufferMPEG2 *slice =
&surface_object->params.mpeg2.slice;
VAIQMatrixBufferMPEG2 *iqmatrix = VAIQMatrixBufferMPEG2 *iqmatrix =
&surface_object->params.mpeg2.iqmatrix; &surface_object->params.mpeg2.iqmatrix;
bool iqmatrix_set = surface_object->params.mpeg2.iqmatrix_set; bool iqmatrix_set = surface_object->params.mpeg2.iqmatrix_set;
struct v4l2_ctrl_mpeg2_slice_params slice_params;
struct v4l2_ctrl_mpeg2_quantization quantization; /* Clause 2: v4l2_ctrl_mpeg2_sequence (12 bytes) */
struct v4l2_ctrl_mpeg2_sequence sequence;
/* Clause 3: v4l2_ctrl_mpeg2_picture (32 bytes; reserved[5] must be zero) */
struct v4l2_ctrl_mpeg2_picture pic;
/* Clause 4: v4l2_ctrl_mpeg2_quantisation (256 bytes) */
struct v4l2_ctrl_mpeg2_quantisation quant;
struct object_surface *forward_reference_surface; struct object_surface *forward_reference_surface;
struct object_surface *backward_reference_surface; struct object_surface *backward_reference_surface;
uint64_t timestamp;
unsigned int i;
int rc; int rc;
memset(&slice_params, 0, sizeof(slice_params)); memset(&sequence, 0, sizeof sequence);
memset(&pic, 0, sizeof pic); /* zeros pic.reserved[5] per Clause 3 */
memset(&quant, 0, sizeof quant);
slice_params.bit_size = surface_object->slices_size * 8; /* === Clause 2: SEQUENCE ===
slice_params.data_bit_offset = 0; *
* VAAPI's VAPictureParameterBufferMPEG2 doesn't expose the
slice_params.sequence.horizontal_size = picture->horizontal_size; * sequence-extension's progressive_sequence flag separately;
slice_params.sequence.vertical_size = picture->vertical_size; * use progressive_frame from the picture-coding extension as a
slice_params.sequence.vbv_buffer_size = SOURCE_SIZE_MAX; * proxy. They're identical for typical streams (BBB is
* progressive throughout).
slice_params.sequence.profile_and_level_indication = 0; */
slice_params.sequence.progressive_sequence = 0; sequence.horizontal_size = picture->horizontal_size;
slice_params.sequence.chroma_format = 1; // 4:2:0 sequence.vertical_size = picture->vertical_size;
sequence.vbv_buffer_size = surface_object->source_size;
slice_params.picture.picture_coding_type = picture->picture_coding_type; sequence.profile_and_level_indication = 0; /* not exposed by VAAPI */
slice_params.picture.f_code[0][0] = (picture->f_code >> 12) & 0x0f; sequence.chroma_format = 1; /* 4:2:0 — campaign codec scope */
slice_params.picture.f_code[0][1] = (picture->f_code >> 8) & 0x0f; if (picture->picture_coding_extension.bits.progressive_frame)
slice_params.picture.f_code[1][0] = (picture->f_code >> 4) & 0x0f; sequence.flags |= V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE;
slice_params.picture.f_code[1][1] = (picture->f_code >> 0) & 0x0f;
slice_params.picture.intra_dc_precision =
picture->picture_coding_extension.bits.intra_dc_precision;
slice_params.picture.picture_structure =
picture->picture_coding_extension.bits.picture_structure;
slice_params.picture.top_field_first =
picture->picture_coding_extension.bits.top_field_first;
slice_params.picture.frame_pred_frame_dct =
picture->picture_coding_extension.bits.frame_pred_frame_dct;
slice_params.picture.concealment_motion_vectors =
picture->picture_coding_extension.bits
.concealment_motion_vectors;
slice_params.picture.q_scale_type =
picture->picture_coding_extension.bits.q_scale_type;
slice_params.picture.intra_vlc_format =
picture->picture_coding_extension.bits.intra_vlc_format;
slice_params.picture.alternate_scan =
picture->picture_coding_extension.bits.alternate_scan;
slice_params.picture.repeat_first_field =
picture->picture_coding_extension.bits.repeat_first_field;
slice_params.picture.progressive_frame =
picture->picture_coding_extension.bits.progressive_frame;
slice_params.quantiser_scale_code = slice->quantiser_scale_code;
/* === Clause 3: PICTURE ===
*
* Behavioral correction vs. previous mpeg2.c at this iter1:
* old code self-referenced surface_object->timestamp when the
* VAAPI ref picture was VA_INVALID_ID. New code sets ts = 0 for
* missing refs, matching kernel doc's 0-as-sentinel convention
* (verified against Phase 3 Baseline C frame 1: I-frame has both
* forward_ref_ts and backward_ref_ts == 0; FFmpeg
* libavcodec/v4l2_request_mpeg2.c:98-108 uses same convention).
*/
forward_reference_surface = forward_reference_surface =
SURFACE(driver_data, picture->forward_reference_picture); SURFACE(driver_data, picture->forward_reference_picture);
if (forward_reference_surface == NULL) if (forward_reference_surface != NULL)
forward_reference_surface = surface_object; pic.forward_ref_ts =
v4l2_timeval_to_ns(&forward_reference_surface->timestamp);
timestamp = v4l2_timeval_to_ns(&forward_reference_surface->timestamp);
slice_params.forward_ref_ts = timestamp;
backward_reference_surface = backward_reference_surface =
SURFACE(driver_data, picture->backward_reference_picture); SURFACE(driver_data, picture->backward_reference_picture);
if (backward_reference_surface == NULL) if (backward_reference_surface != NULL)
backward_reference_surface = surface_object; pic.backward_ref_ts =
v4l2_timeval_to_ns(&backward_reference_surface->timestamp);
timestamp = v4l2_timeval_to_ns(&backward_reference_surface->timestamp); if (picture->picture_coding_extension.bits.top_field_first)
slice_params.backward_ref_ts = timestamp; pic.flags |= V4L2_MPEG2_PIC_FLAG_TOP_FIELD_FIRST;
if (picture->picture_coding_extension.bits.frame_pred_frame_dct)
pic.flags |= V4L2_MPEG2_PIC_FLAG_FRAME_PRED_DCT;
if (picture->picture_coding_extension.bits.concealment_motion_vectors)
pic.flags |= V4L2_MPEG2_PIC_FLAG_CONCEALMENT_MV;
if (picture->picture_coding_extension.bits.q_scale_type)
pic.flags |= V4L2_MPEG2_PIC_FLAG_Q_SCALE_TYPE;
if (picture->picture_coding_extension.bits.intra_vlc_format)
pic.flags |= V4L2_MPEG2_PIC_FLAG_INTRA_VLC;
if (picture->picture_coding_extension.bits.alternate_scan)
pic.flags |= V4L2_MPEG2_PIC_FLAG_ALT_SCAN;
if (picture->picture_coding_extension.bits.repeat_first_field)
pic.flags |= V4L2_MPEG2_PIC_FLAG_REPEAT_FIRST;
if (picture->picture_coding_extension.bits.progressive_frame)
pic.flags |= V4L2_MPEG2_PIC_FLAG_PROGRESSIVE;
rc = v4l2_set_control(driver_data->video_fd, surface_object->request_fd, pic.f_code[0][0] = (picture->f_code >> 12) & 0x0f;
V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS, pic.f_code[0][1] = (picture->f_code >> 8) & 0x0f;
&slice_params, sizeof(slice_params)); pic.f_code[1][0] = (picture->f_code >> 4) & 0x0f;
pic.f_code[1][1] = (picture->f_code >> 0) & 0x0f;
pic.picture_coding_type = picture->picture_coding_type;
pic.picture_structure =
picture->picture_coding_extension.bits.picture_structure;
pic.intra_dc_precision =
picture->picture_coding_extension.bits.intra_dc_precision;
/* pic.reserved[5] zeroed by memset above */
/* === Clause 4: QUANTISATION ===
*
* Kernel always reads all four matrices unconditionally
* (no load_* flags in the new API; kernel hantro_mpeg2.c
* doesn't synthesize defaults). When VAAPI's consumer didn't
* send VAIQMatrixBufferType (iqmatrix_set==false), populate
* with MPEG-2 spec default matrices.
*
* VAAPI VAIQMatrixBufferMPEG2 stores matrices in zigzag scanning
* order (per VAAPI spec). Kernel expects zigzag scanning order
* (per v4l2-controls.h:2076). Direct memcpy.
*/
if (iqmatrix_set) {
memcpy(quant.intra_quantiser_matrix,
iqmatrix->intra_quantiser_matrix, 64);
memcpy(quant.non_intra_quantiser_matrix,
iqmatrix->non_intra_quantiser_matrix, 64);
memcpy(quant.chroma_intra_quantiser_matrix,
iqmatrix->chroma_intra_quantiser_matrix, 64);
memcpy(quant.chroma_non_intra_quantiser_matrix,
iqmatrix->chroma_non_intra_quantiser_matrix, 64);
} else {
memcpy(quant.intra_quantiser_matrix,
mpeg2_default_intra_matrix, 64);
memset(quant.non_intra_quantiser_matrix, 16, 64);
memcpy(quant.chroma_intra_quantiser_matrix,
mpeg2_default_intra_matrix, 64);
memset(quant.chroma_non_intra_quantiser_matrix, 16, 64);
}
/* === Clause 1+5: batched submission ===
*
* One VIDIOC_S_EXT_CTRLS with all three controls. Matches
* src/h264.c:986 pattern (single v4l2_set_controls call) and
* FFmpeg ff_v4l2_request_decode_frame contract. Bound to the
* surface's permanent request_fd (iter6 per-OUTPUT-slot binding;
* picture.c:284 sets surface_object->request_fd at BeginPicture).
*/
struct v4l2_ext_control ctrls[3] = {
{
.id = V4L2_CID_STATELESS_MPEG2_SEQUENCE,
.ptr = &sequence,
.size = sizeof sequence,
},
{
.id = V4L2_CID_STATELESS_MPEG2_PICTURE,
.ptr = &pic,
.size = sizeof pic,
},
{
.id = V4L2_CID_STATELESS_MPEG2_QUANTISATION,
.ptr = &quant,
.size = sizeof quant,
},
};
rc = v4l2_set_controls(driver_data->video_fd,
surface_object->request_fd,
ctrls, 3);
if (rc < 0) if (rc < 0)
return VA_STATUS_ERROR_OPERATION_FAILED; return VA_STATUS_ERROR_OPERATION_FAILED;
if (iqmatrix_set) {
quantization.load_intra_quantiser_matrix =
iqmatrix->load_intra_quantiser_matrix;
quantization.load_non_intra_quantiser_matrix =
iqmatrix->load_non_intra_quantiser_matrix;
quantization.load_chroma_intra_quantiser_matrix =
iqmatrix->load_chroma_intra_quantiser_matrix;
quantization.load_chroma_non_intra_quantiser_matrix =
iqmatrix->load_chroma_non_intra_quantiser_matrix;
for (i = 0; i < 64; i++) {
quantization.intra_quantiser_matrix[i] =
iqmatrix->intra_quantiser_matrix[i];
quantization.non_intra_quantiser_matrix[i] =
iqmatrix->non_intra_quantiser_matrix[i];
quantization.chroma_intra_quantiser_matrix[i] =
iqmatrix->chroma_intra_quantiser_matrix[i];
quantization.chroma_non_intra_quantiser_matrix[i] =
iqmatrix->chroma_non_intra_quantiser_matrix[i];
}
rc = v4l2_set_control(driver_data->video_fd,
surface_object->request_fd,
V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
&quantization, sizeof(quantization));
}
return 0; return 0;
} }