fresnel-fourier iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API

Rewrites src/mpeg2.c to submit MPEG-2 control payload via the new split V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION} controls (mainline kernel <linux/v4l2-controls.h>:1985-2105), replacing the staging-era V4L2_CID_MPEG_VIDEO_MPEG2_{SLICE_PARAMS, QUANTIZATION} combined-struct API that the kernel removed. Per-frame submission: one batched VIDIOC_S_EXT_CTRLS, count=3, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS (0xf010000), with the three controls in order: - id=0xa409dc (SEQUENCE) size=12 bytes - id=0xa409dd (PICTURE) size=32 bytes - id=0xa409de (QUANTISATION) size=256 bytes Matches FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 reference implementation. Verified empirically against fresnel-fourier Phase 0 cross-validator anchor (bit-for-bit byte equivalence on SEQUENCE first-row + QUANTISATION 256 bytes). Six structural changes from old to new API: 1. Slice header parsing moved to kernel: bit_size, data_bit_offset, quantiser_scale_code GONE from new structs. 2. Reference timestamps moved from slice to picture: forward_ref_ts/backward_ref_ts now in v4l2_ctrl_mpeg2_picture (offsets 0/8). 3. Boolean fields collapsed into picture.flags bitmask (TOP_FIELD_FIRST 0x01 .. PROGRESSIVE 0x80, 8 bits total). 4. progressive_sequence collapsed into sequence.flags & V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE. 5. PICTURE_CODING_TYPE renamed to PIC_CODING_TYPE (values same). 6. Quantisation load_* flags removed; matrices always present; British spelling — quantiSation not quantiZation. Behavioral correction (from old code, was a latent bug): Old src/mpeg2.c:104-118 self-referenced surface_object timestamp when the VAAPI ref picture was VA_INVALID_ID. New code sets the ref_ts to 0, matching kernel doc 0-as-sentinel convention (verified Phase 3 Baseline C: I-frame has both ts == 0; FFmpeg v4l2_request_mpeg2.c:98-108 same convention). Quantisation matrix order: zigzag scanning order per kernel doc v4l2-controls.h:2076. VAAPI VAIQMatrixBufferMPEG2 stores in zigzag order (per VAAPI spec). Direct memcpy works; no permutation in libva backend. Kernel hantro_mpeg2.c:: hantro_mpeg2_dec_copy_qtable applies zigzag-to-raster permutation when copying to the hardware quantisation table. Default matrices (when iqmatrix_set==false): MPEG-2 spec defaults per ISO/IEC 13818-2 Table 7-3. The mpeg2_default_intra_matrix constant was transcribed from fresnel-fourier Phase 3 Baseline C QUANTISATION verbatim payload bytes 0..63 (256-byte capture from ffmpeg-v4l2request decode of bbb_720p10s_mpeg2.ts), per phase5_iter1_review.md S3 amendment that flagged spec-recall as unreliable. non_intra and chroma_non_intra are 16s per spec (verified Baseline C bytes 64..127, 192..255). chroma_intra is copy of intra (Baseline C bytes 128..191, verified identical). Submission shape: one batched v4l2_set_controls call with all three v4l2_ext_control entries, matching iter6/7/8 H.264 pattern at src/h264.c:986. Bound to surface_object->request_fd (the per-OUTPUT-slot permanent request_fd from iter6 binding). Behavioral details: - sequence.vbv_buffer_size = surface_object->source_size, where source_size is set in picture.c:276 from request_pool slot->size, which is the V4L2-negotiated sizeimage from VIDIOC_QUERYBUF. Matches FFmpeg controls->pic.output->size. - sequence.profile_and_level_indication = 0; not exposed by VAAPI VAPictureParameterBufferMPEG2. - sequence.chroma_format = 1 (4:2:0) hardcoded; campaign codec scope is 4:2:0. - progressive_frame proxies for progressive_sequence; same bit for typical streams. Phase 6 smoke test (post Commit A + Commit B): - vainfo enumerates VAProfileMPEG2Simple + VAProfileMPEG2Main on hantro bind. (Phase 1 criterion 1) - libva trace: vaCreateConfig(VAProfileMPEG2Main) = VA_STATUS_SUCCESS. (Phase 1 criterion 2) - ffmpeg -hwaccel vaapi exits 0 with no Failed-to-create- decode-configuration. (Phase 1 criterion 3 adjusted) - mpv --hwdec=vaapi --vo=image at +02s seek: 2 distinct frames with hashes byte-identical to SW reference: HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 SW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de SW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de (Phase 1 criterion 4 — DMA-BUF GL import path; cache-coherency-safe) - T4 H.264 reference hashes still match (criterion 5; verified Phase 3 Baseline D earlier). Cache-stale class observation (out-of-scope iter1 work item): ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi + hwdownload pipeline produces all-zero NV12 for MPEG-2 (same iter1 patch-0011 cache-coherency bug class observed for H.264 in fresnel-fourier T4). Kernel + HW decode is correct (verified via ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime + hwdownload which produces correct non-zero pixels matching SW reference). Bug is in libva backend vaDeriveImage path; Phase 4 cross- cutting work to add VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC support. Not blocking iter1 — DMA-BUF GL import path (mpv --vo=image) is cache-coherency-safe and gives bit-exact pixels. Auxiliary EINVAL noise (out-of-scope iter1 work item): src/context.c:142-155 unconditionally sets H.264 device-wide controls (V4L2_CID_STATELESS_H264_DECODE_MODE, _START_CODE) on every CreateContext, regardless of profile. EINVALs on hantro-vpu-dec (no H.264 controls there). Intentional best-effort behavior — return value cast to (void) and discarded at line 153. The error message "Unable to set control(s): Invalid argument" is logged from src/v4l2.c:484 but doesn't propagate as a backend error. Stays as documented auxiliary noise. Drop #include <mpeg2-ctrls.h> from src/config.c:37 and src/mpeg2.c (formerly line 38). The kernel UAPI for MPEG-2 stateless control IDs comes from <linux/v4l2-controls.h>, pulled transitively via <linux/videodev2.h> (and explicitly from src/mpeg2.c after this rewrite). The fork local include/mpeg2-ctrls.h header is deleted in commit C; this commit removes the last includes of it. src/config.c:38 still includes <hevc-ctrls.h> — left untouched per phase5_iter1_review.md Nit 6 (lower-risk path; HEVC iteration deletes its header). Refs: ../fresnel-fourier/phase4_iter1_plan.md (contract clauses 1-6, File 2 patch shape) ../fresnel-fourier/phase5_iter1_review.md (S3, Q4, Q5 amendments) ../fresnel-fourier/phase0_evidence/2026-05-07/iter1_phase3/ baseline_C_xvalidator/ffmpeg.stdout (cross-validator anchor) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 10:17:40 +02:00
parent e7dad7abb5
commit 5fe873c144
2 changed files with 182 additions and 88 deletions
@@ -34,7 +34,6 @@
 #include <linux/videodev2.h>
 #include <mpeg2-ctrls.h>
 #include <hevc-ctrls.h>
 #include "utils.h"
@@ -23,6 +23,34 @@
 * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 */
 /*
 * fresnel-fourier iter1 Phase 6 commit B: rewrite against new split
 * V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION} stateless
 * controls (mainline kernel <linux/v4l2-controls.h>:1985-2105).
 *
 * Replaces the staging-era V4L2_CID_MPEG_VIDEO_MPEG2_{SLICE_PARAMS,
 * QUANTIZATION} combined-struct API that the fork previously used
 * via include/mpeg2-ctrls.h (deleted in commit C).
 *
 * Per-frame submission: one batched VIDIOC_S_EXT_CTRLS with three
 * controls (12-byte SEQUENCE + 32-byte PICTURE + 256-byte QUANTISATION),
 * matching FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 reference
 * implementation. Verified empirically in fresnel-fourier Phase 0
 * cross-validator sweep and Phase 3 Baseline C verbatim payload.
 *
 * Quantisation matrix order: zigzag scanning order per kernel doc
 * v4l2-controls.h:2076. VAAPI VAIQMatrixBufferMPEG2 also stores in
 * zigzag scanning order (per VAAPI spec). Direct memcpy works; no
 * permutation in the libva backend. Kernel hantro_mpeg2.c::
 * hantro_mpeg2_dec_copy_qtable applies the zigzag-to-raster
 * permutation when copying to the hardware quantisation table.
 *
 * Default matrices (when iqmatrix_set==false): MPEG-2 spec defaults
 * per ISO/IEC 13818-2 Table 7-3, transcribed from Phase 3 Baseline C
 * QUANTISATION verbatim payload (256 bytes captured from
 * ffmpeg-v4l2request decode of bbb_720p10s_mpeg2.ts).
 */
 #include "mpeg2.h"
 #include "context.h"
 #include "request.h"
@@ -35,120 +63,187 @@
 #include <sys/mman.h>
 #include <linux/videodev2.h>
-#include <mpeg2-ctrls.h>
+#include <linux/v4l2-controls.h>
 #include "v4l2.h"
 /*
 * MPEG-2 default intra quantisation matrix in zigzag scanning order
 * (ISO/IEC 13818-2 Table 7-3, verified empirically against
 * fresnel-fourier Phase 3 Baseline C QUANTISATION payload bytes 0..63
 * from a ffmpeg-v4l2request decode of the BBB 720p10s MPEG-2 fixture).
 */
 static const __u8 mpeg2_default_intra_matrix[64] = {
 	  8,  16,  16,  19,  16,  19,  22,  22,
 	 22,  22,  22,  22,  26,  24,  26,  27,
 	 27,  27,  26,  26,  26,  26,  27,  27,
 	 27,  29,  29,  29,  34,  34,  34,  29,
 	 29,  29,  27,  27,  29,  29,  32,  32,
 	 34,  34,  37,  38,  37,  35,  35,  34,
 	 35,  38,  38,  40,  40,  40,  48,  48,
 	 46,  46,  56,  56,  58,  69,  69,  83,
 };
 /*
 * MPEG-2 default non-intra quantisation matrix is uniformly 16 in spec.
 * Verified against Phase 3 Baseline C QUANTISATION payload bytes
 * 64..127 (all 0x10 = 16). Same applies to chroma_non_intra
 * (bytes 192..255). Filled at runtime via memset rather than a
 * separate const array to keep the binary smaller.
 */
 int mpeg2_set_controls(struct request_data *driver_data,
 		       struct object_context *context_object,
 		       struct object_surface *surface_object)
 {
 	VAPictureParameterBufferMPEG2 *picture =
 		&surface_object->params.mpeg2.picture;
 	VASliceParameterBufferMPEG2 *slice =
 		&surface_object->params.mpeg2.slice;
 	VAIQMatrixBufferMPEG2 *iqmatrix =
 		&surface_object->params.mpeg2.iqmatrix;
 	bool iqmatrix_set = surface_object->params.mpeg2.iqmatrix_set;
-	struct v4l2_ctrl_mpeg2_slice_params slice_params;
+
-	struct v4l2_ctrl_mpeg2_quantization quantization;
+	/* Clause 2: v4l2_ctrl_mpeg2_sequence (12 bytes) */
 	struct v4l2_ctrl_mpeg2_sequence sequence;
 	/* Clause 3: v4l2_ctrl_mpeg2_picture (32 bytes; reserved[5] must be zero) */
 	struct v4l2_ctrl_mpeg2_picture pic;
 	/* Clause 4: v4l2_ctrl_mpeg2_quantisation (256 bytes) */
 	struct v4l2_ctrl_mpeg2_quantisation quant;
 	struct object_surface *forward_reference_surface;
 	struct object_surface *backward_reference_surface;
 	uint64_t timestamp;
 	unsigned int i;
 	int rc;
-	memset(&slice_params, 0, sizeof(slice_params));
+	memset(&sequence, 0, sizeof sequence);
 	memset(&pic, 0, sizeof pic);  /* zeros pic.reserved[5] per Clause 3 */
 	memset(&quant, 0, sizeof quant);
-	slice_params.bit_size = surface_object->slices_size * 8;
+	/* === Clause 2: SEQUENCE ===
-	slice_params.data_bit_offset = 0;
+	 *
-
+	 * VAAPI's VAPictureParameterBufferMPEG2 doesn't expose the
-	slice_params.sequence.horizontal_size = picture->horizontal_size;
+	 * sequence-extension's progressive_sequence flag separately;
-	slice_params.sequence.vertical_size = picture->vertical_size;
+	 * use progressive_frame from the picture-coding extension as a
-	slice_params.sequence.vbv_buffer_size = SOURCE_SIZE_MAX;
+	 * proxy. They're identical for typical streams (BBB is
-
+	 * progressive throughout).
-	slice_params.sequence.profile_and_level_indication = 0;
+	 */
-	slice_params.sequence.progressive_sequence = 0;
+	sequence.horizontal_size = picture->horizontal_size;
-	slice_params.sequence.chroma_format = 1; // 4:2:0
+	sequence.vertical_size = picture->vertical_size;
-
+	sequence.vbv_buffer_size = surface_object->source_size;
-	slice_params.picture.picture_coding_type = picture->picture_coding_type;
+	sequence.profile_and_level_indication = 0;  /* not exposed by VAAPI */
-	slice_params.picture.f_code[0][0] = (picture->f_code >> 12) & 0x0f;
+	sequence.chroma_format = 1;  /* 4:2:0 — campaign codec scope */
-	slice_params.picture.f_code[0][1] = (picture->f_code >> 8) & 0x0f;
+	if (picture->picture_coding_extension.bits.progressive_frame)
-	slice_params.picture.f_code[1][0] = (picture->f_code >> 4) & 0x0f;
+		sequence.flags |= V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE;
 	slice_params.picture.f_code[1][1] = (picture->f_code >> 0) & 0x0f;
 	slice_params.picture.intra_dc_precision =
 		picture->picture_coding_extension.bits.intra_dc_precision;
 	slice_params.picture.picture_structure =
 		picture->picture_coding_extension.bits.picture_structure;
 	slice_params.picture.top_field_first =
 		picture->picture_coding_extension.bits.top_field_first;
 	slice_params.picture.frame_pred_frame_dct =
 		picture->picture_coding_extension.bits.frame_pred_frame_dct;
 	slice_params.picture.concealment_motion_vectors =
 		picture->picture_coding_extension.bits
 			.concealment_motion_vectors;
 	slice_params.picture.q_scale_type =
 		picture->picture_coding_extension.bits.q_scale_type;
 	slice_params.picture.intra_vlc_format =
 		picture->picture_coding_extension.bits.intra_vlc_format;
 	slice_params.picture.alternate_scan =
 		picture->picture_coding_extension.bits.alternate_scan;
 	slice_params.picture.repeat_first_field =
 		picture->picture_coding_extension.bits.repeat_first_field;
 	slice_params.picture.progressive_frame =
 		picture->picture_coding_extension.bits.progressive_frame;
 	slice_params.quantiser_scale_code = slice->quantiser_scale_code;
 	/* === Clause 3: PICTURE ===
 	 *
 	 * Behavioral correction vs. previous mpeg2.c at this iter1:
 	 * old code self-referenced surface_object->timestamp when the
 	 * VAAPI ref picture was VA_INVALID_ID. New code sets ts = 0 for
 	 * missing refs, matching kernel doc's 0-as-sentinel convention
 	 * (verified against Phase 3 Baseline C frame 1: I-frame has both
 	 * forward_ref_ts and backward_ref_ts == 0; FFmpeg
 	 * libavcodec/v4l2_request_mpeg2.c:98-108 uses same convention).
 	 */
 	forward_reference_surface =
 		SURFACE(driver_data, picture->forward_reference_picture);
-	if (forward_reference_surface == NULL)
+	if (forward_reference_surface != NULL)
-		forward_reference_surface = surface_object;
+		pic.forward_ref_ts =
-
+			v4l2_timeval_to_ns(&forward_reference_surface->timestamp);
 	timestamp = v4l2_timeval_to_ns(&forward_reference_surface->timestamp);
 	slice_params.forward_ref_ts = timestamp;
 	backward_reference_surface =
 		SURFACE(driver_data, picture->backward_reference_picture);
-	if (backward_reference_surface == NULL)
+	if (backward_reference_surface != NULL)
-		backward_reference_surface = surface_object;
+		pic.backward_ref_ts =
 			v4l2_timeval_to_ns(&backward_reference_surface->timestamp);
-	timestamp = v4l2_timeval_to_ns(&backward_reference_surface->timestamp);
+	if (picture->picture_coding_extension.bits.top_field_first)
-	slice_params.backward_ref_ts = timestamp;
+		pic.flags |= V4L2_MPEG2_PIC_FLAG_TOP_FIELD_FIRST;
 	if (picture->picture_coding_extension.bits.frame_pred_frame_dct)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_FRAME_PRED_DCT;
 	if (picture->picture_coding_extension.bits.concealment_motion_vectors)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_CONCEALMENT_MV;
 	if (picture->picture_coding_extension.bits.q_scale_type)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_Q_SCALE_TYPE;
 	if (picture->picture_coding_extension.bits.intra_vlc_format)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_INTRA_VLC;
 	if (picture->picture_coding_extension.bits.alternate_scan)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_ALT_SCAN;
 	if (picture->picture_coding_extension.bits.repeat_first_field)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_REPEAT_FIRST;
 	if (picture->picture_coding_extension.bits.progressive_frame)
 		pic.flags |= V4L2_MPEG2_PIC_FLAG_PROGRESSIVE;
-	rc = v4l2_set_control(driver_data->video_fd, surface_object->request_fd,
+	pic.f_code[0][0] = (picture->f_code >> 12) & 0x0f;
-			      V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
+	pic.f_code[0][1] = (picture->f_code >>  8) & 0x0f;
-			      &slice_params, sizeof(slice_params));
+	pic.f_code[1][0] = (picture->f_code >>  4) & 0x0f;
 	pic.f_code[1][1] = (picture->f_code >>  0) & 0x0f;
 	pic.picture_coding_type = picture->picture_coding_type;
 	pic.picture_structure =
 		picture->picture_coding_extension.bits.picture_structure;
 	pic.intra_dc_precision =
 		picture->picture_coding_extension.bits.intra_dc_precision;
 	/* pic.reserved[5] zeroed by memset above */
 	/* === Clause 4: QUANTISATION ===
 	 *
 	 * Kernel always reads all four matrices unconditionally
 	 * (no load_* flags in the new API; kernel hantro_mpeg2.c
 	 * doesn't synthesize defaults). When VAAPI's consumer didn't
 	 * send VAIQMatrixBufferType (iqmatrix_set==false), populate
 	 * with MPEG-2 spec default matrices.
 	 *
 	 * VAAPI VAIQMatrixBufferMPEG2 stores matrices in zigzag scanning
 	 * order (per VAAPI spec). Kernel expects zigzag scanning order
 	 * (per v4l2-controls.h:2076). Direct memcpy.
 	 */
 	if (iqmatrix_set) {
 		memcpy(quant.intra_quantiser_matrix,
 		       iqmatrix->intra_quantiser_matrix, 64);
 		memcpy(quant.non_intra_quantiser_matrix,
 		       iqmatrix->non_intra_quantiser_matrix, 64);
 		memcpy(quant.chroma_intra_quantiser_matrix,
 		       iqmatrix->chroma_intra_quantiser_matrix, 64);
 		memcpy(quant.chroma_non_intra_quantiser_matrix,
 		       iqmatrix->chroma_non_intra_quantiser_matrix, 64);
 	} else {
 		memcpy(quant.intra_quantiser_matrix,
 		       mpeg2_default_intra_matrix, 64);
 		memset(quant.non_intra_quantiser_matrix, 16, 64);
 		memcpy(quant.chroma_intra_quantiser_matrix,
 		       mpeg2_default_intra_matrix, 64);
 		memset(quant.chroma_non_intra_quantiser_matrix, 16, 64);
 	}
 	/* === Clause 1+5: batched submission ===
 	 *
 	 * One VIDIOC_S_EXT_CTRLS with all three controls. Matches
 	 * src/h264.c:986 pattern (single v4l2_set_controls call) and
 	 * FFmpeg ff_v4l2_request_decode_frame contract. Bound to the
 	 * surface's permanent request_fd (iter6 per-OUTPUT-slot binding;
 	 * picture.c:284 sets surface_object->request_fd at BeginPicture).
 	 */
 	struct v4l2_ext_control ctrls[3] = {
 		{
 			.id = V4L2_CID_STATELESS_MPEG2_SEQUENCE,
 			.ptr = &sequence,
 			.size = sizeof sequence,
 		},
 		{
 			.id = V4L2_CID_STATELESS_MPEG2_PICTURE,
 			.ptr = &pic,
 			.size = sizeof pic,
 		},
 		{
 			.id = V4L2_CID_STATELESS_MPEG2_QUANTISATION,
 			.ptr = &quant,
 			.size = sizeof quant,
 		},
 	};
 	rc = v4l2_set_controls(driver_data->video_fd,
 			       surface_object->request_fd,
 			       ctrls, 3);
 	if (rc < 0)
 		return VA_STATUS_ERROR_OPERATION_FAILED;
 	if (iqmatrix_set) {
 		quantization.load_intra_quantiser_matrix =
 			iqmatrix->load_intra_quantiser_matrix;
 		quantization.load_non_intra_quantiser_matrix =
 			iqmatrix->load_non_intra_quantiser_matrix;
 		quantization.load_chroma_intra_quantiser_matrix =
 			iqmatrix->load_chroma_intra_quantiser_matrix;
 		quantization.load_chroma_non_intra_quantiser_matrix =
 			iqmatrix->load_chroma_non_intra_quantiser_matrix;
 		for (i = 0; i < 64; i++) {
 			quantization.intra_quantiser_matrix[i] =
 				iqmatrix->intra_quantiser_matrix[i];
 			quantization.non_intra_quantiser_matrix[i] =
 				iqmatrix->non_intra_quantiser_matrix[i];
 			quantization.chroma_intra_quantiser_matrix[i] =
 				iqmatrix->chroma_intra_quantiser_matrix[i];
 			quantization.chroma_non_intra_quantiser_matrix[i] =
 				iqmatrix->chroma_non_intra_quantiser_matrix[i];
 		}
 		rc = v4l2_set_control(driver_data->video_fd,
 				      surface_object->request_fd,
 				      V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION,
 				      &quantization, sizeof(quantization));
 	}
 	return 0;
 }