claude-noether/libva-v4l2-request-fourier

forked from marfrit/libva-v4l2-request-fourier

T

claude-noether 5fe873c144 fresnel-fourier iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API

Rewrites src/mpeg2.c to submit MPEG-2 control payload via the new
split V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION}
controls (mainline kernel <linux/v4l2-controls.h>:1985-2105),
replacing the staging-era V4L2_CID_MPEG_VIDEO_MPEG2_{SLICE_PARAMS,
QUANTIZATION} combined-struct API that the kernel removed.

Per-frame submission: one batched VIDIOC_S_EXT_CTRLS, count=3,
ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS (0xf010000), with the
three controls in order:
  - id=0xa409dc (SEQUENCE)      size=12  bytes
  - id=0xa409dd (PICTURE)       size=32  bytes
  - id=0xa409de (QUANTISATION)  size=256 bytes

Matches FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 reference
implementation. Verified empirically against fresnel-fourier
Phase 0 cross-validator anchor (bit-for-bit byte equivalence on
SEQUENCE first-row + QUANTISATION 256 bytes).

Six structural changes from old to new API:

  1. Slice header parsing moved to kernel: bit_size,
     data_bit_offset, quantiser_scale_code GONE from new structs.
  2. Reference timestamps moved from slice to picture:
     forward_ref_ts/backward_ref_ts now in
     v4l2_ctrl_mpeg2_picture (offsets 0/8).
  3. Boolean fields collapsed into picture.flags bitmask
     (TOP_FIELD_FIRST 0x01 .. PROGRESSIVE 0x80, 8 bits total).
  4. progressive_sequence collapsed into sequence.flags &
     V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE.
  5. PICTURE_CODING_TYPE renamed to PIC_CODING_TYPE (values same).
  6. Quantisation load_* flags removed; matrices always present;
     British spelling — quantiSation not quantiZation.

Behavioral correction (from old code, was a latent bug):
  Old src/mpeg2.c:104-118 self-referenced surface_object timestamp
  when the VAAPI ref picture was VA_INVALID_ID. New code sets the
  ref_ts to 0, matching kernel doc 0-as-sentinel convention
  (verified Phase 3 Baseline C: I-frame has both ts == 0; FFmpeg
  v4l2_request_mpeg2.c:98-108 same convention).

Quantisation matrix order: zigzag scanning order per kernel doc
v4l2-controls.h:2076. VAAPI VAIQMatrixBufferMPEG2 stores in
zigzag order (per VAAPI spec). Direct memcpy works; no
permutation in libva backend. Kernel hantro_mpeg2.c::
hantro_mpeg2_dec_copy_qtable applies zigzag-to-raster permutation
when copying to the hardware quantisation table.

Default matrices (when iqmatrix_set==false): MPEG-2 spec defaults
per ISO/IEC 13818-2 Table 7-3. The mpeg2_default_intra_matrix
constant was transcribed from fresnel-fourier Phase 3 Baseline C
QUANTISATION verbatim payload bytes 0..63 (256-byte capture from
ffmpeg-v4l2request decode of bbb_720p10s_mpeg2.ts), per
phase5_iter1_review.md S3 amendment that flagged spec-recall as
unreliable. non_intra and chroma_non_intra are 16s per spec
(verified Baseline C bytes 64..127, 192..255). chroma_intra is
copy of intra (Baseline C bytes 128..191, verified identical).

Submission shape: one batched v4l2_set_controls call with all
three v4l2_ext_control entries, matching iter6/7/8 H.264 pattern
at src/h264.c:986. Bound to surface_object->request_fd (the
per-OUTPUT-slot permanent request_fd from iter6 binding).

Behavioral details:
  - sequence.vbv_buffer_size = surface_object->source_size, where
    source_size is set in picture.c:276 from request_pool slot->size,
    which is the V4L2-negotiated sizeimage from VIDIOC_QUERYBUF.
    Matches FFmpeg controls->pic.output->size.
  - sequence.profile_and_level_indication = 0; not exposed by
    VAAPI VAPictureParameterBufferMPEG2.
  - sequence.chroma_format = 1 (4:2:0) hardcoded; campaign codec
    scope is 4:2:0.
  - progressive_frame proxies for progressive_sequence; same bit
    for typical streams.

Phase 6 smoke test (post Commit A + Commit B):
  - vainfo enumerates VAProfileMPEG2Simple + VAProfileMPEG2Main
    on hantro bind. (Phase 1 criterion 1)
  - libva trace: vaCreateConfig(VAProfileMPEG2Main) =
    VA_STATUS_SUCCESS. (Phase 1 criterion 2)
  - ffmpeg -hwaccel vaapi exits 0 with no Failed-to-create-
    decode-configuration. (Phase 1 criterion 3 adjusted)
  - mpv --hwdec=vaapi --vo=image at +02s seek: 2 distinct
    frames with hashes byte-identical to SW reference:
      HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
      SW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
      HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
      SW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
    (Phase 1 criterion 4 — DMA-BUF GL import path; cache-coherency-safe)
  - T4 H.264 reference hashes still match (criterion 5; verified
    Phase 3 Baseline D earlier).

Cache-stale class observation (out-of-scope iter1 work item):
  ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi + hwdownload
  pipeline produces all-zero NV12 for MPEG-2 (same iter1 patch-0011
  cache-coherency bug class observed for H.264 in fresnel-fourier
  T4). Kernel + HW decode is correct (verified via ffmpeg
  -hwaccel v4l2request -hwaccel_output_format drm_prime + hwdownload
  which produces correct non-zero pixels matching SW reference).
  Bug is in libva backend vaDeriveImage path; Phase 4 cross-
  cutting work to add VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC support.
  Not blocking iter1 — DMA-BUF GL import path (mpv --vo=image) is
  cache-coherency-safe and gives bit-exact pixels.

Auxiliary EINVAL noise (out-of-scope iter1 work item):
  src/context.c:142-155 unconditionally sets H.264 device-wide
  controls (V4L2_CID_STATELESS_H264_DECODE_MODE,
  _START_CODE) on every CreateContext, regardless of profile.
  EINVALs on hantro-vpu-dec (no H.264 controls there). Intentional
  best-effort behavior — return value cast to (void) and discarded
  at line 153. The error message "Unable to set control(s):
  Invalid argument" is logged from src/v4l2.c:484 but doesn't
  propagate as a backend error. Stays as documented auxiliary
  noise.

Drop #include <mpeg2-ctrls.h> from src/config.c:37 and src/mpeg2.c
(formerly line 38). The kernel UAPI for MPEG-2 stateless control
IDs comes from <linux/v4l2-controls.h>, pulled transitively via
<linux/videodev2.h> (and explicitly from src/mpeg2.c after this
rewrite). The fork local include/mpeg2-ctrls.h header is deleted
in commit C; this commit removes the last includes of it.
src/config.c:38 still includes <hevc-ctrls.h> — left untouched per
phase5_iter1_review.md Nit 6 (lower-risk path; HEVC iteration
deletes its header).

Refs:
  ../fresnel-fourier/phase4_iter1_plan.md (contract clauses 1-6,
                                           File 2 patch shape)
  ../fresnel-fourier/phase5_iter1_review.md (S3, Q4, Q5 amendments)
  ../fresnel-fourier/phase0_evidence/2026-05-07/iter1_phase3/
    baseline_C_xvalidator/ffmpeg.stdout (cross-validator anchor)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-08 10:17:40 +02:00

include

fourier-local: stateless control modernization + HEVC strip

2026-05-04 09:40:14 +00:00

src

fresnel-fourier iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API

2026-05-08 10:17:40 +02:00

tests

iter8 Phase 4: tests/run_perf_binding_cell.sh — perf binding cell harness

2026-05-06 11:59:13 +00:00

.clang-format

tree: Run clang-format to conform to the kernel coding style

2018-07-17 10:12:15 +02:00

.gitignore

Rename va_config to config for consistency

2018-04-23 17:09:19 +02:00

AUTHORS

Update AUTHORS file with Maxime and Paul

2018-09-02 21:54:52 +02:00

autogen.sh

autotools: Rewrite configuration in a minimalistic fashion

2019-03-07 11:37:12 +01:00

configure.ac

Lower libva requirement to API version 1.1.0 (lib version 2.1.0)

2019-03-07 14:11:18 +01:00

COPYING

COPYING: Reformulate and make more concise

2018-04-23 15:52:03 +02:00

COPYING.LGPL

Clarify licenses text

2016-08-26 15:43:09 +02:00

COPYING.MIT

Clarify licenses text

2016-08-26 15:43:09 +02:00

CREDITS

CREDITS: add Albin Söderqvist

2018-09-08 08:51:51 +02:00

Makefile.am

autotools: Rewrite configuration in a minimalistic fashion

2019-03-07 11:37:12 +01:00

meson_options.txt

Add option to specify path to up-to-date kernel headers

2019-05-17 13:59:23 +08:00

meson.build

Add option to specify path to up-to-date kernel headers

2019-05-17 13:59:23 +08:00

README.md

Update README.md to mention H265 support

2018-09-02 21:54:18 +02:00

STUDY.md

STUDY.md: pointer to libva-multiplanar campaign Phase 0

2026-05-04 09:45:45 +00:00

README.md

v4l2-request libVA Backend

About

This libVA backend is designed to work with the Linux Video4Linux2 Request API that is used by a number of video codecs drivers, including the Video Engine found in most Allwinner SoCs.

Status

The v4l2-request libVA backend currently supports the following formats:

MPEG2 (Simple and Main profiles)
H264 (Baseline, Main and High profiles)
H265 (Main profile)

Instructions

In order to use this libVA backend, the v4l2_request driver has to be specified through the LIBVA_DRIVER_NAME environment variable, as such:

export LIBVA_DRIVER_NAME=v4l2_request

A media player that supports VAAPI (such as VLC) can then be used to decode a video in a supported format:

vlc path/to/video.mpg

Sample media files can be obtained from:

http://samplemedia.linaro.org/MPEG2/
http://samplemedia.linaro.org/MPEG4/SVT/

Technical Notes

Surface

A Surface is an internal data structure never handled by the VA's user containing the output of a rendering. Usualy, a bunch of surfaces are created at the begining of decoding and they are then used alternatively. When created, a surface is assigned a corresponding v4l capture buffer and it is kept until the end of decoding. Syncing a surface waits for the v4l buffer to be available and then dequeue it.

Note: since a Surface is kept private from the VA's user, it can ask to directly render a Surface on screen in an X Drawable. Some kind of implementation is available in PutSurface but this is only for development purpose.

Context

A Context is a global data structure used for rendering a video of a certain format. When a context is created, input buffers are created and v4l's output (which is the compressed data input queue, since capture is the real output) format is set.

Picture

A Picture is an encoded input frame made of several buffers. A single input can contain slice data, headers and IQ matrix. Each Picture is assigned a request ID when created and each corresponding buffer might be turned into a v4l buffers or extended control when rendered. Finally they are submitted to kernel space when reaching EndPicture.

The real rendering is done in EndPicture instead of RenderPicture because the v4l2 driver expects to have the full corresponding extended control when a buffer is queued and we don't know in which order the different RenderPicture will be called.

Image

An Image is a standard data structure containing rendered frames in a usable pixel format. Here we only use NV12 buffers which are converted from sunxi's proprietary tiled pixel format with tiled_yuv when deriving an Image from a Surface.

Languages

C 96.2%

Shell 2%

Meson 0.8%

Assembly 0.4%

Makefile 0.4%

Other 0.2%