This is what was actually broken — and it's much smaller than the STUDY
guessed.
The library is in fact multiplanar-aware throughout the v4l2.c helpers
(v4l2_setup_format, v4l2_get_format, v4l2_query_buffer, v4l2_queue_buffer,
v4l2_dequeue_buffer all branch on v4l2_type_is_mplane). The whole
"context.c / picture.c are single-plane" hypothesis from STUDY.md was wrong —
they derive output_type and capture_type from
video_format->v4l2_mplane and pass them through.
The bug is upstream of all that: driver_data->video_format is set in
RequestCreateSurfaces only, and the probe there only tries
V4L2_BUF_TYPE_VIDEO_CAPTURE (single-plane). On hantro the single-plane
ENUM_FMT returns nothing, video_format stays NULL, and every subsequent
operation hits the `if (video_format == NULL) return OPERATION_FAILED`
guard at the top of context.c / buffer.c / image.c. That's exactly what
Brave's vaCreateContext failure was — not a downstream multiplanar fault,
just the format-detection short-circuit firing.
Three small changes:
- src/video.c: add an NV12 multi-plane entry next to the existing
single-plane NV12 / Sunxi entries. Same pixelformat fourcc (S264 vs NV12
has nothing to do with mplane), distinguished by the v4l2_mplane bit.
- src/video.h + src/video.c: video_format_find() takes a `bool mplane`
parameter and matches both fields. Without this the single-plane and
multi-plane NV12 entries collide on pixelformat.
- src/surface.c: the probe block now tries V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE
as a fallback after the single-plane probes return nothing. On a
single-plane decoder the mplane probe is a no-op; on hantro it picks
the new mplane NV12 entry, video_format gets set, and the rest of the
library — already mplane-aware — does the right thing for OUTPUT
S_FMT, REQBUFS, EXPBUF, QBUF, DQBUF.
Also: src/context.c: the H264 case still referenced V4L2_PIX_FMT_H264_SLICE_RAW
that we renamed in commit c1f5108. Caught now.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v4l2-request libVA Backend
About
This libVA backend is designed to work with the Linux Video4Linux2 Request API that is used by a number of video codecs drivers, including the Video Engine found in most Allwinner SoCs.
Status
The v4l2-request libVA backend currently supports the following formats:
- MPEG2 (Simple and Main profiles)
- H264 (Baseline, Main and High profiles)
- H265 (Main profile)
Instructions
In order to use this libVA backend, the v4l2_request driver has to
be specified through the LIBVA_DRIVER_NAME environment variable, as
such:
export LIBVA_DRIVER_NAME=v4l2_request
A media player that supports VAAPI (such as VLC) can then be used to decode a video in a supported format:
vlc path/to/video.mpg
Sample media files can be obtained from:
http://samplemedia.linaro.org/MPEG2/
http://samplemedia.linaro.org/MPEG4/SVT/
Technical Notes
Surface
A Surface is an internal data structure never handled by the VA's user containing the output of a rendering. Usualy, a bunch of surfaces are created at the begining of decoding and they are then used alternatively. When created, a surface is assigned a corresponding v4l capture buffer and it is kept until the end of decoding. Syncing a surface waits for the v4l buffer to be available and then dequeue it.
Note: since a Surface is kept private from the VA's user, it can ask to directly render a Surface on screen in an X Drawable. Some kind of implementation is available in PutSurface but this is only for development purpose.
Context
A Context is a global data structure used for rendering a video of a certain format. When a context is created, input buffers are created and v4l's output (which is the compressed data input queue, since capture is the real output) format is set.
Picture
A Picture is an encoded input frame made of several buffers. A single input can contain slice data, headers and IQ matrix. Each Picture is assigned a request ID when created and each corresponding buffer might be turned into a v4l buffers or extended control when rendered. Finally they are submitted to kernel space when reaching EndPicture.
The real rendering is done in EndPicture instead of RenderPicture because the v4l2 driver expects to have the full corresponding extended control when a buffer is queued and we don't know in which order the different RenderPicture will be called.
Image
An Image is a standard data structure containing rendered frames in a usable pixel format. Here we only use NV12 buffers which are converted from sunxi's proprietary tiled pixel format with tiled_yuv when deriving an Image from a Surface.