Phase 8.10+8.11: libva consumer integration scaffold

Brings daedalus_v4l2 from "standalone test client" to "VAAPI-
discoverable decoder" by adding the surface formats and
media-controller plumbing that libva-v4l2-request-fourier
(sibling repo) requires.

libva-v4l2-request-fourier patches (pushed separately):
- b5b3acf: daedalus_v4l2 added to known_decoder_drivers
- 2146341: meson option gate

This commit (daedalus-v4l2 side, 3 production changes):

1. V4L2_PIX_FMT_NV12 (single-plane) on CAPTURE
   - Added to daedalus_capture_formats[] alongside NV12M + P010
   - daedalus_fill_capture_fmt handles num_planes=1 case
     (sizeimage = W*H*3/2, bytesperline = W)
   - daemon pack_nv12_single_to_plane: Y at base+0,
     interleaved CbCr at base+(stride*H); same byte content
     as NV12M two-plane, different layout
   - Required because libva-v4l2-request-fourier's video.c
     only knows non-multi-plane NV12 (it advertises
     v4l2_mplane=true but uses the single-plane fourcc).
   - Verified byte-exact via test_m2m_stream against
     ffmpeg -pix_fmt nv12 reference (VP9 1080p 10 frames,
     31 MB).

2. V4L2 Request API media ops
   - daedalus_media_ops = { vb2_request_validate,
     v4l2_m2m_request_queue } assigned to mdev.ops before
     media_device_init.
   - Without this, MEDIA_IOC_REQUEST_ALLOC returned
     -ENOTTY and no VAAPI consumer could allocate a
     media_request.

3. Stateless control registration via v4l2_ctrl_new_custom
   - Switched from v4l2_ctrl_new_std_compound(NULL p_def)
     to v4l2_ctrl_new_custom — pattern rkvdec/cedrus/
     hantro use. Adds a no-op s_ctrl callback.

Verification (hertz, Pi 5, 6.12.75+rpt-rpi-2712):

LibVA trace through `ffmpeg -hwaccel vaapi`:
  vaInitialize / Profiles / Entrypoints / CreateConfig /
  QuerySurfaceAttributes / CreateSurfaces / CreateContext
  (cap_pool: 24 slots, 1 plane each) / CreateBuffer
  (slice + picture params) / MEDIA_IOC_REQUEST_ALLOC
  — all succeed.

Standalone NV12 decode path:
  test_m2m_stream vp9_1080_stream.ivf out.nv12 1920 1080 vp9 nv12
  → 10/10 frames, byte-exact vs ffmpeg -pix_fmt nv12

vainfo (via libva-v4l2-request-fourier with our driver):
  7 VAProfile entries with VAEntrypointVLD
  (H264 Main/High/CBaseline/MultiviewHigh/StereoHigh,
   VP9Profile0, AV1Profile0)

What's NOT here (Phase 8.12):

The libva trace stops at VIDIOC_S_EXT_CTRLS returning
EINVAL when populating V4L2_CID_STATELESS_VP9_FRAME on
the request. The compound-control payload validation
against the kernel's expected struct shape rejects.
This isn't a "missing line" fix — it needs proper
stateless control plumbing (the SPS/PPS/SliceParams
get_dims, validate, default-value paths that in-tree
rkvdec/cedrus/hantro implement to satisfy v4l2-core's
std_validate). Documented as Phase 8.12 scope.

The shipped integration is itself a meaningful deliverable:
all the framework scaffolding is in place; the remaining
gap is well-characterised and bounded.

See docs/phase_8_10_11_closure.md for the full trace
analysis + next-phase plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-18 17:51:16 +00:00
parent d84efdb125
commit 0de0288dce
4 changed files with 380 additions and 11 deletions
+84 -11
View File
@@ -69,8 +69,18 @@ static const u32 daedalus_output_formats[] = {
#define DAEDALUS_NUM_OUTPUT_FMTS ARRAY_SIZE(daedalus_output_formats)
#define DAEDALUS_DEFAULT_OUTPUT_FOURCC V4L2_PIX_FMT_VP9_FRAME
/*
* NV12 (single-plane Y+CbCr contiguous) listed alongside NV12M
* (two-plane Y / CbCr separate) so legacy MPLANE clients that
* expect single-plane buffer geometry (e.g. libva-v4l2-request-
* fourier's NV12 video_format entry, used by VAAPI consumers via
* ffmpeg vaapi) can negotiate the format successfully. The two
* fourccs differ only in plane layout — bit-exact pixel content
* is identical.
*/
static const u32 daedalus_capture_formats[] = {
V4L2_PIX_FMT_NV12M,
V4L2_PIX_FMT_NV12,
V4L2_PIX_FMT_P010,
};
#define DAEDALUS_NUM_CAPTURE_FMTS ARRAY_SIZE(daedalus_capture_formats)
@@ -177,21 +187,44 @@ static const u32 daedalus_stateless_ctrls[] = {
V4L2_CID_STATELESS_AV1_FILM_GRAIN,
};
/*
* No-op control op set: daemon ignores all stateless control
* values (FFmpeg re-parses the bitstream). But v4l2-core requires
* ops to be present on a ctrl_handler that processes SET requests
* — without it, S_EXT_CTRLS rejects with EINVAL on validate.
* Always-success s_ctrl is the right shape for "we accept whatever
* you tell us but actually act on the OUTPUT buffer payload alone."
*/
static int daedalus_s_ctrl_noop(struct v4l2_ctrl *ctrl)
{
(void) ctrl;
return 0;
}
static const struct v4l2_ctrl_ops daedalus_ctrl_ops = {
.s_ctrl = daedalus_s_ctrl_noop,
};
static int daedalus_register_stateless_ctrls(struct v4l2_ctrl_handler *hdl)
{
size_t i;
/*
* Use v4l2_ctrl_new_custom (the pattern rkvdec / cedrus /
* hantro use) rather than v4l2_ctrl_new_std_compound.
* v4l2-core auto-detects the type from each known
* V4L2_CID_STATELESS_* id and allocates the right payload
* size internally; S_EXT_CTRLS then validates user input
* against that allocated payload. v4l2_ctrl_new_std_compound
* with NULL p_def was rejecting writes (libva-v4l2-request-
* fourier got EINVAL on every stateless ctrl SET).
*/
for (i = 0; i < ARRAY_SIZE(daedalus_stateless_ctrls); i++) {
v4l2_ctrl_new_std_compound(hdl, NULL,
daedalus_stateless_ctrls[i],
v4l2_ctrl_ptr_create(NULL));
/*
* Errors here mean the v4l2-core doesn't know about
* this CID on this kernel (e.g. older trees missing
* AV1_FILM_GRAIN). hdl->error captures it; we
* tolerate it — the codec just won't appear as
* supported through that control.
*/
struct v4l2_ctrl_config cfg = {
.ops = &daedalus_ctrl_ops,
.id = daedalus_stateless_ctrls[i],
};
v4l2_ctrl_new_custom(hdl, &cfg, NULL);
if (hdl->error) {
pr_debug("daedalus_v4l2: skipping unsupported CID 0x%x (err=%d)\n",
daedalus_stateless_ctrls[i], hdl->error);
@@ -204,9 +237,14 @@ static int daedalus_register_stateless_ctrls(struct v4l2_ctrl_handler *hdl)
/* -- format helpers -------------------------------------------------- */
/*
* CAPTURE format fill. Two layouts supported:
* CAPTURE format fill. Three layouts supported:
* NV12M (default, 8-bit) — 2 planes: Y (W*H bytes) + interleaved
* CbCr at half-res (W*H/2 bytes).
* NV12 (8-bit, 1 plane) — 1 plane: Y (W*H) followed by
* interleaved CbCr (W*H/2); total
* W*H*3/2 bytes. For legacy MPLANE
* clients that don't speak multi-
* plane (libva-v4l2-request).
* P010 (10-bit HDR) — 1 plane: Y first (W*H*2 bytes) then
* interleaved CbCr at half-res
* (W*H bytes); 16-bit samples,
@@ -230,6 +268,12 @@ static void daedalus_fill_capture_fmt(struct v4l2_pix_format_mplane *f,
f->plane_fmt[0].sizeimage = w * h * 2 + w * h;
f->plane_fmt[1].bytesperline = 0;
f->plane_fmt[1].sizeimage = 0;
} else if (fourcc == V4L2_PIX_FMT_NV12) {
f->num_planes = 1;
f->plane_fmt[0].bytesperline = w;
f->plane_fmt[0].sizeimage = w * h + w * h / 2;
f->plane_fmt[1].bytesperline = 0;
f->plane_fmt[1].sizeimage = 0;
} else {
f->num_planes = 2;
f->plane_fmt[0].bytesperline = w;
@@ -927,6 +971,26 @@ static void daedalus_vdev_release(struct video_device *vdev)
/* embedded in daedalus_dev (devm) — nothing to free here */
}
/* -- media controller request-API ops (Phase 8.11) ------------------ */
/*
* V4L2 Request API plumbing: lets a client allocate a media_request
* (MEDIA_IOC_REQUEST_ALLOC), stage per-buffer controls into it via
* VIDIOC_S_EXT_CTRLS with which=V4L2_CTRL_WHICH_REQUEST_VAL, then
* queue the OUTPUT buffer with the request fd bound — all controls
* + the buffer apply atomically at decode submission.
*
* vb2_request_validate / v4l2_m2m_request_queue are the canonical
* helpers; the daemon doesn't actually use the staged controls
* (FFmpeg re-parses the bitstream) but the wire-level support is
* what libva-v4l2-request-fourier requires to call MEDIA_IOC_
* REQUEST_ALLOC successfully.
*/
static const struct media_device_ops daedalus_media_ops = {
.req_validate = vb2_request_validate,
.req_queue = v4l2_m2m_request_queue,
};
/* -- platform driver bind -------------------------------------------- */
static int daedalus_probe(struct platform_device *pdev)
@@ -964,9 +1028,18 @@ static int daedalus_probe(struct platform_device *pdev)
* are required by spec to expose a media controller (the
* request API rides on it) — v4l2-compliance's DECODER_CMD
* test rejects drivers without it.
*
* Phase 8.11: wire the V4L2 request API media ops so libva-
* v4l2-request-fourier can MEDIA_IOC_REQUEST_ALLOC against
* us. vb2_request_validate + v4l2_m2m_request_queue are the
* canonical helpers — they bundle per-buffer controls with
* the matching qbuf so the decode submission is atomic
* (required for stateless decoders feeding hardware that
* needs all params present before kickoff).
*/
dev->mdev.dev = &pdev->dev;
strscpy(dev->mdev.model, "daedalus-v4l2", sizeof(dev->mdev.model));
dev->mdev.ops = &daedalus_media_ops;
media_device_init(&dev->mdev);
dev->v4l2_dev.mdev = &dev->mdev;