Files
daedalus-v4l2/docs/phase_8_10_11_closure.md
marfrit 0de0288dce Phase 8.10+8.11: libva consumer integration scaffold
Brings daedalus_v4l2 from "standalone test client" to "VAAPI-
discoverable decoder" by adding the surface formats and
media-controller plumbing that libva-v4l2-request-fourier
(sibling repo) requires.

libva-v4l2-request-fourier patches (pushed separately):
- b5b3acf: daedalus_v4l2 added to known_decoder_drivers
- 2146341: meson option gate

This commit (daedalus-v4l2 side, 3 production changes):

1. V4L2_PIX_FMT_NV12 (single-plane) on CAPTURE
   - Added to daedalus_capture_formats[] alongside NV12M + P010
   - daedalus_fill_capture_fmt handles num_planes=1 case
     (sizeimage = W*H*3/2, bytesperline = W)
   - daemon pack_nv12_single_to_plane: Y at base+0,
     interleaved CbCr at base+(stride*H); same byte content
     as NV12M two-plane, different layout
   - Required because libva-v4l2-request-fourier's video.c
     only knows non-multi-plane NV12 (it advertises
     v4l2_mplane=true but uses the single-plane fourcc).
   - Verified byte-exact via test_m2m_stream against
     ffmpeg -pix_fmt nv12 reference (VP9 1080p 10 frames,
     31 MB).

2. V4L2 Request API media ops
   - daedalus_media_ops = { vb2_request_validate,
     v4l2_m2m_request_queue } assigned to mdev.ops before
     media_device_init.
   - Without this, MEDIA_IOC_REQUEST_ALLOC returned
     -ENOTTY and no VAAPI consumer could allocate a
     media_request.

3. Stateless control registration via v4l2_ctrl_new_custom
   - Switched from v4l2_ctrl_new_std_compound(NULL p_def)
     to v4l2_ctrl_new_custom — pattern rkvdec/cedrus/
     hantro use. Adds a no-op s_ctrl callback.

Verification (hertz, Pi 5, 6.12.75+rpt-rpi-2712):

LibVA trace through `ffmpeg -hwaccel vaapi`:
  vaInitialize / Profiles / Entrypoints / CreateConfig /
  QuerySurfaceAttributes / CreateSurfaces / CreateContext
  (cap_pool: 24 slots, 1 plane each) / CreateBuffer
  (slice + picture params) / MEDIA_IOC_REQUEST_ALLOC
  — all succeed.

Standalone NV12 decode path:
  test_m2m_stream vp9_1080_stream.ivf out.nv12 1920 1080 vp9 nv12
  → 10/10 frames, byte-exact vs ffmpeg -pix_fmt nv12

vainfo (via libva-v4l2-request-fourier with our driver):
  7 VAProfile entries with VAEntrypointVLD
  (H264 Main/High/CBaseline/MultiviewHigh/StereoHigh,
   VP9Profile0, AV1Profile0)

What's NOT here (Phase 8.12):

The libva trace stops at VIDIOC_S_EXT_CTRLS returning
EINVAL when populating V4L2_CID_STATELESS_VP9_FRAME on
the request. The compound-control payload validation
against the kernel's expected struct shape rejects.
This isn't a "missing line" fix — it needs proper
stateless control plumbing (the SPS/PPS/SliceParams
get_dims, validate, default-value paths that in-tree
rkvdec/cedrus/hantro implement to satisfy v4l2-core's
std_validate). Documented as Phase 8.12 scope.

The shipped integration is itself a meaningful deliverable:
all the framework scaffolding is in place; the remaining
gap is well-characterised and bounded.

See docs/phase_8_10_11_closure.md for the full trace
analysis + next-phase plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:51:16 +00:00

8.6 KiB

Phase 8.10 + 8.11 closure — libva consumer integration scaffold

Status: closed 2026-05-18.

Two interleaved phases:

  • Phase 8.10 — wire daedalus_v4l2 into the existing libva-v4l2-request-fourier campaign fork (the sibling repo at marfrit/libva-v4l2-request-fourier already had VP9/AV1/H264/HEVC working on Rockchip/Allwinner).
  • Phase 8.11 — extend daedalus_v4l2 with the V4L2 surface format and media-controller plumbing the libva fork needs.

Together they bring the daedalus stack from "standalone test client" to "VAAPI-discoverable decoder with all the ICD-side framework integration in place." Actual decode through the libva path stops at the V4L2 stateless control payload acceptance step — a deeper framework integration that lands in Phase 8.12.

What lands

libva-v4l2-request-fourier (sibling fork, gitea marfrit/)

Two commits pushed to master:

  • b5b3acf — daedalus_v4l2: add to known_decoder_drivers
    • multi-device-probe slot. Same shape as iter40's rpi-hevc-dec wiring: array entry, driver_data fd slot, primary-driver detection branch, post-probe log line. 34-line diff.
  • 2146341 — daedalus_v4l2: meson option gate (default true). meson setup -Ddaedalus_v4l2=false builds a .so with no daedalus strings at all (verified via strings on both build dirs). Struct fields stay unconditional to avoid ODR risk across translation units.

daedalus-v4l2 (this repo)

Three production changes in this commit:

1. V4L2_PIX_FMT_NV12 (single-plane) on CAPTURE

The libva fork's video_format table only knows NV12 single-plane (WH Y bytes followed by WH/2 interleaved CbCr bytes in one buffer plane), not NV12M (two-plane). Added NV12 alongside our existing NV12M + P010 in the CAPTURE format list:

  • daedalus_capture_formats[] grew an V4L2_PIX_FMT_NV12 entry; enum_fmt now lists 3 CAPTURE formats.
  • daedalus_fill_capture_fmt handles the new num_planes=1 layout (sizeimage = WH3/2, bytesperline = W).
  • daemon pack_nv12_single_to_plane: Y line-by-line into base+0, interleaved CbCr at base+(stride*H). Mirrors the P010 pack structure minus the depth shift.
  • daedalus_decoder_run_request dispatches on req->capture_pix_fmt to the right pack function.

Verified: VP9 1080p decoded into NV12 single-plane via tools/test_m2m_stream, byte-for-byte match against ffmpeg -pix_fmt nv12 reference (10-frame 31 MB stream).

2. V4L2 Request API media ops

daedalus_media_ops = { .req_validate = vb2_request_validate, .req_queue = v4l2_m2m_request_queue } assigned to mdev.ops before media_device_init. Before this, MEDIA_IOC_REQUEST_ALLOC returned -ENOTTY and any VAAPI consumer couldn't even allocate a media_request.

3. Stateless control registration via v4l2_ctrl_new_custom

Switched from v4l2_ctrl_new_std_compound(NULL p_def) to v4l2_ctrl_new_custom(&cfg, NULL) — the pattern rkvdec / cedrus / hantro use. Adds a no-op s_ctrl callback so v4l2-core has somewhere to dispatch SET operations.

Verification

Probe + enumeration

$ LIBVA_DRIVER_NAME=v4l2_request \
  LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \
  LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \
  vainfo --display drm --device /dev/dri/renderD128

v4l2-request: phase 8.10: opened daedalus_v4l2 at video_fd=N media_fd=M
vainfo: Driver version: v4l2-request
vainfo: Supported profile and entrypoints
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264MultiviewHigh      : VAEntrypointVLD
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD

Seven VAAPI profiles enumerated through the libva path.

LibVA trace through ffmpeg -hwaccel vaapi

Step Status
vaInitialize
vaQueryConfigProfiles
vaQueryConfigEntrypoints (VLD)
vaCreateConfig (VP9 + VLD + NV12)
vaQuerySurfaceAttributes (NV12 fourcc reported)
vaCreateSurfaces
vaCreateContext (cap_pool: 24 slots, 1 plane each)
vaCreateBuffer (slice + picture params)
MEDIA_IOC_REQUEST_ALLOC
VIDIOC_S_EXT_CTRLS (stateless ctrls) ✗ EINVAL
VIDIOC_QBUF with request fd ✗ "Invalid request descriptor"
vaEndPicture ✗ OPERATION_FAILED

Everything past discovery / probe / context / surface / buffer / request alloc works. The blocker is VIDIOC_S_EXT_CTRLS returning EINVAL when libva tries to populate V4L2_CID_STATELESS_VP9_FRAME on the request — the payload validation against the kernel's expected compound-control struct shape rejects.

This isn't a "missing line" fix — it needs proper stateless control plumbing (the SPS/PPS/SliceParams/etc. get_dims, validate, default-value paths that the in-tree rkvdec / cedrus / hantro decoders implement to satisfy v4l2-core's std_validate machinery). That's Phase 8.12 scope.

Standalone NV12 verification

$ sudo ./tools/test_m2m_stream /tmp/vp9_1080_stream.ivf \
     /tmp/nv12_out.nv12 1920 1080 vp9 nv12
parsed 10 frames, 1920x1080
CAPTURE fmt=NV12 planes=1 sizeimage=[3110400,0]
decoded 10 / 10 frames
$ cmp /tmp/nv12_out.nv12 /tmp/vp9_1080_stream_ref.nv12
$ echo $?
0

Byte-exact through the daedalus-internal path with the NV12 single-plane format. Confirms pack_nv12_single_to_plane produces the same pixels as pack_nv12_to_planes (just re-laid-out into one buffer).

Design decisions

Why ship even though decode-via-libva is blocked

The framework integration up to MEDIA_IOC_REQUEST_ALLOC is itself a significant deliverable:

  • Other VAAPI consumers (testing tools, future patched ffmpeg paths, custom VA clients) get the same scaffolding for free.
  • The remaining gap is well-characterised: it's the stateless control payload acceptance, a single named V4L2 framework integration. Phase 8.12's surface is clearly defined.
  • All the new code is correct on its own merits (NV12 single-plane decode is byte-exact via our own path; request API ops are the canonical helpers).

Why register stateless controls if we don't act on them

libva-v4l2-request-fourier's per-codec dispatch requires the standard V4L2 stateless controls to be visible — that's how it validates that the kernel supports the right profile. Without the registered controls vainfo would not enumerate the profile.

Our daemon ignores the control values because FFmpeg re-parses the bitstream on its own. The plumbing exists to satisfy the V4L2 stateless contract; the actual decode logic doesn't depend on it.

Why v4l2_ctrl_new_custom over _std_compound

v4l2_ctrl_new_std_compound with NULL default rejected SET requests (same EINVAL libva is hitting today — removing the NULL default didn't fix it either). v4l2_ctrl_new_custom is the pattern in-tree decoders use; v4l2-core auto-fills the type/dims/size from the std control table when given just a .id.

The remaining issue isn't the registration pattern but the payload validation path — v4l2-core expects more than just a registered control; it expects the driver to have set up min/max/step/def for each compound field, which v4l2_ctrl_new_std_compound does internally for known CIDs but our handler isn't quite right.

What's NOT here (Phase 8.12 scope)

  • Stateless control payload acceptance: the S_EXT_CTRLS EINVAL. Needs proper v4l2-core validation hooks — likely meaning the daemon needs to actually consume the per-frame controls (not just ignore them), so the validation path has something to hand off to.
  • Per-codec control wiring: even if S_EXT_CTRLS succeeds, the actual decode submission to the daemon needs to bundle the per-buffer controls (or document why they're ignored — and convince v4l2-core to allow the request to validate).
  • First end-to-end decoded frame via libva: the payoff for Phase 8.12.

Phase 8.12 plan

  1. Study cedrus or rkvdec's stateless control validation to understand what std_validate expects beyond just registration.
  2. Either:
    • (a) Add proper compound-control validation hooks so S_EXT_CTRLS succeeds without us doing real work (control values become "advisory" since daemon re-parses bitstream), OR
    • (b) Wire the daemon to actually use the per-frame control payload (skip FFmpeg's parse step, trust libva's parsed values). Bigger change but more correct.
  3. Verify first frame decoded through the libva path.
  4. Run the full vainfo --display drm --decode test if that exists, or a small VA decode snippet.