# Phase 8.10 + 8.11 closure — libva consumer integration scaffold **Status:** closed 2026-05-18. Two interleaved phases: - **Phase 8.10** — wire daedalus_v4l2 into the existing `libva-v4l2-request-fourier` campaign fork (the sibling repo at `marfrit/libva-v4l2-request-fourier` already had VP9/AV1/H264/HEVC working on Rockchip/Allwinner). - **Phase 8.11** — extend daedalus_v4l2 with the V4L2 surface format and media-controller plumbing the libva fork needs. Together they bring the daedalus stack from "standalone test client" to "VAAPI-discoverable decoder with all the ICD-side framework integration in place." Actual decode through the libva path stops at the V4L2 stateless control payload acceptance step — a deeper framework integration that lands in Phase 8.12. ## What lands ### libva-v4l2-request-fourier (sibling fork, gitea `marfrit/`) Two commits pushed to `master`: - `b5b3acf` — daedalus_v4l2: add to known_decoder_drivers + multi-device-probe slot. Same shape as iter40's rpi-hevc-dec wiring: array entry, driver_data fd slot, primary-driver detection branch, post-probe log line. 34-line diff. - `2146341` — daedalus_v4l2: meson option gate (default true). `meson setup -Ddaedalus_v4l2=false` builds a .so with no daedalus strings at all (verified via strings on both build dirs). Struct fields stay unconditional to avoid ODR risk across translation units. ### daedalus-v4l2 (this repo) Three production changes in this commit: **1. `V4L2_PIX_FMT_NV12` (single-plane) on CAPTURE** The libva fork's `video_format` table only knows NV12 single-plane (W*H Y bytes followed by W*H/2 interleaved CbCr bytes in one buffer plane), not NV12M (two-plane). Added NV12 alongside our existing NV12M + P010 in the CAPTURE format list: - `daedalus_capture_formats[]` grew an `V4L2_PIX_FMT_NV12` entry; `enum_fmt` now lists 3 CAPTURE formats. - `daedalus_fill_capture_fmt` handles the new num_planes=1 layout (sizeimage = W*H*3/2, bytesperline = W). - daemon `pack_nv12_single_to_plane`: Y line-by-line into base+0, interleaved CbCr at base+(stride*H). Mirrors the P010 pack structure minus the depth shift. - `daedalus_decoder_run_request` dispatches on `req->capture_pix_fmt` to the right pack function. **Verified**: VP9 1080p decoded into NV12 single-plane via `tools/test_m2m_stream`, byte-for-byte match against `ffmpeg -pix_fmt nv12` reference (10-frame 31 MB stream). **2. V4L2 Request API media ops** `daedalus_media_ops = { .req_validate = vb2_request_validate, .req_queue = v4l2_m2m_request_queue }` assigned to `mdev.ops` before `media_device_init`. Before this, `MEDIA_IOC_REQUEST_ALLOC` returned `-ENOTTY` and any VAAPI consumer couldn't even allocate a media_request. **3. Stateless control registration via `v4l2_ctrl_new_custom`** Switched from `v4l2_ctrl_new_std_compound(NULL p_def)` to `v4l2_ctrl_new_custom(&cfg, NULL)` — the pattern rkvdec / cedrus / hantro use. Adds a no-op `s_ctrl` callback so v4l2-core has somewhere to dispatch SET operations. ## Verification ### Probe + enumeration ``` $ LIBVA_DRIVER_NAME=v4l2_request \ LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \ LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \ vainfo --display drm --device /dev/dri/renderD128 v4l2-request: phase 8.10: opened daedalus_v4l2 at video_fd=N media_fd=M vainfo: Driver version: v4l2-request vainfo: Supported profile and entrypoints VAProfileH264Main : VAEntrypointVLD VAProfileH264High : VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264MultiviewHigh : VAEntrypointVLD VAProfileH264StereoHigh : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointVLD VAProfileAV1Profile0 : VAEntrypointVLD ``` Seven VAAPI profiles enumerated through the libva path. ### LibVA trace through `ffmpeg -hwaccel vaapi` | Step | Status | |------|--------| | `vaInitialize` | ✓ | | `vaQueryConfigProfiles` | ✓ | | `vaQueryConfigEntrypoints` (VLD) | ✓ | | `vaCreateConfig` (VP9 + VLD + NV12) | ✓ | | `vaQuerySurfaceAttributes` (NV12 fourcc reported) | ✓ | | `vaCreateSurfaces` | ✓ | | `vaCreateContext` (cap_pool: 24 slots, 1 plane each) | ✓ | | `vaCreateBuffer` (slice + picture params) | ✓ | | `MEDIA_IOC_REQUEST_ALLOC` | ✓ | | `VIDIOC_S_EXT_CTRLS` (stateless ctrls) | ✗ EINVAL | | `VIDIOC_QBUF` with request fd | ✗ "Invalid request descriptor" | | `vaEndPicture` | ✗ OPERATION_FAILED | Everything past discovery / probe / context / surface / buffer / request alloc works. The blocker is `VIDIOC_S_EXT_CTRLS` returning EINVAL when libva tries to populate `V4L2_CID_STATELESS_VP9_FRAME` on the request — the payload validation against the kernel's expected compound-control struct shape rejects. This isn't a "missing line" fix — it needs proper stateless control plumbing (the SPS/PPS/SliceParams/etc. get_dims, validate, default-value paths that the in-tree rkvdec / cedrus / hantro decoders implement to satisfy v4l2-core's `std_validate` machinery). That's Phase 8.12 scope. ### Standalone NV12 verification ``` $ sudo ./tools/test_m2m_stream /tmp/vp9_1080_stream.ivf \ /tmp/nv12_out.nv12 1920 1080 vp9 nv12 parsed 10 frames, 1920x1080 CAPTURE fmt=NV12 planes=1 sizeimage=[3110400,0] decoded 10 / 10 frames $ cmp /tmp/nv12_out.nv12 /tmp/vp9_1080_stream_ref.nv12 $ echo $? 0 ``` Byte-exact through the daedalus-internal path with the NV12 single-plane format. Confirms `pack_nv12_single_to_plane` produces the same pixels as `pack_nv12_to_planes` (just re-laid-out into one buffer). ## Design decisions ### Why ship even though decode-via-libva is blocked The framework integration up to MEDIA_IOC_REQUEST_ALLOC is itself a significant deliverable: - Other VAAPI consumers (testing tools, future patched ffmpeg paths, custom VA clients) get the same scaffolding for free. - The remaining gap is well-characterised: it's the stateless control payload acceptance, a single named V4L2 framework integration. Phase 8.12's surface is clearly defined. - All the new code is correct on its own merits (NV12 single-plane decode is byte-exact via our own path; request API ops are the canonical helpers). ### Why register stateless controls if we don't act on them libva-v4l2-request-fourier's per-codec dispatch requires the standard V4L2 stateless controls to be visible — that's how it validates that the kernel supports the right profile. Without the registered controls vainfo would not enumerate the profile. Our daemon ignores the control values because FFmpeg re-parses the bitstream on its own. The plumbing exists to satisfy the V4L2 stateless contract; the actual decode logic doesn't depend on it. ### Why `v4l2_ctrl_new_custom` over `_std_compound` `v4l2_ctrl_new_std_compound` with `NULL` default rejected SET requests (same EINVAL libva is hitting today — removing the `NULL` default didn't fix it either). `v4l2_ctrl_new_custom` is the pattern in-tree decoders use; v4l2-core auto-fills the type/dims/size from the std control table when given just a `.id`. The remaining issue isn't the registration pattern but the payload validation path — v4l2-core expects more than just a registered control; it expects the driver to have set up `min/max/step/def` for each compound field, which `v4l2_ctrl_new_std_compound` does internally for known CIDs but our handler isn't quite right. ## What's NOT here (Phase 8.12 scope) - **Stateless control payload acceptance**: the S_EXT_CTRLS EINVAL. Needs proper v4l2-core validation hooks — likely meaning the daemon needs to actually consume the per-frame controls (not just ignore them), so the validation path has something to hand off to. - **Per-codec control wiring**: even if S_EXT_CTRLS succeeds, the actual decode submission to the daemon needs to bundle the per-buffer controls (or document why they're ignored — and convince v4l2-core to allow the request to validate). - **First end-to-end decoded frame via libva**: the payoff for Phase 8.12. ## Phase 8.12 plan 1. Study cedrus or rkvdec's stateless control validation to understand what `std_validate` expects beyond just registration. 2. Either: - (a) Add proper compound-control validation hooks so S_EXT_CTRLS succeeds without us doing real work (control values become "advisory" since daemon re-parses bitstream), OR - (b) Wire the daemon to actually use the per-frame control payload (skip FFmpeg's parse step, trust libva's parsed values). Bigger change but more correct. 3. Verify first frame decoded through the libva path. 4. Run the full vainfo --display drm --decode test if that exists, or a small VA decode snippet.