# Phase 8.13 closure — byte-exact end-to-end via libva **Status:** closed 2026-05-18. The project's consumer-side goal landed: a real VAAPI consumer (ffmpeg with `-hwaccel vaapi`) drives our libva backend → V4L2 driver → daemon → byte-exact NV12 output back to ffmpeg. ``` ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \ -hwaccel_output_format nv12 -i vp9_small.ivf \ -f rawvideo -y /tmp/vp9_via_libva.nv12 cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12 # ← match ``` 18432 bytes, byte-for-byte identical to plain `ffmpeg -pix_fmt nv12 -f rawvideo` software decode of the same VP9 keyframe. The `project_consumer_target` memory's deliverable shape — "V4L2 stateless node consumed by a real VAAPI client" — is achieved. ## What lands Two related kernel changes that unstick the libva request completion handshake: ### 1. Stateless control handler initialisation `v4l2_ctrl_handler_setup(&ctx->hdl)` after registration — matches rkvdec/cedrus/hantro. Brings each registered compound control out of "uninitialised" state via the std_init_compound defaults (e.g. VP9_FRAME gets `profile=0, bit_depth=8`). ### 2. Per-request control completion in the decode path The actual root cause of "Timeout when waiting for media request": - vb2-core's `vb2_buffer_done` unbinds the BUFFER's req_obj on normal decode completion. - But the per-request CONTROL object stays bound until `v4l2_ctrl_request_complete` runs. - The vb2 `buf_request_complete` op fires only from queue-cancel paths (vb2-core line 2284), NOT from normal buf_done. - The driver must call `v4l2_ctrl_request_complete(req, hdl)` explicitly from its decode-completion path. Fix (in `kernel/daedalus_v4l2_main.c`): ```c struct daedalus_inflight { ... struct media_request *req; /* captured from src_buf */ }; static void daedalus_device_run(void *priv) { ... inf->req = src_buf->vb2_buf.req_obj.req; ... } void daedalus_complete_resp_frame(...) { ... if (inf->req) v4l2_ctrl_request_complete(inf->req, &inf->ctx->hdl); v4l2_m2m_buf_done_and_job_finish(...); } ``` For non-request flows (test_m2m_stream's direct QBUF) `inf->req` is NULL; the conditional skips the `v4l2_ctrl_request_complete` call. Both consumer styles work concurrently. ### 3. Diagnostic improvements - libva-v4l2-request-fourier `src/v4l2.c`: better error logging in `v4l2_set_controls` (logs `error_idx`, failing control id, size). Made the diagnosis above tractable. ## Verification ### End-to-end via ffmpeg + libva, byte-exact ``` $ pkill -f daedalus_v4l2_daemon; sudo rmmod daedalus_v4l2 $ sudo insmod kernel/daedalus_v4l2.ko $ daedalus_v4l2_daemon -v daemon & $ LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \ LIBVA_DRIVER_NAME=v4l2_request \ LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \ LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \ ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \ -hwaccel_output_format nv12 -i /tmp/vp9_small.ivf \ -f rawvideo -y /tmp/vp9_via_libva.nv12 v4l2-request: cap_pool_init: 24 slots ready v4l2-request: Unable to set control(s): EINVAL (H264 probe — harmless) v4l2-request: Unable to set control(s): EINVAL (HEVC probe — harmless) (no timeout, no decode error) daemon log: REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes decoder: opened vp9 context decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe luma=12288 chroma=6144 $ ls -la /tmp/vp9_via_libva.nv12 -rw-r--r-- 1 root root 18432 May 18 20:13 /tmp/vp9_via_libva.nv12 $ ffmpeg -i /tmp/vp9_small.ivf -pix_fmt nv12 -f rawvideo \ -y /tmp/vp9_ref_for_libva.nv12 $ cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12 $ echo $? 0 ``` Byte-for-byte match. The two `Unable to set control(s): EINVAL` messages are libva probing H264 + HEVC PROFILE/LEVEL integer controls during config negotiation — we don't register those (since we expose VP9/AV1/H264 stateless), libva gets EINVAL, logs it, and moves on. Functional flow is unaffected. ## Design analysis ### Why was Phase 8.12 close but not complete 8.12 wired all the request API hooks (supports_requests, buf_out_validate, buf_request_complete) and the daemon-side decode worked byte-exact — but the per-request control object stayed bound forever because `buf_request_complete` only fires from queue-cancel paths in vb2-core, not from normal buf_done. Result: request never transitioned to COMPLETE, libva poll timed out. Phase 8.13 closes that loop by capturing the media_request from the OUTPUT vb2_buffer's req_obj at device_run time and calling `v4l2_ctrl_request_complete` explicitly when the decode finishes (chardev RESP_FRAME path). Mirrors what rkvdec does from its IRQ handler and cedrus from its device_run completion. ### Why the EINVAL noise was misleading Earlier phases (8.10–8.12) kept seeing "Unable to set control(s): Invalid argument" and assumed it pointed at our stateless control registration. strace revealed three separate S_EXT_CTRLS calls per frame: | # | controls | result | meaning | |---|----------|--------|---------| | 1 | H264_PROFILE + H264_LEVEL | EINVAL | libva probes H264; we don't register it | | 2 | HEVC_PROFILE + HEVC_LEVEL | EINVAL | libva probes HEVC; we don't register it | | 3 | VP9_FRAME + VP9_COMPRESSED_HDR | OK | actual decode controls | Calls 1 and 2 are harmless: libva detects we don't support H264/HEVC integer probes and falls back to the stateless controls it does have. Call 3 (the actual VP9 stateless controls) succeeded all along. Only the completion handshake was broken. Phase 8.13's added `error_idx` logging in v4l2.c (`failing_ctrl_id=0xa40900 size=0` etc.) is what made the distinction visible. ### Why one fix unblocked both 8.13 and 8.14 The original plan split Phase 8.13 ("trace the EINVAL") from Phase 8.14 ("call request_complete from the right place"). Once strace clarified that the EINVAL was probe noise, the real fix was just the request_complete call from the decode path — a 10-line change. Doing both in one shot avoided a phase boundary that wouldn't have shipped anything additional. ## What's NOT here (Phase 8.14+ scope) - **Multi-frame stream via libva.** Verified single keyframe; P-frame reference handling across requests untested. Likely works (the daemon's AVCodecContext is persistent across REQ_DECODE calls — already proven via test_m2m_stream). - **AV1 + H.264 via libva.** Different stateless control sets; needs the same control-payload validation path. May need similar `v4l2_ctrl_request_complete` adjustments per codec. - **mpv + Firefox end-to-end.** The lower-level harness (ffmpeg vaapi) works; higher-level consumers should follow but each has its own VAAPI quirks. - **The two harmless EINVALs from H264/HEVC profile probes.** Could be suppressed by registering those integer controls too (always rejecting writes) but that's a polish item. ## Phase 8.14 plan 1. Multi-frame VP9 stream via libva (re-use vp9_60s.ivf from Phase 8.9 stress test). 2. AV1 + H.264 single-frame via libva (likely needs codec- specific tweaks). 3. Document any remaining libva-side quirks for higher-level consumers (mpv, Firefox).