The project's consumer-side goal landed: a real VAAPI consumer
(ffmpeg with -hwaccel vaapi) drives our libva backend → V4L2
driver → daemon → byte-exact NV12 output back to ffmpeg.
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format nv12 -i vp9_small.ivf \
-f rawvideo -y /tmp/vp9_via_libva.nv12
cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12 → match
18432-byte NV12 byte-for-byte identical to plain ffmpeg
-pix_fmt nv12 software decode. The project_consumer_target
memory's deliverable shape — "V4L2 stateless node consumed by
a real VAAPI client" — is achieved.
Two related kernel changes:
1. v4l2_ctrl_handler_setup(&ctx->hdl) after registration —
matches rkvdec/cedrus/hantro. Brings each registered
compound control out of "uninitialised" state via
std_init_compound defaults.
2. Per-request control completion in the decode path —
the real fix for "Timeout when waiting for media request".
vb2-core's vb2_buffer_done unbinds the BUFFER's req_obj
on normal decode completion, but the per-request CONTROL
object stays bound. buf_request_complete fires only from
queue-cancel paths (vb2-core line 2284), NOT from normal
buf_done. The driver must call
v4l2_ctrl_request_complete(req, hdl) explicitly from the
completion path.
struct daedalus_inflight gained a `struct media_request
*req` field, captured from src_buf->vb2_buf.req_obj.req
in device_run. daedalus_complete_resp_frame then calls
v4l2_ctrl_request_complete before
v4l2_m2m_buf_done_and_job_finish — triggers
MEDIA_REQUEST_STATE_COMPLETE and wakes the request fd
poll.
For non-request flows (test_m2m_stream direct QBUF)
inf->req is NULL; the conditional skips the call.
Both consumer styles work concurrently.
Diagnostic clarification (was Phase 8.13a):
strace traced three S_EXT_CTRLS calls per frame:
1. H264_PROFILE + H264_LEVEL → EINVAL (we don't register)
2. HEVC_PROFILE + HEVC_LEVEL → EINVAL (we don't register)
3. VP9_FRAME + VP9_COMPRESSED_HDR → SUCCESS
The first two are harmless: libva probes whether we support
H264/HEVC integer profile/level controls during config
negotiation; we don't (we expose them as stateless), so EINVAL
just falls through. The actual VP9 stateless controls (#3)
succeeded all along — the libva-side "Unable to set control(s)"
log was misleading us into thinking the control path was the
bug.
Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):
daemon log:
REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
decoder: opened vp9 context
decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe ...
ffmpeg side:
no Timeout, no Decoding error
/tmp/vp9_via_libva.nv12: 18432 bytes
cmp vs reference: byte-for-byte identical.
Roadmap update:
- 8.10/8.11, 8.12, 8.13 marked closed with closure docs.
- 8.14 = multi-frame VP9 via libva, AV1 + H.264, mpv/Firefox
higher-level consumers.
Per correctness-before-speed:
- strace + kernel-source-reading found the actual root cause
rather than guessing.
- Conditional v4l2_ctrl_request_complete preserves the existing
test_m2m_stream non-request path — both consumer styles work
concurrently without per-flow branching elsewhere.
- Byte-exact pixel comparison, not "frame size matches."
Phase 8.14 next: multi-frame stream + multi-codec via libva +
mpv/Firefox.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.1 KiB
Phase 8.13 closure — byte-exact end-to-end via libva
Status: closed 2026-05-18.
The project's consumer-side goal landed: a real VAAPI consumer
(ffmpeg with -hwaccel vaapi) drives our libva backend → V4L2
driver → daemon → byte-exact NV12 output back to ffmpeg.
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format nv12 -i vp9_small.ivf \
-f rawvideo -y /tmp/vp9_via_libva.nv12
cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12 # ← match
18432 bytes, byte-for-byte identical to plain
ffmpeg -pix_fmt nv12 -f rawvideo software decode of the same
VP9 keyframe. The project_consumer_target memory's deliverable
shape — "V4L2 stateless node consumed by a real VAAPI client" —
is achieved.
What lands
Two related kernel changes that unstick the libva request completion handshake:
1. Stateless control handler initialisation
v4l2_ctrl_handler_setup(&ctx->hdl) after registration —
matches rkvdec/cedrus/hantro. Brings each registered compound
control out of "uninitialised" state via the std_init_compound
defaults (e.g. VP9_FRAME gets profile=0, bit_depth=8).
2. Per-request control completion in the decode path
The actual root cause of "Timeout when waiting for media request":
- vb2-core's
vb2_buffer_doneunbinds the BUFFER's req_obj on normal decode completion. - But the per-request CONTROL object stays bound until
v4l2_ctrl_request_completeruns. - The vb2
buf_request_completeop fires only from queue-cancel paths (vb2-core line 2284), NOT from normal buf_done. - The driver must call
v4l2_ctrl_request_complete(req, hdl)explicitly from its decode-completion path.
Fix (in kernel/daedalus_v4l2_main.c):
struct daedalus_inflight {
...
struct media_request *req; /* captured from src_buf */
};
static void daedalus_device_run(void *priv) {
...
inf->req = src_buf->vb2_buf.req_obj.req;
...
}
void daedalus_complete_resp_frame(...) {
...
if (inf->req)
v4l2_ctrl_request_complete(inf->req, &inf->ctx->hdl);
v4l2_m2m_buf_done_and_job_finish(...);
}
For non-request flows (test_m2m_stream's direct QBUF)
inf->req is NULL; the conditional skips the
v4l2_ctrl_request_complete call. Both consumer styles
work concurrently.
3. Diagnostic improvements
- libva-v4l2-request-fourier
src/v4l2.c: better error logging inv4l2_set_controls(logserror_idx, failing control id, size). Made the diagnosis above tractable.
Verification
End-to-end via ffmpeg + libva, byte-exact
$ pkill -f daedalus_v4l2_daemon; sudo rmmod daedalus_v4l2
$ sudo insmod kernel/daedalus_v4l2.ko
$ daedalus_v4l2_daemon -v daemon &
$ LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \
LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format nv12 -i /tmp/vp9_small.ivf \
-f rawvideo -y /tmp/vp9_via_libva.nv12
v4l2-request: cap_pool_init: 24 slots ready
v4l2-request: Unable to set control(s): EINVAL (H264 probe — harmless)
v4l2-request: Unable to set control(s): EINVAL (HEVC probe — harmless)
(no timeout, no decode error)
daemon log:
REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
decoder: opened vp9 context
decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe luma=12288 chroma=6144
$ ls -la /tmp/vp9_via_libva.nv12
-rw-r--r-- 1 root root 18432 May 18 20:13 /tmp/vp9_via_libva.nv12
$ ffmpeg -i /tmp/vp9_small.ivf -pix_fmt nv12 -f rawvideo \
-y /tmp/vp9_ref_for_libva.nv12
$ cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12
$ echo $?
0
Byte-for-byte match. The two Unable to set control(s): EINVAL messages are libva probing H264 + HEVC PROFILE/LEVEL
integer controls during config negotiation — we don't
register those (since we expose VP9/AV1/H264 stateless), libva
gets EINVAL, logs it, and moves on. Functional flow is
unaffected.
Design analysis
Why was Phase 8.12 close but not complete
8.12 wired all the request API hooks (supports_requests,
buf_out_validate, buf_request_complete) and the daemon-side
decode worked byte-exact — but the per-request control object
stayed bound forever because buf_request_complete only fires
from queue-cancel paths in vb2-core, not from normal buf_done.
Result: request never transitioned to COMPLETE, libva poll
timed out.
Phase 8.13 closes that loop by capturing the media_request from
the OUTPUT vb2_buffer's req_obj at device_run time and calling
v4l2_ctrl_request_complete explicitly when the decode
finishes (chardev RESP_FRAME path). Mirrors what rkvdec does
from its IRQ handler and cedrus from its device_run completion.
Why the EINVAL noise was misleading
Earlier phases (8.10–8.12) kept seeing "Unable to set control(s): Invalid argument" and assumed it pointed at our stateless control registration. strace revealed three separate S_EXT_CTRLS calls per frame:
| # | controls | result | meaning |
|---|---|---|---|
| 1 | H264_PROFILE + H264_LEVEL | EINVAL | libva probes H264; we don't register it |
| 2 | HEVC_PROFILE + HEVC_LEVEL | EINVAL | libva probes HEVC; we don't register it |
| 3 | VP9_FRAME + VP9_COMPRESSED_HDR | OK | actual decode controls |
Calls 1 and 2 are harmless: libva detects we don't support H264/HEVC integer probes and falls back to the stateless controls it does have. Call 3 (the actual VP9 stateless controls) succeeded all along. Only the completion handshake was broken.
Phase 8.13's added error_idx logging in v4l2.c
(failing_ctrl_id=0xa40900 size=0 etc.) is what made the
distinction visible.
Why one fix unblocked both 8.13 and 8.14
The original plan split Phase 8.13 ("trace the EINVAL") from Phase 8.14 ("call request_complete from the right place"). Once strace clarified that the EINVAL was probe noise, the real fix was just the request_complete call from the decode path — a 10-line change. Doing both in one shot avoided a phase boundary that wouldn't have shipped anything additional.
What's NOT here (Phase 8.14+ scope)
- Multi-frame stream via libva. Verified single keyframe; P-frame reference handling across requests untested. Likely works (the daemon's AVCodecContext is persistent across REQ_DECODE calls — already proven via test_m2m_stream).
- AV1 + H.264 via libva. Different stateless control
sets; needs the same control-payload validation path. May
need similar
v4l2_ctrl_request_completeadjustments per codec. - mpv + Firefox end-to-end. The lower-level harness (ffmpeg vaapi) works; higher-level consumers should follow but each has its own VAAPI quirks.
- The two harmless EINVALs from H264/HEVC profile probes. Could be suppressed by registering those integer controls too (always rejecting writes) but that's a polish item.
Phase 8.14 plan
- Multi-frame VP9 stream via libva (re-use vp9_60s.ivf from Phase 8.9 stress test).
- AV1 + H.264 single-frame via libva (likely needs codec- specific tweaks).
- Document any remaining libva-side quirks for higher-level consumers (mpv, Firefox).