The project's consumer-side goal landed: a real VAAPI consumer
(ffmpeg with -hwaccel vaapi) drives our libva backend → V4L2
driver → daemon → byte-exact NV12 output back to ffmpeg.
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format nv12 -i vp9_small.ivf \
-f rawvideo -y /tmp/vp9_via_libva.nv12
cmp /tmp/vp9_via_libva.nv12 /tmp/vp9_ref_for_libva.nv12 → match
18432-byte NV12 byte-for-byte identical to plain ffmpeg
-pix_fmt nv12 software decode. The project_consumer_target
memory's deliverable shape — "V4L2 stateless node consumed by
a real VAAPI client" — is achieved.
Two related kernel changes:
1. v4l2_ctrl_handler_setup(&ctx->hdl) after registration —
matches rkvdec/cedrus/hantro. Brings each registered
compound control out of "uninitialised" state via
std_init_compound defaults.
2. Per-request control completion in the decode path —
the real fix for "Timeout when waiting for media request".
vb2-core's vb2_buffer_done unbinds the BUFFER's req_obj
on normal decode completion, but the per-request CONTROL
object stays bound. buf_request_complete fires only from
queue-cancel paths (vb2-core line 2284), NOT from normal
buf_done. The driver must call
v4l2_ctrl_request_complete(req, hdl) explicitly from the
completion path.
struct daedalus_inflight gained a `struct media_request
*req` field, captured from src_buf->vb2_buf.req_obj.req
in device_run. daedalus_complete_resp_frame then calls
v4l2_ctrl_request_complete before
v4l2_m2m_buf_done_and_job_finish — triggers
MEDIA_REQUEST_STATE_COMPLETE and wakes the request fd
poll.
For non-request flows (test_m2m_stream direct QBUF)
inf->req is NULL; the conditional skips the call.
Both consumer styles work concurrently.
Diagnostic clarification (was Phase 8.13a):
strace traced three S_EXT_CTRLS calls per frame:
1. H264_PROFILE + H264_LEVEL → EINVAL (we don't register)
2. HEVC_PROFILE + HEVC_LEVEL → EINVAL (we don't register)
3. VP9_FRAME + VP9_COMPRESSED_HDR → SUCCESS
The first two are harmless: libva probes whether we support
H264/HEVC integer profile/level controls during config
negotiation; we don't (we expose them as stateless), so EINVAL
just falls through. The actual VP9 stateless controls (#3)
succeeded all along — the libva-side "Unable to set control(s)"
log was misleading us into thinking the control path was the
bug.
Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):
daemon log:
REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
decoder: opened vp9 context
decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe ...
ffmpeg side:
no Timeout, no Decoding error
/tmp/vp9_via_libva.nv12: 18432 bytes
cmp vs reference: byte-for-byte identical.
Roadmap update:
- 8.10/8.11, 8.12, 8.13 marked closed with closure docs.
- 8.14 = multi-frame VP9 via libva, AV1 + H.264, mpv/Firefox
higher-level consumers.
Per correctness-before-speed:
- strace + kernel-source-reading found the actual root cause
rather than guessing.
- Conditional v4l2_ctrl_request_complete preserves the existing
test_m2m_stream non-request path — both consumer styles work
concurrently without per-flow branching elsewhere.
- Byte-exact pixel comparison, not "frame size matches."
Phase 8.14 next: multi-frame stream + multi-codec via libva +
mpv/Firefox.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daedalus-v4l2
V4L2 stateless decoder for the Raspberry Pi 5 / CM5, backed by the
daedalus-fourier kernel library (VP9 + AV1 CDEF + H.264 video
decode kernels on VideoCore VII compute + ARM NEON).
Status: scaffold (2026-05-18). Architecture locked per daedalus-fourier session memory; implementation not yet begun.
What this is
Sibling repo to daedalus-fourier (the kernel library; cycles 1-9 closed).
A two-piece userspace + kernel-module stack that exposes a V4L2
stateless decoder interface (/dev/videoNN) so that
libva-v4l2-request-fourier → firefox-fourier /
chromium-fourier can drive it the same way they drive existing
hardware-decode pipelines on Pi 5 / RK3588.
+-----------------------------------------------------------+
| firefox-fourier / chromium-fourier (existing) |
+-----------------------------------------------------------+
| VA-API |
+-----------------------------------------------------------+
| libva-v4l2-request-fourier (existing, sibling project) |
+-----------------------------------------------------------+
| V4L2 stateless ioctl uAPI |
+-----------------------------------------------------------+
| daedalus-v4l2 kernel module (`kernel/`) |
| - registers /dev/videoNN |
| - parses V4L2 stateless ioctls (VP9/AV1/H.264 controls) |
| - forwards bitstream + controls to userspace daemon |
| via chardev or netlink |
+-----------------------------------------------------------+
| daedalus-v4l2 userspace daemon (`daemon/`) |
| - takes bitstream blobs + per-slice controls |
| - drives FFmpeg parsers via dlopen (Option γ) |
| - dispatches per-block ops via daedalus-fourier |
| public API (daedalus_dispatch_*) |
| - posts decoded frames back to kernel module |
+-----------------------------------------------------------+
| daedalus-fourier kernel library (sibling project) |
| - exports include/daedalus.h public API |
| - per-kernel CPU NEON + opportunistic V3D QPU dispatch |
| - 9 closed cycles across VP9, AV1 CDEF, H.264 |
+-----------------------------------------------------------+
| V3D 7.1 (Mesa userspace v3dv) + ARM NEON (BCM2712) |
+-----------------------------------------------------------+
Why this architecture (Option B + γ + sibling)
Locked by user 2026-05-18 from 3 options in
daedalus-fourier/docs/phase8_scoping.md:
- Option B over A (userspace v4l2loopback): real
/dev/videoNN, proper DRM PRIME / dmabuf for browser zero-copy. - Option γ: dlopen FFmpeg as parser at runtime. No vendoring, fastest to v1.
- Sibling repo: per
project_consumer_targetconvention, V4L2-side work lives outside daedalus-fourier so the kernel-library has a clean API boundary.
Status
Initial scaffold only. See docs/architecture.md for the
deeper design and docs/roadmap.md for the
sub-phase breakdown.
Repo layout
kernel/— Linux kernel module (V4L2 device registration + ioctl handling + userspace chardev bridge). Out-of-tree.daemon/— userspace decoder daemon (linkslibdaedalus_core.afrom sibling daedalus-fourier; uses dlopen for FFmpeg parser).include/— shared headers between kernel and daemon.docs/— architecture + roadmap.
License
Kernel module: GPLv2 (required for kernel-tree compatibility). Userspace daemon: BSD-2-Clause (matches daedalus-fourier).