Real H.264 access units routinely exceed the previous 64 KiB cap on the chardev wire protocol: 720p worst-case I-frame ~200 KiB 1080p worst-case I-frame ~500 KiB libva-v4l2-request-fourier detects the under-sized OUTPUT-MPLANE buffer and tries to grow it via VIDIOC_S_FMT to 147456 B, but daedalus_fill_output_fmt unconditionally pins sizeimage to DAEDALUS_MAX_BITSTREAM (= 65484) regardless of userspace's request. Firefox loses the slice, falls back to libmozavcodec SW for the rest of the session. Bumping the wire-protocol cap to 1 MiB lifts the kernel OUTPUT_MPLANE sizeimage with it (DAEDALUS_MAX_BITSTREAM is derived from the same #define). All allocations (kernel kmalloc / kmemdup, daemon read buffer, vb2 plane backing) are dynamic and sized per-payload at runtime, so the only growth is the daemon's startup read buffer (one ~1 MiB allocation per daemon process) and the V4L2 OUTPUT_MPLANE per-buffer size. KMALLOC_MAX_SIZE on aarch64 SLUB is several MiB; 1 MiB is well within bounds. Other V4L2 stateless decoders (cedrus, rkvdec, hantro) report 1-4 MiB OUTPUT_MPLANE sizeimage — this puts daedalus at the conservative end of normal. ## Compatibility #define-only change; struct layout unchanged. But the effective cap is the smaller of (kernel cap, daemon cap), so: - new daemon + stale kernel: still capped at 64 KiB until the kernel module rebuilds. - new kernel + stale daemon: same. Lock-step install of daedalus-v4l2 + daedalus-v4l2-dkms is therefore required for the fix to take effect; mirrors the PR-#7/#8 cadence. ## NOT changed in this commit - daedalus_fill_output_fmt still hardcodes sizeimage = DAEDALUS_MAX_BITSTREAM regardless of userspace request. Acceptable: vb2 will allocate up to that, and libva's resize- test now sees the kernel report a sizeimage at-least-as-large as what it asked for (147456 < 1048524). A future cleanup could respect userspace's S_FMT.sizeimage clamped to the cap, to save memory on tiny streams. - chardev kmalloc → kvmalloc swap (only matters above KMALLOC_MAX_SIZE, not here). Refs #19.
daedalus-v4l2
V4L2 stateless decoder for the Raspberry Pi 5 / CM5, backed by the
daedalus-fourier kernel library (VP9 + AV1 CDEF + H.264 video
decode kernels on VideoCore VII compute + ARM NEON).
Status: scaffold (2026-05-18). Architecture locked per daedalus-fourier session memory; implementation not yet begun.
What this is
Sibling repo to daedalus-fourier (the kernel library; cycles 1-9 closed).
A two-piece userspace + kernel-module stack that exposes a V4L2
stateless decoder interface (/dev/videoNN) so that
libva-v4l2-request-fourier → firefox-fourier /
chromium-fourier can drive it the same way they drive existing
hardware-decode pipelines on Pi 5 / RK3588.
+-----------------------------------------------------------+
| firefox-fourier / chromium-fourier (existing) |
+-----------------------------------------------------------+
| VA-API |
+-----------------------------------------------------------+
| libva-v4l2-request-fourier (existing, sibling project) |
+-----------------------------------------------------------+
| V4L2 stateless ioctl uAPI |
+-----------------------------------------------------------+
| daedalus-v4l2 kernel module (`kernel/`) |
| - registers /dev/videoNN |
| - parses V4L2 stateless ioctls (VP9/AV1/H.264 controls) |
| - forwards bitstream + controls to userspace daemon |
| via chardev or netlink |
+-----------------------------------------------------------+
| daedalus-v4l2 userspace daemon (`daemon/`) |
| - takes bitstream blobs + per-slice controls |
| - drives FFmpeg parsers via dlopen (Option γ) |
| - dispatches per-block ops via daedalus-fourier |
| public API (daedalus_dispatch_*) |
| - posts decoded frames back to kernel module |
+-----------------------------------------------------------+
| daedalus-fourier kernel library (sibling project) |
| - exports include/daedalus.h public API |
| - per-kernel CPU NEON + opportunistic V3D QPU dispatch |
| - 9 closed cycles across VP9, AV1 CDEF, H.264 |
+-----------------------------------------------------------+
| V3D 7.1 (Mesa userspace v3dv) + ARM NEON (BCM2712) |
+-----------------------------------------------------------+
Why this architecture (Option B + γ + sibling)
Locked by user 2026-05-18 from 3 options in
daedalus-fourier/docs/phase8_scoping.md:
- Option B over A (userspace v4l2loopback): real
/dev/videoNN, proper DRM PRIME / dmabuf for browser zero-copy. - Option γ: dlopen FFmpeg as parser at runtime. No vendoring, fastest to v1.
- Sibling repo: per
project_consumer_targetconvention, V4L2-side work lives outside daedalus-fourier so the kernel-library has a clean API boundary.
Status
Initial scaffold only. See docs/architecture.md for the
deeper design and docs/roadmap.md for the
sub-phase breakdown.
Repo layout
kernel/— Linux kernel module (V4L2 device registration + ioctl handling + userspace chardev bridge). Out-of-tree.daemon/— userspace decoder daemon (linkslibdaedalus_core.afrom sibling daedalus-fourier; uses dlopen for FFmpeg parser).include/— shared headers between kernel and daemon.docs/— architecture + roadmap.
License
Kernel module: GPLv2 (required for kernel-tree compatibility). Userspace daemon: BSD-2-Clause (matches daedalus-fourier).