First incremental step toward H.264 daemon-rewrite (daedalus-v4l2#11):
make the daedalus-fourier kernel library available to the daemon
process so subsequent patches can substitute its primitives
(IDCT 4×4, IDCT 8×8, luma vertical deblock, etc.) for libavcodec's
per-MB pixel math.
This patch does NOT yet dispatch any kernels. It only:
- Adds `pkg_check_modules(DAEDALUS_FOURIER REQUIRED daedalus-fourier)`
to the daemon's CMakeLists, with explicit link ordering
(libdaedalus_core.a must precede -lvulkan because the static
archive references vulkan symbols and the linker resolves
left-to-right). We bypass IMPORTED_TARGET because pkg-config's
Requires.private chain leaves CMake's dependency graph reordering
the archive after -lvulkan, breaking the static link.
- Calls daedalus_ctx_create_no_qpu() at daemon startup, logs the
substrate-availability line, destroys the context at exit.
no_qpu mode skips V3D Vulkan probe — proves linkage works
without depending on shader-path resolution (which is a
separate piece of work, since v3d_runner currently loads
.spv files from cwd-relative paths and consumer would need
a search path override).
Sample journal line:
[2026-05-21 17:59:35.271 INFO] daedalus-fourier: linked, ctx alive
(no_qpu mode; has_qpu=0)
Build-test verified on hertz (Pi 5 dev host) against an installed
copy of daedalus-fourier r35+gd87239d (from marfrit/daedalus-fourier
PR #1). Binary links cleanly, --help prints, daemon mode opens
chardev (fails predictably on hertz which has no daedalus_v4l2
kmod; on higgs this is the existing working path).
Follow-up patches per daedalus-v4l2#11:
1. Instrument the existing libavcodec decode path to count
per-frame IDCT blocks / deblock edges / MC tiles so we have
a baseline of what work the daemon dispatches for a typical
YouTube H.264 stream.
2. Substitute daedalus-fourier kernels one at a time, measuring
CPU saved per substitution.
3. Wire shader path resolution into daedalus_ctx_create() for
the QPU substrate (V3D opportunistic helper paths).
Wire protocol unchanged. DAEDALUS_PROTO_VERSION stays at 0.
daedalus-v4l2
V4L2 stateless decoder for the Raspberry Pi 5 / CM5, backed by the
daedalus-fourier kernel library (VP9 + AV1 CDEF + H.264 video
decode kernels on VideoCore VII compute + ARM NEON).
Status: scaffold (2026-05-18). Architecture locked per daedalus-fourier session memory; implementation not yet begun.
What this is
Sibling repo to daedalus-fourier (the kernel library; cycles 1-9 closed).
A two-piece userspace + kernel-module stack that exposes a V4L2
stateless decoder interface (/dev/videoNN) so that
libva-v4l2-request-fourier → firefox-fourier /
chromium-fourier can drive it the same way they drive existing
hardware-decode pipelines on Pi 5 / RK3588.
+-----------------------------------------------------------+
| firefox-fourier / chromium-fourier (existing) |
+-----------------------------------------------------------+
| VA-API |
+-----------------------------------------------------------+
| libva-v4l2-request-fourier (existing, sibling project) |
+-----------------------------------------------------------+
| V4L2 stateless ioctl uAPI |
+-----------------------------------------------------------+
| daedalus-v4l2 kernel module (`kernel/`) |
| - registers /dev/videoNN |
| - parses V4L2 stateless ioctls (VP9/AV1/H.264 controls) |
| - forwards bitstream + controls to userspace daemon |
| via chardev or netlink |
+-----------------------------------------------------------+
| daedalus-v4l2 userspace daemon (`daemon/`) |
| - takes bitstream blobs + per-slice controls |
| - drives FFmpeg parsers via dlopen (Option γ) |
| - dispatches per-block ops via daedalus-fourier |
| public API (daedalus_dispatch_*) |
| - posts decoded frames back to kernel module |
+-----------------------------------------------------------+
| daedalus-fourier kernel library (sibling project) |
| - exports include/daedalus.h public API |
| - per-kernel CPU NEON + opportunistic V3D QPU dispatch |
| - 9 closed cycles across VP9, AV1 CDEF, H.264 |
+-----------------------------------------------------------+
| V3D 7.1 (Mesa userspace v3dv) + ARM NEON (BCM2712) |
+-----------------------------------------------------------+
Why this architecture (Option B + γ + sibling)
Locked by user 2026-05-18 from 3 options in
daedalus-fourier/docs/phase8_scoping.md:
- Option B over A (userspace v4l2loopback): real
/dev/videoNN, proper DRM PRIME / dmabuf for browser zero-copy. - Option γ: dlopen FFmpeg as parser at runtime. No vendoring, fastest to v1.
- Sibling repo: per
project_consumer_targetconvention, V4L2-side work lives outside daedalus-fourier so the kernel-library has a clean API boundary.
Status
Initial scaffold only. See docs/architecture.md for the
deeper design and docs/roadmap.md for the
sub-phase breakdown.
Repo layout
kernel/— Linux kernel module (V4L2 device registration + ioctl handling + userspace chardev bridge). Out-of-tree.daemon/— userspace decoder daemon (linkslibdaedalus_core.afrom sibling daedalus-fourier; uses dlopen for FFmpeg parser).include/— shared headers between kernel and daemon.docs/— architecture + roadmap.
License
Kernel module: GPLv2 (required for kernel-tree compatibility). Userspace daemon: BSD-2-Clause (matches daedalus-fourier).