Force libavcodec's H.264 decoder to emit frames in DECODE order (one frame per send_packet, no internal display-order reorder queue). Single-line addition: ctx->flags |= AV_CODEC_FLAG_LOW_DELAY before avcodec_open2, gated on codec_id == DAEDALUS_CODEC_H264. Closes daedalus-v4l2#11 part (2). Background ---------- PR #7's "parking design" approach to the H.264 display-reorder problem broke libva-v4l2-request-fourier's 1:1 CAPTURE-completion contract (see #9 + #10). After the revert, the visible "2 1 4 3" pair-swap regressed and the only path forward was to align the daemon's output ordering with what V4L2 stateless clients expect: **decode order, one CAPTURE buffer per OUTPUT slice, with display reorder pushed upstream to ffmpeg-vaapi's per-VAAPI-surface POC logic** (which it already does correctly for every real H.264 hardware decoder via VAPictureParameterBufferH264). How LOW_DELAY does this ----------------------- Inside libavcodec/h264dec.c, the flag sets h->low_delay = 1. h264_select_output_frame (h264_picture.c) emits the just-decoded picture immediately instead of routing through the display-order DPB output queue. DPB management for reference frames (short_ref / long_ref) is unaffected — B-frame decoding correctness is preserved; only the output buffering is bypassed. Skipped for VP9 / AV1 — those codecs don't reorder internally, so the flag would be a no-op but adds no value. Verified -------- On higgs (Pi CM5, 6.18.29+rpt-rpi-2712), test daemon hot-swapped into /usr/bin/daedalus_v4l2_daemon, mpv --hwdec=vaapi-copy --frames=300 against bbb_720p_h264.mp4: 311 REQ_DECODEs received, 308 successful "decoder: OK" responses (99.04% steady-state delivery — 3 lost at GOP boundaries, no compounding drift). mpv plays to its --frames cap and exits cleanly with "End of file". No "Unable to dequeue buffer", no "Failed to end picture decode", no "AVHWFramesContext: Failed to sync surface" — all the failures from #9 are gone. Builds clean against ffmpeg-v4l2-request-fourier libavcodec.
daedalus-v4l2
V4L2 stateless decoder for the Raspberry Pi 5 / CM5, backed by the
daedalus-fourier kernel library (VP9 + AV1 CDEF + H.264 video
decode kernels on VideoCore VII compute + ARM NEON).
Status: scaffold (2026-05-18). Architecture locked per daedalus-fourier session memory; implementation not yet begun.
What this is
Sibling repo to daedalus-fourier (the kernel library; cycles 1-9 closed).
A two-piece userspace + kernel-module stack that exposes a V4L2
stateless decoder interface (/dev/videoNN) so that
libva-v4l2-request-fourier → firefox-fourier /
chromium-fourier can drive it the same way they drive existing
hardware-decode pipelines on Pi 5 / RK3588.
+-----------------------------------------------------------+
| firefox-fourier / chromium-fourier (existing) |
+-----------------------------------------------------------+
| VA-API |
+-----------------------------------------------------------+
| libva-v4l2-request-fourier (existing, sibling project) |
+-----------------------------------------------------------+
| V4L2 stateless ioctl uAPI |
+-----------------------------------------------------------+
| daedalus-v4l2 kernel module (`kernel/`) |
| - registers /dev/videoNN |
| - parses V4L2 stateless ioctls (VP9/AV1/H.264 controls) |
| - forwards bitstream + controls to userspace daemon |
| via chardev or netlink |
+-----------------------------------------------------------+
| daedalus-v4l2 userspace daemon (`daemon/`) |
| - takes bitstream blobs + per-slice controls |
| - drives FFmpeg parsers via dlopen (Option γ) |
| - dispatches per-block ops via daedalus-fourier |
| public API (daedalus_dispatch_*) |
| - posts decoded frames back to kernel module |
+-----------------------------------------------------------+
| daedalus-fourier kernel library (sibling project) |
| - exports include/daedalus.h public API |
| - per-kernel CPU NEON + opportunistic V3D QPU dispatch |
| - 9 closed cycles across VP9, AV1 CDEF, H.264 |
+-----------------------------------------------------------+
| V3D 7.1 (Mesa userspace v3dv) + ARM NEON (BCM2712) |
+-----------------------------------------------------------+
Why this architecture (Option B + γ + sibling)
Locked by user 2026-05-18 from 3 options in
daedalus-fourier/docs/phase8_scoping.md:
- Option B over A (userspace v4l2loopback): real
/dev/videoNN, proper DRM PRIME / dmabuf for browser zero-copy. - Option γ: dlopen FFmpeg as parser at runtime. No vendoring, fastest to v1.
- Sibling repo: per
project_consumer_targetconvention, V4L2-side work lives outside daedalus-fourier so the kernel-library has a clean API boundary.
Status
Initial scaffold only. See docs/architecture.md for the
deeper design and docs/roadmap.md for the
sub-phase breakdown.
Repo layout
kernel/— Linux kernel module (V4L2 device registration + ioctl handling + userspace chardev bridge). Out-of-tree.daemon/— userspace decoder daemon (linkslibdaedalus_core.afrom sibling daedalus-fourier; uses dlopen for FFmpeg parser).include/— shared headers between kernel and daemon.docs/— architecture + roadmap.
License
Kernel module: GPLv2 (required for kernel-tree compatibility). Userspace daemon: BSD-2-Clause (matches daedalus-fourier).