# daedalus-v4l2 — roadmap ## Sub-phases ### Phase 8.1 — kernel module skeleton Out-of-tree kernel module that: - Registers `/dev/videoNN` with `VFL_TYPE_VIDEO` + a no-op V4L2 stateless dispatch table. - Accepts open/close, S_FMT, REQBUFS ioctls without doing anything (yet). - Builds against `/lib/modules/$(uname -r)/build`. Deliverable: `modprobe daedalus_v4l2` works, `v4l2-ctl --list-devices` shows the new device. ### Phase 8.2 — kernel ↔ daemon chardev bridge - Kernel module creates `/dev/daedalus-v4l2` chardev. - Defines a simple req/resp protocol in `include/daedalus_v4l2_proto.h`. - Daemon connects, exchanges echo requests. Deliverable: ping-pong test passes. ### Phase 8.3 — daemon FFmpeg dlopen + parse - Daemon links `libdaedalus_core.a` from sibling. - Daemon dlopens FFmpeg. - Test program: feed a VP9 IVF file to FFmpeg parsers, extract block-level metadata, validate against expected. Deliverable: daemon can parse a VP9 frame and walk the block-level info. ### Phase 8.4 — daemon ↔ kernel decode round-trip (closed 2026-05-18) Shipped as a debugfs-triggered chardev round-trip rather than the original V4L2-ioctl plan (which moved to Phase 8.5). - REQ_DECODE / RESP_FRAME wire protocol - Daemon decodes VP9 via FFmpeg dlopen, returns FNV-1a digest - Verified content-dependent + deterministic; structured error handling for bad bitstreams See `docs/phase_8_4_closure.md`. ### Phase 8.5 — full V4L2 m2m driver (closed 2026-05-18) Real V4L2 m2m driver — userspace clients drive `S_FMT`/`REQBUFS`/`QBUF`/`DQBUF` the standard way. Bitstream flows kernel→daemon as inline REQ_DECODE payload; decoded NV12 pixels flow daemon→kernel as inline RESP_FRAME payload. Works end-to-end for small frames (≤ ~64 KiB NV12). Deliverable hit: kernel m2m driver passes most v4l2-compliance checks; `tools/test_m2m_decode` produces a NV12 frame that's byte-for-byte identical to `ffmpeg -pix_fmt nv12` reference. See `docs/phase_8_5_closure.md`. ### Phase 8.6 — dmabuf + AV1 + H.264 + stateless controls (closed 2026-05-18) - CAPTURE (and OUTPUT) on `vb2_dma_contig_memops`. - New `DAEDALUS_IOC_GET_DMABUF` chardev ioctl — daemon mmaps the in-flight CAPTURE buffer, decodes pixels in place, sends RESP_FRAME metadata-only. - 64 KiB frame-size cap removed. 1080p VP9 + 128×96 AV1 + 128×96 H.264 all byte-exact against reference FFmpeg decode. - V4L2 stateless controls registered for VP9 / AV1 / H.264 (11 controls visible to userspace). - Colorspace round-trip fix (TRY_FMT preserve, S_FMT OUTPUT→CAPTURE propagation). - Cookie unified across V4L2 + debugfs paths. - v4l2-compliance: 47/48 (only DECODER_CMD remains, needs media controller — moved to 8.7). See `docs/phase_8_6_closure.md`. ### Phase 8.7 — media controller + multi-frame streaming (closed 2026-05-18) - Media controller bound via `v4l2_m2m_register_media_controller` + `media_device_register`; `/dev/mediaN` published. - `tools/test_m2m_stream` parses IVF and pushes frames sequentially through a 4-deep buffer ring; daemon AVCodecContext preserves reference frames across calls. - 30-frame VP9 320×240 stream byte-exact (3.46 MB across 1 keyframe + 29 P-frames). - 10-frame VP9 1080p stream byte-exact (31 MB across 10 frames at full HD). - v4l2-compliance: **49/49 passing** (was 47/48 in 8.6; media controller added a 49th test and closed DECODER_CMD). See `docs/phase_8_7_closure.md`. ### Phase 8.8 — perf, QPU dispatch, AV1/H.264 streams, HDR 1. Profile daemon end-to-end on hertz; identify FFmpeg hot functions per codec. 2. dlopen daedalus-fourier's per-kernel entry points from the daemon; substitute `daedalus_dispatch_*` for FFmpeg's matching per-block calls (IDCT 4×4 / 8×8, MC, deblock, qpel — from cycles 1, 2, 4, 9). 3. Validate bit-exactness after each substitution. 4. Hit 30fps@1080p stable on VP9 — the `30fps-floor-is-fine` memory's user-facing criterion. 5. Multi-frame AV1 + H.264 round-trips (extend stream tests). 6. HDR / 10-bit (P010M CAPTURE, depth-aware `pack_nv12_to_planes`). Deliverable: 30fps stable on real content across all three codecs. ## Effort estimate Each phase: ~1 week of focused work (~40 hours). Total: 7 weeks for v1. Could be split across multiple sessions / contributors.