# Phase 8.4 closure — daemon ↔ kernel decode round-trip (VP9) **Status:** closed 2026-05-18. Wires the Phase 8.3 FFmpeg loader through the Phase 8.2 chardev bridge: kernel injects `REQ_DECODE` carrying a raw VP9 access unit, daemon hands the bitstream to libavcodec via dlopen, sends `RESP_FRAME` back with a content-dependent FNV-1a digest of the decoded YUV planes. Pure CPU decode for now — Phase 8.5 swaps in dmabuf + QPU dispatch. ## What lands ### Protocol (`include/daedalus_v4l2_proto.h`) - New message types: `REQ_DECODE` (kernel→daemon) and `RESP_FRAME` (daemon→kernel). Converted the prior `enum daedalus_msg_type` to `#define`s — high-bit values exceed INT_MAX and tripped -Wpedantic on userspace builds; kernel uABI headers use the same idiom. - New payload structs `daedalus_req_decode`, `daedalus_resp_frame`. - New codec id enum (`DAEDALUS_CODEC_VP9 = 1`); wire-stable so Phase 8.6's AV1/H.264 additions don't move the existing values. - New status enum (`DAEDALUS_DECODE_OK`, `..._NO_FRAME`, `..._ERR_OPEN`, `..._ERR_SEND`, `..._ERR_RECV`, `..._ERR_CODEC`). ### Kernel (`kernel/daedalus_v4l2_chardev.c`) - New debugfs entry `/sys/kernel/debug/daedalus_v4l2/test_decode` — writing raw bitstream bytes wraps them in a `REQ_DECODE` (codec hard-wired to VP9 for Phase 8.4) and enqueues for the daemon. Auto-incrementing cookie per request. - `daedalus_chardev_write` learned `RESP_FRAME`: parses the fixed-size payload and emits a single `pr_info` line with the decode metadata. Keeps the existing PONG path on the default arm. ### Daemon (`daemon/src/...`) - `chardev_client.{c,h}` — opens `/dev/daedalus-v4l2`, blocking read loop dispatching on message type, writes responses via single contiguous `write()` (kernel chardev has only `.write`, no `.write_iter`, so `writev` lands as -EINVAL — discovered the hard way during first end-to-end run). - `decoder.{c,h}` — encapsulates the AVCodecContext (lazily opened on first request per codec), shared AVPacket/AVFrame pair, and an FNV-1a digest of the decoded planes. Plane walk is descriptor-driven (`av_pix_fmt_desc_get`) so the same code path covers YUV420P, YUV422P, YUV444P, GBRP and other 8-bit planar layouts. - `daemon` command in `main.c` opens the chardev and runs the loop until SIGINT / SIGTERM. - `ffmpeg_loader` gained `av_pix_fmt_desc_get` (23 resolved symbols total). ### Build - CMakeLists adds `chardev_client.c` and `decoder.c` to the executable; explicit `-I../include` for the shared protocol header. - Still `-Wall -Wextra -Wpedantic` clean. ## Verification Kernel module built clean against the in-tree headers (`linux-headers-6.12.75+rpt-rpi-2712`): ``` $ cd /home/mfritsche/src/daedalus-v4l2/kernel && make CC [M] daedalus_v4l2_chardev.o LD [M] daedalus_v4l2.ko ``` Daemon built clean: ``` $ cmake --build build/ [100%] Built target daedalus_v4l2_daemon ``` End-to-end: ``` $ ffmpeg -hide_banner -loglevel warning -f lavfi \ -i 'testsrc=duration=0.04:size=320x240:rate=25' \ -pix_fmt yuv420p -c:v libvpx-vp9 -frames:v 1 -y /tmp/vp9_test.ivf $ python3 -c "..." # strip IVF framing → /tmp/vp9_keyframe.bin extracted 3268 bytes raw VP9 $ sudo insmod kernel/daedalus_v4l2.ko $ /tmp/start_daemon.sh # daemon mode, blocks on read $ sudo dd if=/tmp/vp9_keyframe.bin \ of=/sys/kernel/debug/daedalus_v4l2/test_decode bs=8192 count=1 daemon log: [INFO] REQ_DECODE cookie=2 codec=1 bitstream=3268 bytes [INFO] decoder: opened vp9 context [INFO] decoder: OK 320x240 fmt=0 (yuv420p) fnv1a=0x6ef10d71 luma=76800 chroma=38400 kernel log: [16199.734667] daedalus_v4l2: REQ_DECODE enqueued cookie=2 codec=VP9 bitstream=3268 [16199.735951] daedalus_v4l2: RESP_FRAME cookie=2 status=0 codec=1 320x240 pixfmt=0 luma=76800 chroma=38400 fnv1a=0x6ef10d71 ^^^^^^^^^^ matches the daemon's ``` ### Hash properties (sanity) | Trigger | Bitstream | Hash | Notes | |---|---|---|---| | testsrc 320×240 | 3268 B | `0x6ef10d71` | first decode (codec open) | | color=red 320×240 | 44 B | `0x7f6e5dc5` | hash changes with content ✓ | | testsrc again | 3268 B | `0x6ef10d71` | deterministic ✓ | | 64 B `/dev/urandom` | 64 B | n/a | structured error, status=101 | The garbage-input case is the interesting one: FFmpeg's `avcodec_send_packet` returned -1094995529 ("Invalid sync code"), the daemon stayed alive, wrapped that into `DAEDALUS_DECODE_ERR_SEND`, sent `RESP_FRAME` with status=101 and zeroed metadata. Kernel logged the response. No daemon crash, no kernel oops, no stuck request in the FIFO. ### Cleanup ``` $ pkill -TERM -f daedalus_v4l2_daemon # daemon exits cleanly $ sudo rmmod daedalus_v4l2 # ok, all queued requests drained ``` ## Design decisions ### Why FNV-1a, why no pixel data on the wire? The chardev's wire protocol caps single messages at 64 KiB (`DAEDALUS_PROTO_MAX_PAYLOAD`). A single 1080p YUV420P frame is 3.1 MB — orders of magnitude larger. Forcing pixel data through the chardev would require either: 1. Fragmentation across multiple messages (re-assembly state in kernel, complexity tax for a temporary path). 2. Bumping the limit, which lifts the per-message kmalloc out of GFP_KERNEL territory. Neither is the right answer. Phase 8.5 wires dmabuf for actual frame transfer; the FNV-1a digest is just enough to prove "the right bytes came out of the decoder" without paying that cost yet. The digest also stays useful as a cross-host sanity check (reference vs target). ### Why `write()` not `writev()` in the daemon? The kernel chardev implements only `.write` in its fops — not `.write_iter`. Modern Linux does not auto-fallback `writev → write`; userspace `writev` returns `-EINVAL` directly. Options were: 1. Implement `.write_iter` in the kernel (slightly more code, buys nothing functionally for a one-or-two-iovec write). 2. Marshal into a single buffer in the daemon (one short-lived malloc per response, dead simple). Picked (2). Response payloads are ≤ 36 B (struct daedalus_resp_frame) for decode and ≤ 64 KiB for PONG; a malloc/free per response is invisible at the scale we're working. ### Why plane walk via `av_pix_fmt_desc_get`? First end-to-end run decoded `testsrc` (an RGB-native source) as `AV_PIX_FMT_GBRP` (71), not `YUV420P`. The original hand-rolled hash hard-coded YUV420P plane geometry and fell back to "plane 0 only" otherwise — fine for a one-time test, but it would silently miss two-thirds of the pixels on the very first real-world content variation. Using `AVPixFmtDescriptor` directly gives us a generic plane walker: how many planes, which components live in each, and the chroma subsampling shifts. Now the same hash path correctly covers planar YUV (any subsampling), GBRP, and similar 8-bit-per-sample layouts. 10/12-bit (P010, YUV420P10LE) needs a depth-aware variant — that lands when Phase 8.6 starts looking at HDR. ## What's NOT here (deferred) - **dmabuf / DRM PRIME**: Phase 8.5. RESP_FRAME today carries metadata + digest only; actual pixel data goes out-of-band via dmabuf in the next phase. - **V4L2 buffer-queue wiring**: REQ_DECODE today is debugfs- triggered. Phase 8.5+ has the V4L2 m2m queue submit requests from `vidioc_qbuf`. - **QPU dispatch**: the daemon decodes on CPU via FFmpeg. Substituting per-block dispatch into the sibling daedalus-fourier kernels (cycles 1, 2, 4, 9) lands once the daemon-side parser can extract block-level metadata — that's a Phase 8.5/8.6 follow-up. - **AV1 / H.264**: decoder rejects them with `DAEDALUS_DECODE_ERR_CODEC` today. Phase 8.6 adds the codec contexts. - **10-bit pixel formats**: hash path is 8-bit/sample/plane only. ## Phase 8.5 plan 1. Replace debugfs `test_decode` with V4L2 m2m queue submission: `vidioc_qbuf` on the OUTPUT queue extracts the bitstream from the userspace plane and calls `daedalus_chardev_enqueue_req`. 2. dmabuf import on the CAPTURE queue: daemon writes decoded pixels into a kernel-allocated dmabuf and `RESP_FRAME` references the buffer index, not raw bytes. 3. Drive a userspace V4L2 client (start with `v4l2-compliance --stream-options` then a tiny custom test) end-to-end. 4. Begin substituting `daedalus_dispatch_*` calls into the daemon's decode path for kernels where the QPU implementation matches the FFmpeg block format.