Commit Graph

4 Commits

Author SHA1 Message Date
marfrit 2a449632b9 Phase 8.4: daemon ↔ kernel decode round-trip (VP9 end-to-end)
Wires the Phase 8.3 FFmpeg loader through the Phase 8.2 chardev
bridge: kernel injects REQ_DECODE carrying a raw VP9 access unit,
daemon hands the bitstream to libavcodec via dlopen, sends
RESP_FRAME back with a content-dependent FNV-1a digest of the
decoded YUV planes. Pure CPU decode for now — Phase 8.5 swaps in
dmabuf + QPU dispatch.

Protocol (include/daedalus_v4l2_proto.h):
- New REQ_DECODE (kernel→daemon) and RESP_FRAME (daemon→kernel)
  message types, with fixed-size payload structs.
- New DAEDALUS_CODEC_VP9/AV1/H264 enum (wire-stable so 8.6's
  AV1+H.264 work doesn't move existing values).
- New DAEDALUS_DECODE_* status enum (OK / NO_FRAME / ERR_OPEN /
  ERR_SEND / ERR_RECV / ERR_CODEC).
- Converted the prior `enum daedalus_msg_type` to #defines —
  high-bit values exceed INT_MAX and tripped -Wpedantic on
  userspace; kernel uABI headers use the same idiom.

Kernel (kernel/daedalus_v4l2_chardev.c):
- New debugfs entry /sys/kernel/debug/daedalus_v4l2/test_decode:
  writing raw bitstream bytes wraps them in a REQ_DECODE
  (codec=VP9 for Phase 8.4) and enqueues with an
  auto-incrementing cookie.
- daedalus_chardev_write learned RESP_FRAME: parses the payload
  and emits a single pr_info line with decode metadata. Keeps
  existing PONG handling on the default arm.

Daemon (daemon/src/...):
- chardev_client.{c,h} — opens /dev/daedalus-v4l2, blocking read
  loop, single-buffer write() responses (kernel chardev has only
  .write, not .write_iter, so writev lands as -EINVAL —
  discovered the hard way during first run).
- decoder.{c,h} — lazily-opened AVCodecContext per codec, shared
  AVPacket/AVFrame pair, descriptor-driven plane walker
  (av_pix_fmt_desc_get) so the same hash path covers YUV420P,
  YUV422P, YUV444P, GBRP and other 8-bit planar layouts.
  Generalised after first run decoded testsrc as GBRP (71)
  rather than the assumed YUV420P.
- `daemon` command in main.c opens the chardev and runs the loop
  until SIGINT/SIGTERM. Cookie correlation handled end-to-end.
- ffmpeg_loader gained av_pix_fmt_desc_get (23 symbols total).

Build:
- CMakeLists adds chardev_client.c + decoder.c; explicit
  -I../include for the shared protocol header.
- Still -Wall -Wextra -Wpedantic clean.

Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):

  $ ffmpeg ... -pix_fmt yuv420p -c:v libvpx-vp9 -frames:v 1 \
           -y /tmp/vp9_test.ivf
  $ python3 ... strip IVF framing → vp9_keyframe.bin (3268 B)

  $ sudo insmod kernel/daedalus_v4l2.ko
  $ daedalus_v4l2_daemon -v daemon &
  $ sudo dd if=vp9_keyframe.bin \
         of=/sys/kernel/debug/daedalus_v4l2/test_decode

  daemon: REQ_DECODE cookie=2 → decoded yuv420p 320x240
          fnv1a=0x6ef10d71 luma=76800 chroma=38400
  kernel: RESP_FRAME cookie=2 status=0 320x240 pixfmt=0
          fnv1a=0x6ef10d71  ← matches daemon ✓

Hash properties verified:
  cookie=2  testsrc 3268 B → 0x6ef10d71  (first decode)
  cookie=3  red     44 B   → 0x7f6e5dc5  (content-dependent ✓)
  cookie=4  testsrc 3268 B → 0x6ef10d71  (deterministic ✓)
  cookie=5  64 B random    → status=101  (ERR_SEND, daemon alive)

Daemon survives bad input (FFmpeg "Invalid sync code" wrapped
into structured ERR_SEND response). Clean SIGTERM shutdown,
clean rmmod.

Phase 8.4 acceptance criteria met:
- ✓ end-to-end kernel→daemon→FFmpeg→kernel round-trip
- ✓ cookie correlation per request/response pair
- ✓ content-dependent + deterministic digest
- ✓ structured error responses (no daemon crash on bad input)
- ✓ clean teardown (SIGTERM + rmmod)
- ✓ builds clean on both kernel kbuild and daemon CMake

Per correctness-before-speed:
- Real chardev I/O (no shortcuts, no select-loop hacks)
- Real FFmpeg AVCodecContext lifecycle (lazily opened, properly
  freed on cleanup)
- Descriptor-driven plane walk (generalises across pix_fmts)
- Structured error path (not just log-and-continue)
- All resource paths cleaned up on every error branch
- Documented why FNV-1a digest, why write() not writev(), why
  pix_desc walk in docs/phase_8_4_closure.md

Phase 8.5 next: V4L2 m2m queue submits REQ_DECODE from
vidioc_qbuf; dmabuf carries actual pixel data so the chardev's
64 KiB cap doesn't gate frame size; begin substituting
daedalus_dispatch_* into the daemon's decode path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:22:16 +00:00
marfrit 895f57c63a Phase 8.2: kernel ↔ daemon chardev bridge with round-trip test
Adds /dev/daedalus-v4l2 misc chardev to the kernel module. The
chardev is the IPC channel for the future userspace decoder
daemon: kernel enqueues REQ_* messages, daemon read()s them,
processes, write()s RESP_* back.

Wire protocol (pre-1.0, header in include/daedalus_v4l2_proto.h):
- struct daedalus_msg_hdr: magic (D04V) + version + type +
  cookie + payload_len + reserved
- Request/response separated by high bit of type field
- Max 64 KiB payload per message
- Cookie correlates request with matching response

Kernel implementation (kernel/daedalus_v4l2_chardev.{c,h}):
- Single-instance chardev (-EBUSY on second open)
- In-kernel FIFO bounded at 64 messages
- Blocking + non-blocking read; poll() with EPOLLIN on queued
- write() parses + validates header, logs response at pr_debug
- Bad magic → -EBADMSG, bad version → -EPROTO, oversize → -EMSGSIZE
- All error paths free resources

Phase 8.2 test trigger via debugfs:
- /sys/kernel/debug/daedalus_v4l2/test_ping — any byte
  enqueues a PING with a fixed 24-byte payload. Removed in
  Phase 8.4 when real REQ_DECODE from V4L2 path takes over.

Userspace verification tool (tools/test_chardev_pingpong.c):
- Real C program, proper error reporting via strerror
- Validates the 6-step round-trip: open → empty-queue EAGAIN →
  trigger ping → read PING → verify all fields → write PONG → close
- Builds with -Wall -Wextra clean

Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):
  $ sudo insmod daedalus_v4l2.ko
  $ sudo tools/test_chardev_pingpong
  opening /dev/daedalus-v4l2...
    non-blocking read on empty queue: EAGAIN ✓
    injected PING via debugfs ✓
    read PING: magic ✓ version ✓ type=PING ✓ cookie=0x1234 ✓ payload=24 bytes
      payload: "DAEDALUS-V4L2-PING-PL"
    wrote PONG (cookie=0x1234) ✓
  ALL TESTS PASSED.
  $ sudo rmmod daedalus_v4l2      # clean

Per correctness-before-speed: full kerneldoc on structs, 8-tab
kernel style, SPDX headers, proper error paths, real test
program (not "I ran it once"), failure-mode coverage documented.

Phase 8.3 next: userspace daemon with dlopen'd FFmpeg parse path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:05:54 +00:00
marfrit 9415b7e0f7 Phase 8.1: kernel V4L2 device skeleton (out-of-tree module)
Out-of-tree Linux kernel module registering /dev/videoNN. Phase
8.1 scope: skeleton only — VIDIOC_QUERYCAP works, no codec
ioctls / no vb2_queue / no controls yet.

Real V4L2 plumbing throughout per "correctness before speed":
platform_device + v4l2_device + video_device, properly nested
with error paths and devm_kzalloc-managed lifetime. Per-cycle 9
discipline ports to kernel code: SPDX header, kernel coding
style (8-tab, static-by-default), kerneldoc on structs, no
shortcuts.

Files (~250 LOC total):
- kernel/Makefile — out-of-tree kbuild with checkpatch target
- kernel/daedalus_v4l2_main.c — module init/exit + probe/remove

Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):
- Builds clean with -Wall -Wextra. No warnings.
- modprobe / rmmod round-trip clean. No dmesg taints beyond
  the expected "out-of-tree taint" line.
- v4l2-ctl --list-devices shows: "daedalus-fourier V3D7+NEON
  (platform:daedalus_v4l2): /dev/video0"
- VIDIOC_QUERYCAP returns driver/card/bus/caps as specified.
- v4l2-compliance: 44/48 passing. The 4 failures are exactly
  the format/buffer ioctls Phase 8.2 will implement
  (ENUM_FMT, G_FMT, Scaling, REQBUFS) — not skeleton bugs,
  legitimately-absent features.

Documentation: docs/phase_8_1_closure.md captures full
verification output + Phase 8.2 plan.

Phase 8.1 acceptance criteria met:
- ✓ /dev/videoNN appears via v4l2-ctl --list-devices
- ✓ VIDIOC_QUERYCAP responds with sensible values
- ✓ rmmod is clean (no kref leaks)
- ✓ v4l2-compliance passes except for explicit Phase 8.2 work

Next: Phase 8.2 chardev bridge for kernel ↔ daemon IPC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:03:22 +00:00
marfrit c7d8050cc9 Initial scaffold: daedalus-v4l2 sibling repo
V4L2 stateless decoder for Pi 5, backed by sibling
daedalus-fourier kernel library (VP9 + AV1 CDEF + H.264 video
decode kernels on VideoCore VII compute + ARM NEON).

Architecture locked 2026-05-18 by mfritsche per
daedalus-fourier/docs/phase8_scoping.md:
- Option B: Linux kernel V4L2 shim + userspace daemon (not
  v4l2loopback). Real /dev/videoNN; proper DRM PRIME for
  browser zero-copy.
- Option γ: dlopen FFmpeg at runtime as parser. No vendoring;
  fastest to v1.
- Sibling repo (this repo): V4L2-side work outside of
  daedalus-fourier so kernel-library API stays clean.

Components:
  kernel/ - Linux out-of-tree kernel module (GPLv2; V4L2
    device + chardev bridge to userspace daemon)
  daemon/ - userspace decoder daemon (BSD-2-Clause; links
    libdaedalus_core.a from sibling; dlopens FFmpeg)
  docs/   - architecture + 7-phase roadmap (8.1..8.7)
  include/ - shared headers between kernel and daemon

Roadmap (7 sub-phases, ~1 week each):
  8.1 kernel skeleton (/dev/videoNN with no-op ioctls)
  8.2 chardev bridge (kernel ↔ daemon ping-pong)
  8.3 daemon FFmpeg dlopen + parse path
  8.4 VP9 end-to-end via daedalus_dispatch_*
  8.5 dmabuf / DRM PRIME for zero-copy
  8.6 AV1 + H.264 codec support
  8.7 performance: hit 30fps@1080p (project floor)

No code yet — only README + design docs + directory structure.
First implementation work starts in Phase 8.1 next session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:54:56 +00:00