Files
daedalus-v4l2/docs/phase_8_12_closure.md
T
marfrit a7d585eee8 Phase 8.12: first VP9 frame decoded via libva
ffmpeg -hwaccel vaapi → libva-v4l2-request-fourier →
/dev/video0 → daedalus_v4l2 kernel → REQ_DECODE on the
chardev → daemon FFmpeg decode → byte-exact NV12 (FNV-1a
0x1eb34bfe, same hash the standalone test_m2m_stream
produces for the same 128x96 VP9 keyframe).

The pixel-correct decode through the libva path is the
milestone. What's NOT yet working: libva times out on the
media_request fd because buf_request_complete never fires
(vb->req_obj.req is NULL when buf_done runs — the
S_EXT_CTRLS EINVAL leaves the buffer un-bound to the
request even though the buffer queues anyway). Phase 8.13
fixes the EINVAL so the request bind takes and the
completion signal propagates.

Kernel V4L2 request API integration:
- media_device_ops.req_validate / req_queue = vb2_request_
  validate / v4l2_m2m_request_queue (Phase 8.11) —
  MEDIA_IOC_REQUEST_ALLOC succeeds.
- vb2_queue.supports_requests = true on OUTPUT queue —
  without this v4l2-core rejects S_EXT_CTRLS(REQUEST_VAL).
- vb2_ops.buf_request_complete = daedalus_buf_request_complete
  → v4l2_ctrl_request_complete(req, &ctx->hdl). Without
  this v4l2-core WARNs at videobuf2-v4l2.c:440.
- vb2_ops.buf_out_validate: sets field=V4L2_FIELD_NONE on
  OUTPUT buf. Required for the same WARN check.
- requires_requests intentionally NOT set: lets the
  existing test_m2m_stream (direct QBUF, no request) keep
  working alongside the libva path.

Stateless control re-registration:
- Switched from v4l2_ctrl_new_std_compound(NULL p_def) to
  v4l2_ctrl_new_custom(&cfg, NULL) — pattern rkvdec /
  cedrus / hantro use. v4l2-core auto-fills elem_size +
  type from std table (verified: VP9_FRAME elem_size=168,
  matches sizeof(struct v4l2_ctrl_vp9_frame)).
- No-op s_ctrl callback so SET requests don't crash —
  daemon ignores values, FFmpeg re-parses the bitstream.

Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712):

  ffmpeg -hwaccel vaapi -i vp9_small.ivf …
  daemon: REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
  daemon: decoder: opened vp9 context
  daemon: decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe …

Same FNV-1a hash as the standalone test_m2m_stream produces
for the same VP9 keyframe. End-to-end through libva.

Remaining (Phase 8.13):
- S_EXT_CTRLS EINVAL on V4L2_CID_STATELESS_VP9_FRAME despite
  matching elem_size — needs deeper validate-path debugging.
- Once the request bind takes, buf_request_complete fires
  on buf_done, request fd signals completion, libva DQBUFs
  the decoded NV12.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:01:26 +00:00

6.4 KiB

Phase 8.12 closure — first VP9 frame decoded via libva

Status: closed 2026-05-18.

The libva path now drives our daemon end-to-end: a 1566-byte VP9 keyframe submitted via ffmpeg -hwaccel vaapilibva-v4l2-request-fourier/dev/video0 → daedalus_v4l2 kernel → REQ_DECODE on the chardev → daemon FFmpeg decode → byte-exact NV12 output (FNV-1a 0x1eb34bfe — same hash as the standalone test_m2m_stream produces for the same input).

The pixel-correct decode is the milestone. What's not yet working is the libva-side request-completion signal: ffmpeg times out waiting for the media_request to complete and reports "Decoding error" even though the kernel-side decode succeeded. That's Phase 8.13 scope.

What lands

Kernel V4L2 request API integration (kernel/daedalus_v4l2_main.c)

  • media_device_ops.req_validate / req_queue wired in Phase 8.11 — exposed MEDIA_IOC_REQUEST_ALLOC to userspace.
  • vb2_queue.supports_requests = true on the OUTPUT queue — without this v4l2-core rejects VIDIOC_S_EXT_CTRLS(which=REQUEST_VAL) before reaching any driver code.
  • vb2_ops.buf_request_complete = daedalus_buf_request_complete — calls v4l2_ctrl_request_complete(req, &ctx->hdl) so the per-request control state gets unbound when the request completes. Without this v4l2-core WARNs at qbuf: videobuf2-v4l2.c:440: WARN_ON(!q->ops->buf_request_complete).
  • vb2_ops.buf_out_validate — sets the OUTPUT buf's field = V4L2_FIELD_NONE. Required on request-supporting OUTPUT queues for the same WARN check (one above the buf_request_complete one).

Stateless control re-registration

Switched from v4l2_ctrl_new_std_compound(NULL p_def) to v4l2_ctrl_new_custom(&cfg, NULL) with a no-op s_ctrl callback — the pattern rkvdec / cedrus / hantro use for known V4L2_CID_STATELESS_* ids. v4l2-core auto-fills elem_size and type from its internal std-control table (verified via debug prints: VP9_FRAME registers with elem_size=168, matching sizeof(struct v4l2_ctrl_vp9_frame)).

Verification

End-to-end decode through libva

$ LIBVA_DRIVERS_PATH=… LIBVA_DRIVER_NAME=v4l2_request \
  LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \
  LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \
  ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
         -hwaccel_output_format nv12 \
         -i /tmp/vp9_small.ivf -f rawvideo -y /tmp/out.nv12

daemon log:
  REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
  decoder: opened vp9 context
  decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe luma=12288 chroma=6144

Daemon decoded byte-exact, same hash as the standalone test path produces for the same VP9 keyframe.

libva side hangs on the request-completion poll

v4l2-request: Unable to set control(s): Invalid argument
v4l2-request: Timeout when waiting for media request
v4l2-request: Unable to reinit media request: Device or resource busy
[vp9] Failed to end picture decode issue: 1 (operation failed)
[vp9] Decoding error: Input/output error

ffmpeg sees nothing because libva times out before any DQBUF. /tmp/out.nv12 ends up zero bytes.

The S_EXT_CTRLS still returns EINVAL on V4L2_CID_STATELESS_ VP9_FRAME (size=168, struct shape matches the kernel's registered control), but the buffer apparently gets queued ANYWAY (without proper request binding) — that's why the decode runs. Then buf_request_complete never fires because vb->req_obj.req is NULL when buf_done runs, and libva's wait on the media_request fd times out.

Design decisions

Why ship even though libva still times out

The actual pixel-decode pipeline through the libva path works. That's a major milestone — Phase 8.10/8.11 stopped at "all setup ioctls succeed but no decode runs"; Phase 8.12 makes the decode run with byte-exact output.

What remains is plumbing on the V4L2 framework side: making the request fd see the buffer's completion. That's a deeper investigation (probably reachable by fixing the S_EXT_CTRLS EINVAL so the request actually binds the buffer, then buf_request_complete fires naturally on buf_done) — but it's a distinct workstream from "is the decode itself working."

Why the no-op s_ctrl is right for our daemon

FFmpeg in the daemon re-parses the bitstream itself; the per-frame stateless controls from libva are advisory. Our s_ctrl returns 0 unconditionally because we don't act on the values — we just need v4l2-core to accept the writes so the request validation passes.

If we ever wanted to skip FFmpeg's parse step and trust libva's parsed values, s_ctrl would need to forward them to the daemon over the chardev — that's a different architecture (Phase 8.14+ if motivated).

Why supports_requests without requires_requests

requires_requests=true would reject VIDIOC_QBUF without a bound media_request_fd. That'd break our existing test_m2m_stream which queues buffers directly (no request).

Keeping requires_requests=false lets both consumers coexist: the libva path uses requests, the test client uses direct QBUF. The cost is that libva queueing failures might fall back to direct QBUF (which is what we see — the buffer runs anyway after S_EXT_CTRLS fails), masking the actual issue.

What's NOT here (Phase 8.13 scope)

  • Fix the S_EXT_CTRLS EINVAL so request bind actually takes. Likely sub-issue: the request's per-handler control state isn't initialised correctly when our controls were registered via v4l2_ctrl_new_custom — may need explicit v4l2_ctrl_request_setup or different registration approach.
  • Verify byte-exact CAPTURE buffer returned to ffmpeg after request fixes.
  • Multi-frame stream via libva (P-frame reference handling across requests).
  • AV1 + H.264 via libva (different control sets, may need adjustments).

Phase 8.13 plan

  1. Deep-dive S_EXT_CTRLS EINVAL: trace what v4l2-core validates beyond elem_size. Possible suspects:
    • validate_new_compound may require ctrl->p_def non-NULL for std compound controls.
    • The control's "request setup" callback may need explicit registration.
  2. Either resolve the EINVAL or document what cedrus/rkvdec do differently (their controls are also v4l2_ctrl_new_custom with bare .id — so why do theirs work?).
  3. Once S_EXT_CTRLS succeeds, buf_request_complete should fire naturally and the request poll should return.
  4. End-to-end byte-exact verification.