Phase 8.12: first VP9 frame decoded via libva
ffmpeg -hwaccel vaapi → libva-v4l2-request-fourier → /dev/video0 → daedalus_v4l2 kernel → REQ_DECODE on the chardev → daemon FFmpeg decode → byte-exact NV12 (FNV-1a 0x1eb34bfe, same hash the standalone test_m2m_stream produces for the same 128x96 VP9 keyframe). The pixel-correct decode through the libva path is the milestone. What's NOT yet working: libva times out on the media_request fd because buf_request_complete never fires (vb->req_obj.req is NULL when buf_done runs — the S_EXT_CTRLS EINVAL leaves the buffer un-bound to the request even though the buffer queues anyway). Phase 8.13 fixes the EINVAL so the request bind takes and the completion signal propagates. Kernel V4L2 request API integration: - media_device_ops.req_validate / req_queue = vb2_request_ validate / v4l2_m2m_request_queue (Phase 8.11) — MEDIA_IOC_REQUEST_ALLOC succeeds. - vb2_queue.supports_requests = true on OUTPUT queue — without this v4l2-core rejects S_EXT_CTRLS(REQUEST_VAL). - vb2_ops.buf_request_complete = daedalus_buf_request_complete → v4l2_ctrl_request_complete(req, &ctx->hdl). Without this v4l2-core WARNs at videobuf2-v4l2.c:440. - vb2_ops.buf_out_validate: sets field=V4L2_FIELD_NONE on OUTPUT buf. Required for the same WARN check. - requires_requests intentionally NOT set: lets the existing test_m2m_stream (direct QBUF, no request) keep working alongside the libva path. Stateless control re-registration: - Switched from v4l2_ctrl_new_std_compound(NULL p_def) to v4l2_ctrl_new_custom(&cfg, NULL) — pattern rkvdec / cedrus / hantro use. v4l2-core auto-fills elem_size + type from std table (verified: VP9_FRAME elem_size=168, matches sizeof(struct v4l2_ctrl_vp9_frame)). - No-op s_ctrl callback so SET requests don't crash — daemon ignores values, FFmpeg re-parses the bitstream. Verification on hertz (Pi 5, 6.12.75+rpt-rpi-2712): ffmpeg -hwaccel vaapi -i vp9_small.ivf … daemon: REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes daemon: decoder: opened vp9 context daemon: decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe … Same FNV-1a hash as the standalone test_m2m_stream produces for the same VP9 keyframe. End-to-end through libva. Remaining (Phase 8.13): - S_EXT_CTRLS EINVAL on V4L2_CID_STATELESS_VP9_FRAME despite matching elem_size — needs deeper validate-path debugging. - Once the request bind takes, buf_request_complete fires on buf_done, request fd signals completion, libva DQBUFs the decoded NV12. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,161 @@
|
||||
# Phase 8.12 closure — first VP9 frame decoded via libva
|
||||
|
||||
**Status:** closed 2026-05-18.
|
||||
|
||||
The libva path now drives our daemon end-to-end: a 1566-byte
|
||||
VP9 keyframe submitted via `ffmpeg -hwaccel vaapi` →
|
||||
`libva-v4l2-request-fourier` → `/dev/video0` → daedalus_v4l2
|
||||
kernel → REQ_DECODE on the chardev → daemon FFmpeg decode →
|
||||
**byte-exact NV12 output** (FNV-1a `0x1eb34bfe` — same hash
|
||||
as the standalone `test_m2m_stream` produces for the same
|
||||
input).
|
||||
|
||||
The pixel-correct decode is the milestone. What's *not* yet
|
||||
working is the libva-side request-completion signal: `ffmpeg`
|
||||
times out waiting for the media_request to complete and reports
|
||||
"Decoding error" even though the kernel-side decode succeeded.
|
||||
That's Phase 8.13 scope.
|
||||
|
||||
## What lands
|
||||
|
||||
### Kernel V4L2 request API integration (`kernel/daedalus_v4l2_main.c`)
|
||||
|
||||
- **`media_device_ops.req_validate / req_queue`** wired in
|
||||
Phase 8.11 — exposed `MEDIA_IOC_REQUEST_ALLOC` to userspace.
|
||||
- **`vb2_queue.supports_requests = true`** on the OUTPUT
|
||||
queue — without this v4l2-core rejects
|
||||
`VIDIOC_S_EXT_CTRLS(which=REQUEST_VAL)` before reaching any
|
||||
driver code.
|
||||
- **`vb2_ops.buf_request_complete = daedalus_buf_request_complete`**
|
||||
— calls `v4l2_ctrl_request_complete(req, &ctx->hdl)` so the
|
||||
per-request control state gets unbound when the request
|
||||
completes. Without this v4l2-core WARNs at qbuf:
|
||||
`videobuf2-v4l2.c:440: WARN_ON(!q->ops->buf_request_complete)`.
|
||||
- **`vb2_ops.buf_out_validate`** — sets the OUTPUT buf's
|
||||
`field = V4L2_FIELD_NONE`. Required on request-supporting
|
||||
OUTPUT queues for the same WARN check (one above the
|
||||
buf_request_complete one).
|
||||
|
||||
### Stateless control re-registration
|
||||
|
||||
Switched from `v4l2_ctrl_new_std_compound(NULL p_def)` to
|
||||
`v4l2_ctrl_new_custom(&cfg, NULL)` with a no-op `s_ctrl`
|
||||
callback — the pattern rkvdec / cedrus / hantro use for known
|
||||
`V4L2_CID_STATELESS_*` ids. v4l2-core auto-fills `elem_size`
|
||||
and `type` from its internal std-control table (verified via
|
||||
debug prints: VP9_FRAME registers with `elem_size=168`,
|
||||
matching `sizeof(struct v4l2_ctrl_vp9_frame)`).
|
||||
|
||||
## Verification
|
||||
|
||||
### End-to-end decode through libva
|
||||
|
||||
```
|
||||
$ LIBVA_DRIVERS_PATH=… LIBVA_DRIVER_NAME=v4l2_request \
|
||||
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \
|
||||
LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \
|
||||
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 \
|
||||
-hwaccel_output_format nv12 \
|
||||
-i /tmp/vp9_small.ivf -f rawvideo -y /tmp/out.nv12
|
||||
|
||||
daemon log:
|
||||
REQ_DECODE cookie=1 codec=1 bitstream=1566 bytes capture=128x96 1 planes
|
||||
decoder: opened vp9 context
|
||||
decoder: OK 128x96 fmt=0 (yuv420p) fnv1a=0x1eb34bfe luma=12288 chroma=6144
|
||||
```
|
||||
|
||||
Daemon decoded byte-exact, same hash as the standalone test
|
||||
path produces for the same VP9 keyframe.
|
||||
|
||||
### libva side hangs on the request-completion poll
|
||||
|
||||
```
|
||||
v4l2-request: Unable to set control(s): Invalid argument
|
||||
v4l2-request: Timeout when waiting for media request
|
||||
v4l2-request: Unable to reinit media request: Device or resource busy
|
||||
[vp9] Failed to end picture decode issue: 1 (operation failed)
|
||||
[vp9] Decoding error: Input/output error
|
||||
```
|
||||
|
||||
ffmpeg sees nothing because libva times out before any DQBUF.
|
||||
`/tmp/out.nv12` ends up zero bytes.
|
||||
|
||||
The S_EXT_CTRLS still returns EINVAL on `V4L2_CID_STATELESS_
|
||||
VP9_FRAME` (size=168, struct shape matches the kernel's
|
||||
registered control), but the buffer apparently gets queued
|
||||
ANYWAY (without proper request binding) — that's why the
|
||||
decode runs. Then `buf_request_complete` never fires
|
||||
because `vb->req_obj.req` is NULL when `buf_done` runs, and
|
||||
libva's wait on the media_request fd times out.
|
||||
|
||||
## Design decisions
|
||||
|
||||
### Why ship even though libva still times out
|
||||
|
||||
The actual pixel-decode pipeline through the libva path
|
||||
works. That's a major milestone — Phase 8.10/8.11 stopped
|
||||
at "all setup ioctls succeed but no decode runs"; Phase 8.12
|
||||
makes the decode run with byte-exact output.
|
||||
|
||||
What remains is plumbing on the V4L2 framework side: making
|
||||
the request fd see the buffer's completion. That's a deeper
|
||||
investigation (probably reachable by fixing the S_EXT_CTRLS
|
||||
EINVAL so the request actually binds the buffer, then
|
||||
buf_request_complete fires naturally on buf_done) — but it's
|
||||
a distinct workstream from "is the decode itself working."
|
||||
|
||||
### Why the no-op s_ctrl is right for our daemon
|
||||
|
||||
FFmpeg in the daemon re-parses the bitstream itself; the
|
||||
per-frame stateless controls from libva are advisory. Our
|
||||
`s_ctrl` returns 0 unconditionally because we don't act on
|
||||
the values — we just need v4l2-core to accept the writes so
|
||||
the request validation passes.
|
||||
|
||||
If we ever wanted to skip FFmpeg's parse step and trust
|
||||
libva's parsed values, `s_ctrl` would need to forward them
|
||||
to the daemon over the chardev — that's a different
|
||||
architecture (Phase 8.14+ if motivated).
|
||||
|
||||
### Why `supports_requests` without `requires_requests`
|
||||
|
||||
`requires_requests=true` would reject `VIDIOC_QBUF` without
|
||||
a bound `media_request_fd`. That'd break our existing
|
||||
test_m2m_stream which queues buffers directly (no request).
|
||||
|
||||
Keeping `requires_requests=false` lets both consumers
|
||||
coexist: the libva path uses requests, the test client uses
|
||||
direct QBUF. The cost is that libva queueing failures might
|
||||
fall back to direct QBUF (which is what we see — the buffer
|
||||
runs anyway after S_EXT_CTRLS fails), masking the actual
|
||||
issue.
|
||||
|
||||
## What's NOT here (Phase 8.13 scope)
|
||||
|
||||
- **Fix the S_EXT_CTRLS EINVAL** so request bind actually
|
||||
takes. Likely sub-issue: the request's per-handler
|
||||
control state isn't initialised correctly when our
|
||||
controls were registered via `v4l2_ctrl_new_custom` —
|
||||
may need explicit `v4l2_ctrl_request_setup` or different
|
||||
registration approach.
|
||||
- **Verify byte-exact CAPTURE buffer returned to ffmpeg**
|
||||
after request fixes.
|
||||
- **Multi-frame stream via libva** (P-frame reference handling
|
||||
across requests).
|
||||
- **AV1 + H.264 via libva** (different control sets, may
|
||||
need adjustments).
|
||||
|
||||
## Phase 8.13 plan
|
||||
|
||||
1. Deep-dive S_EXT_CTRLS EINVAL: trace what v4l2-core
|
||||
validates beyond elem_size. Possible suspects:
|
||||
- `validate_new_compound` may require `ctrl->p_def` non-NULL
|
||||
for std compound controls.
|
||||
- The control's "request setup" callback may need
|
||||
explicit registration.
|
||||
2. Either resolve the EINVAL or document what cedrus/rkvdec
|
||||
do differently (their controls are also `v4l2_ctrl_new_custom`
|
||||
with bare `.id` — so why do theirs work?).
|
||||
3. Once S_EXT_CTRLS succeeds, buf_request_complete should
|
||||
fire naturally and the request poll should return.
|
||||
4. End-to-end byte-exact verification.
|
||||
Reference in New Issue
Block a user