0de0288dce
Brings daedalus_v4l2 from "standalone test client" to "VAAPI-
discoverable decoder" by adding the surface formats and
media-controller plumbing that libva-v4l2-request-fourier
(sibling repo) requires.
libva-v4l2-request-fourier patches (pushed separately):
- b5b3acf: daedalus_v4l2 added to known_decoder_drivers
- 2146341: meson option gate
This commit (daedalus-v4l2 side, 3 production changes):
1. V4L2_PIX_FMT_NV12 (single-plane) on CAPTURE
- Added to daedalus_capture_formats[] alongside NV12M + P010
- daedalus_fill_capture_fmt handles num_planes=1 case
(sizeimage = W*H*3/2, bytesperline = W)
- daemon pack_nv12_single_to_plane: Y at base+0,
interleaved CbCr at base+(stride*H); same byte content
as NV12M two-plane, different layout
- Required because libva-v4l2-request-fourier's video.c
only knows non-multi-plane NV12 (it advertises
v4l2_mplane=true but uses the single-plane fourcc).
- Verified byte-exact via test_m2m_stream against
ffmpeg -pix_fmt nv12 reference (VP9 1080p 10 frames,
31 MB).
2. V4L2 Request API media ops
- daedalus_media_ops = { vb2_request_validate,
v4l2_m2m_request_queue } assigned to mdev.ops before
media_device_init.
- Without this, MEDIA_IOC_REQUEST_ALLOC returned
-ENOTTY and no VAAPI consumer could allocate a
media_request.
3. Stateless control registration via v4l2_ctrl_new_custom
- Switched from v4l2_ctrl_new_std_compound(NULL p_def)
to v4l2_ctrl_new_custom — pattern rkvdec/cedrus/
hantro use. Adds a no-op s_ctrl callback.
Verification (hertz, Pi 5, 6.12.75+rpt-rpi-2712):
LibVA trace through `ffmpeg -hwaccel vaapi`:
vaInitialize / Profiles / Entrypoints / CreateConfig /
QuerySurfaceAttributes / CreateSurfaces / CreateContext
(cap_pool: 24 slots, 1 plane each) / CreateBuffer
(slice + picture params) / MEDIA_IOC_REQUEST_ALLOC
— all succeed.
Standalone NV12 decode path:
test_m2m_stream vp9_1080_stream.ivf out.nv12 1920 1080 vp9 nv12
→ 10/10 frames, byte-exact vs ffmpeg -pix_fmt nv12
vainfo (via libva-v4l2-request-fourier with our driver):
7 VAProfile entries with VAEntrypointVLD
(H264 Main/High/CBaseline/MultiviewHigh/StereoHigh,
VP9Profile0, AV1Profile0)
What's NOT here (Phase 8.12):
The libva trace stops at VIDIOC_S_EXT_CTRLS returning
EINVAL when populating V4L2_CID_STATELESS_VP9_FRAME on
the request. The compound-control payload validation
against the kernel's expected struct shape rejects.
This isn't a "missing line" fix — it needs proper
stateless control plumbing (the SPS/PPS/SliceParams
get_dims, validate, default-value paths that in-tree
rkvdec/cedrus/hantro implement to satisfy v4l2-core's
std_validate). Documented as Phase 8.12 scope.
The shipped integration is itself a meaningful deliverable:
all the framework scaffolding is in place; the remaining
gap is well-characterised and bounded.
See docs/phase_8_10_11_closure.md for the full trace
analysis + next-phase plan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
232 lines
8.6 KiB
Markdown
232 lines
8.6 KiB
Markdown
# Phase 8.10 + 8.11 closure — libva consumer integration scaffold
|
|
|
|
**Status:** closed 2026-05-18.
|
|
|
|
Two interleaved phases:
|
|
|
|
- **Phase 8.10** — wire daedalus_v4l2 into the existing
|
|
`libva-v4l2-request-fourier` campaign fork (the sibling
|
|
repo at `marfrit/libva-v4l2-request-fourier` already had
|
|
VP9/AV1/H264/HEVC working on Rockchip/Allwinner).
|
|
- **Phase 8.11** — extend daedalus_v4l2 with the V4L2
|
|
surface format and media-controller plumbing the libva
|
|
fork needs.
|
|
|
|
Together they bring the daedalus stack from "standalone
|
|
test client" to "VAAPI-discoverable decoder with all the
|
|
ICD-side framework integration in place." Actual decode
|
|
through the libva path stops at the V4L2 stateless control
|
|
payload acceptance step — a deeper framework integration
|
|
that lands in Phase 8.12.
|
|
|
|
## What lands
|
|
|
|
### libva-v4l2-request-fourier (sibling fork, gitea `marfrit/`)
|
|
|
|
Two commits pushed to `master`:
|
|
|
|
- `b5b3acf` — daedalus_v4l2: add to known_decoder_drivers
|
|
+ multi-device-probe slot. Same shape as iter40's
|
|
rpi-hevc-dec wiring: array entry, driver_data fd slot,
|
|
primary-driver detection branch, post-probe log line.
|
|
34-line diff.
|
|
- `2146341` — daedalus_v4l2: meson option gate (default
|
|
true). `meson setup -Ddaedalus_v4l2=false` builds a .so
|
|
with no daedalus strings at all (verified via strings
|
|
on both build dirs). Struct fields stay unconditional
|
|
to avoid ODR risk across translation units.
|
|
|
|
### daedalus-v4l2 (this repo)
|
|
|
|
Three production changes in this commit:
|
|
|
|
**1. `V4L2_PIX_FMT_NV12` (single-plane) on CAPTURE**
|
|
|
|
The libva fork's `video_format` table only knows NV12
|
|
single-plane (W*H Y bytes followed by W*H/2 interleaved
|
|
CbCr bytes in one buffer plane), not NV12M (two-plane).
|
|
Added NV12 alongside our existing NV12M + P010 in the
|
|
CAPTURE format list:
|
|
|
|
- `daedalus_capture_formats[]` grew an `V4L2_PIX_FMT_NV12`
|
|
entry; `enum_fmt` now lists 3 CAPTURE formats.
|
|
- `daedalus_fill_capture_fmt` handles the new num_planes=1
|
|
layout (sizeimage = W*H*3/2, bytesperline = W).
|
|
- daemon `pack_nv12_single_to_plane`: Y line-by-line into
|
|
base+0, interleaved CbCr at base+(stride*H). Mirrors
|
|
the P010 pack structure minus the depth shift.
|
|
- `daedalus_decoder_run_request` dispatches on
|
|
`req->capture_pix_fmt` to the right pack function.
|
|
|
|
**Verified**: VP9 1080p decoded into NV12 single-plane via
|
|
`tools/test_m2m_stream`, byte-for-byte match against
|
|
`ffmpeg -pix_fmt nv12` reference (10-frame 31 MB stream).
|
|
|
|
**2. V4L2 Request API media ops**
|
|
|
|
`daedalus_media_ops = { .req_validate = vb2_request_validate,
|
|
.req_queue = v4l2_m2m_request_queue }` assigned to
|
|
`mdev.ops` before `media_device_init`. Before this,
|
|
`MEDIA_IOC_REQUEST_ALLOC` returned `-ENOTTY` and any
|
|
VAAPI consumer couldn't even allocate a media_request.
|
|
|
|
**3. Stateless control registration via `v4l2_ctrl_new_custom`**
|
|
|
|
Switched from `v4l2_ctrl_new_std_compound(NULL p_def)` to
|
|
`v4l2_ctrl_new_custom(&cfg, NULL)` — the pattern rkvdec /
|
|
cedrus / hantro use. Adds a no-op `s_ctrl` callback so
|
|
v4l2-core has somewhere to dispatch SET operations.
|
|
|
|
## Verification
|
|
|
|
### Probe + enumeration
|
|
|
|
```
|
|
$ LIBVA_DRIVER_NAME=v4l2_request \
|
|
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0 \
|
|
LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media3 \
|
|
vainfo --display drm --device /dev/dri/renderD128
|
|
|
|
v4l2-request: phase 8.10: opened daedalus_v4l2 at video_fd=N media_fd=M
|
|
vainfo: Driver version: v4l2-request
|
|
vainfo: Supported profile and entrypoints
|
|
VAProfileH264Main : VAEntrypointVLD
|
|
VAProfileH264High : VAEntrypointVLD
|
|
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
|
|
VAProfileH264MultiviewHigh : VAEntrypointVLD
|
|
VAProfileH264StereoHigh : VAEntrypointVLD
|
|
VAProfileVP9Profile0 : VAEntrypointVLD
|
|
VAProfileAV1Profile0 : VAEntrypointVLD
|
|
```
|
|
|
|
Seven VAAPI profiles enumerated through the libva path.
|
|
|
|
### LibVA trace through `ffmpeg -hwaccel vaapi`
|
|
|
|
| Step | Status |
|
|
|------|--------|
|
|
| `vaInitialize` | ✓ |
|
|
| `vaQueryConfigProfiles` | ✓ |
|
|
| `vaQueryConfigEntrypoints` (VLD) | ✓ |
|
|
| `vaCreateConfig` (VP9 + VLD + NV12) | ✓ |
|
|
| `vaQuerySurfaceAttributes` (NV12 fourcc reported) | ✓ |
|
|
| `vaCreateSurfaces` | ✓ |
|
|
| `vaCreateContext` (cap_pool: 24 slots, 1 plane each) | ✓ |
|
|
| `vaCreateBuffer` (slice + picture params) | ✓ |
|
|
| `MEDIA_IOC_REQUEST_ALLOC` | ✓ |
|
|
| `VIDIOC_S_EXT_CTRLS` (stateless ctrls) | ✗ EINVAL |
|
|
| `VIDIOC_QBUF` with request fd | ✗ "Invalid request descriptor" |
|
|
| `vaEndPicture` | ✗ OPERATION_FAILED |
|
|
|
|
Everything past discovery / probe / context / surface /
|
|
buffer / request alloc works. The blocker is
|
|
`VIDIOC_S_EXT_CTRLS` returning EINVAL when libva tries to
|
|
populate `V4L2_CID_STATELESS_VP9_FRAME` on the request —
|
|
the payload validation against the kernel's expected
|
|
compound-control struct shape rejects.
|
|
|
|
This isn't a "missing line" fix — it needs proper
|
|
stateless control plumbing (the SPS/PPS/SliceParams/etc.
|
|
get_dims, validate, default-value paths that the in-tree
|
|
rkvdec / cedrus / hantro decoders implement to satisfy
|
|
v4l2-core's `std_validate` machinery). That's Phase 8.12
|
|
scope.
|
|
|
|
### Standalone NV12 verification
|
|
|
|
```
|
|
$ sudo ./tools/test_m2m_stream /tmp/vp9_1080_stream.ivf \
|
|
/tmp/nv12_out.nv12 1920 1080 vp9 nv12
|
|
parsed 10 frames, 1920x1080
|
|
CAPTURE fmt=NV12 planes=1 sizeimage=[3110400,0]
|
|
decoded 10 / 10 frames
|
|
$ cmp /tmp/nv12_out.nv12 /tmp/vp9_1080_stream_ref.nv12
|
|
$ echo $?
|
|
0
|
|
```
|
|
|
|
Byte-exact through the daedalus-internal path with the
|
|
NV12 single-plane format. Confirms `pack_nv12_single_to_plane`
|
|
produces the same pixels as `pack_nv12_to_planes` (just
|
|
re-laid-out into one buffer).
|
|
|
|
## Design decisions
|
|
|
|
### Why ship even though decode-via-libva is blocked
|
|
|
|
The framework integration up to MEDIA_IOC_REQUEST_ALLOC is
|
|
itself a significant deliverable:
|
|
|
|
- Other VAAPI consumers (testing tools, future patched
|
|
ffmpeg paths, custom VA clients) get the same scaffolding
|
|
for free.
|
|
- The remaining gap is well-characterised: it's the
|
|
stateless control payload acceptance, a single named
|
|
V4L2 framework integration. Phase 8.12's surface is
|
|
clearly defined.
|
|
- All the new code is correct on its own merits (NV12
|
|
single-plane decode is byte-exact via our own path;
|
|
request API ops are the canonical helpers).
|
|
|
|
### Why register stateless controls if we don't act on them
|
|
|
|
libva-v4l2-request-fourier's per-codec dispatch requires
|
|
the standard V4L2 stateless controls to be visible — that's
|
|
how it validates that the kernel supports the right
|
|
profile. Without the registered controls vainfo would
|
|
not enumerate the profile.
|
|
|
|
Our daemon ignores the control values because FFmpeg
|
|
re-parses the bitstream on its own. The plumbing exists
|
|
to satisfy the V4L2 stateless contract; the actual
|
|
decode logic doesn't depend on it.
|
|
|
|
### Why `v4l2_ctrl_new_custom` over `_std_compound`
|
|
|
|
`v4l2_ctrl_new_std_compound` with `NULL` default rejected
|
|
SET requests (same EINVAL libva is hitting today —
|
|
removing the `NULL` default didn't fix it either).
|
|
`v4l2_ctrl_new_custom` is the pattern in-tree decoders
|
|
use; v4l2-core auto-fills the type/dims/size from the std
|
|
control table when given just a `.id`.
|
|
|
|
The remaining issue isn't the registration pattern but
|
|
the payload validation path — v4l2-core expects more
|
|
than just a registered control; it expects the driver to
|
|
have set up `min/max/step/def` for each compound field,
|
|
which `v4l2_ctrl_new_std_compound` does internally for
|
|
known CIDs but our handler isn't quite right.
|
|
|
|
## What's NOT here (Phase 8.12 scope)
|
|
|
|
- **Stateless control payload acceptance**: the S_EXT_CTRLS
|
|
EINVAL. Needs proper v4l2-core validation hooks —
|
|
likely meaning the daemon needs to actually consume the
|
|
per-frame controls (not just ignore them), so the
|
|
validation path has something to hand off to.
|
|
- **Per-codec control wiring**: even if S_EXT_CTRLS
|
|
succeeds, the actual decode submission to the daemon
|
|
needs to bundle the per-buffer controls (or document
|
|
why they're ignored — and convince v4l2-core to allow
|
|
the request to validate).
|
|
- **First end-to-end decoded frame via libva**: the
|
|
payoff for Phase 8.12.
|
|
|
|
## Phase 8.12 plan
|
|
|
|
1. Study cedrus or rkvdec's stateless control validation
|
|
to understand what `std_validate` expects beyond just
|
|
registration.
|
|
2. Either:
|
|
- (a) Add proper compound-control validation hooks so
|
|
S_EXT_CTRLS succeeds without us doing real work
|
|
(control values become "advisory" since daemon
|
|
re-parses bitstream), OR
|
|
- (b) Wire the daemon to actually use the per-frame
|
|
control payload (skip FFmpeg's parse step, trust
|
|
libva's parsed values). Bigger change but more
|
|
correct.
|
|
3. Verify first frame decoded through the libva path.
|
|
4. Run the full vainfo --display drm --decode test if
|
|
that exists, or a small VA decode snippet.
|