350 LoC across 6 files in libva backend. vainfo advertises AV1. av1_set_controls stub returns -1 cleanly; ffmpeg-vaapi falls back to SW; no crash, no IOMMU fault. Three devices opened at backend init (rkvdec /dev/video1 + legacy hantro /dev/video2 + vpu981 /dev/video4) via capability-based probe. HEVC + H264 + VP8 + MPEG2 unaffected. Remaining: Phase 2.1 — fill_sequence (75 LoC), fill_frame (257 LoC), fill_film_grain (67 LoC), fill_tile_group_entries + 4-control batch submit (~50 LoC) = ~450 LoC of field-mapping work + Janet F1/F2/F3 silent-failure-mode handling. Fresh-focus session recommended. Phase 3 test vectors locked: av1-1-b8-01-size-208x208.ivf (smoke), av1-1-b8-23-film_grain-50.ivf (F1 multi-tile + F3 film_grain stress). Byte-compare libva vs kdirect (Phase 0 verified bit-perfect). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3.5 KiB
Phase 2 step 4 close — AV1 dispatch SCAFFOLDING wired and verified
Date: 2026-05-17 ~10:30. Phase 2 scaffolding complete. Stub av1_set_controls returns -1 cleanly; ffmpeg falls back to SW; no crash, no IOMMU fault.
What's done (3 commits on libva backend branch av1-iter1)
| Commit | Subject | Lines |
|---|---|---|
bed75c0 |
Phase 2 step 1: third-device fd scaffolding for vpu981 | +154 |
61db76e |
Phase 2 step 2: advertise VAProfileAV1Profile0 via libva | +19 |
78a9978 |
Phase 2 step 4: AV1 dispatch scaffolding compiles and wires | +177 |
Total: 350 LoC across 6 files. Module installed at /usr/lib/dri/v4l2_request_drv_video.so on ampere (md5 adab8fa005ec3187f5127d0dd6605690).
Verification
$ vainfo
VAProfileMPEG2Simple/Main, H264×5, HEVC, VP8, AV1 — all advertised
$ LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi -i /tmp/test_av1.ivf ...
v4l2-request: auto-selected codec device: /dev/video1 + /dev/media0
v4l2-request: iter38: also opened hantro-vpu decoder at /dev/video2 + /dev/media1
v4l2-request: ampere-av1: vpu981 AV1 decoder at /dev/video4 + /dev/media3
v4l2-request: ampere-av1: av1_set_controls stub — Phase 2.1 will implement ...
[av1] HW accel end frame fail.
[dec:av1] Error submitting packet to decoder: Input/output error
Three devices opened, vpu981 found via capability probe, AV1 routed to stub, graceful fall-back to SW. Other codecs unaffected.
Phase 2.1 remaining (the bulk — ~700 LoC)
Implement the body of av1_set_controls in ~/src/libva-v4l2-request-fourier/src/av1.c:
| Function | Reference (Kwiboo) | Notes |
|---|---|---|
fill_sequence() |
lines 55-128 | 75 LoC — AV1 sequence flags from VADecPictureParameterBufferAV1.seq_fields.bits.* |
fill_frame() |
lines 130-385 | 257 LoC — the heavy one; F1 (mi_col/row_starts sentinel), F2 (superres_denom AV1_SUPERRES_NUM=8), F3 (loop_restoration_size USES_LR gate) ALL apply here |
fill_film_grain() |
lines 387-452 | 67 LoC — only if context->has_film_grain (init-time probe) |
fill_tile_group_entries() |
lines 481-522 | tile_group_entries[] DYNAMIC_ARRAY; size = sizeof × MAX(N,1) |
av1_set_controls() body |
lines 524-560 | 4-control batch + request_fd submit |
Plus context init for has_film_grain flag (VIDIOC_QUERY_EXT_CTRL probe at first AV1 context open).
Phase 3 testing plan (unchanged from v3 amendment)
- Test 208×208 (single-tile, smoke test) —
av1-1-b8-01-size-208x208.ivf - Test 352×288 with film_grain + tiling —
av1-1-b8-23-film_grain-50.ivf(F1 + F3 stress) - Byte-compare HW (libva) vs HW (kdirect, verified Phase 0 bit-perfect) — should be 100% identical
Pending follow-ups
- Push av1-iter1 branch to gitea (vp9 branch had ssh sideband disconnect; may recur)
- Janet code review of completed av1.c after Phase 2.1 implementation
- Real-world 1080p AV1 stress test post-Phase 3 bit-perfect verification
State at this checkpoint
- ampere: iter38b-vp9-iter1-base (sibling close) + Phase 2 scaffolding installed; HEVC bit-perfect retained, AV1 falls back cleanly to SW until Phase 2.1
- backend repo on ampere: branch
av1-iter1with 3 new commits stacked on iter2 step4 - campaign repo: 4 plan docs + Phase 0 + this step4 close
Honest assessment
Scaffolding work was 350 LoC of mostly-mechanical infrastructure (~3 hours). The remaining ~700 LoC of av1.c body IS the field-mapping work where Janet's F1/F2/F3 silent-failure modes lurk. That's fresh-focus territory, not "keep grinding" territory. Recommend Phase 2.1 in a separate focused session.