Files
ampere-av1-enablement/phase2_step4_close.md
T
claude-noether 4ed18d6837 Phase 2 step 4 close: AV1 dispatch scaffolding verified
350 LoC across 6 files in libva backend. vainfo advertises AV1.
av1_set_controls stub returns -1 cleanly; ffmpeg-vaapi falls back to
SW; no crash, no IOMMU fault.

Three devices opened at backend init (rkvdec /dev/video1 + legacy
hantro /dev/video2 + vpu981 /dev/video4) via capability-based probe.
HEVC + H264 + VP8 + MPEG2 unaffected.

Remaining: Phase 2.1 — fill_sequence (75 LoC), fill_frame (257 LoC),
fill_film_grain (67 LoC), fill_tile_group_entries + 4-control batch
submit (~50 LoC) = ~450 LoC of field-mapping work + Janet F1/F2/F3
silent-failure-mode handling. Fresh-focus session recommended.

Phase 3 test vectors locked: av1-1-b8-01-size-208x208.ivf (smoke),
av1-1-b8-23-film_grain-50.ivf (F1 multi-tile + F3 film_grain stress).
Byte-compare libva vs kdirect (Phase 0 verified bit-perfect).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 07:55:56 +00:00

3.5 KiB
Raw Blame History

Phase 2 step 4 close — AV1 dispatch SCAFFOLDING wired and verified

Date: 2026-05-17 ~10:30. Phase 2 scaffolding complete. Stub av1_set_controls returns -1 cleanly; ffmpeg falls back to SW; no crash, no IOMMU fault.

What's done (3 commits on libva backend branch av1-iter1)

Commit Subject Lines
bed75c0 Phase 2 step 1: third-device fd scaffolding for vpu981 +154
61db76e Phase 2 step 2: advertise VAProfileAV1Profile0 via libva +19
78a9978 Phase 2 step 4: AV1 dispatch scaffolding compiles and wires +177

Total: 350 LoC across 6 files. Module installed at /usr/lib/dri/v4l2_request_drv_video.so on ampere (md5 adab8fa005ec3187f5127d0dd6605690).

Verification

$ vainfo
VAProfileMPEG2Simple/Main, H264×5, HEVC, VP8, AV1   — all advertised

$ LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi -i /tmp/test_av1.ivf ...
v4l2-request: auto-selected codec device: /dev/video1 + /dev/media0
v4l2-request: iter38: also opened hantro-vpu decoder at /dev/video2 + /dev/media1
v4l2-request: ampere-av1: vpu981 AV1 decoder at /dev/video4 + /dev/media3
v4l2-request: ampere-av1: av1_set_controls stub — Phase 2.1 will implement ...
[av1] HW accel end frame fail.
[dec:av1] Error submitting packet to decoder: Input/output error

Three devices opened, vpu981 found via capability probe, AV1 routed to stub, graceful fall-back to SW. Other codecs unaffected.

Phase 2.1 remaining (the bulk — ~700 LoC)

Implement the body of av1_set_controls in ~/src/libva-v4l2-request-fourier/src/av1.c:

Function Reference (Kwiboo) Notes
fill_sequence() lines 55-128 75 LoC — AV1 sequence flags from VADecPictureParameterBufferAV1.seq_fields.bits.*
fill_frame() lines 130-385 257 LoC — the heavy one; F1 (mi_col/row_starts sentinel), F2 (superres_denom AV1_SUPERRES_NUM=8), F3 (loop_restoration_size USES_LR gate) ALL apply here
fill_film_grain() lines 387-452 67 LoC — only if context->has_film_grain (init-time probe)
fill_tile_group_entries() lines 481-522 tile_group_entries[] DYNAMIC_ARRAY; size = sizeof × MAX(N,1)
av1_set_controls() body lines 524-560 4-control batch + request_fd submit

Plus context init for has_film_grain flag (VIDIOC_QUERY_EXT_CTRL probe at first AV1 context open).

Phase 3 testing plan (unchanged from v3 amendment)

  1. Test 208×208 (single-tile, smoke test) — av1-1-b8-01-size-208x208.ivf
  2. Test 352×288 with film_grain + tiling — av1-1-b8-23-film_grain-50.ivf (F1 + F3 stress)
  3. Byte-compare HW (libva) vs HW (kdirect, verified Phase 0 bit-perfect) — should be 100% identical

Pending follow-ups

  • Push av1-iter1 branch to gitea (vp9 branch had ssh sideband disconnect; may recur)
  • Janet code review of completed av1.c after Phase 2.1 implementation
  • Real-world 1080p AV1 stress test post-Phase 3 bit-perfect verification

State at this checkpoint

  • ampere: iter38b-vp9-iter1-base (sibling close) + Phase 2 scaffolding installed; HEVC bit-perfect retained, AV1 falls back cleanly to SW until Phase 2.1
  • backend repo on ampere: branch av1-iter1 with 3 new commits stacked on iter2 step4
  • campaign repo: 4 plan docs + Phase 0 + this step4 close

Honest assessment

Scaffolding work was 350 LoC of mostly-mechanical infrastructure (~3 hours). The remaining ~700 LoC of av1.c body IS the field-mapping work where Janet's F1/F2/F3 silent-failure modes lurk. That's fresh-focus territory, not "keep grinding" territory. Recommend Phase 2.1 in a separate focused session.