Files
fresnel-fourier/phase0_findings_iter3.md
T
claude-noether ea2413e957 iter3 Phase 0 + Phase 1 lock: VP8 on hantro-vpu-dec
Opens iter3 of the fresnel-fourier campaign immediately after iter2
close (df787a6). Targets VP8 as the fourth codec to pass boolean-
correctness on fresnel via libva-v4l2-request-fourier.

Locked research question:
  mpv --hwdec=vaapi bbb_720p10s_vp8.webm engages backend cleanly and
  DMA-BUF GL import yields HW pixels byte-identical to SW reference.

Five Phase 1 boolean criteria:
  1. vainfo enumerates VAProfileVP8Version0_3 on hantro env binding
  2. vaCreateConfig(VAProfileVP8Version0_3, VLD) = SUCCESS
  3. ffmpeg -hwaccel vaapi VP8 decode exit 0
  4. mpv --hwdec=vaapi --vo=image @ +02s seek: HW=SW byte-identical
     for 2 distinct frames; frame1 != frame2
  5. THREE-codec regression block: iter1 MPEG-2 + iter2 HEVC + T4
     H.264 reference hashes all hold

Substrate carry-forward (re-verified):
  - fork master tip post-iter2-close (cca539d + 8d71e20)
  - /usr/lib/dri/v4l2_request_drv_video.so SHA256 9e27...6258
  - linux-eos-arm 6.19.9-99-eos-arm (post linux-7 headers-only upgrade)
  - bbb_720p10s_vp8.webm fixture on fresnel ~/fourier-test/ (2.4 MB)
  - hantro-vpu-dec OUTPUT_MPLANE VP8F + vp8_frame_parameters control
  - cross-validator anchor confirmed: ffmpeg-v4l2request VP8 = exit 0

Predicted scope (smaller than iter1+iter2):
  - config.c: ADD VP8 enumeration block + RequestCreateConfig case
    + RequestQueryConfigEntrypoints case (3 sites; iter1+iter2
    only had 1-2 existing-but-broken case labels)
  - src/vp8.c NEW file (~150-250 lines vs iter2's 588 h265.c)
  - src/vp8.h NEW file
  - src/meson.build add 'vp8.c' + 'vp8.h' entries
  - picture.c codec_set_controls VP8 dispatch + codec_store_buffer
    cases for 4 VAAPI VP8 buffer types (Picture, Slice, Probability,
    IQMatrix)
  - surface.h params union extend with vp8 member
  - context.c: NO changes (VP8 has no DECODE_MODE/START_CODE menus
    on hantro per Phase 0 v4l2_inventory)

VP8 contract surface: single V4L2_CID_STATELESS_VP8_FRAME control
per frame (no batch); no slice_params dynamic-array (frame-mode);
no SCALING_MATRIX (entropy + quant carried in v4l2_ctrl_vp8_frame
sub-structs).

Phase 2 source-read targets queued: config.c enumeration pattern,
picture.c dispatch + per-buffer-type cases, surface.h params union,
VAAPI <va/va_dec_vp8.h>, kernel UAPI <linux/v4l2-controls.h>
v4l2_ctrl_vp8_frame, kernel hantro_vp8.c driver, FFmpeg
v4l2_request_vp8.c.

Memory carry-forward (all five entries apply unchanged):
  feedback_gitea_as_claude_noether
  feedback_no_session_termination_attempts
  feedback_header_deletion_check
  feedback_review_empirical_over_theoretical (BOTH directions)
  feedback_rockchip_pixel_verify_path

Refs:
  phase0_findings_iter1.md (iter1 MPEG-2 lock template)
  phase0_findings_iter2.md (iter2 HEVC lock template)
  phase8_iteration2_close.md (immediate predecessor close)
  phase0_evidence/2026-05-07/v4l2_inventory_findings.md (hantro VP8
    capability)
  phase0_evidence/2026-05-07/cross_validator_traces.md (VP8 kernel
    decode path proven)
  phase0_evidence/2026-05-07/test_fixtures.md (bbb_720p10s_vp8.webm
    provenance)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:49:28 +00:00

18 KiB
Raw Blame History

Iteration 3 — Phase 0 (substrate / motivation / inventory) → Phase 1 lock

Opens 2026-05-08 immediately after iter2 close (phase8_iteration2_close.md, commit df787a6). Per feedback_dev_process.md Phase 0, this document captures iter3's locked research question + substrate + scope, ending with the Phase 1 measurable success criterion.

Locked research question (iteration 3)

"Make VP8 the fourth codec to pass boolean-correctness on fresnel via the libva-v4l2-request-fourier path — mpv --hwdec=vaapi bbb_720p10s_vp8.webm engages the backend cleanly and DMA-BUF GL import yields HW pixels byte-identical to a software-decoded reference for the same frames."

Pass/fail (boolean):

  1. Profile enumeration. vainfo with the hantro env binding (LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video<hantro-vpu-dec>, LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media<hantro-vpu> — re-verify per-boot device numbering) lists VAProfileVP8Version0_3. Currently NOT enumerated — config.c has no VP8 enumeration block at all (unlike iter1 MPEG-2 + iter2 HEVC where the case labels existed but fell through). iter3 must ADD the enumeration logic in RequestQueryConfigProfiles + the validation case in RequestCreateConfig.
  2. Config creation succeeds. vaCreateConfig(VAProfileVP8Version0_3, VAEntrypointVLD) returns VA_STATUS_SUCCESS.
  3. End-to-end ffmpeg-direct decode. ffmpeg -hwaccel vaapi -i ~/fourier-test/bbb_720p10s_vp8.webm -frames:v 5 -f null - (with hantro env binding) shows the libva chain in stderr, libva trace shows vaCreateConfig SUCCESS, no Failed to create decode configuration, no EINVAL from VIDIOC_S_EXT_CTRLS, exits 0. Phase 1 criterion #3 wording matches iter1/iter2 (ffmpeg-direct anchor; mpv --hwdec=vaapi-copy may filter VP8 out — fall-forward to mpv-vaapi-vo=image for criterion 4).
  4. Cache-safe pixel verification matches SW reference. mpv --hwdec=vaapi --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm and the --hwdec=no SW run produce JPEGs whose sha256sum outputs match for both frame 1 and frame 2. Frames 1 and 2 must hash-differ (real motion content) AND hash-equal across HW vs SW. Per memory/feedback_rockchip_pixel_verify_path.md: DMA-BUF GL is the cache-coherency-safe verifier; do NOT use ffmpeg-vaapi+hwdownload.
  5. Three-codec regression. iter1 MPEG-2 + iter2 HEVC + T4 H.264 reference hashes all hold:
    • MPEG-2 +02s: 6e7873030dbf... and ccc7ce08810d...
    • HEVC +02s: 47a5f3850df5... and a467b3bc9d7b...
    • H.264 +30s: f623d5f7... and 7d7bc6f2... iter3 must not regress any prior codec.

A clean iter3 close has all five checks green. Phase 7 → Phase 4 loopback per feedback_dev_process.md if any fail.

Mechanism the question targets

Phase 0 cross-validator sweep (phase0_evidence/2026-05-07/cross_validator_traces.md) established that the kernel + hantro driver path works for VP8: ffmpeg -hwaccel v4l2request -i bbb_720p10s_vp8.webm -frames:v 2 -f null - = exit 0. The broken link is the libva backend at four distinct sites:

  • src/config.c::RequestQueryConfigProfiles — no VP8 enumeration block. Existing blocks for MPEG-2 (lines 119-128), H.264 (lines 130-142), HEVC (lines 144-151) probe v4l2_find_format(...V4L2_PIX_FMT_X_SLICE). iter3 needs an analogous block probing V4L2_PIX_FMT_VP8_FRAME and adding VAProfileVP8Version0_3 to the enumerated list.
  • src/config.c::RequestCreateConfigcase VAProfileVP8Version0_3: doesn't exist. Need to add with a break; (mirroring iter1 + iter2 pattern).
  • src/config.c::RequestQueryConfigEntrypoints — switch case for VAProfileVP8Version0_3 missing; add to the entry-point case list (lines 164-171).
  • src/vp8.c — file doesn't exist in fork. Different from iter2 where src/h265.c already existed (just disabled). iter3 creates the file from scratch.
  • src/meson.build'vp8.c' + 'vp8.h' aren't in the sources/headers lists. iter3 adds them.
  • src/picture.c::codec_set_controls — no VAProfileVP8Version0_3 case. Need to add with dispatch to vp8_set_controls().
  • src/picture.c::codec_store_buffer — no VP8 cases for VAPictureParameterBufferType, VASliceParameterBufferType, VAProbabilityDataBufferType, VAIQMatrixBufferType. Need to add.
  • src/surface.hparams.h264/params.mpeg2/params.h265 union has no VP8 member. Need to add params.vp8 with VAAPI VP8 buffer types.

VP8 contract surface from Phase 0 V4L2 inventory:

hantro-vpu-dec /dev/video5 (boot-dependent path):
  OUTPUT_MPLANE codec_format VP8F (V4L2_PIX_FMT_VP8_FRAME, compressed)
  Stateless control: vp8_frame_parameters 0x00a409c8 (vp8-frame)
                     value=unsupported payload type flags=has-payload

Single control per frame. No DECODE_MODE/START_CODE menus (VP8 has no Annex B equivalent — VP8 frames have their own header format). No SCALING_MATRIX. No DECODE_PARAMS-equivalent (VP8 keeps DPB info in the frame struct itself). Slice_params is single-instance (VP8 is frame-mode, no multi-slice).

Predicted iter3 scope: smaller than iter1 + iter2 — single control struct, no dynamic-array, no conditional batching. Likely closer to iter1's mpeg2.c (~120 lines) or smaller.

Phase 4 plan must cite the contract before patching (feedback_dev_process.md Phase 6 contract-before-code): read kernel drivers/media/platform/verisilicon/hantro_vp8.c, read FFmpeg libavcodec/v4l2_request_vp8.c, read kernel UAPI <linux/v4l2-controls.h> for V4L2_CID_STATELESS_VP8_FRAME and struct v4l2_ctrl_vp8_frame, state the contract explicitly before any code lands.

Predecessor carry-over (iter2 → iter3)

State that carries forward (re-verified at iter3 open)

  • Hardware: fresnel RK3399, kernel linux-eos-arm 6.19.9-99-eos-arm. Custom OC kernel preserved across iter2's pacman -Syu headers-only userland upgrade.
  • hantro-vpu-dec node: device numbering shuffles per boot. iter1 Phase 7 had /dev/video5+/dev/media2; iter2 Phase 7 had /dev/video3+/dev/media1. iter3 binding cells re-verify via v4l2-ctl --info at session start.
  • Decoder formats (hantro-vpu-dec, from Phase 0 v4l2_inventory): OUTPUT_MPLANE = MG2S (MPEG-2) + VP8F (VP8). CAPTURE_MPLANE = NV12.
  • Kernel UAPI: V4L2_CID_STATELESS_VP8_FRAME + struct v4l2_ctrl_vp8_frame available in /usr/include/linux/v4l2-controls.h (verified Phase 0 build smoke; same kernel headers as iter1+iter2).
  • Backend build state: libva-v4l2-request-fourier master tip post-iter2-close. Includes:
    • iter1 commits e7dad7a..229d6d1 (config.c + mpeg2.c + delete include/mpeg2-ctrls.h)
    • iter2 commits cca539d (config.c HEVCMain break) + 8d71e20 (h265.c rewrite + picture.c HEVC dispatch + slice_params accumulation + surface.h slices[64] + context.c HEVC device-init + meson.build h265 re-enable)
    • SHA256 9e27043847998c197a46a1a26b2f77f22880bb7b3a62aa4d60d8fcaec0ae6258 of /usr/lib/dri/v4l2_request_drv_video.so on fresnel.
  • Test fixture: ~/fourier-test/bbb_720p10s_vp8.webm on fresnel (2.4 MB, VP8 Profile 0, 1280×720@24fps yuv420p, 10s). Provenance in phase0_evidence/2026-05-07/test_fixtures.md.
  • Reference fixtures for regression: bbb_1080p30_h264.mp4, bbb_720p10s_mpeg2.ts, bbb_720p10s_hevc.mp4 (all on fresnel).
  • Reference hashes for criterion 5:
    • H.264 (T4) at +30s: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9 + 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8
    • MPEG-2 (iter1) at +02s: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 + ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
    • HEVC (iter2) at +02s: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5 + a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
  • Cross-validator anchor: ffmpeg-v4l2request VP8 trace from phase0_evidence/2026-05-07/cross_validator/vp8/. 10 S_EXT_CTRLS for 2 frames (single VP8 frame control per frame; ~5 init + 5 per-frame? Phase 3 will re-capture verbatim payloads). 4 MEDIA_IOC_REQUEST_ALLOC. 52 ftrace v4l2 events.
  • Cache-safe verify path: mpv --hwdec=vaapi --vo=image (DMA-BUF GL import). Confirmed for H.264 (T4) + MPEG-2 (iter1) + HEVC (iter2) on RK3399; iter3 expects same path to work for VP8.

Data that does NOT carry forward (re-acquire if needed)

  • ohm/RK3568 hantro VP8 behaviour — ohm hantro DOES decode VP8 (per phase0_findings.md ohm context: "hantro (H.264 + MPEG-2 + VP8)") but the libva-multiplanar campaign never tested VP8 end-to-end. Reference history only.
  • Pre-iter1 VP8 trace data — none exists in this fork.

Open questions inherited from iter2 close

  • iter1 B3 latent surface-reuse bug: picture.c:287 params.h264.matrix_set = false writes union byte 240. For VP8 surfaces, byte 240 lands inside the new params.vp8 struct (whatever layout we add). Phase 2 source-read action item: verify whether byte 240 lands in a meaningful VP8 field. If so, the same masking-by-RenderPicture-overwrite mechanism used by iter1+iter2 should hold for VP8 (ffmpeg-vaapi sends VAPictureParameterBufferType per frame for VP8 too, expected).
  • iter2 B6 SPS field-fidelity gap: HEVC-specific (sps_max_num_reorder_pics, etc.); doesn't apply to VP8. VP8 has its own per-frame fidelity surface.
  • Phase 4 cross-cutting backlog (vaDeriveImage cache-stale fix, V4L2 device-discovery, BeginPicture profile-aware reset, context.c log suppression, mpeg2 vbv_buffer_size polish, h265 SPS bitstream-parse fidelity): not iter3 scope; iter3 inherits the same workarounds.

Tooling and measurement-instrument inventory (live verification)

Re-verified at iter3 open (carries forward from iter1+iter2; all proven working):

  • strace -ff -tt -y -v -e trace=ioctl,openat,close — V4L2 + media-request ioctl tracing.
  • LIBVA_TRACE environment variable — vaCreate/vaQuery/vaInitialize call traces.
  • sudo sh -c "echo 1 > /sys/kernel/tracing/events/v4l2/enable" — kernel v4l2 tracepoints.
  • mpv --hwdec=vaapi --vo=image — cache-coherency-safe pixel verifier.
  • ffmpeg -hwaccel v4l2request — independent V4L2 client cross-validator.
  • Backend build: ninja -C ~/src/libva-v4l2-request-fourier/build && sudo ninja -C build install.
  • Empirical VP8 decode of bbb_720p10s_vp8.webm via ffmpeg-v4l2request DRM_PRIME path — proven in cross-validator sweep (Phase 0).
  • gcc test-compile for VAAPI field-availability checks — per memory/feedback_review_empirical_over_theoretical.md Direction 2 protocol.

In-scope (LOCKED 2026-05-08 for iteration 3)

  • libva-v4l2-request-fourier backend VP8 path on hantro-vpu-dec.
  • src/config.c — ADD VP8 enumeration block in RequestQueryConfigProfiles; ADD case VAProfileVP8Version0_3: with break; in RequestCreateConfig; ADD case in RequestQueryConfigEntrypoints. Mirrors iter1+iter2 patterns + adds the enumerator block (different from iter1+iter2 which had existing-but-broken case labels).
  • src/picture.c::codec_set_controls — ADD VAProfileVP8Version0_3 dispatch to vp8_set_controls().
  • src/picture.c::codec_store_buffer — ADD VP8 cases for VAPictureParameterBufferType, VASliceParameterBufferType, VAProbabilityDataBufferType, VAIQMatrixBufferType (the 4 VP8-specific buffer types VAAPI consumers send).
  • src/vp8.c — NEW file. Implements vp8_set_controls() against V4L2_CID_STATELESS_VP8_FRAME + struct v4l2_ctrl_vp8_frame. Single control per frame; not batched (just one CID).
  • src/vp8.h — NEW file. Declares vp8_set_controls().
  • src/surface.h — extend params union with params.vp8 struct holding VAAPI VP8 buffer types + iqmatrix_set flag.
  • src/meson.build — add 'vp8.c' + 'vp8.h' to sources/headers lists.
  • iter3 binding-cell test harness: re-run iter1's Phase 7 5-criterion shape with VP8 fixture substituted + 3-codec regression block.
  • Cache-safe pixel verify uses DMA-BUF GL import (memory rule).

Out-of-scope (LOCKED 2026-05-08 for iteration 3)

  • VP9 (iter4 — separate iteration; rkvdec, NOT hantro).
  • VP8 decode_mode menu selection (kernel only exposes single-mode for VP8 per Phase 0; no choice).
  • Performance metrics.
  • Long-duration VP8 stress (>10s).
  • Phase 4 cross-cutting backlog items (B1 device-discovery, B3 BeginPicture profile-aware reset, B4 context.c log suppression, B5 vbv_buffer_size, B6 SPS fidelity, L3 vaDeriveImage cache-stale).
  • chromium-fourier 149 install on fresnel.
  • src/context.c — no changes (VP8 has no DECODE_MODE/START_CODE menus per Phase 0 inventory; nothing to add to device-init block).
  • WebRTC-specific VP8 features (temporal layers, simulcast) — out of campaign codec scope.
  • Upstream Linux engagement.

Phase 1 success criterion (LOCKED 2026-05-08)

The five Pass/fail bullets at the top of this document are the iter3 success criterion. Phase 3 baseline measurements feed Phase 4 plan; Phase 7 verification re-runs all five against the patched backend.

If Phase 3 baseline reveals the chosen criterion is the wrong target (Phase 3 → Phase 1 loopback per feedback_dev_process.md), the criterion will be rewritten and re-locked. Plausible reasons that would trigger the loopback:

  • VP8 fixture is malformed in a way that exposes a fixture-side bug rather than a backend-side bug. (Mitigation: ffmpeg-v4l2request decoded the fixture cleanly per Phase 0 cross-validator; unlikely.)
  • VAAPI's VP8 buffer types include something not exposed by mpv-vaapi consumer chain (e.g., probability buffer not sent, requiring iter3 to derive from bitstream). Phase 3 baseline LIBVA_TRACE will surface.
  • mpv --hwdec=vaapi (DMA-BUF) filters VP8 out (analogous to MPEG-2 vaapi-copy filter from iter1). Mitigation: fall-forward to ffmpeg-v4l2request DRM_PRIME path for criterion 4 verification.
  • Hantro VP8 quirk on RK3399: silicon supports VP8 but kernel's hantro_vp8 driver may have RK3399-specific assumptions different from RK3568. Phase 0 cross-validator decoded the fixture, so this is unlikely.

The other four Phase 1 criteria hold as locked.

Phase 2 source-read targets

For the upcoming Phase 2 situation analysis:

  • src/config.cRequestQueryConfigProfiles (lines 112-156): pattern for adding VP8 enumeration block; RequestCreateConfig (lines 45-95): pattern for adding case VAProfileVP8Version0_3: with break;; RequestQueryConfigEntrypoints (lines 158-182): adding to entry-point case list.
  • src/picture.c::codec_set_controls (lines 178-213): pattern for VP8 dispatch (mirror MPEG-2 dispatch shape from iter1).
  • src/picture.c::codec_store_buffer (lines 54-176): patterns for adding VP8 cases per VAAPI buffer type. VAAPI VP8 sends 4 distinct buffer types per frame (not iter1's 2 or iter2's 4-with-array).
  • src/surface.h (lines 89-110): params union pattern for adding vp8 member.
  • VAAPI <va/va_dec_vp8.h> — VAAPI VP8 buffer struct definitions: VAPictureParameterBufferVP8, VASliceParameterBufferVP8, VAProbabilityDataBufferVP8, VAIQMatrixBufferVP8.
  • Kernel UAPI <linux/v4l2-controls.h>V4L2_CID_STATELESS_VP8_FRAME + struct v4l2_ctrl_vp8_frame + sub-structs (segment, lf, quant, entropy, coder_state).
  • Linux mainline kernel drivers/media/platform/verisilicon/hantro_vp8.c — hantro-vpu-dec VP8 driver source (probably available; verify in Phase 2).
  • FFmpeg downstream libavcodec/v4l2_request_vp8.c — independent V4L2 client implementation. Submission shape, per-frame field mapping.

What "iteration 3 close" looks like

A clean iter3 close per feedback_dev_process.md Phase 8:

  • All 5 Phase 1 criteria green.
  • phase8_iteration3_close.md summarizing the bug, contract, fix, binding-cell numbers.
  • Fourth-codec passing on the campaign-level scoreboard: 3/5 → 4/5.
  • Memory entries distilled for any new lessons (likely none — iter3 expected to be a smaller iter1-shape, no novel constructs).
  • Debug-instrumentation sweep at close.
  • Phase 5 sonnet-architect review pass signed off.
  • Commits all authored as claude-noether per memory feedback_gitea_as_claude_noether.md.
  • src/vp8.c + src/vp8.h added to repo, enabled in meson.build.

Predicted iter3 difficulty vs iter1+iter2: smaller than both. iter1 was 3 controls + 1 file rewrite + 1 file delete; iter2 was 5 controls + 1 file rewrite + slice_params dynamic-array + surface.h extension. iter3 is 1 control + 1 new file + simpler structure, but does require ADD-rather-than-modify in config.c (4 sites: enumerator block, CreateConfig case, QueryConfigEntrypoints case, plus 4 buffer-type cases in picture.c).

Net file count similar to iter2 (6-7 modified files) but each modification is smaller. Predicted h265.c-equivalent at ~150-250 lines for vp8.c (vs h265.c's 588).

If Phase 7 misses a check, most likely culprits:

  1. Probability buffer mishandling: VAAPI sends VAProbabilityDataBufferVP8 separately from picture/slice; iter3 must wire it into the frame control. FFmpeg v4l2_request_vp8.c is the canonical pattern.
  2. Quantization buffer mishandling: VAIQMatrixBufferVP8 contains the quant indices; iter3 must populate v4l2_ctrl_vp8_frame.quant.* fields.
  3. Frame header parsing: VP8 has a 3-byte uncompressed frame header (frame_type, version, show_frame, first_part_size). Some fields may need to be parsed from the slice bitstream similar to HEVC NAL header.
  4. DPB references: VP8 has 3 reference frames (last, golden, altref). VAAPI exposes them in VAPictureParameterBufferVP8; iter3 maps to v4l2_ctrl_vp8_frame's last_frame_ts, golden_frame_ts, alt_frame_ts.
  5. mpv-vaapi filtering: mpv may silently filter VP8 out of vaapi/vaapi-copy candidates. Phase 3 baseline confirms; criterion 4 falls forward to ffmpeg-vaapi+hwdownload if so (BUT that path returns cache-stale zeros on RK3399 per memory — would need ffmpeg-v4l2request DRM_PRIME instead).

Phase 5 review will catch most of these in advance per iter1+iter2 precedent.