Files
fresnel-fourier/phase2_iter3_situation.md
claude-noether 656596aa6b iter3 Phase 5: sonnet review — 4 Critical findings, 4 amendments
Second-model review by sonnet-architect found 4 Critical bugs in
Phase 4 plan, all verified empirically by author before incorporation
per memory feedback_review_empirical_over_theoretical Direction 2.
Amendments applied in-place to phase4_iter3_plan.md +
phase2_iter3_situation.md.

Critical findings:

  C1 first_part_header_bits = 0 was claimed cosmetic; actually
     UNSAFE. hantro_g1_vp8_dec.c:260 + rockchip_vpu2_hw_vp8_dec.c:372
     both read this field unconditionally to compute the macroblock
     DMA offset. Setting 0 would place hardware at wrong DMA offset
     for ALL macroblock data → garbage decode.
     Fix: frame.first_part_header_bits = slice->macroblock_offset
     (verified by source identity — vaapi_vp8.c:204 and
     v4l2_request_vp8.c:83 use byte-identical formulas).

  C2 first_part_size = slice->partition_size[0] was wrong; VAAPI's
     partition_size[0] is the REMAINING bytes after parsing
     (vaapi_vp8.c:209 confirms; va_dec_vp8.h:193-196 spec confirms).
     Kernel needs the TOTAL control partition size.
     Fix: frame.first_part_size = slice->partition_size[0] +
                                  ((macroblock_offset + 7) / 8)
     Phase 3 keyframe numerics confirm: 21923 + 819 = 22742 ✓.

  C3 VAProbabilityDataBufferType does not exist as a buffer-type
     enum; it's the struct name. The actual enum constant is
     VAProbabilityBufferType (= 13 per va.h:2058). Switch case
     using the wrong identifier would have failed Phase 6 compile.
     Fix: replace globally in phase2 + phase4 docs.

  C4 (s8) cast undefined in userspace. Kernel has 's8' typedef in
     linux/types.h (kernel-internal). UAPI exposes '__s8' (double-
     underscore). Userspace portable cast is int8_t from <stdint.h>.
     Fix: replace (s8) with (int8_t) in Clauses 6+7.

Suggested:

  S3 Clause 8 comment was factually wrong: hantro_vp8.c::
     hantro_vp8_prob_update reads coeff_probs unconditionally;
     there is NO default-table fallback. If probability_set==false,
     decode produces garbage. Practical risk low (FFmpeg vaapi_vp8.c
     always sends VAProbabilityBufferType per frame), but corrected
     comment + added assert(probability_set) runtime guard for
     immediate Phase 6 surfacing.

Plus 5 minor S/Q items documented; non-blocking for iter3.

Author's 7 review questions all answered directly in the review:
  Q1 quantization derivation: correct for typical content
  Q2 first_part_header_bits=0 safety: UNSAFE → C1
  Q3 num_dct_parts off-by-one: confirmed correct
  Q4 field availability: 2 compile failures found (C3 + C4)
  Q5 quant_update[s] semantics: signed delta confirmed
  Q6 SHOW_FRAME unconditional: safe for BBB scope
  Q7 buffer order independence: confirmed

Estimated saving: 1 Phase 6 → Phase 4 loopback + 2 Phase 6 fix-
forward commits. Review pass is the right path forward per memory
rule "Reviews are never skippable" — empty-review value =
empirical-verification value, regardless of finding count.

Refs:
  phase4_iter3_plan.md (amended in-place; Phase 5 amendments
                         section appended)
  phase2_iter3_situation.md (amended C3 globally)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:27:53 +00:00

27 KiB

Iteration 3 — Phase 2 (situation analysis)

Source-read of every file the iter3 patch series will touch, plus the kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference sources. Written immediately after iter3 Phase 1 lock (commit ea2413e). Conducted on noether against fork tip 8d71e20 (iter2 Phase 6 commit B); fresnel.vpn was unreachable at Phase 2 open, so the read is against the noether mirror — verified at commit hash level pre-read.

This is a contract-before-code analysis per feedback_dev_process.md Phase 2: enumerate the bugs, cite the contract verbatim, predict the patch shape, queue the Phase 3 baseline questions.

Bug enumeration (sites the iter3 patch series must touch)

B1 — src/config.c::RequestQueryConfigProfiles — VP8 enumeration block missing

Site: config.c:121-165.

Current state (lines 128-160): three enumeration blocks for MPEG-2 (lines 128-137), H.264 (139-151), HEVC (153-160). Each v4l2_find_format()'s the OUTPUT-side pixfmt against both single-plane and MPLANE buffer types, then conditionally appends profile constants to the output array under a count guard.

Bug: no analogous block for V4L2_PIX_FMT_VP8_FRAMEVAProfileVP8Version0_3. Without this, vainfo (and any consumer that calls vaQueryConfigProfiles) sees no VP8 profile in the enumeration → criterion 1 fails before vaCreateConfig is ever attempted.

Different from iter1+iter2: iter1 (MPEG-2) and iter2 (HEVC) had the enumeration block already in place pre-iter; only the case label fall-through in RequestCreateConfig was missing. iter3 has neither. Both ADDs.

B2 — src/config.c::RequestCreateConfig — VP8 case label missing entirely

Site: config.c:54-78.

Current state: switch over profile. iter1 added case VAProfileMPEG2Simple/Main: with explicit break; (lines 63-69). iter2 added case VAProfileHEVCMain: with break; (lines 70-75). H.264 always existed (lines 56-62, marked // FIXME from upstream). Default → VA_STATUS_ERROR_UNSUPPORTED_PROFILE.

Bug: no case VAProfileVP8Version0_3:. Hits default → consumer gets VA_STATUS_ERROR_UNSUPPORTED_PROFILE from vaCreateConfig → criterion 2 fails.

Patch shape: add 4-line case (label + comment + break;) directly after the iter2 HEVCMain block, mirroring iter1+iter2 style.

B3 — src/config.c::RequestQueryConfigEntrypoints — VP8 case missing

Site: config.c:167-191.

Current state: switch over profile; case list at lines 173-180 covers MPEG-2/H.264/HEVC and falls through to entrypoints[0] = VAEntrypointVLD; *entrypoints_count = 1;. Default sets count to 0.

Bug: no case VAProfileVP8Version0_3:. mpv-vaapi's profile probe queries entry points; without VLD, it skips VP8 → criterion 3 fails (mpv falls through to SW decode silently).

Patch shape: add case VAProfileVP8Version0_3: to the existing fall-through case list.

B4 — src/vp8.c — file does not exist; needs net-new implementation

Site: NEW FILE src/vp8.c.

Bug: there is no VP8 codec dispatcher in the fork. The fork's predecessor (libva-v4l2-request bootlin master) only implements MPEG-2 + H.264 + HEVC. VP8 was never added upstream.

Patch shape: NEW file, ~150-200 lines. Mirror the iter1 mpeg2.c template (src/mpeg2.c:53-249):

  • Includes block (mpeg2.h-equivalent + context + request + surface + v4l2-controls)
  • vp8_set_controls() function entry point matching the existing dispatcher signature (struct request_data *driver_data, struct object_context *context_object, struct object_surface *surface_object) -> int
  • Local v4l2_ctrl_vp8_frame struct populated from VAAPI buffers (Picture + IQMatrix + Probability + Slice param)
  • DPB-timestamp lookup for last_frame_ts/golden_frame_ts/alt_frame_ts from VASurfaceID references in VAPictureParameterBufferVP8
  • One-element v4l2_ext_control array, single V4L2_CID_STATELESS_VP8_FRAME control
  • Single v4l2_set_controls(driver_data->video_fd, surface_object->request_fd, ctrls, 1) call

B5 — src/vp8.h — header does not exist

Site: NEW FILE src/vp8.h.

Bug: companion header for vp8.c. Declare vp8_set_controls(). Mirror src/mpeg2.h (forward declarations of request_data, object_context, object_surface, function prototype). No struct definitions needed (no array dimensions to declare like HEVC's HEVC_MAX_SLICES_PER_FRAME).

B6 — src/picture.c::codec_set_controls — VP8 dispatch case missing

Site: picture.c:188-225 (function codec_set_controls).

Current state: switch over profile; MPEG-2 → mpeg2_set_controls (lines 196-201), H.264 → h264_set_controls (203-212), HEVCMain → h265_set_controls (214-218). Default → VA_STATUS_ERROR_UNSUPPORTED_PROFILE.

Bug: no VP8 case. Hits default after RequestEndPicture → vaEndPicture returns error → consumer aborts decode.

Patch shape: add case VAProfileVP8Version0_3: calling vp8_set_controls(driver_data, context_object, surface_object) with same if (rc < 0) return VA_STATUS_ERROR_OPERATION_FAILED; shape as MPEG-2 + HEVC.

Plus include directive update: add #include "vp8.h" near picture.c:34-36 (the existing h264.h/h265.h/mpeg2.h block).

B7 — src/picture.c::codec_store_buffer — 4 VAAPI buffer types unmapped

Site: picture.c:54-186 (function codec_store_buffer).

VAAPI VP8 sends FOUR distinct per-frame buffer types (per va_dec_vp8.h:71-241):

VAAPI buffer type VAAPI struct Per-frame
VAPictureParameterBufferType VAPictureParameterBufferVP8 once
VASliceParameterBufferType VASliceParameterBufferVP8 once (frame-mode)
VAProbabilityBufferType VAProbabilityDataBufferVP8 once
VAIQMatrixBufferType VAIQMatrixBufferVP8 once
VASliceDataBufferType raw bitstream once

Current state:

  • VASliceDataBufferType (lines 61-83) — already universal, no per-profile branch. context->h264_start_code flag prepends 00 00 01 for H.264 only; VP8 does not need start-code prefix (VP8 has its own 3-byte uncompressed frame header). The slice-data path is fine for VP8 unmodified.
  • VAPictureParameterBufferType (lines 85-113) — switch over profile; MPEG-2/H.264/HEVC handled. Default → break (silent ignore). Bug: no VP8 case.
  • VASliceParameterBufferType (lines 115-146) — switch; H.264/HEVC handled. Bug: no MPEG-2 case (intentional — MPEG-2 has only Picture + Quant + Slice-data per VAAPI), no VP8 case.
  • VAIQMatrixBufferType (lines 148-179) — switch; MPEG-2/H.264/HEVC handled. Bug: no VP8 case.
  • VAProbabilityBufferType — NOT IN THE OUTER SWITCH. VAAPI defines this enum value for VP8, but the fork's codec_store_buffer outer switch doesn't list it. Currently falls through to default: break; at line 181. Bug: VAProbabilityBufferType case missing entirely.

Patch shape: 4 nested case adds + 1 outer-case add:

  • VAPictureParameterBufferType → add VP8 case → memcpy into surface_object->params.vp8.picture
  • VASliceParameterBufferType → add VP8 case → memcpy into surface_object->params.vp8.slice (single, no slices[] array — VP8 is frame-mode)
  • VAIQMatrixBufferType → add VP8 case → memcpy into surface_object->params.vp8.iqmatrix + set iqmatrix_set true
  • NEW outer case VAProbabilityBufferType → switch over profile → VP8 case → memcpy into surface_object->params.vp8.probability + set probability_set true

B8 — src/picture.c::RequestBeginPicture — no per-frame VP8 reset needed (probably)

Site: picture.c:227-306.

iter1 added surface_object->params.h264.matrix_set = false; at line 299. iter2 added surface_object->params.h265.num_slices = 0; at line 300.

Bug analysis: VP8 has no slice-array (single per-frame). It does have a probability-data flag (probability_set) that needs reset per frame. AND iqmatrix_set needs per-frame reset.

Patch shape: add two lines:

  • surface_object->params.vp8.iqmatrix_set = false;
  • surface_object->params.vp8.probability_set = false;

This mirrors iter1's matrix_set = false reset pattern (one line each profile).

B9 — src/surface.h::object_surface::params union — no vp8 member

Site: surface.h:92-113.

Current state: union of three structs: mpeg2, h264, h265. Each holds the buffer-type structs the dispatcher reads.

Bug: no vp8 member. iter1 B3 latent surface-reuse bug (per phase0_findings_iter3.md): picture.c:299 writes byte 240 of the union (h264.matrix_set offset). The iter2 union is dominated by h265 with its 64-slot slices[64] array; total union size ~17 KB. Adding a vp8 member doesn't grow the union (h265 is the dominant member by far).

Patch shape: add vp8 struct after h265:

struct {
    VAPictureParameterBufferVP8 picture;
    VASliceParameterBufferVP8 slice;
    VAIQMatrixBufferVP8 iqmatrix;
    bool iqmatrix_set;
    VAProbabilityDataBufferVP8 probability;
    bool probability_set;
} vp8;

B10 — src/meson.buildvp8.c + vp8.h not in sources/headers

Site: meson.build:30-74.

Current state: sources list has mpeg2.c/h264.c/h264_slice_header.c/h265.c (line 50, uncommented in iter2). headers list has mpeg2.h/h264.h/h264_slice_header.h/h265.h (line 73).

Bug: vp8.c + vp8.h are NEW files, must be ADDED.

Patch shape: insert 'vp8.c' after 'h265.c' in sources, insert 'vp8.h' after 'h265.h' in headers.

Non-bugs (intentionally NOT touched)

  • src/context.c — VP8 has no DECODE_MODE/START_CODE menus per Phase 0 V4L2 inventory. iter2's HEVC additions to context.c have no analog. No context.c changes.
  • src/video.c::formats[] — the format list is CAPTURE-side (NV12 + Sunxi NV12). VP8 is OUTPUT-side; OUTPUT format probing is v4l2_find_format() calls in config.c, NOT video.c. No video.c changes.
  • src/v4l2.cv4l2_find_format() is fourcc-agnostic. No v4l2.c changes.
  • src/buffer.cVAProbabilityBufferType is a standard VAAPI buffer type; the buffer registry is type-agnostic. No buffer.c changes.
  • include/hevc-ctrls.h — already a 9-line shim including <linux/v4l2-controls.h>. VP8's V4L2_CID_STATELESS_VP8_FRAME is in the same kernel UAPI header (line 1900). No header-shim work like iter1's mpeg2-ctrls.h deletion.

Contract surface (verbatim from kernel UAPI + VAAPI)

Kernel UAPI: V4L2_CID_STATELESS_VP8_FRAME

<linux/v4l2-controls.h>:1900V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE + 200 = 0x00a409c8. Matches the per-device control advertised by hantro-vpu-dec in Phase 0 V4L2 inventory (vp8_frame_parameters 0x00a409c8).

Kernel UAPI: struct v4l2_ctrl_vp8_frame (<linux/v4l2-controls.h>:1929-1958)

struct v4l2_ctrl_vp8_frame {
    struct v4l2_vp8_segment segment;             /* offset 0 */
    struct v4l2_vp8_loop_filter lf;              /* loop filter parameters */
    struct v4l2_vp8_quantization quant;          /* base quant indices */
    struct v4l2_vp8_entropy entropy;             /* update probabilities */
    struct v4l2_vp8_entropy_coder_state coder_state;

    __u16 width;
    __u16 height;

    __u8 horizontal_scale;
    __u8 vertical_scale;

    __u8 version;
    __u8 prob_skip_false;
    __u8 prob_intra;
    __u8 prob_last;
    __u8 prob_gf;
    __u8 num_dct_parts;

    __u32 first_part_size;
    __u32 first_part_header_bits;
    __u32 dct_part_sizes[8];

    __u64 last_frame_ts;
    __u64 golden_frame_ts;
    __u64 alt_frame_ts;

    __u64 flags;
};

Sub-structs (<linux/v4l2-controls.h>:1785-1888):

  • v4l2_vp8_segment: __s8 quant_update[4]; __s8 lf_update[4]; __u8 segment_probs[3]; __u8 padding; __u32 flags; (segment-id probabilities, per-segment quant/lf overrides, flags V4L2_VP8_SEGMENT_FLAG_{ENABLED, UPDATE_MAP, UPDATE_FEATURE_DATA, DELTA_VALUE_MODE})
  • v4l2_vp8_loop_filter: __s8 ref_frm_delta[4]; __s8 mb_mode_delta[4]; __u8 sharpness_level; __u8 level; __u16 padding; __u32 flags; (flags V4L2_VP8_LF_{ADJ_ENABLE, DELTA_UPDATE, FILTER_TYPE_SIMPLE})
  • v4l2_vp8_quantization: __u8 y_ac_qi; __s8 y_dc_delta; __s8 y2_dc_delta; __s8 y2_ac_delta; __s8 uv_dc_delta; __s8 uv_ac_delta; __u16 padding; — base values; per-segment overrides come from segment.quant_update[]
  • v4l2_vp8_entropy: __u8 coeff_probs[4][8][3][11]; __u8 y_mode_probs[4]; __u8 uv_mode_probs[3]; __u8 mv_probs[2][19]; __u8 padding[3]; — probability update tables
  • v4l2_vp8_entropy_coder_state: __u8 range; __u8 value; __u8 bit_count; __u8 padding; — boolean coder state at end of header

Frame flags (<linux/v4l2-controls.h>:1890-1895):

  • V4L2_VP8_FRAME_FLAG_KEY_FRAME = 0x01
  • V4L2_VP8_FRAME_FLAG_EXPERIMENTAL = 0x02
  • V4L2_VP8_FRAME_FLAG_SHOW_FRAME = 0x04
  • V4L2_VP8_FRAME_FLAG_MB_NO_SKIP_COEFF = 0x08
  • V4L2_VP8_FRAME_FLAG_SIGN_BIAS_GOLDEN = 0x10
  • V4L2_VP8_FRAME_FLAG_SIGN_BIAS_ALT = 0x20

VAAPI buffer types (/home/mfritsche/src/ohm_gl_fix/phase6/step1/reference/libva/va/va_dec_vp8.h)

VAPictureParameterBufferVP8 (lines 71-160):

  • frame_width, frame_height (u32)
  • last_ref_frame, golden_ref_frame, alt_ref_frame, out_of_loop_frame (VASurfaceID)
  • pic_fields.bits.{key_frame, version, segmentation_enabled, update_mb_segmentation_map, update_segment_feature_data, filter_type, sharpness_level, loop_filter_adj_enable, mode_ref_lf_delta_update, sign_bias_golden, sign_bias_alternate, mb_no_coeff_skip, loop_filter_disable} (packed bitfield)
  • mb_segment_tree_probs[3] (u8)
  • loop_filter_level[4], loop_filter_deltas_ref_frame[4], loop_filter_deltas_mode[4] (per-segment / per-ref / per-mode)
  • prob_skip_false, prob_intra, prob_last, prob_gf (u8)
  • y_mode_probs[4], uv_mode_probs[3] (u8 — luma + chroma intra-prediction probs)
  • mv_probs[2][19] (u8)
  • bool_coder_ctx.{range, value, count} (u8 — same bytes as kernel v4l2_vp8_entropy_coder_state minus padding)

VASliceParameterBufferVP8 (lines 170-202):

  • slice_data_size, slice_data_offset, slice_data_flag, macroblock_offset (u32)
  • num_of_partitions (u8)
  • partition_size[9] (u32) — partition_size[0] is control-partition remaining bytes; partition_size[1..8] are DCT partition sizes (max 8 DCT partitions per VP8 spec)

VAProbabilityDataBufferVP8 (lines 218-223):

  • dct_coeff_probs[4][8][3][11] (u8) — direct match to kernel v4l2_vp8_entropy.coeff_probs

VAIQMatrixBufferVP8 (lines 232-241):

  • quantization_index[4][6] (u16) — per-segment, per-component effective Q index. Component order: yac(0), ydc(1), y2dc(2), y2ac(3), uvdc(4), uvac(5). Already includes per-segment effective values.

FFmpeg downstream reference (v4l2_request_vp8.c:31-187)

Submission shape: single batched S_EXT_CTRLS at end_frame, count=1, V4L2_CID_STATELESS_VP8_FRAME with full v4l2_ctrl_vp8_frame struct. No init-time device-wide menus (no DECODE_MODE/START_CODE for VP8 — confirmed by absence in FFmpeg ref + Phase 0 V4L2 inventory).

Bitstream is appended verbatim (v4l2_request_vp8_decode_slice calls ff_v4l2_request_append_output(buffer, size) once per frame with the WHOLE VP8 frame including 3-byte uncompressed header). NO Annex-B start codes, NO start-code emulation prevention. The kernel hantro driver re-parses the 3-byte (or 10-byte for keyframe) uncompressed header.

Kernel hantro driver reference (hantro_vp8.c:49-143)

hantro_vp8_prob_update() reads:

  • hdr->prob_skip_false, hdr->prob_intra, hdr->prob_last, hdr->prob_gf
  • hdr->segment.segment_probs[0..2]
  • hdr->entropy.{y_mode_probs[4], uv_mode_probs[3], mv_probs[2][19], coeff_probs[4][8][3][11]}

The kernel does NOT read hdr->coder_state.padding or quant.padding or lf.padding — they're zeroed by struct designation initializer in C. All padding fields must be left zero in the libva backend (matches FFmpeg ref, which uses C99 designated init defaulting all unset fields to zero).

Mapping table (VAAPI → V4L2 / kernel)

The libva backend's job: read VAAPI's per-frame buffers (Picture + Slice + Probability + IQMatrix) and write the kernel's v4l2_ctrl_vp8_frame. The VAAPI consumer (mpv/ffmpeg-vaapi) has already parsed the bitstream — the libva backend is field-shuffling only, no bitstream parsing.

Kernel field VAAPI source Notes
width, height picture->frame_width, frame_height u32 → u16, both ≤65535 within campaign codec scope (1920 max)
version picture->pic_fields.bits.version 3-bit field
horizontal_scale, vertical_scale 0, 0 VAAPI doesn't expose; FFmpeg ref also hardcodes 0
prob_skip_false picture->prob_skip_false direct
prob_intra picture->prob_intra direct
prob_last picture->prob_last direct
prob_gf picture->prob_gf direct
num_dct_parts slice->num_of_partitions - 1 VAAPI's count includes control partition; kernel's excludes (per-spec). Verify against Phase 3 trace.
first_part_size slice->partition_size[0] control-partition size
first_part_header_bits DERIVED — see below not in VAAPI directly
dct_part_sizes[0..7] slice->partition_size[1..8] shift by 1 to skip control partition
last_frame_ts DPB lookup picture->last_ref_frame VASurfaceID → object_surface->timestamp → v4l2_timeval_to_ns() (mirror mpeg2.c::pic.forward_ref_ts pattern)
golden_frame_ts DPB lookup picture->golden_ref_frame same as above
alt_frame_ts DPB lookup picture->alt_ref_frame same as above
flags & KEY_FRAME picture->pic_fields.bits.key_frame == 0 VAAPI inverts — VP8 spec says key_frame=0 means key-frame
flags & SHOW_FRAME not in VAAPI force 1 (mpv only renders shown frames; alt-ref invisible frames are also shown=1 to mpv consumer side; safe to force)
flags & MB_NO_SKIP_COEFF picture->pic_fields.bits.mb_no_coeff_skip direct
flags & SIGN_BIAS_GOLDEN picture->pic_fields.bits.sign_bias_golden direct
flags & SIGN_BIAS_ALT picture->pic_fields.bits.sign_bias_alternate direct
flags & EXPERIMENTAL 0 VAAPI doesn't expose; FFmpeg uses s->profile & 0x4 which has no VAAPI analog. Leave 0.
coder_state.range picture->bool_coder_ctx.range direct
coder_state.value picture->bool_coder_ctx.value direct
coder_state.bit_count picture->bool_coder_ctx.count VAAPI calls it count
lf.sharpness_level picture->pic_fields.bits.sharpness_level direct
lf.level picture->loop_filter_level[0] base level (segment 0); VAAPI exposes per-segment, kernel takes base only
lf.ref_frm_delta[0..3] picture->loop_filter_deltas_ref_frame[0..3] direct
lf.mb_mode_delta[0..3] picture->loop_filter_deltas_mode[0..3] direct
lf.flags & ADJ_ENABLE picture->pic_fields.bits.loop_filter_adj_enable direct
lf.flags & DELTA_UPDATE picture->pic_fields.bits.mode_ref_lf_delta_update direct
lf.flags & FILTER_TYPE_SIMPLE picture->pic_fields.bits.filter_type VAAPI: filter_type=0 normal, =1 simple
quant.y_ac_qi iqmatrix->quantization_index[0][0] segment 0, yac component
quant.y_dc_delta iqmatrix->quantization_index[0][1] - iqmatrix->quantization_index[0][0] u8 - u8 → s8 (clamp)
quant.y2_dc_delta iqmatrix->quantization_index[0][2] - iqmatrix->quantization_index[0][0] same
quant.y2_ac_delta iqmatrix->quantization_index[0][3] - iqmatrix->quantization_index[0][0] same
quant.uv_dc_delta iqmatrix->quantization_index[0][4] - iqmatrix->quantization_index[0][0] same
quant.uv_ac_delta iqmatrix->quantization_index[0][5] - iqmatrix->quantization_index[0][0] same
segment.quant_update[s] for s∈[1..3]: iqmatrix->quantization_index[s][0] - iqmatrix->quantization_index[0][0] if segmentation enabled, else 0 when segmentation_enabled=0 (BBB case), all quant_updates are 0 — bypass the per-segment math
segment.lf_update[s] for s∈[1..3]: picture->loop_filter_level[s] - picture->loop_filter_level[0] if segmentation enabled, else 0 same
segment.segment_probs[0..2] picture->mb_segment_tree_probs[0..2] direct
segment.flags & ENABLED picture->pic_fields.bits.segmentation_enabled direct
segment.flags & UPDATE_MAP picture->pic_fields.bits.update_mb_segmentation_map direct
segment.flags & UPDATE_FEATURE_DATA picture->pic_fields.bits.update_segment_feature_data direct
segment.flags & DELTA_VALUE_MODE NOT in VAAPI directly VAAPI doesn't expose abs_delta. Per VP8 spec default, segment values are deltas unless explicitly absolute — the FFmpeg ref sets DELTA_VALUE_MODE iff !s->segmentation.absolute_vals. For BBB (segmentation disabled), this flag's value is irrelevant. Leave 0; document the gap for Phase 5 review.
entropy.y_mode_probs[0..3] picture->y_mode_probs[0..3] direct
entropy.uv_mode_probs[0..2] picture->uv_mode_probs[0..2] direct
entropy.mv_probs[i][j] picture->mv_probs[i][j] direct, [2][19] both sides
entropy.coeff_probs[i][j][k][l] probability->dct_coeff_probs[i][j][k][l] DIFFERENT BUFFER — sourced from VAProbabilityDataBuffer not Picture. Direct shape match [4][8][3][11].

first_part_header_bits derivation

This field is a kernel-imposed metadata about the bitstream: number of bits consumed by the uncompressed header partition before the boolean coder takes over. FFmpeg derives it from internal parser state:

.first_part_header_bits = (8 * (s->coder_state_at_header_end.input - data) -
                           s->coder_state_at_header_end.bit_count - 8),

VAAPI does not expose this directly. Open question for Phase 3 baseline: derive from slice->macroblock_offset (bit offset of MB layer from start of slice data) — likely equal or off by a known constant. Phase 3 captures the verbatim payload from ffmpeg-v4l2request and computes the relationship.

If the kernel ignores first_part_header_bits (some drivers do — hantro re-parses), the field can be left zero or approximate. Phase 5 review will flag this as a known fidelity gap.

Patch shape prediction

Site Action LOC delta
src/config.c:121-160 INSERT VP8 enumeration block (~10 lines) +10
src/config.c:54-78 INSERT case label + break + comment (~5 lines) +5
src/config.c:167-191 INSERT case label (~1 line) +1
src/vp8.c NEW FILE +160-220
src/vp8.h NEW FILE +35-45
src/picture.c:34-36 INSERT #include "vp8.h" +1
src/picture.c:188-225 INSERT VP8 dispatch case (~6 lines) +6
src/picture.c:54-186 INSERT 4 nested cases + 1 outer case +30-40
src/picture.c:299-300 INSERT 2 reset lines +2
src/surface.h:92-113 INSERT vp8 struct (~8 lines) +8
src/meson.build:50,73 INSERT 2 entries +2

Total: ~260-340 LOC across 6 modified files + 2 new files. Compared to iter1 (~120 LOC, 4 modified + 0 new + 1 deleted) and iter2 (~470 LOC, 5 modified + 0 new + 0 deleted), iter3 is medium-sized — the new file dominates. The dispatcher additions in picture.c + config.c are mechanical ports of iter1+iter2 patterns.

Open questions for Phase 3 baseline

The Phase 3 baseline run will capture verbatim S_EXT_CTRLS payloads from ffmpeg -hwaccel v4l2request bbb_720p10s_vp8.webm (cross-validator anchor). Questions to answer empirically before Phase 4 plan locks:

  1. first_part_header_bits exact value: capture for frame 1 (key) and frame 2 (inter). Compare against slice->macroblock_offset from a parallel vainfo --vbo-equivalent capture.
  2. num_dct_parts vs num_of_partitions: confirm off-by-one (kernel excludes, VAAPI includes control partition). Verify dct_part_sizes[] indexing.
  3. DPB timestamp lookup: confirm v4l2_timeval_to_ns(picture->last_ref_frame's surface_object->timestamp) matches what the kernel hantro driver reads. Any 0-sentinel for missing refs? (FFmpeg leaves zero for missing refs by C99 designated init.)
  4. show_frame handling: VAAPI doesn't expose. Force 1 vs derive — which matches the kernel's expectation? (BBB has no alt-ref invisible frames; both options should work for the binding cell, but verify.)
  5. lf.flags FILTER_TYPE_SIMPLE bit: VAAPI's filter_type=1 means simple. Confirm against bitstream baseline.
  6. First-frame DPB sentinel: when picture->last_ref_frame == VA_INVALID_SURFACE, what does FFmpeg ref's last_frame_ts end up as? (Likely 0; verify.)

These answers feed Phase 4 plan clauses. None are blocking — all have safe defaults that work for the BBB binding cell.

Phase 3 baseline targets (work plan)

To answer the open questions above, Phase 3 will run on fresnel (when reachable):

  1. Cross-validator capture: strace -ff -tt -y -v -e trace=ioctl ffmpeg -hwaccel v4l2request -i ~/fourier-test/bbb_720p10s_vp8.webm -frames:v 5 -f null - 2>strace.log with hantro-vpu-dec env vars. Extract S_EXT_CTRLS payload bytes for VP8_FRAME control across frames 1 (key) and 2 (inter).
  2. VAAPI-side trace: LIBVA_TRACE=/tmp/vp8_libva.trace mpv --hwdec=no --vo=null --frames=2 ~/fourier-test/bbb_720p10s_vp8.webm to confirm VAAPI consumer chain (mpv's parser produces VAPictureParameterBufferVP8 + slice + iqmatrix + probability buffers).
  3. Cache-safe verify path baseline: mpv --hwdec=no --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm and capture frame-0001.jpg + frame-0002.jpg SHA256s (SW reference for criterion 4 byte-compare in Phase 7).

Phase 4 plan structure (anticipated)

Following iter2's 10-clause plan template:

  • Clause 1: device-init batched submission contract (VP8 has none — clause is empty / N/A)
  • Clause 2: per-frame batched submission shape (count=1, VP8_FRAME control)
  • Clause 3: VAAPI → V4L2 mapping table (the table above, normalized to plan-prose form)
  • Clause 4: DPB timestamp resolution
  • Clause 5: quantization base+delta derivation from VAAPI's denormalized matrix
  • Clause 6: probability table mapping (separate buffer source)
  • Clause 7: BeginPicture per-frame reset (iqmatrix_set, probability_set)
  • Clause 8: surface union extension
  • Clause 9: enumeration + dispatch wiring (config.c + picture.c)
  • Clause 10: meson + new file integration

The plan will cite verbatim Phase 3 baseline payload bytes for fields where the mapping is non-obvious (quant deltas, first_part_header_bits) per feedback_dev_process.md Phase 6 contract-before-code.

Substrate state at Phase 2 close

  • iter3 Phase 1 commit ea2413e pushed to gitea (campaign repo).
  • Fork on noether at iter2 tip 8d71e20 (synced via git fetch origin && git merge --ff-only origin/master from previous commit 229d6d1).
  • Fresnel.vpn unreachable at Phase 2 read time; Phase 3 baseline + Phase 6 builds need the laptop online. Memory rule — don't offer pause prompts; will wait for fresnel to come back online OR the user to wake it before Phase 3.
  • All 5 memory entries still apply: gitea-as-claude-noether, no-session-termination-attempts, header-deletion-check, review-empirical-over-theoretical (BOTH directions), rockchip-pixel-verify-path.
  • Phase 3 baseline questions queued (6 items above).