Second-model review by sonnet-architect found 4 Critical bugs in
Phase 4 plan, all verified empirically by author before incorporation
per memory feedback_review_empirical_over_theoretical Direction 2.
Amendments applied in-place to phase4_iter3_plan.md +
phase2_iter3_situation.md.
Critical findings:
C1 first_part_header_bits = 0 was claimed cosmetic; actually
UNSAFE. hantro_g1_vp8_dec.c:260 + rockchip_vpu2_hw_vp8_dec.c:372
both read this field unconditionally to compute the macroblock
DMA offset. Setting 0 would place hardware at wrong DMA offset
for ALL macroblock data → garbage decode.
Fix: frame.first_part_header_bits = slice->macroblock_offset
(verified by source identity — vaapi_vp8.c:204 and
v4l2_request_vp8.c:83 use byte-identical formulas).
C2 first_part_size = slice->partition_size[0] was wrong; VAAPI's
partition_size[0] is the REMAINING bytes after parsing
(vaapi_vp8.c:209 confirms; va_dec_vp8.h:193-196 spec confirms).
Kernel needs the TOTAL control partition size.
Fix: frame.first_part_size = slice->partition_size[0] +
((macroblock_offset + 7) / 8)
Phase 3 keyframe numerics confirm: 21923 + 819 = 22742 ✓.
C3 VAProbabilityDataBufferType does not exist as a buffer-type
enum; it's the struct name. The actual enum constant is
VAProbabilityBufferType (= 13 per va.h:2058). Switch case
using the wrong identifier would have failed Phase 6 compile.
Fix: replace globally in phase2 + phase4 docs.
C4 (s8) cast undefined in userspace. Kernel has 's8' typedef in
linux/types.h (kernel-internal). UAPI exposes '__s8' (double-
underscore). Userspace portable cast is int8_t from <stdint.h>.
Fix: replace (s8) with (int8_t) in Clauses 6+7.
Suggested:
S3 Clause 8 comment was factually wrong: hantro_vp8.c::
hantro_vp8_prob_update reads coeff_probs unconditionally;
there is NO default-table fallback. If probability_set==false,
decode produces garbage. Practical risk low (FFmpeg vaapi_vp8.c
always sends VAProbabilityBufferType per frame), but corrected
comment + added assert(probability_set) runtime guard for
immediate Phase 6 surfacing.
Plus 5 minor S/Q items documented; non-blocking for iter3.
Author's 7 review questions all answered directly in the review:
Q1 quantization derivation: correct for typical content
Q2 first_part_header_bits=0 safety: UNSAFE → C1
Q3 num_dct_parts off-by-one: confirmed correct
Q4 field availability: 2 compile failures found (C3 + C4)
Q5 quant_update[s] semantics: signed delta confirmed
Q6 SHOW_FRAME unconditional: safe for BBB scope
Q7 buffer order independence: confirmed
Estimated saving: 1 Phase 6 → Phase 4 loopback + 2 Phase 6 fix-
forward commits. Review pass is the right path forward per memory
rule "Reviews are never skippable" — empty-review value =
empirical-verification value, regardless of finding count.
Refs:
phase4_iter3_plan.md (amended in-place; Phase 5 amendments
section appended)
phase2_iter3_situation.md (amended C3 globally)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
27 KiB
Iteration 3 — Phase 2 (situation analysis)
Source-read of every file the iter3 patch series will touch, plus the kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference sources. Written immediately after iter3 Phase 1 lock (commit ea2413e). Conducted on noether against fork tip 8d71e20 (iter2 Phase 6 commit B); fresnel.vpn was unreachable at Phase 2 open, so the read is against the noether mirror — verified at commit hash level pre-read.
This is a contract-before-code analysis per feedback_dev_process.md Phase 2: enumerate the bugs, cite the contract verbatim, predict the patch shape, queue the Phase 3 baseline questions.
Bug enumeration (sites the iter3 patch series must touch)
B1 — src/config.c::RequestQueryConfigProfiles — VP8 enumeration block missing
Site: config.c:121-165.
Current state (lines 128-160): three enumeration blocks for MPEG-2 (lines 128-137), H.264 (139-151), HEVC (153-160). Each v4l2_find_format()'s the OUTPUT-side pixfmt against both single-plane and MPLANE buffer types, then conditionally appends profile constants to the output array under a count guard.
Bug: no analogous block for V4L2_PIX_FMT_VP8_FRAME → VAProfileVP8Version0_3. Without this, vainfo (and any consumer that calls vaQueryConfigProfiles) sees no VP8 profile in the enumeration → criterion 1 fails before vaCreateConfig is ever attempted.
Different from iter1+iter2: iter1 (MPEG-2) and iter2 (HEVC) had the enumeration block already in place pre-iter; only the case label fall-through in RequestCreateConfig was missing. iter3 has neither. Both ADDs.
B2 — src/config.c::RequestCreateConfig — VP8 case label missing entirely
Site: config.c:54-78.
Current state: switch over profile. iter1 added case VAProfileMPEG2Simple/Main: with explicit break; (lines 63-69). iter2 added case VAProfileHEVCMain: with break; (lines 70-75). H.264 always existed (lines 56-62, marked // FIXME from upstream). Default → VA_STATUS_ERROR_UNSUPPORTED_PROFILE.
Bug: no case VAProfileVP8Version0_3:. Hits default → consumer gets VA_STATUS_ERROR_UNSUPPORTED_PROFILE from vaCreateConfig → criterion 2 fails.
Patch shape: add 4-line case (label + comment + break;) directly after the iter2 HEVCMain block, mirroring iter1+iter2 style.
B3 — src/config.c::RequestQueryConfigEntrypoints — VP8 case missing
Site: config.c:167-191.
Current state: switch over profile; case list at lines 173-180 covers MPEG-2/H.264/HEVC and falls through to entrypoints[0] = VAEntrypointVLD; *entrypoints_count = 1;. Default sets count to 0.
Bug: no case VAProfileVP8Version0_3:. mpv-vaapi's profile probe queries entry points; without VLD, it skips VP8 → criterion 3 fails (mpv falls through to SW decode silently).
Patch shape: add case VAProfileVP8Version0_3: to the existing fall-through case list.
B4 — src/vp8.c — file does not exist; needs net-new implementation
Site: NEW FILE src/vp8.c.
Bug: there is no VP8 codec dispatcher in the fork. The fork's predecessor (libva-v4l2-request bootlin master) only implements MPEG-2 + H.264 + HEVC. VP8 was never added upstream.
Patch shape: NEW file, ~150-200 lines. Mirror the iter1 mpeg2.c template (src/mpeg2.c:53-249):
- Includes block (mpeg2.h-equivalent + context + request + surface + v4l2-controls)
vp8_set_controls()function entry point matching the existing dispatcher signature(struct request_data *driver_data, struct object_context *context_object, struct object_surface *surface_object) -> int- Local
v4l2_ctrl_vp8_framestruct populated from VAAPI buffers (Picture + IQMatrix + Probability + Slice param) - DPB-timestamp lookup for
last_frame_ts/golden_frame_ts/alt_frame_tsfromVASurfaceIDreferences in VAPictureParameterBufferVP8 - One-element
v4l2_ext_controlarray, singleV4L2_CID_STATELESS_VP8_FRAMEcontrol - Single
v4l2_set_controls(driver_data->video_fd, surface_object->request_fd, ctrls, 1)call
B5 — src/vp8.h — header does not exist
Site: NEW FILE src/vp8.h.
Bug: companion header for vp8.c. Declare vp8_set_controls(). Mirror src/mpeg2.h (forward declarations of request_data, object_context, object_surface, function prototype). No struct definitions needed (no array dimensions to declare like HEVC's HEVC_MAX_SLICES_PER_FRAME).
B6 — src/picture.c::codec_set_controls — VP8 dispatch case missing
Site: picture.c:188-225 (function codec_set_controls).
Current state: switch over profile; MPEG-2 → mpeg2_set_controls (lines 196-201), H.264 → h264_set_controls (203-212), HEVCMain → h265_set_controls (214-218). Default → VA_STATUS_ERROR_UNSUPPORTED_PROFILE.
Bug: no VP8 case. Hits default after RequestEndPicture → vaEndPicture returns error → consumer aborts decode.
Patch shape: add case VAProfileVP8Version0_3: calling vp8_set_controls(driver_data, context_object, surface_object) with same if (rc < 0) return VA_STATUS_ERROR_OPERATION_FAILED; shape as MPEG-2 + HEVC.
Plus include directive update: add #include "vp8.h" near picture.c:34-36 (the existing h264.h/h265.h/mpeg2.h block).
B7 — src/picture.c::codec_store_buffer — 4 VAAPI buffer types unmapped
Site: picture.c:54-186 (function codec_store_buffer).
VAAPI VP8 sends FOUR distinct per-frame buffer types (per va_dec_vp8.h:71-241):
| VAAPI buffer type | VAAPI struct | Per-frame |
|---|---|---|
VAPictureParameterBufferType |
VAPictureParameterBufferVP8 |
once |
VASliceParameterBufferType |
VASliceParameterBufferVP8 |
once (frame-mode) |
VAProbabilityBufferType |
VAProbabilityDataBufferVP8 |
once |
VAIQMatrixBufferType |
VAIQMatrixBufferVP8 |
once |
VASliceDataBufferType |
raw bitstream | once |
Current state:
VASliceDataBufferType(lines 61-83) — already universal, no per-profile branch.context->h264_start_codeflag prepends00 00 01for H.264 only; VP8 does not need start-code prefix (VP8 has its own 3-byte uncompressed frame header). The slice-data path is fine for VP8 unmodified.VAPictureParameterBufferType(lines 85-113) — switch over profile; MPEG-2/H.264/HEVC handled. Default → break (silent ignore). Bug: no VP8 case.VASliceParameterBufferType(lines 115-146) — switch; H.264/HEVC handled. Bug: no MPEG-2 case (intentional — MPEG-2 has only Picture + Quant + Slice-data per VAAPI), no VP8 case.VAIQMatrixBufferType(lines 148-179) — switch; MPEG-2/H.264/HEVC handled. Bug: no VP8 case.VAProbabilityBufferType— NOT IN THE OUTER SWITCH. VAAPI defines this enum value for VP8, but the fork'scodec_store_bufferouter switch doesn't list it. Currently falls through todefault: break;at line 181. Bug: VAProbabilityBufferType case missing entirely.
Patch shape: 4 nested case adds + 1 outer-case add:
- VAPictureParameterBufferType → add VP8 case → memcpy into
surface_object->params.vp8.picture - VASliceParameterBufferType → add VP8 case → memcpy into
surface_object->params.vp8.slice(single, no slices[] array — VP8 is frame-mode) - VAIQMatrixBufferType → add VP8 case → memcpy into
surface_object->params.vp8.iqmatrix+ setiqmatrix_settrue - NEW outer case
VAProbabilityBufferType→ switch over profile → VP8 case → memcpy intosurface_object->params.vp8.probability+ setprobability_settrue
B8 — src/picture.c::RequestBeginPicture — no per-frame VP8 reset needed (probably)
Site: picture.c:227-306.
iter1 added surface_object->params.h264.matrix_set = false; at line 299. iter2 added surface_object->params.h265.num_slices = 0; at line 300.
Bug analysis: VP8 has no slice-array (single per-frame). It does have a probability-data flag (probability_set) that needs reset per frame. AND iqmatrix_set needs per-frame reset.
Patch shape: add two lines:
surface_object->params.vp8.iqmatrix_set = false;surface_object->params.vp8.probability_set = false;
This mirrors iter1's matrix_set = false reset pattern (one line each profile).
B9 — src/surface.h::object_surface::params union — no vp8 member
Site: surface.h:92-113.
Current state: union of three structs: mpeg2, h264, h265. Each holds the buffer-type structs the dispatcher reads.
Bug: no vp8 member. iter1 B3 latent surface-reuse bug (per phase0_findings_iter3.md): picture.c:299 writes byte 240 of the union (h264.matrix_set offset). The iter2 union is dominated by h265 with its 64-slot slices[64] array; total union size ~17 KB. Adding a vp8 member doesn't grow the union (h265 is the dominant member by far).
Patch shape: add vp8 struct after h265:
struct {
VAPictureParameterBufferVP8 picture;
VASliceParameterBufferVP8 slice;
VAIQMatrixBufferVP8 iqmatrix;
bool iqmatrix_set;
VAProbabilityDataBufferVP8 probability;
bool probability_set;
} vp8;
B10 — src/meson.build — vp8.c + vp8.h not in sources/headers
Site: meson.build:30-74.
Current state: sources list has mpeg2.c/h264.c/h264_slice_header.c/h265.c (line 50, uncommented in iter2). headers list has mpeg2.h/h264.h/h264_slice_header.h/h265.h (line 73).
Bug: vp8.c + vp8.h are NEW files, must be ADDED.
Patch shape: insert 'vp8.c' after 'h265.c' in sources, insert 'vp8.h' after 'h265.h' in headers.
Non-bugs (intentionally NOT touched)
src/context.c— VP8 has no DECODE_MODE/START_CODE menus per Phase 0 V4L2 inventory. iter2's HEVC additions to context.c have no analog. No context.c changes.src/video.c::formats[]— the format list is CAPTURE-side (NV12 + Sunxi NV12). VP8 is OUTPUT-side; OUTPUT format probing isv4l2_find_format()calls in config.c, NOT video.c. No video.c changes.src/v4l2.c—v4l2_find_format()is fourcc-agnostic. No v4l2.c changes.src/buffer.c—VAProbabilityBufferTypeis a standard VAAPI buffer type; the buffer registry is type-agnostic. No buffer.c changes.include/hevc-ctrls.h— already a 9-line shim including<linux/v4l2-controls.h>. VP8's V4L2_CID_STATELESS_VP8_FRAME is in the same kernel UAPI header (line 1900). No header-shim work like iter1'smpeg2-ctrls.hdeletion.
Contract surface (verbatim from kernel UAPI + VAAPI)
Kernel UAPI: V4L2_CID_STATELESS_VP8_FRAME
<linux/v4l2-controls.h>:1900 — V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE + 200 = 0x00a409c8. Matches the per-device control advertised by hantro-vpu-dec in Phase 0 V4L2 inventory (vp8_frame_parameters 0x00a409c8).
Kernel UAPI: struct v4l2_ctrl_vp8_frame (<linux/v4l2-controls.h>:1929-1958)
struct v4l2_ctrl_vp8_frame {
struct v4l2_vp8_segment segment; /* offset 0 */
struct v4l2_vp8_loop_filter lf; /* loop filter parameters */
struct v4l2_vp8_quantization quant; /* base quant indices */
struct v4l2_vp8_entropy entropy; /* update probabilities */
struct v4l2_vp8_entropy_coder_state coder_state;
__u16 width;
__u16 height;
__u8 horizontal_scale;
__u8 vertical_scale;
__u8 version;
__u8 prob_skip_false;
__u8 prob_intra;
__u8 prob_last;
__u8 prob_gf;
__u8 num_dct_parts;
__u32 first_part_size;
__u32 first_part_header_bits;
__u32 dct_part_sizes[8];
__u64 last_frame_ts;
__u64 golden_frame_ts;
__u64 alt_frame_ts;
__u64 flags;
};
Sub-structs (<linux/v4l2-controls.h>:1785-1888):
v4l2_vp8_segment:__s8 quant_update[4]; __s8 lf_update[4]; __u8 segment_probs[3]; __u8 padding; __u32 flags;(segment-id probabilities, per-segment quant/lf overrides, flagsV4L2_VP8_SEGMENT_FLAG_{ENABLED, UPDATE_MAP, UPDATE_FEATURE_DATA, DELTA_VALUE_MODE})v4l2_vp8_loop_filter:__s8 ref_frm_delta[4]; __s8 mb_mode_delta[4]; __u8 sharpness_level; __u8 level; __u16 padding; __u32 flags;(flagsV4L2_VP8_LF_{ADJ_ENABLE, DELTA_UPDATE, FILTER_TYPE_SIMPLE})v4l2_vp8_quantization:__u8 y_ac_qi; __s8 y_dc_delta; __s8 y2_dc_delta; __s8 y2_ac_delta; __s8 uv_dc_delta; __s8 uv_ac_delta; __u16 padding;— base values; per-segment overrides come fromsegment.quant_update[]v4l2_vp8_entropy:__u8 coeff_probs[4][8][3][11]; __u8 y_mode_probs[4]; __u8 uv_mode_probs[3]; __u8 mv_probs[2][19]; __u8 padding[3];— probability update tablesv4l2_vp8_entropy_coder_state:__u8 range; __u8 value; __u8 bit_count; __u8 padding;— boolean coder state at end of header
Frame flags (<linux/v4l2-controls.h>:1890-1895):
V4L2_VP8_FRAME_FLAG_KEY_FRAME = 0x01V4L2_VP8_FRAME_FLAG_EXPERIMENTAL = 0x02V4L2_VP8_FRAME_FLAG_SHOW_FRAME = 0x04V4L2_VP8_FRAME_FLAG_MB_NO_SKIP_COEFF = 0x08V4L2_VP8_FRAME_FLAG_SIGN_BIAS_GOLDEN = 0x10V4L2_VP8_FRAME_FLAG_SIGN_BIAS_ALT = 0x20
VAAPI buffer types (/home/mfritsche/src/ohm_gl_fix/phase6/step1/reference/libva/va/va_dec_vp8.h)
VAPictureParameterBufferVP8 (lines 71-160):
frame_width,frame_height(u32)last_ref_frame,golden_ref_frame,alt_ref_frame,out_of_loop_frame(VASurfaceID)pic_fields.bits.{key_frame, version, segmentation_enabled, update_mb_segmentation_map, update_segment_feature_data, filter_type, sharpness_level, loop_filter_adj_enable, mode_ref_lf_delta_update, sign_bias_golden, sign_bias_alternate, mb_no_coeff_skip, loop_filter_disable}(packed bitfield)mb_segment_tree_probs[3](u8)loop_filter_level[4],loop_filter_deltas_ref_frame[4],loop_filter_deltas_mode[4](per-segment / per-ref / per-mode)prob_skip_false,prob_intra,prob_last,prob_gf(u8)y_mode_probs[4],uv_mode_probs[3](u8 — luma + chroma intra-prediction probs)mv_probs[2][19](u8)bool_coder_ctx.{range, value, count}(u8 — same bytes as kernelv4l2_vp8_entropy_coder_stateminuspadding)
VASliceParameterBufferVP8 (lines 170-202):
slice_data_size,slice_data_offset,slice_data_flag,macroblock_offset(u32)num_of_partitions(u8)partition_size[9](u32) — partition_size[0] is control-partition remaining bytes; partition_size[1..8] are DCT partition sizes (max 8 DCT partitions per VP8 spec)
VAProbabilityDataBufferVP8 (lines 218-223):
dct_coeff_probs[4][8][3][11](u8) — direct match to kernelv4l2_vp8_entropy.coeff_probs
VAIQMatrixBufferVP8 (lines 232-241):
quantization_index[4][6](u16) — per-segment, per-component effective Q index. Component order: yac(0), ydc(1), y2dc(2), y2ac(3), uvdc(4), uvac(5). Already includes per-segment effective values.
FFmpeg downstream reference (v4l2_request_vp8.c:31-187)
Submission shape: single batched S_EXT_CTRLS at end_frame, count=1, V4L2_CID_STATELESS_VP8_FRAME with full v4l2_ctrl_vp8_frame struct. No init-time device-wide menus (no DECODE_MODE/START_CODE for VP8 — confirmed by absence in FFmpeg ref + Phase 0 V4L2 inventory).
Bitstream is appended verbatim (v4l2_request_vp8_decode_slice calls ff_v4l2_request_append_output(buffer, size) once per frame with the WHOLE VP8 frame including 3-byte uncompressed header). NO Annex-B start codes, NO start-code emulation prevention. The kernel hantro driver re-parses the 3-byte (or 10-byte for keyframe) uncompressed header.
Kernel hantro driver reference (hantro_vp8.c:49-143)
hantro_vp8_prob_update() reads:
hdr->prob_skip_false,hdr->prob_intra,hdr->prob_last,hdr->prob_gfhdr->segment.segment_probs[0..2]hdr->entropy.{y_mode_probs[4], uv_mode_probs[3], mv_probs[2][19], coeff_probs[4][8][3][11]}
The kernel does NOT read hdr->coder_state.padding or quant.padding or lf.padding — they're zeroed by struct designation initializer in C. All padding fields must be left zero in the libva backend (matches FFmpeg ref, which uses C99 designated init defaulting all unset fields to zero).
Mapping table (VAAPI → V4L2 / kernel)
The libva backend's job: read VAAPI's per-frame buffers (Picture + Slice + Probability + IQMatrix) and write the kernel's v4l2_ctrl_vp8_frame. The VAAPI consumer (mpv/ffmpeg-vaapi) has already parsed the bitstream — the libva backend is field-shuffling only, no bitstream parsing.
| Kernel field | VAAPI source | Notes |
|---|---|---|
width, height |
picture->frame_width, frame_height |
u32 → u16, both ≤65535 within campaign codec scope (1920 max) |
version |
picture->pic_fields.bits.version |
3-bit field |
horizontal_scale, vertical_scale |
0, 0 | VAAPI doesn't expose; FFmpeg ref also hardcodes 0 |
prob_skip_false |
picture->prob_skip_false |
direct |
prob_intra |
picture->prob_intra |
direct |
prob_last |
picture->prob_last |
direct |
prob_gf |
picture->prob_gf |
direct |
num_dct_parts |
slice->num_of_partitions - 1 |
VAAPI's count includes control partition; kernel's excludes (per-spec). Verify against Phase 3 trace. |
first_part_size |
slice->partition_size[0] |
control-partition size |
first_part_header_bits |
DERIVED — see below | not in VAAPI directly |
dct_part_sizes[0..7] |
slice->partition_size[1..8] |
shift by 1 to skip control partition |
last_frame_ts |
DPB lookup picture->last_ref_frame |
VASurfaceID → object_surface->timestamp → v4l2_timeval_to_ns() (mirror mpeg2.c::pic.forward_ref_ts pattern) |
golden_frame_ts |
DPB lookup picture->golden_ref_frame |
same as above |
alt_frame_ts |
DPB lookup picture->alt_ref_frame |
same as above |
flags & KEY_FRAME |
picture->pic_fields.bits.key_frame == 0 |
VAAPI inverts — VP8 spec says key_frame=0 means key-frame |
flags & SHOW_FRAME |
not in VAAPI | force 1 (mpv only renders shown frames; alt-ref invisible frames are also shown=1 to mpv consumer side; safe to force) |
flags & MB_NO_SKIP_COEFF |
picture->pic_fields.bits.mb_no_coeff_skip |
direct |
flags & SIGN_BIAS_GOLDEN |
picture->pic_fields.bits.sign_bias_golden |
direct |
flags & SIGN_BIAS_ALT |
picture->pic_fields.bits.sign_bias_alternate |
direct |
flags & EXPERIMENTAL |
0 | VAAPI doesn't expose; FFmpeg uses s->profile & 0x4 which has no VAAPI analog. Leave 0. |
coder_state.range |
picture->bool_coder_ctx.range |
direct |
coder_state.value |
picture->bool_coder_ctx.value |
direct |
coder_state.bit_count |
picture->bool_coder_ctx.count |
VAAPI calls it count |
lf.sharpness_level |
picture->pic_fields.bits.sharpness_level |
direct |
lf.level |
picture->loop_filter_level[0] |
base level (segment 0); VAAPI exposes per-segment, kernel takes base only |
lf.ref_frm_delta[0..3] |
picture->loop_filter_deltas_ref_frame[0..3] |
direct |
lf.mb_mode_delta[0..3] |
picture->loop_filter_deltas_mode[0..3] |
direct |
lf.flags & ADJ_ENABLE |
picture->pic_fields.bits.loop_filter_adj_enable |
direct |
lf.flags & DELTA_UPDATE |
picture->pic_fields.bits.mode_ref_lf_delta_update |
direct |
lf.flags & FILTER_TYPE_SIMPLE |
picture->pic_fields.bits.filter_type |
VAAPI: filter_type=0 normal, =1 simple |
quant.y_ac_qi |
iqmatrix->quantization_index[0][0] |
segment 0, yac component |
quant.y_dc_delta |
iqmatrix->quantization_index[0][1] - iqmatrix->quantization_index[0][0] |
u8 - u8 → s8 (clamp) |
quant.y2_dc_delta |
iqmatrix->quantization_index[0][2] - iqmatrix->quantization_index[0][0] |
same |
quant.y2_ac_delta |
iqmatrix->quantization_index[0][3] - iqmatrix->quantization_index[0][0] |
same |
quant.uv_dc_delta |
iqmatrix->quantization_index[0][4] - iqmatrix->quantization_index[0][0] |
same |
quant.uv_ac_delta |
iqmatrix->quantization_index[0][5] - iqmatrix->quantization_index[0][0] |
same |
segment.quant_update[s] |
for s∈[1..3]: iqmatrix->quantization_index[s][0] - iqmatrix->quantization_index[0][0] if segmentation enabled, else 0 |
when segmentation_enabled=0 (BBB case), all quant_updates are 0 — bypass the per-segment math |
segment.lf_update[s] |
for s∈[1..3]: picture->loop_filter_level[s] - picture->loop_filter_level[0] if segmentation enabled, else 0 |
same |
segment.segment_probs[0..2] |
picture->mb_segment_tree_probs[0..2] |
direct |
segment.flags & ENABLED |
picture->pic_fields.bits.segmentation_enabled |
direct |
segment.flags & UPDATE_MAP |
picture->pic_fields.bits.update_mb_segmentation_map |
direct |
segment.flags & UPDATE_FEATURE_DATA |
picture->pic_fields.bits.update_segment_feature_data |
direct |
segment.flags & DELTA_VALUE_MODE |
NOT in VAAPI directly | VAAPI doesn't expose abs_delta. Per VP8 spec default, segment values are deltas unless explicitly absolute — the FFmpeg ref sets DELTA_VALUE_MODE iff !s->segmentation.absolute_vals. For BBB (segmentation disabled), this flag's value is irrelevant. Leave 0; document the gap for Phase 5 review. |
entropy.y_mode_probs[0..3] |
picture->y_mode_probs[0..3] |
direct |
entropy.uv_mode_probs[0..2] |
picture->uv_mode_probs[0..2] |
direct |
entropy.mv_probs[i][j] |
picture->mv_probs[i][j] |
direct, [2][19] both sides |
entropy.coeff_probs[i][j][k][l] |
probability->dct_coeff_probs[i][j][k][l] |
DIFFERENT BUFFER — sourced from VAProbabilityDataBuffer not Picture. Direct shape match [4][8][3][11]. |
first_part_header_bits derivation
This field is a kernel-imposed metadata about the bitstream: number of bits consumed by the uncompressed header partition before the boolean coder takes over. FFmpeg derives it from internal parser state:
.first_part_header_bits = (8 * (s->coder_state_at_header_end.input - data) -
s->coder_state_at_header_end.bit_count - 8),
VAAPI does not expose this directly. Open question for Phase 3 baseline: derive from slice->macroblock_offset (bit offset of MB layer from start of slice data) — likely equal or off by a known constant. Phase 3 captures the verbatim payload from ffmpeg-v4l2request and computes the relationship.
If the kernel ignores first_part_header_bits (some drivers do — hantro re-parses), the field can be left zero or approximate. Phase 5 review will flag this as a known fidelity gap.
Patch shape prediction
| Site | Action | LOC delta |
|---|---|---|
src/config.c:121-160 |
INSERT VP8 enumeration block (~10 lines) | +10 |
src/config.c:54-78 |
INSERT case label + break + comment (~5 lines) | +5 |
src/config.c:167-191 |
INSERT case label (~1 line) | +1 |
src/vp8.c |
NEW FILE | +160-220 |
src/vp8.h |
NEW FILE | +35-45 |
src/picture.c:34-36 |
INSERT #include "vp8.h" |
+1 |
src/picture.c:188-225 |
INSERT VP8 dispatch case (~6 lines) | +6 |
src/picture.c:54-186 |
INSERT 4 nested cases + 1 outer case | +30-40 |
src/picture.c:299-300 |
INSERT 2 reset lines | +2 |
src/surface.h:92-113 |
INSERT vp8 struct (~8 lines) | +8 |
src/meson.build:50,73 |
INSERT 2 entries | +2 |
Total: ~260-340 LOC across 6 modified files + 2 new files. Compared to iter1 (~120 LOC, 4 modified + 0 new + 1 deleted) and iter2 (~470 LOC, 5 modified + 0 new + 0 deleted), iter3 is medium-sized — the new file dominates. The dispatcher additions in picture.c + config.c are mechanical ports of iter1+iter2 patterns.
Open questions for Phase 3 baseline
The Phase 3 baseline run will capture verbatim S_EXT_CTRLS payloads from ffmpeg -hwaccel v4l2request bbb_720p10s_vp8.webm (cross-validator anchor). Questions to answer empirically before Phase 4 plan locks:
- first_part_header_bits exact value: capture for frame 1 (key) and frame 2 (inter). Compare against
slice->macroblock_offsetfrom a parallelvainfo --vbo-equivalent capture. - num_dct_parts vs num_of_partitions: confirm off-by-one (kernel excludes, VAAPI includes control partition). Verify dct_part_sizes[] indexing.
- DPB timestamp lookup: confirm v4l2_timeval_to_ns(picture->last_ref_frame's surface_object->timestamp) matches what the kernel hantro driver reads. Any 0-sentinel for missing refs? (FFmpeg leaves zero for missing refs by C99 designated init.)
- show_frame handling: VAAPI doesn't expose. Force 1 vs derive — which matches the kernel's expectation? (BBB has no alt-ref invisible frames; both options should work for the binding cell, but verify.)
- lf.flags FILTER_TYPE_SIMPLE bit: VAAPI's filter_type=1 means simple. Confirm against bitstream baseline.
- First-frame DPB sentinel: when
picture->last_ref_frame == VA_INVALID_SURFACE, what does FFmpeg ref'slast_frame_tsend up as? (Likely 0; verify.)
These answers feed Phase 4 plan clauses. None are blocking — all have safe defaults that work for the BBB binding cell.
Phase 3 baseline targets (work plan)
To answer the open questions above, Phase 3 will run on fresnel (when reachable):
- Cross-validator capture:
strace -ff -tt -y -v -e trace=ioctl ffmpeg -hwaccel v4l2request -i ~/fourier-test/bbb_720p10s_vp8.webm -frames:v 5 -f null - 2>strace.logwith hantro-vpu-dec env vars. Extract S_EXT_CTRLS payload bytes for VP8_FRAME control across frames 1 (key) and 2 (inter). - VAAPI-side trace:
LIBVA_TRACE=/tmp/vp8_libva.trace mpv --hwdec=no --vo=null --frames=2 ~/fourier-test/bbb_720p10s_vp8.webmto confirm VAAPI consumer chain (mpv's parser produces VAPictureParameterBufferVP8 + slice + iqmatrix + probability buffers). - Cache-safe verify path baseline:
mpv --hwdec=no --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webmand captureframe-0001.jpg+frame-0002.jpgSHA256s (SW reference for criterion 4 byte-compare in Phase 7).
Phase 4 plan structure (anticipated)
Following iter2's 10-clause plan template:
- Clause 1: device-init batched submission contract (VP8 has none — clause is empty / N/A)
- Clause 2: per-frame batched submission shape (count=1, VP8_FRAME control)
- Clause 3: VAAPI → V4L2 mapping table (the table above, normalized to plan-prose form)
- Clause 4: DPB timestamp resolution
- Clause 5: quantization base+delta derivation from VAAPI's denormalized matrix
- Clause 6: probability table mapping (separate buffer source)
- Clause 7: BeginPicture per-frame reset (iqmatrix_set, probability_set)
- Clause 8: surface union extension
- Clause 9: enumeration + dispatch wiring (config.c + picture.c)
- Clause 10: meson + new file integration
The plan will cite verbatim Phase 3 baseline payload bytes for fields where the mapping is non-obvious (quant deltas, first_part_header_bits) per feedback_dev_process.md Phase 6 contract-before-code.
Substrate state at Phase 2 close
- iter3 Phase 1 commit
ea2413epushed to gitea (campaign repo). - Fork on noether at iter2 tip
8d71e20(synced viagit fetch origin && git merge --ff-only origin/masterfrom previous commit229d6d1). - Fresnel.vpn unreachable at Phase 2 read time; Phase 3 baseline + Phase 6 builds need the laptop online. Memory rule — don't offer pause prompts; will wait for fresnel to come back online OR the user to wake it before Phase 3.
- All 5 memory entries still apply: gitea-as-claude-noether, no-session-termination-attempts, header-deletion-check, review-empirical-over-theoretical (BOTH directions), rockchip-pixel-verify-path.
- Phase 3 baseline questions queued (6 items above).