Files

T

marfrit 56abe3d6a2 iter4 Phase 3: VP9 baseline + 4-codec regression on 7.0 substrate

Captured on linux-fresnel-fourier 7.0-1 (post 6.19 decommission).

VP9 baseline (kernel-direct via ffmpeg-v4l2request on rkvdec):
- 5-frame SW reference PNG SHA256 anchors (criterion-4)
- VIDIOC_S_EXT_CTRLS strace with full payload at -s 16384
- Empirical struct sizes 168 B (FRAME) / 2040 B (COMPRESSED_HDR)
  supersede Phase 2 estimates of 144 / 1947
- Probe pattern: count=1 (FRAME-only) then count=2 (FRAME + COMPRESSED_HDR)

Phase 2 doc fix: control IDs corrected 0xa40b2c/d -> 0xa40a2c/d.

4-codec regression (H.264, MPEG-2, HEVC, VP8): all fall back to SW on
default config because /dev/video0 is now rockchip-rga (RGB color
converter), not a codec device. Fork hardcodes /dev/video0 in
request.c:149. Env override LIBVA_V4L2_REQUEST_VIDEO_PATH /
_MEDIA_PATH restores per-driver profile enumeration; mitigation A/B/C
queued for user decision.

New contract clauses surfaced:
- Clause 11: uncompressed-header partial parse for lf_delta /
  base_q_idx (VAAPI doesn't expose these; keyframe ref_deltas non-zero
  for BBB so leave-at-zero is wrong)
- Clause 12: compile-time sizeof asserts on the two control structs
  so future UAPI shifts fail loudly

iter4_phase3.tgz: full Phase 3 artifact bundle (strace + PNG refs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-09 20:31:53 +00:00

24 KiB

Raw Permalink Blame History

Iteration 4 — Phase 2 (situation analysis)

Source-read of every file the iter4 patch series will touch, plus the kernel UAPI + VAAPI + downstream FFmpeg + kernel rkvdec reference sources. Conducted on noether against fork tip e1aca9c (iter3 close).

This is a contract-before-code analysis per feedback_dev_process.md Phase 2: enumerate the bugs, cite the contract verbatim, predict the patch shape, queue the Phase 3 baseline questions.

Critical finding: rkvdec requires VP9_COMPRESSED_HDR

The biggest scope-shaping discovery: rkvdec on RK3399 requires V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, not optional. From drivers/staging/media/rkvdec/rkvdec-vp9.c::rkvdec_vp9_run_preamble lines 740-754:

ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_FRAME);
if (WARN_ON(!ctrl))
    return -EINVAL;
dec_params = ctrl->p_cur.p;
...
ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
if (WARN_ON(!ctrl))
    return -EINVAL;       /* ← rkvdec WILL fail without compressed-header probs */
prob_updates = ctrl->p_cur.p;
vp9_ctx->cur.tx_mode = prob_updates->tx_mode;
...
v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params);

VAAPI does NOT expose compressed-header probability updates (per va_dec_vp9.h:50-192 — only frame parameters + segmentation, no probability deltas; vendor VAAPI drivers parse compressed header in firmware/GPU). So the libva backend must parse the compressed header itself via a VPX boolean decoder.

This shapes iter4's scope significantly larger than iter3 VP8.

Bug enumeration (sites the iter4 patch series must touch)

B1 — `src/config.c::RequestQueryConfigProfiles` — VP9 enumeration block missing

Site: config.c:121-160.

Bug: no analogous block for V4L2_PIX_FMT_VP9_FRAME → VAProfileVP9Profile0. Same starting condition as iter3 VP8.

Patch shape: ADD enumeration block after iter3's VP8 block. ~10 LOC.

B2 — `src/config.c::RequestCreateConfig` — VP9 case label missing

Site: config.c:54-78.

Bug: no case VAProfileVP9Profile0:. Mirror iter3 VP8 pattern. ~5 LOC.

B3 — `src/config.c::RequestQueryConfigEntrypoints` — VP9 case missing

Site: config.c:167-191.

Bug: missing in fall-through case list. ~1 LOC.

B4 — `src/vp9.c` — file does not exist; needs net-new implementation

Site: NEW FILE src/vp9.c.

Patch shape: NEW file, ~500-600 LOC (substantially larger than iter3 vp8.c due to compressed-header parser):

Includes block
Static inv_map_table[255] — direct copy from FFmpeg v4l2_request_vp9.c:43-64
VPX range coder helpers (port from FFmpeg vp89_rac.h + boolean decoder primitives) — ~80 LOC
vp9_fill_frame() — fill v4l2_ctrl_vp9_frame from VAAPI VADecPictureParameterBufferVP9 + VASliceParameterBufferVP9 — ~150 LOC
vp9_fill_compressed_hdr() — parse compressed header bits from surface_object->source_data + uncompressed_header_size, populate v4l2_ctrl_vp9_compressed_hdr — ~180 LOC (port from FFmpeg fill_compressed_hdr lines 99-261)
vp9_set_controls() — entry point, allocates both structs, calls vp9_fill_frame + vp9_fill_compressed_hdr, batched 2-element v4l2_ext_control array, single v4l2_set_controls call

B5 — `src/vp9.h` — header does not exist

Site: NEW FILE src/vp9.h.

Patch shape: declare vp9_set_controls(). Mirror iter3 vp8.h.

B6 — Possibly `src/vp9_rac.h` — VPX range decoder helpers (decision point)

Site: NEW FILE candidate src/vp9_rac.h.

VP9 boolean decoder primitives (vpx_rac_get_prob_branchy, vp89_rac_get, vp89_rac_get_uint, init function) are needed by vp9_fill_compressed_hdr. Two design options:

Option A: inline the ~80 LOC of decoder helpers directly in vp9.c. Simpler; one file. Recommended for first cut.
Option B: separate vp9_rac.h/vp9_rac.c. Mirrors FFmpeg's vp89_rac.h upstream pattern. More files, easier reuse if AV1/VP10 work follows.

Phase 4 plan locks Option A unless Phase 5 review surfaces a reason for Option B.

B7 — `src/picture.c::codec_set_controls` — VP9 dispatch case missing

Site: picture.c:188-225.

Patch shape: ADD case VAProfileVP9Profile0: calling vp9_set_controls. ~6 LOC.

B8 — `src/picture.c::codec_store_buffer` — 2 VAAPI buffer types unmapped

VAAPI VP9 sends only TWO buffer types per frame (per va_dec_vp9.h:58-303):

VAAPI buffer type	VAAPI struct	Per-frame
`VAPictureParameterBufferType`	`VADecPictureParameterBufferVP9`	once
`VASliceParameterBufferType`	`VASliceParameterBufferVP9` (with `seg_param[8]`)	once
`VASliceDataBufferType`	raw bitstream	once

Different from iter3 VP8: no VAProbabilityBufferType (VP9 keeps probability state in the picture/slice params + parsed compressed header), no VAIQMatrixBufferType (VP9 keeps quantization in the slice's per-segment seg_param array). Just 2 cases vs VP8's 4.

Patch shape: 2 nested case adds in codec_store_buffer outer switch + inner profile dispatch. ~14 LOC total.

B9 — `src/picture.c::RequestBeginPicture` — per-frame VP9 reset

Site: picture.c:299-302.

Bug: VP9 doesn't have an iqmatrix_set / probability_set flag pattern; the picture/slice params are unconditionally fully-populated by VAAPI consumer per frame. Possibly NO reset needed (analogous to MPEG-2's iqmatrix-only pattern but even simpler).

Patch shape: likely no edit. If Phase 5 review reveals a hidden state-leak risk (e.g., VAAPI reusing the surface for a new context with stale params), add reset for params.vp9.<some-flag>. Default plan: no reset added; revisit if Phase 7 byte-compare shows stale state.

B10 — `src/surface.h::object_surface::params` union — no `vp9` member

Site: surface.h:92-119.

Patch shape: ADD vp9 struct after vp8:

struct {
    VADecPictureParameterBufferVP9 picture;
    VASliceParameterBufferVP9 slice;
} vp9;

VASliceParameterBufferVP9 is large (~340 bytes — seg_param[8] × ~40 bytes each); VADecPictureParameterBufferVP9 ~80 bytes. Union grows by ~420 bytes from this; still dominated by params.h265 with its 64-slot slices[64] array (~17 KB).

B11 — `src/meson.build` — `vp9.c` + `vp9.h` not in lists

Site: meson.build:30-74.

Patch shape: insert 'vp9.c' after 'vp8.c' in sources, insert 'vp9.h' after 'vp8.h' in headers. +2 lines.

B12 — `src/buffer.c` — buffer-type allow-list (predicted no change needed)

Site: buffer.c:59-70.

VP9 uses VAPictureParameterBufferType + VASliceParameterBufferType + VASliceDataBufferType — all three already in the allow-list (used by H.264 + iter3 VP8). Predicted no change needed.

Per memory feedback_runtime_enumerates_allowlists.md: plan for fix-forward Commit D if a runtime miss surfaces (would be unexpected for VP9 given the buffer types are H.264-shape; but the iter3 lesson is "don't audit exhaustively — let runtime enumerate").

Non-bugs (intentionally NOT touched)

src/context.c — no DECODE_MODE/START_CODE menus for VP9 (per FFmpeg V4L2 ref v4l2_request_vp9.c:487-503: v4l2_request_vp9_init doesn't issue any device-wide menu sets; per-frame batch only). No context.c changes.
src/video.c::formats[] — CAPTURE-side format list (NV12); VP9 is OUTPUT-side fourcc, probed via v4l2_find_format() in config.c. No video.c changes.
src/v4l2.c — fourcc-agnostic helpers. No v4l2.c changes.
include/hevc-ctrls.h — already includes <linux/v4l2-controls.h> which holds VP9 control IDs.

Contract surface (verbatim)

Kernel UAPI: `V4L2_CID_STATELESS_VP9_FRAME` (`<linux/v4l2-controls.h>:2696`)

#define V4L2_CID_STATELESS_VP9_FRAME        (V4L2_CID_CODEC_STATELESS_BASE + 300)
                                            /* = 0xa40a2c */

struct v4l2_ctrl_vp9_frame {
    struct v4l2_vp9_loop_filter lf;        /* 16 bytes; ref_deltas[4] + mode_deltas[2]
                                              + level + sharpness + flags + reserved[7] */
    struct v4l2_vp9_quantization quant;    /* 8 bytes; base_q_idx + 3 deltas + reserved[4] */
    struct v4l2_vp9_segmentation seg;      /* 80 bytes; feature_data[8][4] + feature_enabled[8]
                                              + tree_probs[7] + pred_probs[3] + flags + reserved[5] */
    __u32 flags;                            /* 6 V4L2_VP9_FRAME_FLAG_* bits per
                                              <linux/v4l2-controls.h>:2665-2674 */
    __u16 compressed_header_size;
    __u16 uncompressed_header_size;
    __u16 frame_width_minus_1;
    __u16 frame_height_minus_1;
    __u16 render_width_minus_1;
    __u16 render_height_minus_1;
    __u64 last_frame_ts;                    /* per-VASurfaceID timestamp lookup */
    __u64 golden_frame_ts;
    __u64 alt_frame_ts;
    __u8 ref_frame_sign_bias;               /* OR of V4L2_VP9_SIGN_BIAS_{LAST,GOLDEN,ALT} */
    __u8 reset_frame_context;               /* V4L2_VP9_RESET_FRAME_CTX_* (0..2) */
    __u8 frame_context_idx;
    __u8 profile;
    __u8 bit_depth;
    __u8 interpolation_filter;
    __u8 tile_cols_log2;
    __u8 tile_rows_log2;
    __u8 reference_mode;
    __u8 reserved[7];
};

Total size: ~144 bytes (vs iter3 VP8's 1232 bytes — much smaller because VP9_FRAME carries no entropy table; that's in COMPRESSED_HDR).

Kernel UAPI: `V4L2_CID_STATELESS_VP9_COMPRESSED_HDR` (`<linux/v4l2-controls.h>:2797`)

#define V4L2_CID_STATELESS_VP9_COMPRESSED_HDR  (V4L2_CID_CODEC_STATELESS_BASE + 301)
                                              /* = 0xa40a2d */

struct v4l2_ctrl_vp9_compressed_hdr {
    __u8 tx_mode;                          /* V4L2_VP9_TX_MODE_* (0..4) */
    __u8 tx8[2][1];
    __u8 tx16[2][2];
    __u8 tx32[2][3];
    __u8 coef[4][2][2][6][6][3];           /* HUGE: 1728 bytes */
    __u8 skip[3];
    __u8 inter_mode[7][3];
    __u8 interp_filter[4][2];
    __u8 is_inter[4];
    __u8 comp_mode[5];
    __u8 single_ref[5][2];
    __u8 comp_ref[5];
    __u8 y_mode[4][9];
    __u8 uv_mode[10][9];
    __u8 partition[16][3];
    struct v4l2_vp9_mv_probs mv;           /* 79 bytes; joint/sign/classes/class0_bit/bits/etc */
};

Total size: ~1947 bytes. Filled by parsing the compressed header bits via VPX boolean decoder + inv_map_table[] (per FFmpeg v4l2_request_vp9.c:99-261).

The kernel uses these as PROBABILITY UPDATES (not absolutes): a value of zero in any array element means "no update — keep prior probability." The kernel runs v4l2_vp9_fw_update_probs(&probability_tables, prob_updates, dec_params) to apply updates per rkvdec-vp9.c:796.

VAAPI buffer types

VADecPictureParameterBufferVP9 (va_dec_vp9.h:58-192):

frame_width, frame_height (u16)
reference_frames[8] — 8-entry DPB (vs VP8's 3)
pic_fields.bits.{...} — 27 single-bit/multi-bit fields (subsampling_x/y, frame_type, show_frame, error_resilient_mode, intra_only, allow_high_precision_mv, mcomp_filter_type[3 bits], frame_parallel_decoding_mode, reset_frame_context[2 bits], refresh_frame_context, frame_context_idx[2 bits], segmentation_*, last/golden/alt_ref_frame[3 bits each, indexes into reference_frames[8]], *_sign_bias, lossless_flag)
filter_level, sharpness_level (u8)
log2_tile_rows, log2_tile_columns (u8)
frame_header_length_in_bytes — uncompressed_header_size (u8 — note 8-bit width may overflow for super-frames; typical < 256 for BBB)
first_partition_size — compressed_header_size (u16)
mb_segment_tree_probs[7], segment_pred_probs[3] (u8)
profile, bit_depth (u8)

VASliceParameterBufferVP9 (va_dec_vp9.h:279-303):

slice_data_size, slice_data_offset, slice_data_flag (u32)
seg_param[8] — array of VASegmentParameterVP9 (~40 bytes each):
- segment_flags.fields.{segment_reference_enabled, segment_reference[2 bits], segment_reference_skipped} (u16 packed)
- filter_level[4][2] (u8) — per-ref-frame × per-mode loop filter levels
- luma_ac_quant_scale, luma_dc_quant_scale, chroma_ac_quant_scale, chroma_dc_quant_scale (s16) — already-computed effective scale per segment

FFmpeg V4L2 reference (`v4l2_request_vp9.c`)

Submission shape: 2 batched controls per frame in single S_EXT_CTRLS:

control[0] = { .id = V4L2_CID_STATELESS_VP9_FRAME, ... };
control[1] = { .id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, ... };
v4l2_set_controls(..., control, 2);

The COMPRESSED_HDR control is conditionally-included based on a runtime probe (v4l2_request_vp9_post_frames_ctx queries the kernel; if the control isn't advertised, falls back to FRAME-only). For rkvdec on RK3399, the kernel advertises COMPRESSED_HDR — verified at rkvdec-vp9.c:752 (kernel WILL EINVAL if not provided).

Kernel rkvdec driver (`rkvdec-vp9.c`)

Key reads in rkvdec_vp9_run_preamble:

VP9_FRAME control → dec_params = ctrl->p_cur.p → drives register programming via config_registers().
VP9_COMPRESSED_HDR control → prob_updates = ctrl->p_cur.p → applied via v4l2_vp9_fw_update_probs().
8-entry reference frame DPB resolved from FRAME's last_frame_ts/golden_frame_ts/alt_frame_ts (only 3 active references at a time, despite VAAPI exposing 8 — kernel uses last/golden/alt indexes into the picture's 8-frame DPB).

Mapping table (VAAPI → V4L2 / kernel)

The libva backend's job: read VAAPI's per-frame buffers (Picture + Slice) AND parse the compressed header from the bitstream, write the kernel's two structs.

`v4l2_ctrl_vp9_frame` mapping

Kernel field	VAAPI source	Notes
`lf.ref_deltas[4]`	NOT in VAAPI	VAAPI doesn't expose loop-filter ref deltas separately; FFmpeg's V4L2 ref reads from VP9Context internal state. Open question Phase 3: are these zero in the BBB fixture?
`lf.mode_deltas[2]`	NOT in VAAPI	same
`lf.level`	`picture->filter_level`	direct
`lf.sharpness`	`picture->sharpness_level`	direct
`lf.flags`	NOT in VAAPI	DELTA_ENABLED + DELTA_UPDATE bits — ditto
`quant.base_q_idx`	DERIVED — no direct VAAPI exposure	Open question Phase 3: VAAPI exposes per-segment `luma_ac_quant_scale[seg_param[s]]` but those are EFFECTIVE Q-scales, not the base index. Inverse-derive from `luma_ac_quant_scale[0][1]` via VP9 spec quantization table? Or leave zero and let kernel use default?
`quant.delta_q_y_dc/uv_dc/uv_ac`	NOT in VAAPI	same — VAAPI only exposes effective per-segment scales
`seg.feature_data[8][4]`	DERIVED from `slice->seg_param[s].filter_level[][]` + quant scales	mapping non-trivial
`seg.feature_enabled[8]`	derived from `slice->seg_param[s].segment_flags` + segmentation enabled bits	non-trivial
`seg.tree_probs[7]`	`picture->mb_segment_tree_probs[7]`	direct
`seg.pred_probs[3]`	`picture->segment_pred_probs[3]`	direct
`seg.flags`	from `pic_fields.bits.{segmentation_enabled, segmentation_update_map, segmentation_temporal_update}` + derived segmentation_update_data + absolute_or_delta	mostly direct
`flags & KEY_FRAME`	`!pic_fields.bits.frame_type`	VAAPI inverts: frame_type=0 means keyframe
`flags & SHOW_FRAME`	`pic_fields.bits.show_frame`	direct
`flags & ERROR_RESILIENT`	`pic_fields.bits.error_resilient_mode`	direct
`flags & INTRA_ONLY`	`pic_fields.bits.intra_only`	direct
`flags & ALLOW_HIGH_PREC_MV`	`pic_fields.bits.allow_high_precision_mv`	direct
`flags & REFRESH_FRAME_CTX`	`pic_fields.bits.refresh_frame_context`	direct
`flags & PARALLEL_DEC_MODE`	`pic_fields.bits.frame_parallel_decoding_mode`	direct
`flags & X/Y_SUBSAMPLING`	`pic_fields.bits.subsampling_x/y`	direct
`flags & COLOR_RANGE_FULL_SWING`	NOT in VAAPI	leave 0 for BT.709 limited (BBB)
`compressed_header_size`	`picture->first_partition_size`	direct (VAAPI mis-named per its own comment)
`uncompressed_header_size`	`picture->frame_header_length_in_bytes`	direct
`frame_width_minus_1`	`picture->frame_width - 1`	direct
`frame_height_minus_1`	`picture->frame_height - 1`	direct
`render_width_minus_1`, `render_height_minus_1`	NOT in VAAPI	leave equal to frame_width-1 / frame_height-1 (no scaling for BBB)
`last_frame_ts`	DPB lookup `picture->reference_frames[picture->pic_fields.bits.last_ref_frame]` → `surface_object->timestamp` → `v4l2_timeval_to_ns()`	uses `last_ref_frame` index into 8-entry DPB
`golden_frame_ts`	DPB lookup `picture->reference_frames[picture->pic_fields.bits.golden_ref_frame]`	same
`alt_frame_ts`	DPB lookup `picture->reference_frames[picture->pic_fields.bits.alt_ref_frame]`	same
`ref_frame_sign_bias`	OR of `pic_fields.bits.{last,golden,alt}_ref_frame_sign_bias` mapped to `V4L2_VP9_SIGN_BIAS_{LAST,GOLDEN,ALT}`	direct
`reset_frame_context`	`pic_fields.bits.reset_frame_context` (with FFmpeg's `> 0 ? -1 : 0` adjustment per ref)	mapping needs inspection
`frame_context_idx`	`pic_fields.bits.frame_context_idx`	direct
`profile`	`picture->profile`	direct
`bit_depth`	`picture->bit_depth`	direct
`interpolation_filter`	`pic_fields.bits.mcomp_filter_type` (with FFmpeg's `^ (filtermode <= 1)` adjustment — see ref)	mapping needs inspection
`tile_cols_log2`, `tile_rows_log2`	`picture->log2_tile_columns`, `log2_tile_rows`	direct
`reference_mode`	NOT in VAAPI	derive from heuristic OR leave default `V4L2_VP9_REFERENCE_MODE_SELECT` — Phase 3 baseline answers

`v4l2_ctrl_vp9_compressed_hdr` mapping

This struct is filled by PARSING the compressed header bitstream — NOT from VAAPI. The libva backend runs a VPX boolean decoder over surface_object->source_data + uncompressed_header_size for compressed_header_size bytes, follows the VP9 spec section 6.3, and applies inv_map_table[d] for each updated probability.

The parsing logic is direct port of FFmpeg fill_compressed_hdr (lines 99-261). Key syntax elements parsed:

tx_mode (2 bits, then conditional 1 bit)
TX 8x8/16x16/32x32 probability updates (only if tx_mode == SELECT)
Coef probability updates (4-level nested loop with branch probs)
Skip / inter_mode / interp_filter / is_inter / comp_mode / single_ref / comp_ref / y_mode / partition probability updates (only on inter frames)
MV probability updates (joint / sign / classes / class0_bit / bits / class0_fr / fr / class0_hp / hp)

Each updated value goes through inv_map_table[] (256-byte lookup). Each "no update" bit leaves zero in the kernel struct.

Patch shape prediction

Site	Action	LOC delta
`src/config.c:121-160`	INSERT VP9 enumeration block	+10
`src/config.c:54-78`	INSERT VP9 case + break + comment	+5
`src/config.c:167-191`	INSERT VP9 case in fall-through	+1
`src/vp9.c`	NEW FILE	+500-600
`src/vp9.h`	NEW FILE	+35-45
`src/picture.c:34-37`	INSERT `#include "vp9.h"`	+1
`src/picture.c:188-225`	INSERT VP9 dispatch case	+6
`src/picture.c:54-186`	INSERT 2 buffer-type cases	+14
`src/surface.h:92-119`	INSERT vp9 struct	+6
`src/meson.build:50,73`	INSERT 2 entries	+2

Total: ~580-690 LOC, 5 modified + 2 new files. Larger than iter3 VP8 (370 LOC) and comparable to iter2 HEVC (470 LOC). Compressed-header parser is the dominant cost.

Predicted commits:

Commit A: src/config.c enumeration + dispatch + entrypoints (Criterion 1).
Commit B: NEW src/vp9.c + src/vp9.h + src/meson.build (10 contract clauses + VPX rac decoder + compressed-header parser).
Commit C: src/picture.c dispatcher + 2 buffer-type cases + src/surface.h union extension (Criteria 2-3).
Commit D: optional fix-forward placeholder.

Open questions for Phase 3 baseline

Loop filter ref/mode deltas: VAAPI doesn't expose lf_delta.ref/mode/enabled/updated. Are these always zero for BBB? Phase 3 strace of FFmpeg-v4l2request VP9 will reveal verbatim values.
Quantization base_q_idx + deltas: VAAPI exposes effective per-segment scales but not the base. Phase 3 baseline: capture verbatim FRAME control payload to see what FFmpeg-v4l2request writes; correlate against VAAPI's per-segment scale via VP9 spec quantization table.
Reference mode: VAAPI doesn't expose comppredmode. Phase 3 baseline: verify default V4L2_VP9_REFERENCE_MODE_SELECT works for BBB.
Interpolation filter mapping: FFmpeg uses filtermode ^ (filtermode <= 1) to remap; VAAPI's mcomp_filter_type may already be in V4L2 enum order (no remap needed) OR in a different order. Empirically check.
Reset frame context mapping: FFmpeg uses > 0 ? - 1 : 0. Either FFmpeg's source enum is offset by 1 from V4L2's, or there's an off-by-one. Empirically verify.
VAAPI per-segment field interpretation: slice->seg_param[s].filter_level[4][2] and quant scales are EFFECTIVE values (computed by mpv-VAAPI consumer). Mapping back to kernel's "ALT_Q delta" + "ALT_L delta" + "REF_FRAME" feature bits is non-trivial. Phase 3 verbatim payload + mapping-back-to-VAAPI cross-check.
Does mpv 0.41.0 engage HW for VP9?: Phase 3 capture mpv -v --hwdec=vaapi --vo=null --frames=2 ~/fourier-test/bbb_720p10s_vp9.webm and grep for Selected decoder: vp9 vs Using software decoding. iter3 VP8 fell back; iter4 VP9 may or may not.
Does rkvdec exhibit the same dma_resv kernel issue as hantro?: iter3 found hantro CAPTURE returns all-zero pages from libva readback. rkvdec is a different driver subsystem; iter1+iter2 successfully verified via mpv-DMA-BUF-GL on rkvdec. Predicted: rkvdec works fine for direct readback. Phase 3 baseline: re-test ffmpeg-vaapi-hwdownload on rkvdec for VP9 and check if output is non-zero.

Phase 3 baseline targets (work plan)

Cross-validator capture: strace -ff -tt -y -v -e trace=ioctl ffmpeg -hwaccel v4l2request bbb_720p10s_vp9.webm -frames:v 5 -f null - 2>strace.log. Decode VP9_FRAME + COMPRESSED_HDR payloads via Phase 3 decoder (extend decode_vp8.py for VP9 layout).
VAAPI consumer trace: LIBVA_TRACE mpv-SW + mpv-vaapi runs to see what buffer types mpv produces.
Cache-safe verify reference: mpv --hwdec=no --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp9.webm and capture frame-0001/0002 SHA256 (criterion-4 anchor).
rkvdec readback path test: re-run ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -vf hwdownload bbb_720p10s_vp9.webm -frames:v 5 after install (would be Phase 6 actually; Phase 3 just baseline-captures the SW reference). Confirm whether rkvdec hits dma_resv issue or not (predicted: NO based on iter1+iter2 working there).
mpv-VP9-vaapi engagement check: per memory feedback_hw_decode_engagement_check.md, verify HW path engaged via mpv -v log BEFORE claiming criterion 4.

Phase 4 plan structure (anticipated)

Following iter2/iter3's clause template:

Clause 1: Submission shape — 2 controls batched per frame
Clause 2: Local struct alloc + zero-init (memset both)
Clause 3: Frame geometry + scalars + flags
Clause 4: DPB timestamp resolution (3 active refs from 8-slot DPB)
Clause 5: Loop filter mapping (with VAAPI gap notes per Q1)
Clause 6: Quantization mapping (with VAAPI gap notes per Q2)
Clause 7: Segmentation mapping (with VAAPI per-segment effective-vs-delta unpacking per Q6)
Clause 8: Compressed header parser — port FFmpeg fill_compressed_hdr + VPX rac decoder + inv_map_table
Clause 9: Final 2-control batched submission
Clause 10: Bitstream offsetting — surface_object->source_data + uncompressed_header_size is the start of compressed-header bytes; compressed_header_size is the byte length

The plan will cite verbatim Phase 3 baseline payload bytes for all fields where mapping is non-obvious (loop-filter deltas, quant base, segmentation feature mapping) per feedback_dev_process.md Phase 6 contract-before-code.

Substrate state at Phase 2 close

iter4 Phase 1 commit 9a71dbf pushed to gitea.
Fork on noether at iter3 tip e1aca9c (synced via git fetch && merge --ff-only).
All Phase 3 prerequisites identified.
Memory rules apply unchanged.
Phase 3 questions queued (8 items, mostly empirical). Phase 5 review will catch the field-availability + mapping questions analogous to iter3 (uniform_spacing_flag Direction 2 lesson).

24 KiB Raw Permalink Blame History Unescape Escape

Iteration 4 — Phase 2 (situation analysis)

Critical finding: rkvdec requires VP9_COMPRESSED_HDR

Bug enumeration (sites the iter4 patch series must touch)

B1 — src/config.c::RequestQueryConfigProfiles — VP9 enumeration block missing

B2 — src/config.c::RequestCreateConfig — VP9 case label missing

B3 — src/config.c::RequestQueryConfigEntrypoints — VP9 case missing

B4 — src/vp9.c — file does not exist; needs net-new implementation

B5 — src/vp9.h — header does not exist

B6 — Possibly src/vp9_rac.h — VPX range decoder helpers (decision point)

B7 — src/picture.c::codec_set_controls — VP9 dispatch case missing

B8 — src/picture.c::codec_store_buffer — 2 VAAPI buffer types unmapped

B9 — src/picture.c::RequestBeginPicture — per-frame VP9 reset

B10 — src/surface.h::object_surface::params union — no vp9 member

B11 — src/meson.build — vp9.c + vp9.h not in lists

B12 — src/buffer.c — buffer-type allow-list (predicted no change needed)