Captured on linux-fresnel-fourier 7.0-1 (post 6.19 decommission). VP9 baseline (kernel-direct via ffmpeg-v4l2request on rkvdec): - 5-frame SW reference PNG SHA256 anchors (criterion-4) - VIDIOC_S_EXT_CTRLS strace with full payload at -s 16384 - Empirical struct sizes 168 B (FRAME) / 2040 B (COMPRESSED_HDR) supersede Phase 2 estimates of 144 / 1947 - Probe pattern: count=1 (FRAME-only) then count=2 (FRAME + COMPRESSED_HDR) Phase 2 doc fix: control IDs corrected 0xa40b2c/d -> 0xa40a2c/d. 4-codec regression (H.264, MPEG-2, HEVC, VP8): all fall back to SW on default config because /dev/video0 is now rockchip-rga (RGB color converter), not a codec device. Fork hardcodes /dev/video0 in request.c:149. Env override LIBVA_V4L2_REQUEST_VIDEO_PATH / _MEDIA_PATH restores per-driver profile enumeration; mitigation A/B/C queued for user decision. New contract clauses surfaced: - Clause 11: uncompressed-header partial parse for lf_delta / base_q_idx (VAAPI doesn't expose these; keyframe ref_deltas non-zero for BBB so leave-at-zero is wrong) - Clause 12: compile-time sizeof asserts on the two control structs so future UAPI shifts fail loudly iter4_phase3.tgz: full Phase 3 artifact bundle (strace + PNG refs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
24 KiB
Iteration 4 — Phase 2 (situation analysis)
Source-read of every file the iter4 patch series will touch, plus the kernel UAPI + VAAPI + downstream FFmpeg + kernel rkvdec reference sources. Conducted on noether against fork tip e1aca9c (iter3 close).
This is a contract-before-code analysis per feedback_dev_process.md Phase 2: enumerate the bugs, cite the contract verbatim, predict the patch shape, queue the Phase 3 baseline questions.
Critical finding: rkvdec requires VP9_COMPRESSED_HDR
The biggest scope-shaping discovery: rkvdec on RK3399 requires V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, not optional. From drivers/staging/media/rkvdec/rkvdec-vp9.c::rkvdec_vp9_run_preamble lines 740-754:
ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_FRAME);
if (WARN_ON(!ctrl))
return -EINVAL;
dec_params = ctrl->p_cur.p;
...
ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
if (WARN_ON(!ctrl))
return -EINVAL; /* ← rkvdec WILL fail without compressed-header probs */
prob_updates = ctrl->p_cur.p;
vp9_ctx->cur.tx_mode = prob_updates->tx_mode;
...
v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params);
VAAPI does NOT expose compressed-header probability updates (per va_dec_vp9.h:50-192 — only frame parameters + segmentation, no probability deltas; vendor VAAPI drivers parse compressed header in firmware/GPU). So the libva backend must parse the compressed header itself via a VPX boolean decoder.
This shapes iter4's scope significantly larger than iter3 VP8.
Bug enumeration (sites the iter4 patch series must touch)
B1 — src/config.c::RequestQueryConfigProfiles — VP9 enumeration block missing
Site: config.c:121-160.
Bug: no analogous block for V4L2_PIX_FMT_VP9_FRAME → VAProfileVP9Profile0. Same starting condition as iter3 VP8.
Patch shape: ADD enumeration block after iter3's VP8 block. ~10 LOC.
B2 — src/config.c::RequestCreateConfig — VP9 case label missing
Site: config.c:54-78.
Bug: no case VAProfileVP9Profile0:. Mirror iter3 VP8 pattern. ~5 LOC.
B3 — src/config.c::RequestQueryConfigEntrypoints — VP9 case missing
Site: config.c:167-191.
Bug: missing in fall-through case list. ~1 LOC.
B4 — src/vp9.c — file does not exist; needs net-new implementation
Site: NEW FILE src/vp9.c.
Patch shape: NEW file, ~500-600 LOC (substantially larger than iter3 vp8.c due to compressed-header parser):
- Includes block
- Static
inv_map_table[255]— direct copy from FFmpegv4l2_request_vp9.c:43-64 - VPX range coder helpers (port from FFmpeg
vp89_rac.h+ boolean decoder primitives) — ~80 LOC vp9_fill_frame()— fillv4l2_ctrl_vp9_framefrom VAAPIVADecPictureParameterBufferVP9+VASliceParameterBufferVP9— ~150 LOCvp9_fill_compressed_hdr()— parse compressed header bits fromsurface_object->source_data + uncompressed_header_size, populatev4l2_ctrl_vp9_compressed_hdr— ~180 LOC (port from FFmpegfill_compressed_hdrlines 99-261)vp9_set_controls()— entry point, allocates both structs, callsvp9_fill_frame+vp9_fill_compressed_hdr, batched 2-elementv4l2_ext_controlarray, singlev4l2_set_controlscall
B5 — src/vp9.h — header does not exist
Site: NEW FILE src/vp9.h.
Patch shape: declare vp9_set_controls(). Mirror iter3 vp8.h.
B6 — Possibly src/vp9_rac.h — VPX range decoder helpers (decision point)
Site: NEW FILE candidate src/vp9_rac.h.
VP9 boolean decoder primitives (vpx_rac_get_prob_branchy, vp89_rac_get, vp89_rac_get_uint, init function) are needed by vp9_fill_compressed_hdr. Two design options:
- Option A: inline the ~80 LOC of decoder helpers directly in
vp9.c. Simpler; one file. Recommended for first cut. - Option B: separate
vp9_rac.h/vp9_rac.c. Mirrors FFmpeg'svp89_rac.hupstream pattern. More files, easier reuse if AV1/VP10 work follows.
Phase 4 plan locks Option A unless Phase 5 review surfaces a reason for Option B.
B7 — src/picture.c::codec_set_controls — VP9 dispatch case missing
Site: picture.c:188-225.
Patch shape: ADD case VAProfileVP9Profile0: calling vp9_set_controls. ~6 LOC.
B8 — src/picture.c::codec_store_buffer — 2 VAAPI buffer types unmapped
VAAPI VP9 sends only TWO buffer types per frame (per va_dec_vp9.h:58-303):
| VAAPI buffer type | VAAPI struct | Per-frame |
|---|---|---|
VAPictureParameterBufferType |
VADecPictureParameterBufferVP9 |
once |
VASliceParameterBufferType |
VASliceParameterBufferVP9 (with seg_param[8]) |
once |
VASliceDataBufferType |
raw bitstream | once |
Different from iter3 VP8: no VAProbabilityBufferType (VP9 keeps probability state in the picture/slice params + parsed compressed header), no VAIQMatrixBufferType (VP9 keeps quantization in the slice's per-segment seg_param array). Just 2 cases vs VP8's 4.
Patch shape: 2 nested case adds in codec_store_buffer outer switch + inner profile dispatch. ~14 LOC total.
B9 — src/picture.c::RequestBeginPicture — per-frame VP9 reset
Site: picture.c:299-302.
Bug: VP9 doesn't have an iqmatrix_set / probability_set flag pattern; the picture/slice params are unconditionally fully-populated by VAAPI consumer per frame. Possibly NO reset needed (analogous to MPEG-2's iqmatrix-only pattern but even simpler).
Patch shape: likely no edit. If Phase 5 review reveals a hidden state-leak risk (e.g., VAAPI reusing the surface for a new context with stale params), add reset for params.vp9.<some-flag>. Default plan: no reset added; revisit if Phase 7 byte-compare shows stale state.
B10 — src/surface.h::object_surface::params union — no vp9 member
Site: surface.h:92-119.
Patch shape: ADD vp9 struct after vp8:
struct {
VADecPictureParameterBufferVP9 picture;
VASliceParameterBufferVP9 slice;
} vp9;
VASliceParameterBufferVP9 is large (~340 bytes — seg_param[8] × ~40 bytes each); VADecPictureParameterBufferVP9 ~80 bytes. Union grows by ~420 bytes from this; still dominated by params.h265 with its 64-slot slices[64] array (~17 KB).
B11 — src/meson.build — vp9.c + vp9.h not in lists
Site: meson.build:30-74.
Patch shape: insert 'vp9.c' after 'vp8.c' in sources, insert 'vp9.h' after 'vp8.h' in headers. +2 lines.
B12 — src/buffer.c — buffer-type allow-list (predicted no change needed)
Site: buffer.c:59-70.
VP9 uses VAPictureParameterBufferType + VASliceParameterBufferType + VASliceDataBufferType — all three already in the allow-list (used by H.264 + iter3 VP8). Predicted no change needed.
Per memory feedback_runtime_enumerates_allowlists.md: plan for fix-forward Commit D if a runtime miss surfaces (would be unexpected for VP9 given the buffer types are H.264-shape; but the iter3 lesson is "don't audit exhaustively — let runtime enumerate").
Non-bugs (intentionally NOT touched)
src/context.c— no DECODE_MODE/START_CODE menus for VP9 (per FFmpeg V4L2 refv4l2_request_vp9.c:487-503:v4l2_request_vp9_initdoesn't issue any device-wide menu sets; per-frame batch only). No context.c changes.src/video.c::formats[]— CAPTURE-side format list (NV12); VP9 is OUTPUT-side fourcc, probed viav4l2_find_format()in config.c. No video.c changes.src/v4l2.c— fourcc-agnostic helpers. No v4l2.c changes.include/hevc-ctrls.h— already includes<linux/v4l2-controls.h>which holds VP9 control IDs.
Contract surface (verbatim)
Kernel UAPI: V4L2_CID_STATELESS_VP9_FRAME (<linux/v4l2-controls.h>:2696)
#define V4L2_CID_STATELESS_VP9_FRAME (V4L2_CID_CODEC_STATELESS_BASE + 300)
/* = 0xa40a2c */
struct v4l2_ctrl_vp9_frame {
struct v4l2_vp9_loop_filter lf; /* 16 bytes; ref_deltas[4] + mode_deltas[2]
+ level + sharpness + flags + reserved[7] */
struct v4l2_vp9_quantization quant; /* 8 bytes; base_q_idx + 3 deltas + reserved[4] */
struct v4l2_vp9_segmentation seg; /* 80 bytes; feature_data[8][4] + feature_enabled[8]
+ tree_probs[7] + pred_probs[3] + flags + reserved[5] */
__u32 flags; /* 6 V4L2_VP9_FRAME_FLAG_* bits per
<linux/v4l2-controls.h>:2665-2674 */
__u16 compressed_header_size;
__u16 uncompressed_header_size;
__u16 frame_width_minus_1;
__u16 frame_height_minus_1;
__u16 render_width_minus_1;
__u16 render_height_minus_1;
__u64 last_frame_ts; /* per-VASurfaceID timestamp lookup */
__u64 golden_frame_ts;
__u64 alt_frame_ts;
__u8 ref_frame_sign_bias; /* OR of V4L2_VP9_SIGN_BIAS_{LAST,GOLDEN,ALT} */
__u8 reset_frame_context; /* V4L2_VP9_RESET_FRAME_CTX_* (0..2) */
__u8 frame_context_idx;
__u8 profile;
__u8 bit_depth;
__u8 interpolation_filter;
__u8 tile_cols_log2;
__u8 tile_rows_log2;
__u8 reference_mode;
__u8 reserved[7];
};
Total size: ~144 bytes (vs iter3 VP8's 1232 bytes — much smaller because VP9_FRAME carries no entropy table; that's in COMPRESSED_HDR).
Kernel UAPI: V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (<linux/v4l2-controls.h>:2797)
#define V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (V4L2_CID_CODEC_STATELESS_BASE + 301)
/* = 0xa40a2d */
struct v4l2_ctrl_vp9_compressed_hdr {
__u8 tx_mode; /* V4L2_VP9_TX_MODE_* (0..4) */
__u8 tx8[2][1];
__u8 tx16[2][2];
__u8 tx32[2][3];
__u8 coef[4][2][2][6][6][3]; /* HUGE: 1728 bytes */
__u8 skip[3];
__u8 inter_mode[7][3];
__u8 interp_filter[4][2];
__u8 is_inter[4];
__u8 comp_mode[5];
__u8 single_ref[5][2];
__u8 comp_ref[5];
__u8 y_mode[4][9];
__u8 uv_mode[10][9];
__u8 partition[16][3];
struct v4l2_vp9_mv_probs mv; /* 79 bytes; joint/sign/classes/class0_bit/bits/etc */
};
Total size: ~1947 bytes. Filled by parsing the compressed header bits via VPX boolean decoder + inv_map_table[] (per FFmpeg v4l2_request_vp9.c:99-261).
The kernel uses these as PROBABILITY UPDATES (not absolutes): a value of zero in any array element means "no update — keep prior probability." The kernel runs v4l2_vp9_fw_update_probs(&probability_tables, prob_updates, dec_params) to apply updates per rkvdec-vp9.c:796.
VAAPI buffer types
VADecPictureParameterBufferVP9 (va_dec_vp9.h:58-192):
frame_width,frame_height(u16)reference_frames[8]— 8-entry DPB (vs VP8's 3)pic_fields.bits.{...}— 27 single-bit/multi-bit fields (subsampling_x/y, frame_type, show_frame, error_resilient_mode, intra_only, allow_high_precision_mv, mcomp_filter_type[3 bits], frame_parallel_decoding_mode, reset_frame_context[2 bits], refresh_frame_context, frame_context_idx[2 bits], segmentation_*, last/golden/alt_ref_frame[3 bits each, indexes into reference_frames[8]], *_sign_bias, lossless_flag)filter_level,sharpness_level(u8)log2_tile_rows,log2_tile_columns(u8)frame_header_length_in_bytes— uncompressed_header_size (u8 — note 8-bit width may overflow for super-frames; typical < 256 for BBB)first_partition_size— compressed_header_size (u16)mb_segment_tree_probs[7],segment_pred_probs[3](u8)profile,bit_depth(u8)
VASliceParameterBufferVP9 (va_dec_vp9.h:279-303):
slice_data_size,slice_data_offset,slice_data_flag(u32)seg_param[8]— array ofVASegmentParameterVP9(~40 bytes each):segment_flags.fields.{segment_reference_enabled, segment_reference[2 bits], segment_reference_skipped}(u16 packed)filter_level[4][2](u8) — per-ref-frame × per-mode loop filter levelsluma_ac_quant_scale,luma_dc_quant_scale,chroma_ac_quant_scale,chroma_dc_quant_scale(s16) — already-computed effective scale per segment
FFmpeg V4L2 reference (v4l2_request_vp9.c)
Submission shape: 2 batched controls per frame in single S_EXT_CTRLS:
control[0] = { .id = V4L2_CID_STATELESS_VP9_FRAME, ... };
control[1] = { .id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, ... };
v4l2_set_controls(..., control, 2);
The COMPRESSED_HDR control is conditionally-included based on a runtime probe (v4l2_request_vp9_post_frames_ctx queries the kernel; if the control isn't advertised, falls back to FRAME-only). For rkvdec on RK3399, the kernel advertises COMPRESSED_HDR — verified at rkvdec-vp9.c:752 (kernel WILL EINVAL if not provided).
Kernel rkvdec driver (rkvdec-vp9.c)
Key reads in rkvdec_vp9_run_preamble:
- VP9_FRAME control →
dec_params = ctrl->p_cur.p→ drives register programming viaconfig_registers(). - VP9_COMPRESSED_HDR control →
prob_updates = ctrl->p_cur.p→ applied viav4l2_vp9_fw_update_probs(). - 8-entry reference frame DPB resolved from FRAME's
last_frame_ts/golden_frame_ts/alt_frame_ts(only 3 active references at a time, despite VAAPI exposing 8 — kernel uses last/golden/alt indexes into the picture's 8-frame DPB).
Mapping table (VAAPI → V4L2 / kernel)
The libva backend's job: read VAAPI's per-frame buffers (Picture + Slice) AND parse the compressed header from the bitstream, write the kernel's two structs.
v4l2_ctrl_vp9_frame mapping
| Kernel field | VAAPI source | Notes |
|---|---|---|
lf.ref_deltas[4] |
NOT in VAAPI | VAAPI doesn't expose loop-filter ref deltas separately; FFmpeg's V4L2 ref reads from VP9Context internal state. Open question Phase 3: are these zero in the BBB fixture? |
lf.mode_deltas[2] |
NOT in VAAPI | same |
lf.level |
picture->filter_level |
direct |
lf.sharpness |
picture->sharpness_level |
direct |
lf.flags |
NOT in VAAPI | DELTA_ENABLED + DELTA_UPDATE bits — ditto |
quant.base_q_idx |
DERIVED — no direct VAAPI exposure | Open question Phase 3: VAAPI exposes per-segment luma_ac_quant_scale[seg_param[s]] but those are EFFECTIVE Q-scales, not the base index. Inverse-derive from luma_ac_quant_scale[0][1] via VP9 spec quantization table? Or leave zero and let kernel use default? |
quant.delta_q_y_dc/uv_dc/uv_ac |
NOT in VAAPI | same — VAAPI only exposes effective per-segment scales |
seg.feature_data[8][4] |
DERIVED from slice->seg_param[s].filter_level[][] + quant scales |
mapping non-trivial |
seg.feature_enabled[8] |
derived from slice->seg_param[s].segment_flags + segmentation enabled bits |
non-trivial |
seg.tree_probs[7] |
picture->mb_segment_tree_probs[7] |
direct |
seg.pred_probs[3] |
picture->segment_pred_probs[3] |
direct |
seg.flags |
from pic_fields.bits.{segmentation_enabled, segmentation_update_map, segmentation_temporal_update} + derived segmentation_update_data + absolute_or_delta |
mostly direct |
flags & KEY_FRAME |
!pic_fields.bits.frame_type |
VAAPI inverts: frame_type=0 means keyframe |
flags & SHOW_FRAME |
pic_fields.bits.show_frame |
direct |
flags & ERROR_RESILIENT |
pic_fields.bits.error_resilient_mode |
direct |
flags & INTRA_ONLY |
pic_fields.bits.intra_only |
direct |
flags & ALLOW_HIGH_PREC_MV |
pic_fields.bits.allow_high_precision_mv |
direct |
flags & REFRESH_FRAME_CTX |
pic_fields.bits.refresh_frame_context |
direct |
flags & PARALLEL_DEC_MODE |
pic_fields.bits.frame_parallel_decoding_mode |
direct |
flags & X/Y_SUBSAMPLING |
pic_fields.bits.subsampling_x/y |
direct |
flags & COLOR_RANGE_FULL_SWING |
NOT in VAAPI | leave 0 for BT.709 limited (BBB) |
compressed_header_size |
picture->first_partition_size |
direct (VAAPI mis-named per its own comment) |
uncompressed_header_size |
picture->frame_header_length_in_bytes |
direct |
frame_width_minus_1 |
picture->frame_width - 1 |
direct |
frame_height_minus_1 |
picture->frame_height - 1 |
direct |
render_width_minus_1, render_height_minus_1 |
NOT in VAAPI | leave equal to frame_width-1 / frame_height-1 (no scaling for BBB) |
last_frame_ts |
DPB lookup picture->reference_frames[picture->pic_fields.bits.last_ref_frame] → surface_object->timestamp → v4l2_timeval_to_ns() |
uses last_ref_frame index into 8-entry DPB |
golden_frame_ts |
DPB lookup picture->reference_frames[picture->pic_fields.bits.golden_ref_frame] |
same |
alt_frame_ts |
DPB lookup picture->reference_frames[picture->pic_fields.bits.alt_ref_frame] |
same |
ref_frame_sign_bias |
OR of pic_fields.bits.{last,golden,alt}_ref_frame_sign_bias mapped to V4L2_VP9_SIGN_BIAS_{LAST,GOLDEN,ALT} |
direct |
reset_frame_context |
pic_fields.bits.reset_frame_context (with FFmpeg's > 0 ? -1 : 0 adjustment per ref) |
mapping needs inspection |
frame_context_idx |
pic_fields.bits.frame_context_idx |
direct |
profile |
picture->profile |
direct |
bit_depth |
picture->bit_depth |
direct |
interpolation_filter |
pic_fields.bits.mcomp_filter_type (with FFmpeg's ^ (filtermode <= 1) adjustment — see ref) |
mapping needs inspection |
tile_cols_log2, tile_rows_log2 |
picture->log2_tile_columns, log2_tile_rows |
direct |
reference_mode |
NOT in VAAPI | derive from heuristic OR leave default V4L2_VP9_REFERENCE_MODE_SELECT — Phase 3 baseline answers |
v4l2_ctrl_vp9_compressed_hdr mapping
This struct is filled by PARSING the compressed header bitstream — NOT from VAAPI. The libva backend runs a VPX boolean decoder over surface_object->source_data + uncompressed_header_size for compressed_header_size bytes, follows the VP9 spec section 6.3, and applies inv_map_table[d] for each updated probability.
The parsing logic is direct port of FFmpeg fill_compressed_hdr (lines 99-261). Key syntax elements parsed:
tx_mode(2 bits, then conditional 1 bit)- TX 8x8/16x16/32x32 probability updates (only if tx_mode == SELECT)
- Coef probability updates (4-level nested loop with branch probs)
- Skip / inter_mode / interp_filter / is_inter / comp_mode / single_ref / comp_ref / y_mode / partition probability updates (only on inter frames)
- MV probability updates (joint / sign / classes / class0_bit / bits / class0_fr / fr / class0_hp / hp)
Each updated value goes through inv_map_table[] (256-byte lookup). Each "no update" bit leaves zero in the kernel struct.
Patch shape prediction
| Site | Action | LOC delta |
|---|---|---|
src/config.c:121-160 |
INSERT VP9 enumeration block | +10 |
src/config.c:54-78 |
INSERT VP9 case + break + comment | +5 |
src/config.c:167-191 |
INSERT VP9 case in fall-through | +1 |
src/vp9.c |
NEW FILE | +500-600 |
src/vp9.h |
NEW FILE | +35-45 |
src/picture.c:34-37 |
INSERT #include "vp9.h" |
+1 |
src/picture.c:188-225 |
INSERT VP9 dispatch case | +6 |
src/picture.c:54-186 |
INSERT 2 buffer-type cases | +14 |
src/surface.h:92-119 |
INSERT vp9 struct | +6 |
src/meson.build:50,73 |
INSERT 2 entries | +2 |
Total: ~580-690 LOC, 5 modified + 2 new files. Larger than iter3 VP8 (370 LOC) and comparable to iter2 HEVC (470 LOC). Compressed-header parser is the dominant cost.
Predicted commits:
- Commit A:
src/config.cenumeration + dispatch + entrypoints (Criterion 1). - Commit B: NEW
src/vp9.c+src/vp9.h+src/meson.build(10 contract clauses + VPX rac decoder + compressed-header parser). - Commit C:
src/picture.cdispatcher + 2 buffer-type cases +src/surface.hunion extension (Criteria 2-3). - Commit D: optional fix-forward placeholder.
Open questions for Phase 3 baseline
- Loop filter ref/mode deltas: VAAPI doesn't expose
lf_delta.ref/mode/enabled/updated. Are these always zero for BBB? Phase 3 strace of FFmpeg-v4l2request VP9 will reveal verbatim values. - Quantization base_q_idx + deltas: VAAPI exposes effective per-segment scales but not the base. Phase 3 baseline: capture verbatim FRAME control payload to see what FFmpeg-v4l2request writes; correlate against VAAPI's per-segment scale via VP9 spec quantization table.
- Reference mode: VAAPI doesn't expose
comppredmode. Phase 3 baseline: verify defaultV4L2_VP9_REFERENCE_MODE_SELECTworks for BBB. - Interpolation filter mapping: FFmpeg uses
filtermode ^ (filtermode <= 1)to remap; VAAPI'smcomp_filter_typemay already be in V4L2 enum order (no remap needed) OR in a different order. Empirically check. - Reset frame context mapping: FFmpeg uses
> 0 ? - 1 : 0. Either FFmpeg's source enum is offset by 1 from V4L2's, or there's an off-by-one. Empirically verify. - VAAPI per-segment field interpretation:
slice->seg_param[s].filter_level[4][2]and quant scales are EFFECTIVE values (computed by mpv-VAAPI consumer). Mapping back to kernel's "ALT_Q delta" + "ALT_L delta" + "REF_FRAME" feature bits is non-trivial. Phase 3 verbatim payload + mapping-back-to-VAAPI cross-check. - Does mpv 0.41.0 engage HW for VP9?: Phase 3 capture
mpv -v --hwdec=vaapi --vo=null --frames=2 ~/fourier-test/bbb_720p10s_vp9.webmand grep forSelected decoder: vp9vsUsing software decoding. iter3 VP8 fell back; iter4 VP9 may or may not. - Does rkvdec exhibit the same dma_resv kernel issue as hantro?: iter3 found hantro CAPTURE returns all-zero pages from libva readback. rkvdec is a different driver subsystem; iter1+iter2 successfully verified via mpv-DMA-BUF-GL on rkvdec. Predicted: rkvdec works fine for direct readback. Phase 3 baseline: re-test ffmpeg-vaapi-hwdownload on rkvdec for VP9 and check if output is non-zero.
Phase 3 baseline targets (work plan)
- Cross-validator capture:
strace -ff -tt -y -v -e trace=ioctl ffmpeg -hwaccel v4l2request bbb_720p10s_vp9.webm -frames:v 5 -f null - 2>strace.log. Decode VP9_FRAME + COMPRESSED_HDR payloads via Phase 3 decoder (extenddecode_vp8.pyfor VP9 layout). - VAAPI consumer trace:
LIBVA_TRACEmpv-SW + mpv-vaapi runs to see what buffer types mpv produces. - Cache-safe verify reference:
mpv --hwdec=no --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp9.webmand capture frame-0001/0002 SHA256 (criterion-4 anchor). - rkvdec readback path test: re-run
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -vf hwdownload bbb_720p10s_vp9.webm -frames:v 5after install (would be Phase 6 actually; Phase 3 just baseline-captures the SW reference). Confirm whether rkvdec hits dma_resv issue or not (predicted: NO based on iter1+iter2 working there). - mpv-VP9-vaapi engagement check: per memory
feedback_hw_decode_engagement_check.md, verify HW path engaged viampv -vlog BEFORE claiming criterion 4.
Phase 4 plan structure (anticipated)
Following iter2/iter3's clause template:
- Clause 1: Submission shape — 2 controls batched per frame
- Clause 2: Local struct alloc + zero-init (memset both)
- Clause 3: Frame geometry + scalars + flags
- Clause 4: DPB timestamp resolution (3 active refs from 8-slot DPB)
- Clause 5: Loop filter mapping (with VAAPI gap notes per Q1)
- Clause 6: Quantization mapping (with VAAPI gap notes per Q2)
- Clause 7: Segmentation mapping (with VAAPI per-segment effective-vs-delta unpacking per Q6)
- Clause 8: Compressed header parser — port FFmpeg
fill_compressed_hdr+ VPX rac decoder + inv_map_table - Clause 9: Final 2-control batched submission
- Clause 10: Bitstream offsetting —
surface_object->source_data + uncompressed_header_sizeis the start of compressed-header bytes;compressed_header_sizeis the byte length
The plan will cite verbatim Phase 3 baseline payload bytes for all fields where mapping is non-obvious (loop-filter deltas, quant base, segmentation feature mapping) per feedback_dev_process.md Phase 6 contract-before-code.
Substrate state at Phase 2 close
- iter4 Phase 1 commit
9a71dbfpushed to gitea. - Fork on noether at iter3 tip
e1aca9c(synced viagit fetch && merge --ff-only). - All Phase 3 prerequisites identified.
- Memory rules apply unchanged.
- Phase 3 questions queued (8 items, mostly empirical). Phase 5 review will catch the field-availability + mapping questions analogous to iter3 (
uniform_spacing_flagDirection 2 lesson).