iter3 Phase 2: situation analysis — VP8 backend gaps + contract surface
Source-read of every file the iter3 patch series will touch, plus the
kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference
sources. Conducted on noether against fork tip 8d71e20 (iter2 Phase 6
commit B); fresnel.vpn was unreachable so Phase 3 baseline empirical
capture defers until laptop reachable.
Bug enumeration (10 sites the patch series must touch):
B1 config.c::RequestQueryConfigProfiles enumeration block missing
B2 config.c::RequestCreateConfig VP8 case label missing
B3 config.c::RequestQueryConfigEntrypoints VP8 case missing
B4 src/vp8.c new file ~160-220 LOC
B5 src/vp8.h new file ~35-45 LOC
B6 picture.c::codec_set_controls VP8 dispatch missing
B7 picture.c::codec_store_buffer 4 buffer-type cases +
VAProbabilityDataBufferType
outer case missing
B8 picture.c::RequestBeginPicture per-frame reset additions
B9 surface.h::object_surface::params union vp8 member missing
B10 meson.build vp8.c/vp8.h not in lists
Non-bugs (intentionally untouched):
- context.c (no DECODE_MODE/START_CODE menus for VP8)
- video.c (CAPTURE-side format list; VP8 is OUTPUT-side)
- v4l2.c (fourcc-agnostic helpers)
- buffer.c (buffer registry is type-agnostic)
- include/hevc-ctrls.h (already includes <linux/v4l2-controls.h>
which holds V4L2_CID_STATELESS_VP8_FRAME)
Contract surface cited verbatim:
- V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE+200
= 0x00a409c8 (matches Phase 0 V4L2 inventory)
- struct v4l2_ctrl_vp8_frame at <linux/v4l2-controls.h>:1929-1958
+ 5 sub-structs (segment, lf, quant, entropy, coder_state) at
1785-1888
- VAAPI VAPictureParameterBufferVP8 + VASliceParameterBufferVP8 +
VAProbabilityDataBufferVP8 + VAIQMatrixBufferVP8 at
references/libva/va/va_dec_vp8.h
- FFmpeg v4l2_request_vp8.c reference: single batched S_EXT_CTRLS
at end_frame, count=1, no init-time menus
- Kernel hantro_vp8.c::hantro_vp8_prob_update reads 9 fields from
hdr (skip/intra/last/gf probs, segment_probs, entropy.{y,uv,mv,
coeff}_probs)
VAAPI → V4L2 mapping table: 30 fields enumerated. Open questions for
Phase 3 baseline (6 items: first_part_header_bits derivation, num_
dct_parts off-by-one, DPB timestamp 0-sentinel handling, show_frame
default, lf.flags FILTER_TYPE_SIMPLE bit, first-frame DPB sentinel).
Patch-shape prediction: ~260-340 LOC across 6 modified + 2 new
files. Medium-sized iter — between iter1's 120 LOC (3 modified +
1 deleted) and iter2's 470 LOC (5 modified). The new file dominates.
Phase 3 baseline targets queued: cross-validator strace verbatim
S_EXT_CTRLS payload capture, VAAPI consumer trace, mpv-SW reference
JPEG capture for criterion 4 byte-compare anchor.
Phase 4 plan structure anticipated: 10-clause template per iter2.
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase8_iteration2_close.md (predecessor close)
src/mpeg2.c (iter1 single-codec template; iter3 will mirror shape)
src/h265.c (iter2 dispatcher pattern; iter3 takes structure cues)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,376 @@
|
|||||||
|
# Iteration 3 — Phase 2 (situation analysis)
|
||||||
|
|
||||||
|
Source-read of every file the iter3 patch series will touch, plus the kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference sources. Written immediately after iter3 Phase 1 lock (commit `ea2413e`). Conducted on noether against fork tip `8d71e20` (iter2 Phase 6 commit B); fresnel.vpn was unreachable at Phase 2 open, so the read is against the noether mirror — verified at commit hash level pre-read.
|
||||||
|
|
||||||
|
This is a contract-before-code analysis per `feedback_dev_process.md` Phase 2: enumerate the bugs, cite the contract verbatim, predict the patch shape, queue the Phase 3 baseline questions.
|
||||||
|
|
||||||
|
## Bug enumeration (sites the iter3 patch series must touch)
|
||||||
|
|
||||||
|
### B1 — `src/config.c::RequestQueryConfigProfiles` — VP8 enumeration block missing
|
||||||
|
|
||||||
|
**Site**: `config.c:121-165`.
|
||||||
|
|
||||||
|
**Current state** (lines 128-160): three enumeration blocks for MPEG-2 (lines 128-137), H.264 (139-151), HEVC (153-160). Each `v4l2_find_format()`'s the OUTPUT-side pixfmt against both single-plane and MPLANE buffer types, then conditionally appends profile constants to the output array under a count guard.
|
||||||
|
|
||||||
|
**Bug**: no analogous block for `V4L2_PIX_FMT_VP8_FRAME` → `VAProfileVP8Version0_3`. Without this, `vainfo` (and any consumer that calls `vaQueryConfigProfiles`) sees no VP8 profile in the enumeration → criterion 1 fails before vaCreateConfig is ever attempted.
|
||||||
|
|
||||||
|
**Different from iter1+iter2**: iter1 (MPEG-2) and iter2 (HEVC) had the enumeration block already in place pre-iter; only the case label fall-through in `RequestCreateConfig` was missing. iter3 has neither. Both ADDs.
|
||||||
|
|
||||||
|
### B2 — `src/config.c::RequestCreateConfig` — VP8 case label missing entirely
|
||||||
|
|
||||||
|
**Site**: `config.c:54-78`.
|
||||||
|
|
||||||
|
**Current state**: switch over `profile`. iter1 added `case VAProfileMPEG2Simple/Main:` with explicit `break;` (lines 63-69). iter2 added `case VAProfileHEVCMain:` with `break;` (lines 70-75). H.264 always existed (lines 56-62, marked `// FIXME` from upstream). Default → `VA_STATUS_ERROR_UNSUPPORTED_PROFILE`.
|
||||||
|
|
||||||
|
**Bug**: no `case VAProfileVP8Version0_3:`. Hits default → consumer gets `VA_STATUS_ERROR_UNSUPPORTED_PROFILE` from vaCreateConfig → criterion 2 fails.
|
||||||
|
|
||||||
|
**Patch shape**: add 4-line case (label + comment + `break;`) directly after the iter2 HEVCMain block, mirroring iter1+iter2 style.
|
||||||
|
|
||||||
|
### B3 — `src/config.c::RequestQueryConfigEntrypoints` — VP8 case missing
|
||||||
|
|
||||||
|
**Site**: `config.c:167-191`.
|
||||||
|
|
||||||
|
**Current state**: switch over `profile`; case list at lines 173-180 covers MPEG-2/H.264/HEVC and falls through to `entrypoints[0] = VAEntrypointVLD; *entrypoints_count = 1;`. Default sets count to 0.
|
||||||
|
|
||||||
|
**Bug**: no `case VAProfileVP8Version0_3:`. mpv-vaapi's profile probe queries entry points; without VLD, it skips VP8 → criterion 3 fails (mpv falls through to SW decode silently).
|
||||||
|
|
||||||
|
**Patch shape**: add `case VAProfileVP8Version0_3:` to the existing fall-through case list.
|
||||||
|
|
||||||
|
### B4 — `src/vp8.c` — file does not exist; needs net-new implementation
|
||||||
|
|
||||||
|
**Site**: NEW FILE `src/vp8.c`.
|
||||||
|
|
||||||
|
**Bug**: there is no VP8 codec dispatcher in the fork. The fork's predecessor (libva-v4l2-request bootlin master) only implements MPEG-2 + H.264 + HEVC. VP8 was never added upstream.
|
||||||
|
|
||||||
|
**Patch shape**: NEW file, ~150-200 lines. Mirror the iter1 mpeg2.c template (`src/mpeg2.c:53-249`):
|
||||||
|
- Includes block (mpeg2.h-equivalent + context + request + surface + v4l2-controls)
|
||||||
|
- `vp8_set_controls()` function entry point matching the existing dispatcher signature `(struct request_data *driver_data, struct object_context *context_object, struct object_surface *surface_object) -> int`
|
||||||
|
- Local `v4l2_ctrl_vp8_frame` struct populated from VAAPI buffers (Picture + IQMatrix + Probability + Slice param)
|
||||||
|
- DPB-timestamp lookup for `last_frame_ts`/`golden_frame_ts`/`alt_frame_ts` from `VASurfaceID` references in VAPictureParameterBufferVP8
|
||||||
|
- One-element `v4l2_ext_control` array, single `V4L2_CID_STATELESS_VP8_FRAME` control
|
||||||
|
- Single `v4l2_set_controls(driver_data->video_fd, surface_object->request_fd, ctrls, 1)` call
|
||||||
|
|
||||||
|
### B5 — `src/vp8.h` — header does not exist
|
||||||
|
|
||||||
|
**Site**: NEW FILE `src/vp8.h`.
|
||||||
|
|
||||||
|
**Bug**: companion header for vp8.c. Declare `vp8_set_controls()`. Mirror `src/mpeg2.h` (forward declarations of `request_data`, `object_context`, `object_surface`, function prototype). No struct definitions needed (no array dimensions to declare like HEVC's `HEVC_MAX_SLICES_PER_FRAME`).
|
||||||
|
|
||||||
|
### B6 — `src/picture.c::codec_set_controls` — VP8 dispatch case missing
|
||||||
|
|
||||||
|
**Site**: `picture.c:188-225` (function `codec_set_controls`).
|
||||||
|
|
||||||
|
**Current state**: switch over profile; MPEG-2 → `mpeg2_set_controls` (lines 196-201), H.264 → `h264_set_controls` (203-212), HEVCMain → `h265_set_controls` (214-218). Default → `VA_STATUS_ERROR_UNSUPPORTED_PROFILE`.
|
||||||
|
|
||||||
|
**Bug**: no VP8 case. Hits default after RequestEndPicture → vaEndPicture returns error → consumer aborts decode.
|
||||||
|
|
||||||
|
**Patch shape**: add `case VAProfileVP8Version0_3:` calling `vp8_set_controls(driver_data, context_object, surface_object)` with same `if (rc < 0) return VA_STATUS_ERROR_OPERATION_FAILED;` shape as MPEG-2 + HEVC.
|
||||||
|
|
||||||
|
Plus include directive update: add `#include "vp8.h"` near `picture.c:34-36` (the existing `h264.h`/`h265.h`/`mpeg2.h` block).
|
||||||
|
|
||||||
|
### B7 — `src/picture.c::codec_store_buffer` — 4 VAAPI buffer types unmapped
|
||||||
|
|
||||||
|
**Site**: `picture.c:54-186` (function `codec_store_buffer`).
|
||||||
|
|
||||||
|
VAAPI VP8 sends FOUR distinct per-frame buffer types (per `va_dec_vp8.h:71-241`):
|
||||||
|
|
||||||
|
| VAAPI buffer type | VAAPI struct | Per-frame |
|
||||||
|
|---|---|---|
|
||||||
|
| `VAPictureParameterBufferType` | `VAPictureParameterBufferVP8` | once |
|
||||||
|
| `VASliceParameterBufferType` | `VASliceParameterBufferVP8` | once (frame-mode) |
|
||||||
|
| `VAProbabilityDataBufferType` | `VAProbabilityDataBufferVP8` | once |
|
||||||
|
| `VAIQMatrixBufferType` | `VAIQMatrixBufferVP8` | once |
|
||||||
|
| `VASliceDataBufferType` | raw bitstream | once |
|
||||||
|
|
||||||
|
**Current state**:
|
||||||
|
- `VASliceDataBufferType` (lines 61-83) — already universal, no per-profile branch. `context->h264_start_code` flag prepends `00 00 01` for H.264 only; VP8 does not need start-code prefix (VP8 has its own 3-byte uncompressed frame header). The slice-data path is fine for VP8 unmodified.
|
||||||
|
- `VAPictureParameterBufferType` (lines 85-113) — switch over profile; MPEG-2/H.264/HEVC handled. Default → break (silent ignore). Bug: no VP8 case.
|
||||||
|
- `VASliceParameterBufferType` (lines 115-146) — switch; H.264/HEVC handled. Bug: no MPEG-2 case (intentional — MPEG-2 has only Picture + Quant + Slice-data per VAAPI), no VP8 case.
|
||||||
|
- `VAIQMatrixBufferType` (lines 148-179) — switch; MPEG-2/H.264/HEVC handled. Bug: no VP8 case.
|
||||||
|
- `VAProbabilityDataBufferType` — NOT IN THE OUTER SWITCH. VAAPI defines this enum value for VP8, but the fork's `codec_store_buffer` outer switch doesn't list it. Currently falls through to `default: break;` at line 181. Bug: VAProbabilityDataBufferType case missing entirely.
|
||||||
|
|
||||||
|
**Patch shape**: 4 nested case adds + 1 outer-case add:
|
||||||
|
- VAPictureParameterBufferType → add VP8 case → memcpy into `surface_object->params.vp8.picture`
|
||||||
|
- VASliceParameterBufferType → add VP8 case → memcpy into `surface_object->params.vp8.slice` (single, no slices[] array — VP8 is frame-mode)
|
||||||
|
- VAIQMatrixBufferType → add VP8 case → memcpy into `surface_object->params.vp8.iqmatrix` + set `iqmatrix_set` true
|
||||||
|
- NEW outer case `VAProbabilityDataBufferType` → switch over profile → VP8 case → memcpy into `surface_object->params.vp8.probability` + set `probability_set` true
|
||||||
|
|
||||||
|
### B8 — `src/picture.c::RequestBeginPicture` — no per-frame VP8 reset needed (probably)
|
||||||
|
|
||||||
|
**Site**: `picture.c:227-306`.
|
||||||
|
|
||||||
|
iter1 added `surface_object->params.h264.matrix_set = false;` at line 299. iter2 added `surface_object->params.h265.num_slices = 0;` at line 300.
|
||||||
|
|
||||||
|
**Bug analysis**: VP8 has no slice-array (single per-frame). It does have a probability-data flag (`probability_set`) that needs reset per frame. AND iqmatrix_set needs per-frame reset.
|
||||||
|
|
||||||
|
**Patch shape**: add two lines:
|
||||||
|
- `surface_object->params.vp8.iqmatrix_set = false;`
|
||||||
|
- `surface_object->params.vp8.probability_set = false;`
|
||||||
|
|
||||||
|
This mirrors iter1's `matrix_set = false` reset pattern (one line each profile).
|
||||||
|
|
||||||
|
### B9 — `src/surface.h::object_surface::params` union — no `vp8` member
|
||||||
|
|
||||||
|
**Site**: `surface.h:92-113`.
|
||||||
|
|
||||||
|
**Current state**: union of three structs: `mpeg2`, `h264`, `h265`. Each holds the buffer-type structs the dispatcher reads.
|
||||||
|
|
||||||
|
**Bug**: no `vp8` member. iter1 B3 latent surface-reuse bug (per phase0_findings_iter3.md): `picture.c:299` writes byte 240 of the union (h264.matrix_set offset). The iter2 union is dominated by h265 with its 64-slot slices[64] array; total union size ~17 KB. Adding a `vp8` member doesn't grow the union (h265 is the dominant member by far).
|
||||||
|
|
||||||
|
**Patch shape**: add `vp8` struct after `h265`:
|
||||||
|
```c
|
||||||
|
struct {
|
||||||
|
VAPictureParameterBufferVP8 picture;
|
||||||
|
VASliceParameterBufferVP8 slice;
|
||||||
|
VAIQMatrixBufferVP8 iqmatrix;
|
||||||
|
bool iqmatrix_set;
|
||||||
|
VAProbabilityDataBufferVP8 probability;
|
||||||
|
bool probability_set;
|
||||||
|
} vp8;
|
||||||
|
```
|
||||||
|
|
||||||
|
### B10 — `src/meson.build` — `vp8.c` + `vp8.h` not in sources/headers
|
||||||
|
|
||||||
|
**Site**: `meson.build:30-74`.
|
||||||
|
|
||||||
|
**Current state**: `sources` list has `mpeg2.c`/`h264.c`/`h264_slice_header.c`/`h265.c` (line 50, uncommented in iter2). `headers` list has `mpeg2.h`/`h264.h`/`h264_slice_header.h`/`h265.h` (line 73).
|
||||||
|
|
||||||
|
**Bug**: vp8.c + vp8.h are NEW files, must be ADDED.
|
||||||
|
|
||||||
|
**Patch shape**: insert `'vp8.c'` after `'h265.c'` in sources, insert `'vp8.h'` after `'h265.h'` in headers.
|
||||||
|
|
||||||
|
### Non-bugs (intentionally NOT touched)
|
||||||
|
|
||||||
|
- `src/context.c` — VP8 has no DECODE_MODE/START_CODE menus per Phase 0 V4L2 inventory. iter2's HEVC additions to context.c have no analog. **No context.c changes.**
|
||||||
|
- `src/video.c::formats[]` — the format list is CAPTURE-side (NV12 + Sunxi NV12). VP8 is OUTPUT-side; OUTPUT format probing is `v4l2_find_format()` calls in config.c, NOT video.c. **No video.c changes.**
|
||||||
|
- `src/v4l2.c` — `v4l2_find_format()` is fourcc-agnostic. **No v4l2.c changes.**
|
||||||
|
- `src/buffer.c` — `VAProbabilityDataBufferType` is a standard VAAPI buffer type; the buffer registry is type-agnostic. **No buffer.c changes.**
|
||||||
|
- `include/hevc-ctrls.h` — already a 9-line shim including `<linux/v4l2-controls.h>`. VP8's V4L2_CID_STATELESS_VP8_FRAME is in the same kernel UAPI header (line 1900). No header-shim work like iter1's `mpeg2-ctrls.h` deletion.
|
||||||
|
|
||||||
|
## Contract surface (verbatim from kernel UAPI + VAAPI)
|
||||||
|
|
||||||
|
### Kernel UAPI: `V4L2_CID_STATELESS_VP8_FRAME`
|
||||||
|
|
||||||
|
`<linux/v4l2-controls.h>:1900` — `V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE + 200 = 0x00a409c8`. Matches the per-device control advertised by hantro-vpu-dec in Phase 0 V4L2 inventory (`vp8_frame_parameters 0x00a409c8`).
|
||||||
|
|
||||||
|
### Kernel UAPI: `struct v4l2_ctrl_vp8_frame` (`<linux/v4l2-controls.h>:1929-1958`)
|
||||||
|
|
||||||
|
```c
|
||||||
|
struct v4l2_ctrl_vp8_frame {
|
||||||
|
struct v4l2_vp8_segment segment; /* offset 0 */
|
||||||
|
struct v4l2_vp8_loop_filter lf; /* loop filter parameters */
|
||||||
|
struct v4l2_vp8_quantization quant; /* base quant indices */
|
||||||
|
struct v4l2_vp8_entropy entropy; /* update probabilities */
|
||||||
|
struct v4l2_vp8_entropy_coder_state coder_state;
|
||||||
|
|
||||||
|
__u16 width;
|
||||||
|
__u16 height;
|
||||||
|
|
||||||
|
__u8 horizontal_scale;
|
||||||
|
__u8 vertical_scale;
|
||||||
|
|
||||||
|
__u8 version;
|
||||||
|
__u8 prob_skip_false;
|
||||||
|
__u8 prob_intra;
|
||||||
|
__u8 prob_last;
|
||||||
|
__u8 prob_gf;
|
||||||
|
__u8 num_dct_parts;
|
||||||
|
|
||||||
|
__u32 first_part_size;
|
||||||
|
__u32 first_part_header_bits;
|
||||||
|
__u32 dct_part_sizes[8];
|
||||||
|
|
||||||
|
__u64 last_frame_ts;
|
||||||
|
__u64 golden_frame_ts;
|
||||||
|
__u64 alt_frame_ts;
|
||||||
|
|
||||||
|
__u64 flags;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Sub-structs (`<linux/v4l2-controls.h>:1785-1888`):
|
||||||
|
|
||||||
|
- `v4l2_vp8_segment`: `__s8 quant_update[4]; __s8 lf_update[4]; __u8 segment_probs[3]; __u8 padding; __u32 flags;` (segment-id probabilities, per-segment quant/lf overrides, flags `V4L2_VP8_SEGMENT_FLAG_{ENABLED, UPDATE_MAP, UPDATE_FEATURE_DATA, DELTA_VALUE_MODE}`)
|
||||||
|
- `v4l2_vp8_loop_filter`: `__s8 ref_frm_delta[4]; __s8 mb_mode_delta[4]; __u8 sharpness_level; __u8 level; __u16 padding; __u32 flags;` (flags `V4L2_VP8_LF_{ADJ_ENABLE, DELTA_UPDATE, FILTER_TYPE_SIMPLE}`)
|
||||||
|
- `v4l2_vp8_quantization`: `__u8 y_ac_qi; __s8 y_dc_delta; __s8 y2_dc_delta; __s8 y2_ac_delta; __s8 uv_dc_delta; __s8 uv_ac_delta; __u16 padding;` — base values; per-segment overrides come from `segment.quant_update[]`
|
||||||
|
- `v4l2_vp8_entropy`: `__u8 coeff_probs[4][8][3][11]; __u8 y_mode_probs[4]; __u8 uv_mode_probs[3]; __u8 mv_probs[2][19]; __u8 padding[3];` — probability update tables
|
||||||
|
- `v4l2_vp8_entropy_coder_state`: `__u8 range; __u8 value; __u8 bit_count; __u8 padding;` — boolean coder state at end of header
|
||||||
|
|
||||||
|
Frame flags (`<linux/v4l2-controls.h>:1890-1895`):
|
||||||
|
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_KEY_FRAME = 0x01`
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_EXPERIMENTAL = 0x02`
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_SHOW_FRAME = 0x04`
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_MB_NO_SKIP_COEFF = 0x08`
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_SIGN_BIAS_GOLDEN = 0x10`
|
||||||
|
- `V4L2_VP8_FRAME_FLAG_SIGN_BIAS_ALT = 0x20`
|
||||||
|
|
||||||
|
### VAAPI buffer types (`/home/mfritsche/src/ohm_gl_fix/phase6/step1/reference/libva/va/va_dec_vp8.h`)
|
||||||
|
|
||||||
|
`VAPictureParameterBufferVP8` (lines 71-160):
|
||||||
|
- `frame_width`, `frame_height` (u32)
|
||||||
|
- `last_ref_frame`, `golden_ref_frame`, `alt_ref_frame`, `out_of_loop_frame` (VASurfaceID)
|
||||||
|
- `pic_fields.bits.{key_frame, version, segmentation_enabled, update_mb_segmentation_map, update_segment_feature_data, filter_type, sharpness_level, loop_filter_adj_enable, mode_ref_lf_delta_update, sign_bias_golden, sign_bias_alternate, mb_no_coeff_skip, loop_filter_disable}` (packed bitfield)
|
||||||
|
- `mb_segment_tree_probs[3]` (u8)
|
||||||
|
- `loop_filter_level[4]`, `loop_filter_deltas_ref_frame[4]`, `loop_filter_deltas_mode[4]` (per-segment / per-ref / per-mode)
|
||||||
|
- `prob_skip_false`, `prob_intra`, `prob_last`, `prob_gf` (u8)
|
||||||
|
- `y_mode_probs[4]`, `uv_mode_probs[3]` (u8 — luma + chroma intra-prediction probs)
|
||||||
|
- `mv_probs[2][19]` (u8)
|
||||||
|
- `bool_coder_ctx.{range, value, count}` (u8 — same bytes as kernel `v4l2_vp8_entropy_coder_state` minus `padding`)
|
||||||
|
|
||||||
|
`VASliceParameterBufferVP8` (lines 170-202):
|
||||||
|
- `slice_data_size`, `slice_data_offset`, `slice_data_flag`, `macroblock_offset` (u32)
|
||||||
|
- `num_of_partitions` (u8)
|
||||||
|
- `partition_size[9]` (u32) — partition_size[0] is control-partition remaining bytes; partition_size[1..8] are DCT partition sizes (max 8 DCT partitions per VP8 spec)
|
||||||
|
|
||||||
|
`VAProbabilityDataBufferVP8` (lines 218-223):
|
||||||
|
- `dct_coeff_probs[4][8][3][11]` (u8) — direct match to kernel `v4l2_vp8_entropy.coeff_probs`
|
||||||
|
|
||||||
|
`VAIQMatrixBufferVP8` (lines 232-241):
|
||||||
|
- `quantization_index[4][6]` (u16) — per-segment, per-component effective Q index. Component order: yac(0), ydc(1), y2dc(2), y2ac(3), uvdc(4), uvac(5). Already includes per-segment effective values.
|
||||||
|
|
||||||
|
### FFmpeg downstream reference (`v4l2_request_vp8.c:31-187`)
|
||||||
|
|
||||||
|
Submission shape: single batched S_EXT_CTRLS at end_frame, count=1, V4L2_CID_STATELESS_VP8_FRAME with full v4l2_ctrl_vp8_frame struct. **No init-time device-wide menus** (no DECODE_MODE/START_CODE for VP8 — confirmed by absence in FFmpeg ref + Phase 0 V4L2 inventory).
|
||||||
|
|
||||||
|
Bitstream is appended verbatim (`v4l2_request_vp8_decode_slice` calls `ff_v4l2_request_append_output(buffer, size)` once per frame with the WHOLE VP8 frame including 3-byte uncompressed header). NO Annex-B start codes, NO start-code emulation prevention. The kernel hantro driver re-parses the 3-byte (or 10-byte for keyframe) uncompressed header.
|
||||||
|
|
||||||
|
### Kernel hantro driver reference (`hantro_vp8.c:49-143`)
|
||||||
|
|
||||||
|
`hantro_vp8_prob_update()` reads:
|
||||||
|
- `hdr->prob_skip_false`, `hdr->prob_intra`, `hdr->prob_last`, `hdr->prob_gf`
|
||||||
|
- `hdr->segment.segment_probs[0..2]`
|
||||||
|
- `hdr->entropy.{y_mode_probs[4], uv_mode_probs[3], mv_probs[2][19], coeff_probs[4][8][3][11]}`
|
||||||
|
|
||||||
|
The kernel does NOT read `hdr->coder_state.padding` or `quant.padding` or `lf.padding` — they're zeroed by struct designation initializer in C. **All `padding` fields must be left zero in the libva backend** (matches FFmpeg ref, which uses C99 designated init defaulting all unset fields to zero).
|
||||||
|
|
||||||
|
## Mapping table (VAAPI → V4L2 / kernel)
|
||||||
|
|
||||||
|
The libva backend's job: read VAAPI's per-frame buffers (Picture + Slice + Probability + IQMatrix) and write the kernel's `v4l2_ctrl_vp8_frame`. The VAAPI consumer (mpv/ffmpeg-vaapi) has already parsed the bitstream — the libva backend is field-shuffling only, no bitstream parsing.
|
||||||
|
|
||||||
|
| Kernel field | VAAPI source | Notes |
|
||||||
|
|---|---|---|
|
||||||
|
| `width`, `height` | `picture->frame_width`, `frame_height` | u32 → u16, both ≤65535 within campaign codec scope (1920 max) |
|
||||||
|
| `version` | `picture->pic_fields.bits.version` | 3-bit field |
|
||||||
|
| `horizontal_scale`, `vertical_scale` | 0, 0 | VAAPI doesn't expose; FFmpeg ref also hardcodes 0 |
|
||||||
|
| `prob_skip_false` | `picture->prob_skip_false` | direct |
|
||||||
|
| `prob_intra` | `picture->prob_intra` | direct |
|
||||||
|
| `prob_last` | `picture->prob_last` | direct |
|
||||||
|
| `prob_gf` | `picture->prob_gf` | direct |
|
||||||
|
| `num_dct_parts` | `slice->num_of_partitions - 1` | VAAPI's count includes control partition; kernel's excludes (per-spec). Verify against Phase 3 trace. |
|
||||||
|
| `first_part_size` | `slice->partition_size[0]` | control-partition size |
|
||||||
|
| `first_part_header_bits` | DERIVED — see below | not in VAAPI directly |
|
||||||
|
| `dct_part_sizes[0..7]` | `slice->partition_size[1..8]` | shift by 1 to skip control partition |
|
||||||
|
| `last_frame_ts` | DPB lookup `picture->last_ref_frame` | VASurfaceID → object_surface->timestamp → v4l2_timeval_to_ns() (mirror mpeg2.c::pic.forward_ref_ts pattern) |
|
||||||
|
| `golden_frame_ts` | DPB lookup `picture->golden_ref_frame` | same as above |
|
||||||
|
| `alt_frame_ts` | DPB lookup `picture->alt_ref_frame` | same as above |
|
||||||
|
| `flags & KEY_FRAME` | `picture->pic_fields.bits.key_frame == 0` | VAAPI inverts — VP8 spec says key_frame=0 means key-frame |
|
||||||
|
| `flags & SHOW_FRAME` | not in VAAPI | force 1 (mpv only renders shown frames; alt-ref invisible frames are also shown=1 to mpv consumer side; safe to force) |
|
||||||
|
| `flags & MB_NO_SKIP_COEFF` | `picture->pic_fields.bits.mb_no_coeff_skip` | direct |
|
||||||
|
| `flags & SIGN_BIAS_GOLDEN` | `picture->pic_fields.bits.sign_bias_golden` | direct |
|
||||||
|
| `flags & SIGN_BIAS_ALT` | `picture->pic_fields.bits.sign_bias_alternate` | direct |
|
||||||
|
| `flags & EXPERIMENTAL` | 0 | VAAPI doesn't expose; FFmpeg uses `s->profile & 0x4` which has no VAAPI analog. Leave 0. |
|
||||||
|
| `coder_state.range` | `picture->bool_coder_ctx.range` | direct |
|
||||||
|
| `coder_state.value` | `picture->bool_coder_ctx.value` | direct |
|
||||||
|
| `coder_state.bit_count` | `picture->bool_coder_ctx.count` | VAAPI calls it `count` |
|
||||||
|
| `lf.sharpness_level` | `picture->pic_fields.bits.sharpness_level` | direct |
|
||||||
|
| `lf.level` | `picture->loop_filter_level[0]` | base level (segment 0); VAAPI exposes per-segment, kernel takes base only |
|
||||||
|
| `lf.ref_frm_delta[0..3]` | `picture->loop_filter_deltas_ref_frame[0..3]` | direct |
|
||||||
|
| `lf.mb_mode_delta[0..3]` | `picture->loop_filter_deltas_mode[0..3]` | direct |
|
||||||
|
| `lf.flags & ADJ_ENABLE` | `picture->pic_fields.bits.loop_filter_adj_enable` | direct |
|
||||||
|
| `lf.flags & DELTA_UPDATE` | `picture->pic_fields.bits.mode_ref_lf_delta_update` | direct |
|
||||||
|
| `lf.flags & FILTER_TYPE_SIMPLE` | `picture->pic_fields.bits.filter_type` | VAAPI: filter_type=0 normal, =1 simple |
|
||||||
|
| `quant.y_ac_qi` | `iqmatrix->quantization_index[0][0]` | segment 0, yac component |
|
||||||
|
| `quant.y_dc_delta` | `iqmatrix->quantization_index[0][1] - iqmatrix->quantization_index[0][0]` | u8 - u8 → s8 (clamp) |
|
||||||
|
| `quant.y2_dc_delta` | `iqmatrix->quantization_index[0][2] - iqmatrix->quantization_index[0][0]` | same |
|
||||||
|
| `quant.y2_ac_delta` | `iqmatrix->quantization_index[0][3] - iqmatrix->quantization_index[0][0]` | same |
|
||||||
|
| `quant.uv_dc_delta` | `iqmatrix->quantization_index[0][4] - iqmatrix->quantization_index[0][0]` | same |
|
||||||
|
| `quant.uv_ac_delta` | `iqmatrix->quantization_index[0][5] - iqmatrix->quantization_index[0][0]` | same |
|
||||||
|
| `segment.quant_update[s]` | for s∈[1..3]: `iqmatrix->quantization_index[s][0] - iqmatrix->quantization_index[0][0]` if segmentation enabled, else 0 | when segmentation_enabled=0 (BBB case), all quant_updates are 0 — bypass the per-segment math |
|
||||||
|
| `segment.lf_update[s]` | for s∈[1..3]: `picture->loop_filter_level[s] - picture->loop_filter_level[0]` if segmentation enabled, else 0 | same |
|
||||||
|
| `segment.segment_probs[0..2]` | `picture->mb_segment_tree_probs[0..2]` | direct |
|
||||||
|
| `segment.flags & ENABLED` | `picture->pic_fields.bits.segmentation_enabled` | direct |
|
||||||
|
| `segment.flags & UPDATE_MAP` | `picture->pic_fields.bits.update_mb_segmentation_map` | direct |
|
||||||
|
| `segment.flags & UPDATE_FEATURE_DATA` | `picture->pic_fields.bits.update_segment_feature_data` | direct |
|
||||||
|
| `segment.flags & DELTA_VALUE_MODE` | NOT in VAAPI directly | VAAPI doesn't expose abs_delta. Per VP8 spec default, segment values are deltas unless explicitly absolute — the FFmpeg ref sets DELTA_VALUE_MODE iff `!s->segmentation.absolute_vals`. For BBB (segmentation disabled), this flag's value is irrelevant. Leave 0; document the gap for Phase 5 review. |
|
||||||
|
| `entropy.y_mode_probs[0..3]` | `picture->y_mode_probs[0..3]` | direct |
|
||||||
|
| `entropy.uv_mode_probs[0..2]` | `picture->uv_mode_probs[0..2]` | direct |
|
||||||
|
| `entropy.mv_probs[i][j]` | `picture->mv_probs[i][j]` | direct, [2][19] both sides |
|
||||||
|
| `entropy.coeff_probs[i][j][k][l]` | `probability->dct_coeff_probs[i][j][k][l]` | DIFFERENT BUFFER — sourced from VAProbabilityDataBuffer not Picture. Direct shape match [4][8][3][11]. |
|
||||||
|
|
||||||
|
### `first_part_header_bits` derivation
|
||||||
|
|
||||||
|
This field is a kernel-imposed metadata about the bitstream: number of bits consumed by the uncompressed header partition before the boolean coder takes over. FFmpeg derives it from internal parser state:
|
||||||
|
|
||||||
|
```c
|
||||||
|
.first_part_header_bits = (8 * (s->coder_state_at_header_end.input - data) -
|
||||||
|
s->coder_state_at_header_end.bit_count - 8),
|
||||||
|
```
|
||||||
|
|
||||||
|
VAAPI does not expose this directly. **Open question for Phase 3 baseline**: derive from `slice->macroblock_offset` (bit offset of MB layer from start of slice data) — likely equal or off by a known constant. Phase 3 captures the verbatim payload from ffmpeg-v4l2request and computes the relationship.
|
||||||
|
|
||||||
|
If the kernel ignores `first_part_header_bits` (some drivers do — hantro re-parses), the field can be left zero or approximate. Phase 5 review will flag this as a known fidelity gap.
|
||||||
|
|
||||||
|
## Patch shape prediction
|
||||||
|
|
||||||
|
| Site | Action | LOC delta |
|
||||||
|
|---|---|---|
|
||||||
|
| `src/config.c:121-160` | INSERT VP8 enumeration block (~10 lines) | +10 |
|
||||||
|
| `src/config.c:54-78` | INSERT case label + break + comment (~5 lines) | +5 |
|
||||||
|
| `src/config.c:167-191` | INSERT case label (~1 line) | +1 |
|
||||||
|
| `src/vp8.c` | NEW FILE | +160-220 |
|
||||||
|
| `src/vp8.h` | NEW FILE | +35-45 |
|
||||||
|
| `src/picture.c:34-36` | INSERT `#include "vp8.h"` | +1 |
|
||||||
|
| `src/picture.c:188-225` | INSERT VP8 dispatch case (~6 lines) | +6 |
|
||||||
|
| `src/picture.c:54-186` | INSERT 4 nested cases + 1 outer case | +30-40 |
|
||||||
|
| `src/picture.c:299-300` | INSERT 2 reset lines | +2 |
|
||||||
|
| `src/surface.h:92-113` | INSERT vp8 struct (~8 lines) | +8 |
|
||||||
|
| `src/meson.build:50,73` | INSERT 2 entries | +2 |
|
||||||
|
|
||||||
|
Total: ~260-340 LOC across 6 modified files + 2 new files. Compared to iter1 (~120 LOC, 4 modified + 0 new + 1 deleted) and iter2 (~470 LOC, 5 modified + 0 new + 0 deleted), iter3 is medium-sized — the new file dominates. The dispatcher additions in picture.c + config.c are mechanical ports of iter1+iter2 patterns.
|
||||||
|
|
||||||
|
## Open questions for Phase 3 baseline
|
||||||
|
|
||||||
|
The Phase 3 baseline run will capture verbatim S_EXT_CTRLS payloads from `ffmpeg -hwaccel v4l2request bbb_720p10s_vp8.webm` (cross-validator anchor). Questions to answer empirically before Phase 4 plan locks:
|
||||||
|
|
||||||
|
1. **first_part_header_bits exact value**: capture for frame 1 (key) and frame 2 (inter). Compare against `slice->macroblock_offset` from a parallel `vainfo --vbo`-equivalent capture.
|
||||||
|
2. **num_dct_parts vs num_of_partitions**: confirm off-by-one (kernel excludes, VAAPI includes control partition). Verify dct_part_sizes[] indexing.
|
||||||
|
3. **DPB timestamp lookup**: confirm v4l2_timeval_to_ns(picture->last_ref_frame's surface_object->timestamp) matches what the kernel hantro driver reads. Any 0-sentinel for missing refs? (FFmpeg leaves zero for missing refs by C99 designated init.)
|
||||||
|
4. **show_frame handling**: VAAPI doesn't expose. Force 1 vs derive — which matches the kernel's expectation? (BBB has no alt-ref invisible frames; both options should work for the binding cell, but verify.)
|
||||||
|
5. **lf.flags FILTER_TYPE_SIMPLE bit**: VAAPI's filter_type=1 means simple. Confirm against bitstream baseline.
|
||||||
|
6. **First-frame DPB sentinel**: when `picture->last_ref_frame == VA_INVALID_SURFACE`, what does FFmpeg ref's `last_frame_ts` end up as? (Likely 0; verify.)
|
||||||
|
|
||||||
|
These answers feed Phase 4 plan clauses. None are blocking — all have safe defaults that work for the BBB binding cell.
|
||||||
|
|
||||||
|
## Phase 3 baseline targets (work plan)
|
||||||
|
|
||||||
|
To answer the open questions above, Phase 3 will run on fresnel (when reachable):
|
||||||
|
|
||||||
|
1. **Cross-validator capture**: `strace -ff -tt -y -v -e trace=ioctl ffmpeg -hwaccel v4l2request -i ~/fourier-test/bbb_720p10s_vp8.webm -frames:v 5 -f null - 2>strace.log` with hantro-vpu-dec env vars. Extract S_EXT_CTRLS payload bytes for VP8_FRAME control across frames 1 (key) and 2 (inter).
|
||||||
|
2. **VAAPI-side trace**: `LIBVA_TRACE=/tmp/vp8_libva.trace mpv --hwdec=no --vo=null --frames=2 ~/fourier-test/bbb_720p10s_vp8.webm` to confirm VAAPI consumer chain (mpv's parser produces VAPictureParameterBufferVP8 + slice + iqmatrix + probability buffers).
|
||||||
|
3. **Cache-safe verify path baseline**: `mpv --hwdec=no --vo=image --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm` and capture `frame-0001.jpg` + `frame-0002.jpg` SHA256s (SW reference for criterion 4 byte-compare in Phase 7).
|
||||||
|
|
||||||
|
## Phase 4 plan structure (anticipated)
|
||||||
|
|
||||||
|
Following iter2's 10-clause plan template:
|
||||||
|
|
||||||
|
- Clause 1: device-init batched submission contract (VP8 has none — clause is empty / N/A)
|
||||||
|
- Clause 2: per-frame batched submission shape (count=1, VP8_FRAME control)
|
||||||
|
- Clause 3: VAAPI → V4L2 mapping table (the table above, normalized to plan-prose form)
|
||||||
|
- Clause 4: DPB timestamp resolution
|
||||||
|
- Clause 5: quantization base+delta derivation from VAAPI's denormalized matrix
|
||||||
|
- Clause 6: probability table mapping (separate buffer source)
|
||||||
|
- Clause 7: BeginPicture per-frame reset (iqmatrix_set, probability_set)
|
||||||
|
- Clause 8: surface union extension
|
||||||
|
- Clause 9: enumeration + dispatch wiring (config.c + picture.c)
|
||||||
|
- Clause 10: meson + new file integration
|
||||||
|
|
||||||
|
The plan will cite verbatim Phase 3 baseline payload bytes for fields where the mapping is non-obvious (quant deltas, first_part_header_bits) per `feedback_dev_process.md` Phase 6 contract-before-code.
|
||||||
|
|
||||||
|
## Substrate state at Phase 2 close
|
||||||
|
|
||||||
|
- iter3 Phase 1 commit `ea2413e` pushed to gitea (campaign repo).
|
||||||
|
- Fork on noether at iter2 tip `8d71e20` (synced via `git fetch origin && git merge --ff-only origin/master` from previous commit `229d6d1`).
|
||||||
|
- Fresnel.vpn unreachable at Phase 2 read time; Phase 3 baseline + Phase 6 builds need the laptop online. Memory rule — don't offer pause prompts; will wait for fresnel to come back online OR the user to wake it before Phase 3.
|
||||||
|
- All 5 memory entries still apply: gitea-as-claude-noether, no-session-termination-attempts, header-deletion-check, review-empirical-over-theoretical (BOTH directions), rockchip-pixel-verify-path.
|
||||||
|
- Phase 3 baseline questions queued (6 items above).
|
||||||
Reference in New Issue
Block a user