iter3 Phase 3: baselines — VP8 cross-validator + 3-codec regression
+ SW reference
Captured on fresnel 2026-05-08 across two suspend cycles (laptop
dropped twice mid-run, captures preserved on /tmp/iter3_phase3).
All Phase 3 deliverables green.
Substrate verification:
backend SHA256: 9e27...6258 (matches iter2 close)
3-codec regression block: ALL 6 reference hashes match byte-for-
byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/
hantro). Substrate has not regressed; criterion-5 anchor solid.
Cross-validator anchor (ffmpeg-v4l2request VP8 strace):
- VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_
STATELESS, id=0xa409c8, size=1232 bytes
- struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT
400 as one might assume; entropy.coeff_probs[4][8][3][11] alone
is 1056 bytes)
- keyframe (frame 1) verbatim payload captured: y_ac_qi=8,
last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP),
y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const)
- inter frame verbatim payload captured: y_ac_qi=122, all DPB
timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in
mainline UAPI; vendor-patched ffmpeg-v4l2-request-git;
kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores
bit 0x40)
VP8 SW pixel-verify reference (criterion-4 anchor):
vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad
584d789db2c984
vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3
e6ea8c0c78e97a
Frame 1 != Frame 2 (real motion). These are the Phase 7 byte-
compare HW-vs-SW targets.
Open-question resolution (5 of 6 answered empirically):
Q1 first_part_header_bits — varies per frame (key=6550, inter
ranges 86..254); VAAPI doesn't expose. Phase 4 fallback:
leave 0 and check kernel behavior at Phase 7 byte-compare.
Phase 5 review will flag as known fidelity gap.
Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by-
one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1).
Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all
three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern.
Q4 SHOW_FRAME default — set on every captured frame (BBB has no
alt-ref invisible). Force unconditional in libva backend.
Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter.
Direct mapping from VAAPI filter_type=0.
Q6 First-frame DPB sentinel — confirmed Q3; no self-reference
fallback needed (different from iter1 mpeg2.c).
V4L2 binding cells this boot:
rkvdec : /dev/video3 + /dev/media1
hantro-vpu-dec: /dev/video5 + /dev/media2
Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for
Phase 7 re-run:
vp8_strace.* (19 files, multi-thread)
decode_vp8.py (payload decoder)
vp8_sw_00{1,2}.jpg (criterion-4)
{h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5)
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (Phase 2 contract surface)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,214 @@
|
||||
# Iteration 3 — Phase 3 (baselines)
|
||||
|
||||
Captured 2026-05-08 on fresnel after the laptop returned from suspend (twice — laptop dropped mid-capture, captures preserved on `/tmp/iter3_phase3` between runs). Phase 3 deliverables per `feedback_dev_process.md`:
|
||||
|
||||
1. Substrate re-verification (criterion-5 anchor) — ✅
|
||||
2. Cross-validator anchor (verbatim VP8_FRAME control payload) — ✅
|
||||
3. VAAPI consumer trace — deferred to Phase 6 build-time check (see step 3.4)
|
||||
4. Cache-safe pixel-verify SW reference (criterion-4 anchor) — ✅
|
||||
5. Phase 2 open-question answers — ✅ (5 of 6 answered empirically; 1 deferred)
|
||||
6. Three-codec regression block — ✅
|
||||
|
||||
## Pre-flight (verified)
|
||||
|
||||
```
|
||||
hostname : fresnel
|
||||
kernel : 6.19.9-99-eos-arm
|
||||
mpv : 1:0.41.0-3 (NOT mpv-git — L3 satisfied)
|
||||
libva : 2.23.0-1
|
||||
ffmpeg : ffmpeg-v4l2-request-git 2:8.1.r123329.b57fbbe-2
|
||||
backend SHA256 : 9e27043847998c197a46a1a26b2f77f22880bb7b3a62aa4d60d8fcaec0ae6258 ← matches iter2 close
|
||||
fixture : ~/fourier-test/bbb_720p10s_vp8.webm (2419912 bytes)
|
||||
|
||||
V4L2 binding cells (this boot):
|
||||
rkvdec : /dev/video3 + /dev/media1
|
||||
hantro-vpu-dec: /dev/video5 + /dev/media2
|
||||
```
|
||||
|
||||
## Step 3.1 — Regression-block reference (criterion-5 anchor)
|
||||
|
||||
All 6 reference hashes match byte-for-byte vs iter1+iter2 close — substrate has not regressed.
|
||||
|
||||
| Codec | Site | Frame 1 SHA256 | Frame 2 SHA256 | Status |
|
||||
|---|---|---|---|---|
|
||||
| H.264 +30s | rkvdec | `f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9` | `7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8` | ✅ MATCH |
|
||||
| MPEG-2 +02s | hantro | `6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092` | `ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de` | ✅ MATCH |
|
||||
| HEVC +02s | rkvdec | `47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5` | `a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656` | ✅ MATCH |
|
||||
|
||||
JPEGs preserved at `/tmp/iter3_phase3/{h264,mpeg2,hevc}_hw_00{1,2}.jpg` for re-running Phase 7 byte-compare.
|
||||
|
||||
## Step 3.2 — VP8 SW pixel-verify reference (criterion-4 anchor)
|
||||
|
||||
`mpv --hwdec=no --vo=image --vo-image-format=jpg --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm`:
|
||||
|
||||
| Frame | SHA256 | Size |
|
||||
|---|---|---|
|
||||
| 1 | `e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad584d789db2c984` | 235990 bytes |
|
||||
| 2 | `a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3e6ea8c0c78e97a` | 232549 bytes |
|
||||
|
||||
Frame 1 ≠ Frame 2 (real motion). These two hashes are the Phase 7 criterion-4 byte-equality target.
|
||||
|
||||
JPEGs preserved at `/tmp/iter3_phase3/vp8_sw_00{1,2}.jpg`.
|
||||
|
||||
## Step 3.3 — Cross-validator strace + V4L2_CID_STATELESS_VP8_FRAME payload
|
||||
|
||||
`strace -ff -tt -y -v -s 4096 -e trace=ioctl,openat,close ffmpeg -hwaccel v4l2request -hwaccel_device /dev/media2 -i bbb_720p10s_vp8.webm -frames:v 5 -f null -`. ffmpeg-v4l2-request-git decoded 5 frames; strace produced 7 worker-thread PID files + helpers. Decoded payloads via custom python decoder at `/tmp/iter3_phase3/decode_vp8.py` (kept on disk for re-run).
|
||||
|
||||
### Submission shape (confirmed)
|
||||
|
||||
| Property | Value | Source |
|
||||
|---|---|---|
|
||||
| ioctl | `VIDIOC_S_EXT_CTRLS` | strace verbatim |
|
||||
| `ctrl_class` | `0xf010000` (`V4L2_CTRL_CLASS_CODEC_STATELESS`) | strace verbatim |
|
||||
| `count` | `1` | strace verbatim — confirms single-control-per-frame predicted in Phase 2 |
|
||||
| `controls[0].id` | `0xa409c8` | matches `V4L2_CID_STATELESS_VP8_FRAME` from kernel UAPI |
|
||||
| `controls[0].size` | `1232` | **CORRECTION vs Phase 2** — original commit message of iter1 said "400 bytes" for v4l2_ctrl_vp8_frame; actual is 1232 bytes. Computed: 16(seg)+16(lf)+8(quant)+1104(entropy)+4(coder_state)+84(tail) = 1232. The big component is `entropy.coeff_probs[4][8][3][11] = 1056 bytes`. |
|
||||
|
||||
### Verbatim payload — frame 1 (keyframe, PID 2860 first call)
|
||||
|
||||
```
|
||||
struct v4l2_vp8_segment:
|
||||
quant_update[4] = (0, 0, 0, 0)
|
||||
lf_update[4] = (0, 0, 0, 0)
|
||||
segment_probs[3] = (0, 0, 0)
|
||||
flags = 0x08 (V4L2_VP8_SEGMENT_FLAG_DELTA_VALUE_MODE)
|
||||
|
||||
struct v4l2_vp8_loop_filter:
|
||||
ref_frm_delta[4] = (2, 0, -2, -2)
|
||||
mb_mode_delta[4] = (4, -2, 2, 4)
|
||||
sharpness_level = 0
|
||||
level = 1
|
||||
flags = 0x03 (ADJ_ENABLE | DELTA_UPDATE)
|
||||
|
||||
struct v4l2_vp8_quantization:
|
||||
y_ac_qi = 8
|
||||
y_dc_delta = 0
|
||||
y2_dc_delta = 0
|
||||
y2_ac_delta = 0
|
||||
uv_dc_delta = 0
|
||||
uv_ac_delta = 0
|
||||
|
||||
struct v4l2_vp8_entropy:
|
||||
sha1(1104 bytes) = 8b2fdae200eb193f...
|
||||
y_mode_probs[4] = (145, 156, 163, 128) ← FFmpeg's hardcoded keyframe_y_mode_probs
|
||||
uv_mode_probs[3] = (142, 114, 183) ← FFmpeg's hardcoded keyframe_uv_mode_probs
|
||||
|
||||
struct v4l2_vp8_entropy_coder_state:
|
||||
range = 248
|
||||
value = 133
|
||||
bit_count = 2
|
||||
|
||||
width × height = 1280 × 720
|
||||
horizontal_scale = 0
|
||||
vertical_scale = 0
|
||||
version = 0
|
||||
prob_skip_false = 255
|
||||
prob_intra = 0 ← KEY frame: intra always-on; field unused; FFmpeg writes parser state which is 0
|
||||
prob_last = 0 ← same
|
||||
prob_gf = 0 ← same
|
||||
num_dct_parts = 1
|
||||
first_part_size = 22742
|
||||
first_part_header_bits = 6550
|
||||
dct_part_sizes[8] = (277872, 0, 0, 0, 0, 0, 0, 0)
|
||||
last_frame_ts = 0 ← KEY frame: no prior reference
|
||||
golden_frame_ts = 0 ← same
|
||||
alt_frame_ts = 0 ← same
|
||||
flags = 0x0d (KEY_FRAME | SHOW_FRAME | MB_NO_SKIP_COEFF)
|
||||
```
|
||||
|
||||
### Verbatim payload — frame 2 (inter, PID 2860 second call)
|
||||
|
||||
```
|
||||
segment: same as frame 1 (BBB has segmentation disabled)
|
||||
lf: ref/mb deltas same; sharp=0; level=15; flags=0x01 (DELTA_UPDATE bit clears post-keyframe)
|
||||
quant: y_ac_qi=122; all deltas=0
|
||||
entropy: sha1=e5742b9050e8dc66 (CHANGED — BBB inter-frame entropy state)
|
||||
y_mode_probs = (3, 1, 128, 1) ← parser-derived inter probs
|
||||
uv_mode_probs = (162, 101, 204) ← parser-derived
|
||||
coder_state: range=150 value=69 bit_count=3 ← post-frame-1 boolean coder state
|
||||
prob_skip_false=14 prob_intra=1 prob_last=251 prob_gf=255 num_dct_parts=1
|
||||
first_part_size=1218 first_part_header_bits=133
|
||||
dct_part_sizes=(122,0,0,0,0,0,0,0)
|
||||
last_frame_ts=5000 golden_frame_ts=11000 alt_frame_ts=11000
|
||||
flags = 0x66 (see flags-anomaly note below)
|
||||
```
|
||||
|
||||
### Flags-anomaly note (informational; not blocking iter3)
|
||||
|
||||
The empirical inter-frame `flags=0x66 = bit 0x02 | 0x04 | 0x20 | 0x40` is set by ffmpeg-v4l2-request-git — bit `0x40` is **not defined** in mainline `<linux/v4l2-controls.h>` (only bits 0x01..0x20 are). The keyframe correctly produces `flags=0x0d = 0x01|0x04|0x08`.
|
||||
|
||||
The `0x40` extra bit is a vendor-patched additional flag in the installed ffmpeg-v4l2-request-git (kwiboo branch may have downstream changes vs the in-tree reference). The kernel `hantro_vp8.c` driver only inspects `V4L2_VP8_FRAME_IS_KEY_FRAME(hdr)` — bit 0x40 is silently ignored.
|
||||
|
||||
**Phase 4 plan implication**: the libva backend should set ONLY the 6 mainline-documented flag bits per Phase 2 mapping table. We will NOT attempt to byte-match FFmpeg's `0x66` for inter frames during Phase 7 cross-validator byte-compare; instead, the Phase 7 byte-compare will be field-by-field with explicit allow-list for `flags` (KEY_FRAME bit + the 5 boolean transcoded from VAAPI).
|
||||
|
||||
EXPERIMENTAL bit (0x02) is set by FFmpeg per `if (s->profile & 0x4)`. BBB profile=0, so `s->profile & 0x4 == 0`. The empirical 0x02 set on inter frames suggests ffmpeg-v4l2-request-git either (a) has a different conditional or (b) sets it from a different field. Either way, libva backend skips this — VAAPI doesn't expose it.
|
||||
|
||||
## Step 3.4 — VAAPI consumer trace (DEFERRED)
|
||||
|
||||
`LIBVA_TRACE` capture from `mpv --hwdec=no --vo=null` is uninformative because mpv with hwdec=no doesn't engage the libva decode path (uses libva for color-conversion only). Capturing the libva-side decode-path trace requires HW-decode mode, which iter3's whole point is to enable.
|
||||
|
||||
**Decision**: defer VAAPI buffer-type enumeration verification to Phase 6 build-time. Phase 2 source-read of `va_dec_vp8.h` already enumerated the 4 buffer types (Picture, Slice, Probability, IQMatrix); Phase 6 build will verify the dispatcher accepts them. If a buffer type is missing or extra, Phase 6 compile/runtime will surface it.
|
||||
|
||||
## Step 3.5 — Open-question resolution
|
||||
|
||||
Six Phase 2 questions; empirical answers:
|
||||
|
||||
### Q1: `first_part_header_bits` exact value
|
||||
|
||||
**Frame 1 (key)**: 6550 bits. Frame 2 (inter): 133, 86, 140, 254, 86 (varies). FFmpeg derives this from `s->coder_state_at_header_end.input - data` minus residual bits. VAAPI does NOT expose this directly.
|
||||
|
||||
**Phase 4 implication**: VAAPI's `slice->macroblock_offset` (bit offset of MB layer from start of slice data) is the closest analog. **However**, `slice->macroblock_offset` is the MB-data offset (after BOTH the uncompressed header AND the entropy header), whereas `first_part_header_bits` is just the entropy header portion. They differ by `first_part_size * 8 - first_part_header_bits` (the entropy-encoded part of the control partition).
|
||||
|
||||
**Fallback strategy**: leave `first_part_header_bits = 0` and check whether kernel hantro driver actually uses it. If it doesn't (likely — the driver re-parses the bitstream), zero is correct. If it does, Phase 7 byte-compare will reveal divergence and Phase 4 will need to compute it bitstream-side. **Phase 5 review will flag this as a known fidelity gap**.
|
||||
|
||||
### Q2: `num_dct_parts` vs VAAPI `num_of_partitions`
|
||||
|
||||
**Empirical**: `num_dct_parts = 1` for every captured frame. BBB has 2 partitions total (1 control + 1 DCT). VAAPI's `slice->num_of_partitions = 2`. Confirms predicted off-by-one: `num_dct_parts = slice->num_of_partitions - 1`.
|
||||
|
||||
### Q3: DPB timestamp 0-sentinel handling
|
||||
|
||||
**Empirical**: Frame 1 (key) has `last_ts=0, golden_ts=0, alt_ts=0` — all three zero. Inter frames have all three non-zero (referencing prior visible frames). Confirms FFmpeg writes 0 for missing refs (matches `forward_ref_ts=0` pattern from iter1 mpeg2.c::mpeg2_set_controls).
|
||||
|
||||
**Phase 4 implication**: in vp8_set_controls, lookup VASurfaceID `picture->{last,golden,alt}_ref_frame`; if `SURFACE() == NULL` (i.e. `VA_INVALID_SURFACE` or stale ID), leave timestamp = 0. Mirror iter1 mpeg2.c pattern (lines 146-156).
|
||||
|
||||
### Q4: `SHOW_FRAME` flag default
|
||||
|
||||
**Empirical**: `flags & SHOW_FRAME (0x04)` set on every captured frame (key + inter). BBB has no alt-ref invisible frames in the +0..2s range. **Phase 4 decision**: force `flags |= SHOW_FRAME` unconditionally — VAAPI doesn't expose the bit, and BBB is all-visible. Document as known fidelity gap for streams with alt-ref invisible frames (iter3 out-of-scope per Phase 1 lock).
|
||||
|
||||
### Q5: `lf.flags & FILTER_TYPE_SIMPLE`
|
||||
|
||||
**Empirical**: not set on any captured frame. BBB uses normal (not simple) loop filter. Confirms VAAPI's `pic_fields.bits.filter_type=0` for BBB — direct mapping per Phase 2 table.
|
||||
|
||||
### Q6: First-frame DPB sentinel
|
||||
|
||||
**Empirical**: confirmed Q3 above — `last_ts=golden_ts=alt_ts=0` for the keyframe. No self-reference fallback (different from iter1's mpeg2.c where I had to fix self-reference to use 0 sentinel; FFmpeg's VP8 path naturally writes 0 via C99 designated init).
|
||||
|
||||
## Phase 3 → Phase 4 transition (proceed condition)
|
||||
|
||||
All Phase 3 deliverables green:
|
||||
|
||||
- Substrate not regressed (3-codec hashes hold, criterion-5 anchor solid)
|
||||
- Cross-validator strace captured (~13 S_EXT_CTRLS for 5 frames decoded; verbatim payload for keyframe + inter frames available)
|
||||
- struct size CORRECTED to 1232 bytes (vs Phase 2 implicit assumption of ~400)
|
||||
- 5 of 6 open questions answered empirically; Q1 (first_part_header_bits) deferred with safe-default fallback
|
||||
- VP8 SW reference JPEGs captured (criterion-4 anchor)
|
||||
|
||||
Phase 4 plan can lock against:
|
||||
|
||||
1. Verbatim keyframe + inter-frame payload bytes (above) for byte-compare anchors.
|
||||
2. Confirmed quantization deltas all zero for BBB → libva backend computes `quant.y_dc_delta = quantization_index[0][1] - quantization_index[0][0]` and verify all-zero for BBB; mapping is correct.
|
||||
3. Confirmed segment fields all zero for BBB (segmentation disabled); `segment.flags |= DELTA_VALUE_MODE` per FFmpeg pattern, but kernel ignores when ENABLED bit clear.
|
||||
4. `lf.flags = ADJ_ENABLE` for BBB on inter frames; `ADJ_ENABLE | DELTA_UPDATE` on keyframe (DELTA_UPDATE only when keyframe initializes the loop-filter delta state).
|
||||
5. `flags` byte-compare to use mainline-documented bits only (libva backend will produce 0x0d for keyframe, 0x04|... for inter frames; FFmpeg's bit 0x40 explicitly NOT replicated).
|
||||
|
||||
## Substrate state at Phase 3 close
|
||||
|
||||
- iter3 Phase 1 + Phase 2 commits pushed to gitea (`ea2413e`, `898544a`).
|
||||
- Fork on noether at iter2 tip `8d71e20`; Phase 6 patches will land here.
|
||||
- fresnel went offline twice during Phase 3 capture (suspend mid-run), captures preserved on `/tmp/iter3_phase3` between runs.
|
||||
- Memory rules carry forward unchanged (5 entries + new `feedback_fresnel_hostname`).
|
||||
- Capture artefacts on fresnel `/tmp/iter3_phase3/`:
|
||||
- `vp8_strace.*` — 19 strace files (multi-thread)
|
||||
- `decode_vp8.py` — payload decoder (kept for Phase 7 re-run)
|
||||
- `vp8_sw_00{1,2}.jpg` — SW reference (criterion-4 anchor)
|
||||
- `{h264,mpeg2,hevc}_hw_00{1,2}.jpg` — regression block (criterion-5 anchors)
|
||||
Reference in New Issue
Block a user