From fd3fce86a6d34d8473dc0cd9a537174f68d89753 Mon Sep 17 00:00:00 2001 From: "Claude (noether)" Date: Fri, 8 May 2026 20:14:46 +0000 Subject: [PATCH] =?UTF-8?q?iter3=20Phase=203:=20baselines=20=E2=80=94=20VP?= =?UTF-8?q?8=20cross-validator=20+=203-codec=20regression=20+=20SW=20refer?= =?UTF-8?q?ence?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Captured on fresnel 2026-05-08 across two suspend cycles (laptop dropped twice mid-run, captures preserved on /tmp/iter3_phase3). All Phase 3 deliverables green. Substrate verification: backend SHA256: 9e27...6258 (matches iter2 close) 3-codec regression block: ALL 6 reference hashes match byte-for- byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/ hantro). Substrate has not regressed; criterion-5 anchor solid. Cross-validator anchor (ffmpeg-v4l2request VP8 strace): - VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_ STATELESS, id=0xa409c8, size=1232 bytes - struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT 400 as one might assume; entropy.coeff_probs[4][8][3][11] alone is 1056 bytes) - keyframe (frame 1) verbatim payload captured: y_ac_qi=8, last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP), y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const) - inter frame verbatim payload captured: y_ac_qi=122, all DPB timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in mainline UAPI; vendor-patched ffmpeg-v4l2-request-git; kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores bit 0x40) VP8 SW pixel-verify reference (criterion-4 anchor): vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad 584d789db2c984 vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3 e6ea8c0c78e97a Frame 1 != Frame 2 (real motion). These are the Phase 7 byte- compare HW-vs-SW targets. Open-question resolution (5 of 6 answered empirically): Q1 first_part_header_bits — varies per frame (key=6550, inter ranges 86..254); VAAPI doesn't expose. Phase 4 fallback: leave 0 and check kernel behavior at Phase 7 byte-compare. Phase 5 review will flag as known fidelity gap. Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by- one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1). Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern. Q4 SHOW_FRAME default — set on every captured frame (BBB has no alt-ref invisible). Force unconditional in libva backend. Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter. Direct mapping from VAAPI filter_type=0. Q6 First-frame DPB sentinel — confirmed Q3; no self-reference fallback needed (different from iter1 mpeg2.c). V4L2 binding cells this boot: rkvdec : /dev/video3 + /dev/media1 hantro-vpu-dec: /dev/video5 + /dev/media2 Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for Phase 7 re-run: vp8_strace.* (19 files, multi-thread) decode_vp8.py (payload decoder) vp8_sw_00{1,2}.jpg (criterion-4) {h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5) Refs: phase0_findings_iter3.md (Phase 1 lock) phase2_iter3_situation.md (Phase 2 contract surface) Co-Authored-By: Claude Opus 4.7 (1M context) --- phase3_iter3_baseline.md | 214 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 214 insertions(+) create mode 100644 phase3_iter3_baseline.md diff --git a/phase3_iter3_baseline.md b/phase3_iter3_baseline.md new file mode 100644 index 0000000..930f6b4 --- /dev/null +++ b/phase3_iter3_baseline.md @@ -0,0 +1,214 @@ +# Iteration 3 — Phase 3 (baselines) + +Captured 2026-05-08 on fresnel after the laptop returned from suspend (twice — laptop dropped mid-capture, captures preserved on `/tmp/iter3_phase3` between runs). Phase 3 deliverables per `feedback_dev_process.md`: + +1. Substrate re-verification (criterion-5 anchor) — ✅ +2. Cross-validator anchor (verbatim VP8_FRAME control payload) — ✅ +3. VAAPI consumer trace — deferred to Phase 6 build-time check (see step 3.4) +4. Cache-safe pixel-verify SW reference (criterion-4 anchor) — ✅ +5. Phase 2 open-question answers — ✅ (5 of 6 answered empirically; 1 deferred) +6. Three-codec regression block — ✅ + +## Pre-flight (verified) + +``` +hostname : fresnel +kernel : 6.19.9-99-eos-arm +mpv : 1:0.41.0-3 (NOT mpv-git — L3 satisfied) +libva : 2.23.0-1 +ffmpeg : ffmpeg-v4l2-request-git 2:8.1.r123329.b57fbbe-2 +backend SHA256 : 9e27043847998c197a46a1a26b2f77f22880bb7b3a62aa4d60d8fcaec0ae6258 ← matches iter2 close +fixture : ~/fourier-test/bbb_720p10s_vp8.webm (2419912 bytes) + +V4L2 binding cells (this boot): + rkvdec : /dev/video3 + /dev/media1 + hantro-vpu-dec: /dev/video5 + /dev/media2 +``` + +## Step 3.1 — Regression-block reference (criterion-5 anchor) + +All 6 reference hashes match byte-for-byte vs iter1+iter2 close — substrate has not regressed. + +| Codec | Site | Frame 1 SHA256 | Frame 2 SHA256 | Status | +|---|---|---|---|---| +| H.264 +30s | rkvdec | `f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9` | `7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8` | ✅ MATCH | +| MPEG-2 +02s | hantro | `6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092` | `ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de` | ✅ MATCH | +| HEVC +02s | rkvdec | `47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5` | `a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656` | ✅ MATCH | + +JPEGs preserved at `/tmp/iter3_phase3/{h264,mpeg2,hevc}_hw_00{1,2}.jpg` for re-running Phase 7 byte-compare. + +## Step 3.2 — VP8 SW pixel-verify reference (criterion-4 anchor) + +`mpv --hwdec=no --vo=image --vo-image-format=jpg --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm`: + +| Frame | SHA256 | Size | +|---|---|---| +| 1 | `e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad584d789db2c984` | 235990 bytes | +| 2 | `a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3e6ea8c0c78e97a` | 232549 bytes | + +Frame 1 ≠ Frame 2 (real motion). These two hashes are the Phase 7 criterion-4 byte-equality target. + +JPEGs preserved at `/tmp/iter3_phase3/vp8_sw_00{1,2}.jpg`. + +## Step 3.3 — Cross-validator strace + V4L2_CID_STATELESS_VP8_FRAME payload + +`strace -ff -tt -y -v -s 4096 -e trace=ioctl,openat,close ffmpeg -hwaccel v4l2request -hwaccel_device /dev/media2 -i bbb_720p10s_vp8.webm -frames:v 5 -f null -`. ffmpeg-v4l2-request-git decoded 5 frames; strace produced 7 worker-thread PID files + helpers. Decoded payloads via custom python decoder at `/tmp/iter3_phase3/decode_vp8.py` (kept on disk for re-run). + +### Submission shape (confirmed) + +| Property | Value | Source | +|---|---|---| +| ioctl | `VIDIOC_S_EXT_CTRLS` | strace verbatim | +| `ctrl_class` | `0xf010000` (`V4L2_CTRL_CLASS_CODEC_STATELESS`) | strace verbatim | +| `count` | `1` | strace verbatim — confirms single-control-per-frame predicted in Phase 2 | +| `controls[0].id` | `0xa409c8` | matches `V4L2_CID_STATELESS_VP8_FRAME` from kernel UAPI | +| `controls[0].size` | `1232` | **CORRECTION vs Phase 2** — original commit message of iter1 said "400 bytes" for v4l2_ctrl_vp8_frame; actual is 1232 bytes. Computed: 16(seg)+16(lf)+8(quant)+1104(entropy)+4(coder_state)+84(tail) = 1232. The big component is `entropy.coeff_probs[4][8][3][11] = 1056 bytes`. | + +### Verbatim payload — frame 1 (keyframe, PID 2860 first call) + +``` +struct v4l2_vp8_segment: + quant_update[4] = (0, 0, 0, 0) + lf_update[4] = (0, 0, 0, 0) + segment_probs[3] = (0, 0, 0) + flags = 0x08 (V4L2_VP8_SEGMENT_FLAG_DELTA_VALUE_MODE) + +struct v4l2_vp8_loop_filter: + ref_frm_delta[4] = (2, 0, -2, -2) + mb_mode_delta[4] = (4, -2, 2, 4) + sharpness_level = 0 + level = 1 + flags = 0x03 (ADJ_ENABLE | DELTA_UPDATE) + +struct v4l2_vp8_quantization: + y_ac_qi = 8 + y_dc_delta = 0 + y2_dc_delta = 0 + y2_ac_delta = 0 + uv_dc_delta = 0 + uv_ac_delta = 0 + +struct v4l2_vp8_entropy: + sha1(1104 bytes) = 8b2fdae200eb193f... + y_mode_probs[4] = (145, 156, 163, 128) ← FFmpeg's hardcoded keyframe_y_mode_probs + uv_mode_probs[3] = (142, 114, 183) ← FFmpeg's hardcoded keyframe_uv_mode_probs + +struct v4l2_vp8_entropy_coder_state: + range = 248 + value = 133 + bit_count = 2 + +width × height = 1280 × 720 +horizontal_scale = 0 +vertical_scale = 0 +version = 0 +prob_skip_false = 255 +prob_intra = 0 ← KEY frame: intra always-on; field unused; FFmpeg writes parser state which is 0 +prob_last = 0 ← same +prob_gf = 0 ← same +num_dct_parts = 1 +first_part_size = 22742 +first_part_header_bits = 6550 +dct_part_sizes[8] = (277872, 0, 0, 0, 0, 0, 0, 0) +last_frame_ts = 0 ← KEY frame: no prior reference +golden_frame_ts = 0 ← same +alt_frame_ts = 0 ← same +flags = 0x0d (KEY_FRAME | SHOW_FRAME | MB_NO_SKIP_COEFF) +``` + +### Verbatim payload — frame 2 (inter, PID 2860 second call) + +``` +segment: same as frame 1 (BBB has segmentation disabled) +lf: ref/mb deltas same; sharp=0; level=15; flags=0x01 (DELTA_UPDATE bit clears post-keyframe) +quant: y_ac_qi=122; all deltas=0 +entropy: sha1=e5742b9050e8dc66 (CHANGED — BBB inter-frame entropy state) + y_mode_probs = (3, 1, 128, 1) ← parser-derived inter probs + uv_mode_probs = (162, 101, 204) ← parser-derived +coder_state: range=150 value=69 bit_count=3 ← post-frame-1 boolean coder state +prob_skip_false=14 prob_intra=1 prob_last=251 prob_gf=255 num_dct_parts=1 +first_part_size=1218 first_part_header_bits=133 +dct_part_sizes=(122,0,0,0,0,0,0,0) +last_frame_ts=5000 golden_frame_ts=11000 alt_frame_ts=11000 +flags = 0x66 (see flags-anomaly note below) +``` + +### Flags-anomaly note (informational; not blocking iter3) + +The empirical inter-frame `flags=0x66 = bit 0x02 | 0x04 | 0x20 | 0x40` is set by ffmpeg-v4l2-request-git — bit `0x40` is **not defined** in mainline `` (only bits 0x01..0x20 are). The keyframe correctly produces `flags=0x0d = 0x01|0x04|0x08`. + +The `0x40` extra bit is a vendor-patched additional flag in the installed ffmpeg-v4l2-request-git (kwiboo branch may have downstream changes vs the in-tree reference). The kernel `hantro_vp8.c` driver only inspects `V4L2_VP8_FRAME_IS_KEY_FRAME(hdr)` — bit 0x40 is silently ignored. + +**Phase 4 plan implication**: the libva backend should set ONLY the 6 mainline-documented flag bits per Phase 2 mapping table. We will NOT attempt to byte-match FFmpeg's `0x66` for inter frames during Phase 7 cross-validator byte-compare; instead, the Phase 7 byte-compare will be field-by-field with explicit allow-list for `flags` (KEY_FRAME bit + the 5 boolean transcoded from VAAPI). + +EXPERIMENTAL bit (0x02) is set by FFmpeg per `if (s->profile & 0x4)`. BBB profile=0, so `s->profile & 0x4 == 0`. The empirical 0x02 set on inter frames suggests ffmpeg-v4l2-request-git either (a) has a different conditional or (b) sets it from a different field. Either way, libva backend skips this — VAAPI doesn't expose it. + +## Step 3.4 — VAAPI consumer trace (DEFERRED) + +`LIBVA_TRACE` capture from `mpv --hwdec=no --vo=null` is uninformative because mpv with hwdec=no doesn't engage the libva decode path (uses libva for color-conversion only). Capturing the libva-side decode-path trace requires HW-decode mode, which iter3's whole point is to enable. + +**Decision**: defer VAAPI buffer-type enumeration verification to Phase 6 build-time. Phase 2 source-read of `va_dec_vp8.h` already enumerated the 4 buffer types (Picture, Slice, Probability, IQMatrix); Phase 6 build will verify the dispatcher accepts them. If a buffer type is missing or extra, Phase 6 compile/runtime will surface it. + +## Step 3.5 — Open-question resolution + +Six Phase 2 questions; empirical answers: + +### Q1: `first_part_header_bits` exact value + +**Frame 1 (key)**: 6550 bits. Frame 2 (inter): 133, 86, 140, 254, 86 (varies). FFmpeg derives this from `s->coder_state_at_header_end.input - data` minus residual bits. VAAPI does NOT expose this directly. + +**Phase 4 implication**: VAAPI's `slice->macroblock_offset` (bit offset of MB layer from start of slice data) is the closest analog. **However**, `slice->macroblock_offset` is the MB-data offset (after BOTH the uncompressed header AND the entropy header), whereas `first_part_header_bits` is just the entropy header portion. They differ by `first_part_size * 8 - first_part_header_bits` (the entropy-encoded part of the control partition). + +**Fallback strategy**: leave `first_part_header_bits = 0` and check whether kernel hantro driver actually uses it. If it doesn't (likely — the driver re-parses the bitstream), zero is correct. If it does, Phase 7 byte-compare will reveal divergence and Phase 4 will need to compute it bitstream-side. **Phase 5 review will flag this as a known fidelity gap**. + +### Q2: `num_dct_parts` vs VAAPI `num_of_partitions` + +**Empirical**: `num_dct_parts = 1` for every captured frame. BBB has 2 partitions total (1 control + 1 DCT). VAAPI's `slice->num_of_partitions = 2`. Confirms predicted off-by-one: `num_dct_parts = slice->num_of_partitions - 1`. + +### Q3: DPB timestamp 0-sentinel handling + +**Empirical**: Frame 1 (key) has `last_ts=0, golden_ts=0, alt_ts=0` — all three zero. Inter frames have all three non-zero (referencing prior visible frames). Confirms FFmpeg writes 0 for missing refs (matches `forward_ref_ts=0` pattern from iter1 mpeg2.c::mpeg2_set_controls). + +**Phase 4 implication**: in vp8_set_controls, lookup VASurfaceID `picture->{last,golden,alt}_ref_frame`; if `SURFACE() == NULL` (i.e. `VA_INVALID_SURFACE` or stale ID), leave timestamp = 0. Mirror iter1 mpeg2.c pattern (lines 146-156). + +### Q4: `SHOW_FRAME` flag default + +**Empirical**: `flags & SHOW_FRAME (0x04)` set on every captured frame (key + inter). BBB has no alt-ref invisible frames in the +0..2s range. **Phase 4 decision**: force `flags |= SHOW_FRAME` unconditionally — VAAPI doesn't expose the bit, and BBB is all-visible. Document as known fidelity gap for streams with alt-ref invisible frames (iter3 out-of-scope per Phase 1 lock). + +### Q5: `lf.flags & FILTER_TYPE_SIMPLE` + +**Empirical**: not set on any captured frame. BBB uses normal (not simple) loop filter. Confirms VAAPI's `pic_fields.bits.filter_type=0` for BBB — direct mapping per Phase 2 table. + +### Q6: First-frame DPB sentinel + +**Empirical**: confirmed Q3 above — `last_ts=golden_ts=alt_ts=0` for the keyframe. No self-reference fallback (different from iter1's mpeg2.c where I had to fix self-reference to use 0 sentinel; FFmpeg's VP8 path naturally writes 0 via C99 designated init). + +## Phase 3 → Phase 4 transition (proceed condition) + +All Phase 3 deliverables green: + +- Substrate not regressed (3-codec hashes hold, criterion-5 anchor solid) +- Cross-validator strace captured (~13 S_EXT_CTRLS for 5 frames decoded; verbatim payload for keyframe + inter frames available) +- struct size CORRECTED to 1232 bytes (vs Phase 2 implicit assumption of ~400) +- 5 of 6 open questions answered empirically; Q1 (first_part_header_bits) deferred with safe-default fallback +- VP8 SW reference JPEGs captured (criterion-4 anchor) + +Phase 4 plan can lock against: + +1. Verbatim keyframe + inter-frame payload bytes (above) for byte-compare anchors. +2. Confirmed quantization deltas all zero for BBB → libva backend computes `quant.y_dc_delta = quantization_index[0][1] - quantization_index[0][0]` and verify all-zero for BBB; mapping is correct. +3. Confirmed segment fields all zero for BBB (segmentation disabled); `segment.flags |= DELTA_VALUE_MODE` per FFmpeg pattern, but kernel ignores when ENABLED bit clear. +4. `lf.flags = ADJ_ENABLE` for BBB on inter frames; `ADJ_ENABLE | DELTA_UPDATE` on keyframe (DELTA_UPDATE only when keyframe initializes the loop-filter delta state). +5. `flags` byte-compare to use mainline-documented bits only (libva backend will produce 0x0d for keyframe, 0x04|... for inter frames; FFmpeg's bit 0x40 explicitly NOT replicated). + +## Substrate state at Phase 3 close + +- iter3 Phase 1 + Phase 2 commits pushed to gitea (`ea2413e`, `898544a`). +- Fork on noether at iter2 tip `8d71e20`; Phase 6 patches will land here. +- fresnel went offline twice during Phase 3 capture (suspend mid-run), captures preserved on `/tmp/iter3_phase3` between runs. +- Memory rules carry forward unchanged (5 entries + new `feedback_fresnel_hostname`). +- Capture artefacts on fresnel `/tmp/iter3_phase3/`: + - `vp8_strace.*` — 19 strace files (multi-thread) + - `decode_vp8.py` — payload decoder (kept for Phase 7 re-run) + - `vp8_sw_00{1,2}.jpg` — SW reference (criterion-4 anchor) + - `{h264,mpeg2,hevc}_hw_00{1,2}.jpg` — regression block (criterion-5 anchors)