Files
fresnel-fourier/phase3_iter3_baseline.md
T
claude-noether fd3fce86a6 iter3 Phase 3: baselines — VP8 cross-validator + 3-codec regression
+ SW reference

Captured on fresnel 2026-05-08 across two suspend cycles (laptop
dropped twice mid-run, captures preserved on /tmp/iter3_phase3).
All Phase 3 deliverables green.

Substrate verification:
  backend SHA256: 9e27...6258 (matches iter2 close)
  3-codec regression block: ALL 6 reference hashes match byte-for-
  byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/
  hantro). Substrate has not regressed; criterion-5 anchor solid.

Cross-validator anchor (ffmpeg-v4l2request VP8 strace):
  - VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_
    STATELESS, id=0xa409c8, size=1232 bytes
  - struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT
    400 as one might assume; entropy.coeff_probs[4][8][3][11] alone
    is 1056 bytes)
  - keyframe (frame 1) verbatim payload captured: y_ac_qi=8,
    last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP),
    y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const)
  - inter frame verbatim payload captured: y_ac_qi=122, all DPB
    timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in
    mainline UAPI; vendor-patched ffmpeg-v4l2-request-git;
    kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores
    bit 0x40)

VP8 SW pixel-verify reference (criterion-4 anchor):
  vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad
                  584d789db2c984
  vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3
                  e6ea8c0c78e97a
  Frame 1 != Frame 2 (real motion). These are the Phase 7 byte-
  compare HW-vs-SW targets.

Open-question resolution (5 of 6 answered empirically):

  Q1 first_part_header_bits — varies per frame (key=6550, inter
     ranges 86..254); VAAPI doesn't expose. Phase 4 fallback:
     leave 0 and check kernel behavior at Phase 7 byte-compare.
     Phase 5 review will flag as known fidelity gap.

  Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by-
     one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1).

  Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all
     three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern.

  Q4 SHOW_FRAME default — set on every captured frame (BBB has no
     alt-ref invisible). Force unconditional in libva backend.

  Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter.
     Direct mapping from VAAPI filter_type=0.

  Q6 First-frame DPB sentinel — confirmed Q3; no self-reference
     fallback needed (different from iter1 mpeg2.c).

V4L2 binding cells this boot:
  rkvdec        : /dev/video3 + /dev/media1
  hantro-vpu-dec: /dev/video5 + /dev/media2

Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for
Phase 7 re-run:
  vp8_strace.* (19 files, multi-thread)
  decode_vp8.py (payload decoder)
  vp8_sw_00{1,2}.jpg (criterion-4)
  {h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5)

Refs:
  phase0_findings_iter3.md (Phase 1 lock)
  phase2_iter3_situation.md (Phase 2 contract surface)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:14:46 +00:00

215 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 3 — Phase 3 (baselines)
Captured 2026-05-08 on fresnel after the laptop returned from suspend (twice — laptop dropped mid-capture, captures preserved on `/tmp/iter3_phase3` between runs). Phase 3 deliverables per `feedback_dev_process.md`:
1. Substrate re-verification (criterion-5 anchor) — ✅
2. Cross-validator anchor (verbatim VP8_FRAME control payload) — ✅
3. VAAPI consumer trace — deferred to Phase 6 build-time check (see step 3.4)
4. Cache-safe pixel-verify SW reference (criterion-4 anchor) — ✅
5. Phase 2 open-question answers — ✅ (5 of 6 answered empirically; 1 deferred)
6. Three-codec regression block — ✅
## Pre-flight (verified)
```
hostname : fresnel
kernel : 6.19.9-99-eos-arm
mpv : 1:0.41.0-3 (NOT mpv-git — L3 satisfied)
libva : 2.23.0-1
ffmpeg : ffmpeg-v4l2-request-git 2:8.1.r123329.b57fbbe-2
backend SHA256 : 9e27043847998c197a46a1a26b2f77f22880bb7b3a62aa4d60d8fcaec0ae6258 ← matches iter2 close
fixture : ~/fourier-test/bbb_720p10s_vp8.webm (2419912 bytes)
V4L2 binding cells (this boot):
rkvdec : /dev/video3 + /dev/media1
hantro-vpu-dec: /dev/video5 + /dev/media2
```
## Step 3.1 — Regression-block reference (criterion-5 anchor)
All 6 reference hashes match byte-for-byte vs iter1+iter2 close — substrate has not regressed.
| Codec | Site | Frame 1 SHA256 | Frame 2 SHA256 | Status |
|---|---|---|---|---|
| H.264 +30s | rkvdec | `f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9` | `7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8` | ✅ MATCH |
| MPEG-2 +02s | hantro | `6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092` | `ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de` | ✅ MATCH |
| HEVC +02s | rkvdec | `47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5` | `a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656` | ✅ MATCH |
JPEGs preserved at `/tmp/iter3_phase3/{h264,mpeg2,hevc}_hw_00{1,2}.jpg` for re-running Phase 7 byte-compare.
## Step 3.2 — VP8 SW pixel-verify reference (criterion-4 anchor)
`mpv --hwdec=no --vo=image --vo-image-format=jpg --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm`:
| Frame | SHA256 | Size |
|---|---|---|
| 1 | `e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad584d789db2c984` | 235990 bytes |
| 2 | `a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3e6ea8c0c78e97a` | 232549 bytes |
Frame 1 ≠ Frame 2 (real motion). These two hashes are the Phase 7 criterion-4 byte-equality target.
JPEGs preserved at `/tmp/iter3_phase3/vp8_sw_00{1,2}.jpg`.
## Step 3.3 — Cross-validator strace + V4L2_CID_STATELESS_VP8_FRAME payload
`strace -ff -tt -y -v -s 4096 -e trace=ioctl,openat,close ffmpeg -hwaccel v4l2request -hwaccel_device /dev/media2 -i bbb_720p10s_vp8.webm -frames:v 5 -f null -`. ffmpeg-v4l2-request-git decoded 5 frames; strace produced 7 worker-thread PID files + helpers. Decoded payloads via custom python decoder at `/tmp/iter3_phase3/decode_vp8.py` (kept on disk for re-run).
### Submission shape (confirmed)
| Property | Value | Source |
|---|---|---|
| ioctl | `VIDIOC_S_EXT_CTRLS` | strace verbatim |
| `ctrl_class` | `0xf010000` (`V4L2_CTRL_CLASS_CODEC_STATELESS`) | strace verbatim |
| `count` | `1` | strace verbatim — confirms single-control-per-frame predicted in Phase 2 |
| `controls[0].id` | `0xa409c8` | matches `V4L2_CID_STATELESS_VP8_FRAME` from kernel UAPI |
| `controls[0].size` | `1232` | **CORRECTION vs Phase 2** — original commit message of iter1 said "400 bytes" for v4l2_ctrl_vp8_frame; actual is 1232 bytes. Computed: 16(seg)+16(lf)+8(quant)+1104(entropy)+4(coder_state)+84(tail) = 1232. The big component is `entropy.coeff_probs[4][8][3][11] = 1056 bytes`. |
### Verbatim payload — frame 1 (keyframe, PID 2860 first call)
```
struct v4l2_vp8_segment:
quant_update[4] = (0, 0, 0, 0)
lf_update[4] = (0, 0, 0, 0)
segment_probs[3] = (0, 0, 0)
flags = 0x08 (V4L2_VP8_SEGMENT_FLAG_DELTA_VALUE_MODE)
struct v4l2_vp8_loop_filter:
ref_frm_delta[4] = (2, 0, -2, -2)
mb_mode_delta[4] = (4, -2, 2, 4)
sharpness_level = 0
level = 1
flags = 0x03 (ADJ_ENABLE | DELTA_UPDATE)
struct v4l2_vp8_quantization:
y_ac_qi = 8
y_dc_delta = 0
y2_dc_delta = 0
y2_ac_delta = 0
uv_dc_delta = 0
uv_ac_delta = 0
struct v4l2_vp8_entropy:
sha1(1104 bytes) = 8b2fdae200eb193f...
y_mode_probs[4] = (145, 156, 163, 128) ← FFmpeg's hardcoded keyframe_y_mode_probs
uv_mode_probs[3] = (142, 114, 183) ← FFmpeg's hardcoded keyframe_uv_mode_probs
struct v4l2_vp8_entropy_coder_state:
range = 248
value = 133
bit_count = 2
width × height = 1280 × 720
horizontal_scale = 0
vertical_scale = 0
version = 0
prob_skip_false = 255
prob_intra = 0 ← KEY frame: intra always-on; field unused; FFmpeg writes parser state which is 0
prob_last = 0 ← same
prob_gf = 0 ← same
num_dct_parts = 1
first_part_size = 22742
first_part_header_bits = 6550
dct_part_sizes[8] = (277872, 0, 0, 0, 0, 0, 0, 0)
last_frame_ts = 0 ← KEY frame: no prior reference
golden_frame_ts = 0 ← same
alt_frame_ts = 0 ← same
flags = 0x0d (KEY_FRAME | SHOW_FRAME | MB_NO_SKIP_COEFF)
```
### Verbatim payload — frame 2 (inter, PID 2860 second call)
```
segment: same as frame 1 (BBB has segmentation disabled)
lf: ref/mb deltas same; sharp=0; level=15; flags=0x01 (DELTA_UPDATE bit clears post-keyframe)
quant: y_ac_qi=122; all deltas=0
entropy: sha1=e5742b9050e8dc66 (CHANGED — BBB inter-frame entropy state)
y_mode_probs = (3, 1, 128, 1) ← parser-derived inter probs
uv_mode_probs = (162, 101, 204) ← parser-derived
coder_state: range=150 value=69 bit_count=3 ← post-frame-1 boolean coder state
prob_skip_false=14 prob_intra=1 prob_last=251 prob_gf=255 num_dct_parts=1
first_part_size=1218 first_part_header_bits=133
dct_part_sizes=(122,0,0,0,0,0,0,0)
last_frame_ts=5000 golden_frame_ts=11000 alt_frame_ts=11000
flags = 0x66 (see flags-anomaly note below)
```
### Flags-anomaly note (informational; not blocking iter3)
The empirical inter-frame `flags=0x66 = bit 0x02 | 0x04 | 0x20 | 0x40` is set by ffmpeg-v4l2-request-git — bit `0x40` is **not defined** in mainline `<linux/v4l2-controls.h>` (only bits 0x01..0x20 are). The keyframe correctly produces `flags=0x0d = 0x01|0x04|0x08`.
The `0x40` extra bit is a vendor-patched additional flag in the installed ffmpeg-v4l2-request-git (kwiboo branch may have downstream changes vs the in-tree reference). The kernel `hantro_vp8.c` driver only inspects `V4L2_VP8_FRAME_IS_KEY_FRAME(hdr)` — bit 0x40 is silently ignored.
**Phase 4 plan implication**: the libva backend should set ONLY the 6 mainline-documented flag bits per Phase 2 mapping table. We will NOT attempt to byte-match FFmpeg's `0x66` for inter frames during Phase 7 cross-validator byte-compare; instead, the Phase 7 byte-compare will be field-by-field with explicit allow-list for `flags` (KEY_FRAME bit + the 5 boolean transcoded from VAAPI).
EXPERIMENTAL bit (0x02) is set by FFmpeg per `if (s->profile & 0x4)`. BBB profile=0, so `s->profile & 0x4 == 0`. The empirical 0x02 set on inter frames suggests ffmpeg-v4l2-request-git either (a) has a different conditional or (b) sets it from a different field. Either way, libva backend skips this — VAAPI doesn't expose it.
## Step 3.4 — VAAPI consumer trace (DEFERRED)
`LIBVA_TRACE` capture from `mpv --hwdec=no --vo=null` is uninformative because mpv with hwdec=no doesn't engage the libva decode path (uses libva for color-conversion only). Capturing the libva-side decode-path trace requires HW-decode mode, which iter3's whole point is to enable.
**Decision**: defer VAAPI buffer-type enumeration verification to Phase 6 build-time. Phase 2 source-read of `va_dec_vp8.h` already enumerated the 4 buffer types (Picture, Slice, Probability, IQMatrix); Phase 6 build will verify the dispatcher accepts them. If a buffer type is missing or extra, Phase 6 compile/runtime will surface it.
## Step 3.5 — Open-question resolution
Six Phase 2 questions; empirical answers:
### Q1: `first_part_header_bits` exact value
**Frame 1 (key)**: 6550 bits. Frame 2 (inter): 133, 86, 140, 254, 86 (varies). FFmpeg derives this from `s->coder_state_at_header_end.input - data` minus residual bits. VAAPI does NOT expose this directly.
**Phase 4 implication**: VAAPI's `slice->macroblock_offset` (bit offset of MB layer from start of slice data) is the closest analog. **However**, `slice->macroblock_offset` is the MB-data offset (after BOTH the uncompressed header AND the entropy header), whereas `first_part_header_bits` is just the entropy header portion. They differ by `first_part_size * 8 - first_part_header_bits` (the entropy-encoded part of the control partition).
**Fallback strategy**: leave `first_part_header_bits = 0` and check whether kernel hantro driver actually uses it. If it doesn't (likely — the driver re-parses the bitstream), zero is correct. If it does, Phase 7 byte-compare will reveal divergence and Phase 4 will need to compute it bitstream-side. **Phase 5 review will flag this as a known fidelity gap**.
### Q2: `num_dct_parts` vs VAAPI `num_of_partitions`
**Empirical**: `num_dct_parts = 1` for every captured frame. BBB has 2 partitions total (1 control + 1 DCT). VAAPI's `slice->num_of_partitions = 2`. Confirms predicted off-by-one: `num_dct_parts = slice->num_of_partitions - 1`.
### Q3: DPB timestamp 0-sentinel handling
**Empirical**: Frame 1 (key) has `last_ts=0, golden_ts=0, alt_ts=0` — all three zero. Inter frames have all three non-zero (referencing prior visible frames). Confirms FFmpeg writes 0 for missing refs (matches `forward_ref_ts=0` pattern from iter1 mpeg2.c::mpeg2_set_controls).
**Phase 4 implication**: in vp8_set_controls, lookup VASurfaceID `picture->{last,golden,alt}_ref_frame`; if `SURFACE() == NULL` (i.e. `VA_INVALID_SURFACE` or stale ID), leave timestamp = 0. Mirror iter1 mpeg2.c pattern (lines 146-156).
### Q4: `SHOW_FRAME` flag default
**Empirical**: `flags & SHOW_FRAME (0x04)` set on every captured frame (key + inter). BBB has no alt-ref invisible frames in the +0..2s range. **Phase 4 decision**: force `flags |= SHOW_FRAME` unconditionally — VAAPI doesn't expose the bit, and BBB is all-visible. Document as known fidelity gap for streams with alt-ref invisible frames (iter3 out-of-scope per Phase 1 lock).
### Q5: `lf.flags & FILTER_TYPE_SIMPLE`
**Empirical**: not set on any captured frame. BBB uses normal (not simple) loop filter. Confirms VAAPI's `pic_fields.bits.filter_type=0` for BBB — direct mapping per Phase 2 table.
### Q6: First-frame DPB sentinel
**Empirical**: confirmed Q3 above — `last_ts=golden_ts=alt_ts=0` for the keyframe. No self-reference fallback (different from iter1's mpeg2.c where I had to fix self-reference to use 0 sentinel; FFmpeg's VP8 path naturally writes 0 via C99 designated init).
## Phase 3 → Phase 4 transition (proceed condition)
All Phase 3 deliverables green:
- Substrate not regressed (3-codec hashes hold, criterion-5 anchor solid)
- Cross-validator strace captured (~13 S_EXT_CTRLS for 5 frames decoded; verbatim payload for keyframe + inter frames available)
- struct size CORRECTED to 1232 bytes (vs Phase 2 implicit assumption of ~400)
- 5 of 6 open questions answered empirically; Q1 (first_part_header_bits) deferred with safe-default fallback
- VP8 SW reference JPEGs captured (criterion-4 anchor)
Phase 4 plan can lock against:
1. Verbatim keyframe + inter-frame payload bytes (above) for byte-compare anchors.
2. Confirmed quantization deltas all zero for BBB → libva backend computes `quant.y_dc_delta = quantization_index[0][1] - quantization_index[0][0]` and verify all-zero for BBB; mapping is correct.
3. Confirmed segment fields all zero for BBB (segmentation disabled); `segment.flags |= DELTA_VALUE_MODE` per FFmpeg pattern, but kernel ignores when ENABLED bit clear.
4. `lf.flags = ADJ_ENABLE` for BBB on inter frames; `ADJ_ENABLE | DELTA_UPDATE` on keyframe (DELTA_UPDATE only when keyframe initializes the loop-filter delta state).
5. `flags` byte-compare to use mainline-documented bits only (libva backend will produce 0x0d for keyframe, 0x04|... for inter frames; FFmpeg's bit 0x40 explicitly NOT replicated).
## Substrate state at Phase 3 close
- iter3 Phase 1 + Phase 2 commits pushed to gitea (`ea2413e`, `898544a`).
- Fork on noether at iter2 tip `8d71e20`; Phase 6 patches will land here.
- fresnel went offline twice during Phase 3 capture (suspend mid-run), captures preserved on `/tmp/iter3_phase3` between runs.
- Memory rules carry forward unchanged (5 entries + new `feedback_fresnel_hostname`).
- Capture artefacts on fresnel `/tmp/iter3_phase3/`:
- `vp8_strace.*` — 19 strace files (multi-thread)
- `decode_vp8.py` — payload decoder (kept for Phase 7 re-run)
- `vp8_sw_00{1,2}.jpg` — SW reference (criterion-4 anchor)
- `{h264,mpeg2,hevc}_hw_00{1,2}.jpg` — regression block (criterion-5 anchors)