Files

T

claude-noether fd3fce86a6 iter3 Phase 3: baselines — VP8 cross-validator + 3-codec regression

+ SW reference

Captured on fresnel 2026-05-08 across two suspend cycles (laptop
dropped twice mid-run, captures preserved on /tmp/iter3_phase3).
All Phase 3 deliverables green.

Substrate verification:
  backend SHA256: 9e27...6258 (matches iter2 close)
  3-codec regression block: ALL 6 reference hashes match byte-for-
  byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/
  hantro). Substrate has not regressed; criterion-5 anchor solid.

Cross-validator anchor (ffmpeg-v4l2request VP8 strace):
  - VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_
    STATELESS, id=0xa409c8, size=1232 bytes
  - struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT
    400 as one might assume; entropy.coeff_probs[4][8][3][11] alone
    is 1056 bytes)
  - keyframe (frame 1) verbatim payload captured: y_ac_qi=8,
    last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP),
    y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const)
  - inter frame verbatim payload captured: y_ac_qi=122, all DPB
    timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in
    mainline UAPI; vendor-patched ffmpeg-v4l2-request-git;
    kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores
    bit 0x40)

VP8 SW pixel-verify reference (criterion-4 anchor):
  vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad
                  584d789db2c984
  vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3
                  e6ea8c0c78e97a
  Frame 1 != Frame 2 (real motion). These are the Phase 7 byte-
  compare HW-vs-SW targets.

Open-question resolution (5 of 6 answered empirically):

  Q1 first_part_header_bits — varies per frame (key=6550, inter
     ranges 86..254); VAAPI doesn't expose. Phase 4 fallback:
     leave 0 and check kernel behavior at Phase 7 byte-compare.
     Phase 5 review will flag as known fidelity gap.

  Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by-
     one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1).

  Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all
     three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern.

  Q4 SHOW_FRAME default — set on every captured frame (BBB has no
     alt-ref invisible). Force unconditional in libva backend.

  Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter.
     Direct mapping from VAAPI filter_type=0.

  Q6 First-frame DPB sentinel — confirmed Q3; no self-reference
     fallback needed (different from iter1 mpeg2.c).

V4L2 binding cells this boot:
  rkvdec        : /dev/video3 + /dev/media1
  hantro-vpu-dec: /dev/video5 + /dev/media2

Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for
Phase 7 re-run:
  vp8_strace.* (19 files, multi-thread)
  decode_vp8.py (payload decoder)
  vp8_sw_00{1,2}.jpg (criterion-4)
  {h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5)

Refs:
  phase0_findings_iter3.md (Phase 1 lock)
  phase2_iter3_situation.md (Phase 2 contract surface)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-08 20:14:46 +00:00

13 KiB

Raw Blame History

Iteration 3 — Phase 3 (baselines)

Captured 2026-05-08 on fresnel after the laptop returned from suspend (twice — laptop dropped mid-capture, captures preserved on /tmp/iter3_phase3 between runs). Phase 3 deliverables per feedback_dev_process.md:

Substrate re-verification (criterion-5 anchor) — ✅
Cross-validator anchor (verbatim VP8_FRAME control payload) — ✅
VAAPI consumer trace — deferred to Phase 6 build-time check (see step 3.4)
Cache-safe pixel-verify SW reference (criterion-4 anchor) — ✅
Phase 2 open-question answers — ✅ (5 of 6 answered empirically; 1 deferred)
Three-codec regression block — ✅

Pre-flight (verified)

hostname           : fresnel
kernel             : 6.19.9-99-eos-arm
mpv                : 1:0.41.0-3 (NOT mpv-git — L3 satisfied)
libva              : 2.23.0-1
ffmpeg             : ffmpeg-v4l2-request-git 2:8.1.r123329.b57fbbe-2
backend SHA256     : 9e27043847998c197a46a1a26b2f77f22880bb7b3a62aa4d60d8fcaec0ae6258  ← matches iter2 close
fixture            : ~/fourier-test/bbb_720p10s_vp8.webm (2419912 bytes)

V4L2 binding cells (this boot):
  rkvdec        : /dev/video3 + /dev/media1
  hantro-vpu-dec: /dev/video5 + /dev/media2

Step 3.1 — Regression-block reference (criterion-5 anchor)

All 6 reference hashes match byte-for-byte vs iter1+iter2 close — substrate has not regressed.

Codec	Site	Frame 1 SHA256	Frame 2 SHA256	Status
H.264 +30s	rkvdec	`f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9`	`7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8`	✅ MATCH
MPEG-2 +02s	hantro	`6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092`	`ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de`	✅ MATCH
HEVC +02s	rkvdec	`47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5`	`a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656`	✅ MATCH

JPEGs preserved at /tmp/iter3_phase3/{h264,mpeg2,hevc}_hw_00{1,2}.jpg for re-running Phase 7 byte-compare.

Step 3.2 — VP8 SW pixel-verify reference (criterion-4 anchor)

mpv --hwdec=no --vo=image --vo-image-format=jpg --frames=2 --start=00:00:02 ~/fourier-test/bbb_720p10s_vp8.webm:

Frame	SHA256	Size
1	`e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad584d789db2c984`	235990 bytes
2	`a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3e6ea8c0c78e97a`	232549 bytes

Frame 1 ≠ Frame 2 (real motion). These two hashes are the Phase 7 criterion-4 byte-equality target.

JPEGs preserved at /tmp/iter3_phase3/vp8_sw_00{1,2}.jpg.

Step 3.3 — Cross-validator strace + V4L2_CID_STATELESS_VP8_FRAME payload

strace -ff -tt -y -v -s 4096 -e trace=ioctl,openat,close ffmpeg -hwaccel v4l2request -hwaccel_device /dev/media2 -i bbb_720p10s_vp8.webm -frames:v 5 -f null -. ffmpeg-v4l2-request-git decoded 5 frames; strace produced 7 worker-thread PID files + helpers. Decoded payloads via custom python decoder at /tmp/iter3_phase3/decode_vp8.py (kept on disk for re-run).

Submission shape (confirmed)

Property	Value	Source
ioctl	`VIDIOC_S_EXT_CTRLS`	strace verbatim
`ctrl_class`	`0xf010000` (`V4L2_CTRL_CLASS_CODEC_STATELESS`)	strace verbatim
`count`	`1`	strace verbatim — confirms single-control-per-frame predicted in Phase 2
`controls[0].id`	`0xa409c8`	matches `V4L2_CID_STATELESS_VP8_FRAME` from kernel UAPI
`controls[0].size`	`1232`	CORRECTION vs Phase 2 — original commit message of iter1 said "400 bytes" for v4l2_ctrl_vp8_frame; actual is 1232 bytes. Computed: 16(seg)+16(lf)+8(quant)+1104(entropy)+4(coder_state)+84(tail) = 1232. The big component is `entropy.coeff_probs[4][8][3][11] = 1056 bytes`.

Verbatim payload — frame 1 (keyframe, PID 2860 first call)

struct v4l2_vp8_segment:
  quant_update[4]   = (0, 0, 0, 0)
  lf_update[4]      = (0, 0, 0, 0)
  segment_probs[3]  = (0, 0, 0)
  flags             = 0x08  (V4L2_VP8_SEGMENT_FLAG_DELTA_VALUE_MODE)

struct v4l2_vp8_loop_filter:
  ref_frm_delta[4]  = (2, 0, -2, -2)
  mb_mode_delta[4]  = (4, -2, 2, 4)
  sharpness_level   = 0
  level             = 1
  flags             = 0x03  (ADJ_ENABLE | DELTA_UPDATE)

struct v4l2_vp8_quantization:
  y_ac_qi           = 8
  y_dc_delta        = 0
  y2_dc_delta       = 0
  y2_ac_delta       = 0
  uv_dc_delta       = 0
  uv_ac_delta       = 0

struct v4l2_vp8_entropy:
  sha1(1104 bytes)  = 8b2fdae200eb193f...
  y_mode_probs[4]   = (145, 156, 163, 128)  ← FFmpeg's hardcoded keyframe_y_mode_probs
  uv_mode_probs[3]  = (142, 114, 183)        ← FFmpeg's hardcoded keyframe_uv_mode_probs

struct v4l2_vp8_entropy_coder_state:
  range             = 248
  value             = 133
  bit_count         = 2

width × height      = 1280 × 720
horizontal_scale    = 0
vertical_scale      = 0
version             = 0
prob_skip_false     = 255
prob_intra          = 0      ← KEY frame: intra always-on; field unused; FFmpeg writes parser state which is 0
prob_last           = 0      ← same
prob_gf             = 0      ← same
num_dct_parts       = 1
first_part_size     = 22742
first_part_header_bits = 6550
dct_part_sizes[8]   = (277872, 0, 0, 0, 0, 0, 0, 0)
last_frame_ts       = 0      ← KEY frame: no prior reference
golden_frame_ts     = 0      ← same
alt_frame_ts        = 0      ← same
flags               = 0x0d   (KEY_FRAME | SHOW_FRAME | MB_NO_SKIP_COEFF)

Verbatim payload — frame 2 (inter, PID 2860 second call)

segment: same as frame 1 (BBB has segmentation disabled)
lf: ref/mb deltas same; sharp=0; level=15; flags=0x01  (DELTA_UPDATE bit clears post-keyframe)
quant: y_ac_qi=122; all deltas=0
entropy: sha1=e5742b9050e8dc66 (CHANGED — BBB inter-frame entropy state)
  y_mode_probs   = (3, 1, 128, 1)             ← parser-derived inter probs
  uv_mode_probs  = (162, 101, 204)            ← parser-derived
coder_state: range=150 value=69 bit_count=3   ← post-frame-1 boolean coder state
prob_skip_false=14 prob_intra=1 prob_last=251 prob_gf=255 num_dct_parts=1
first_part_size=1218 first_part_header_bits=133
dct_part_sizes=(122,0,0,0,0,0,0,0)
last_frame_ts=5000  golden_frame_ts=11000  alt_frame_ts=11000
flags = 0x66  (see flags-anomaly note below)

Flags-anomaly note (informational; not blocking iter3)

The empirical inter-frame flags=0x66 = bit 0x02 | 0x04 | 0x20 | 0x40 is set by ffmpeg-v4l2-request-git — bit 0x40 is not defined in mainline <linux/v4l2-controls.h> (only bits 0x01..0x20 are). The keyframe correctly produces flags=0x0d = 0x01|0x04|0x08.

The 0x40 extra bit is a vendor-patched additional flag in the installed ffmpeg-v4l2-request-git (kwiboo branch may have downstream changes vs the in-tree reference). The kernel hantro_vp8.c driver only inspects V4L2_VP8_FRAME_IS_KEY_FRAME(hdr) — bit 0x40 is silently ignored.

Phase 4 plan implication: the libva backend should set ONLY the 6 mainline-documented flag bits per Phase 2 mapping table. We will NOT attempt to byte-match FFmpeg's 0x66 for inter frames during Phase 7 cross-validator byte-compare; instead, the Phase 7 byte-compare will be field-by-field with explicit allow-list for flags (KEY_FRAME bit + the 5 boolean transcoded from VAAPI).

EXPERIMENTAL bit (0x02) is set by FFmpeg per if (s->profile & 0x4). BBB profile=0, so s->profile & 0x4 == 0. The empirical 0x02 set on inter frames suggests ffmpeg-v4l2-request-git either (a) has a different conditional or (b) sets it from a different field. Either way, libva backend skips this — VAAPI doesn't expose it.

Step 3.4 — VAAPI consumer trace (DEFERRED)

LIBVA_TRACE capture from mpv --hwdec=no --vo=null is uninformative because mpv with hwdec=no doesn't engage the libva decode path (uses libva for color-conversion only). Capturing the libva-side decode-path trace requires HW-decode mode, which iter3's whole point is to enable.

Decision: defer VAAPI buffer-type enumeration verification to Phase 6 build-time. Phase 2 source-read of va_dec_vp8.h already enumerated the 4 buffer types (Picture, Slice, Probability, IQMatrix); Phase 6 build will verify the dispatcher accepts them. If a buffer type is missing or extra, Phase 6 compile/runtime will surface it.

Step 3.5 — Open-question resolution

Six Phase 2 questions; empirical answers:

Q1: `first_part_header_bits` exact value

Frame 1 (key): 6550 bits. Frame 2 (inter): 133, 86, 140, 254, 86 (varies). FFmpeg derives this from s->coder_state_at_header_end.input - data minus residual bits. VAAPI does NOT expose this directly.

Phase 4 implication: VAAPI's slice->macroblock_offset (bit offset of MB layer from start of slice data) is the closest analog. However, slice->macroblock_offset is the MB-data offset (after BOTH the uncompressed header AND the entropy header), whereas first_part_header_bits is just the entropy header portion. They differ by first_part_size * 8 - first_part_header_bits (the entropy-encoded part of the control partition).

Fallback strategy: leave first_part_header_bits = 0 and check whether kernel hantro driver actually uses it. If it doesn't (likely — the driver re-parses the bitstream), zero is correct. If it does, Phase 7 byte-compare will reveal divergence and Phase 4 will need to compute it bitstream-side. Phase 5 review will flag this as a known fidelity gap.

Q2: `num_dct_parts` vs VAAPI `num_of_partitions`

Empirical: num_dct_parts = 1 for every captured frame. BBB has 2 partitions total (1 control + 1 DCT). VAAPI's slice->num_of_partitions = 2. Confirms predicted off-by-one: num_dct_parts = slice->num_of_partitions - 1.

Q3: DPB timestamp 0-sentinel handling

Empirical: Frame 1 (key) has last_ts=0, golden_ts=0, alt_ts=0 — all three zero. Inter frames have all three non-zero (referencing prior visible frames). Confirms FFmpeg writes 0 for missing refs (matches forward_ref_ts=0 pattern from iter1 mpeg2.c::mpeg2_set_controls).

Phase 4 implication: in vp8_set_controls, lookup VASurfaceID picture->{last,golden,alt}_ref_frame; if SURFACE() == NULL (i.e. VA_INVALID_SURFACE or stale ID), leave timestamp = 0. Mirror iter1 mpeg2.c pattern (lines 146-156).

Q4: `SHOW_FRAME` flag default

Empirical: flags & SHOW_FRAME (0x04) set on every captured frame (key + inter). BBB has no alt-ref invisible frames in the +0..2s range. Phase 4 decision: force flags |= SHOW_FRAME unconditionally — VAAPI doesn't expose the bit, and BBB is all-visible. Document as known fidelity gap for streams with alt-ref invisible frames (iter3 out-of-scope per Phase 1 lock).

Q5: `lf.flags & FILTER_TYPE_SIMPLE`

Empirical: not set on any captured frame. BBB uses normal (not simple) loop filter. Confirms VAAPI's pic_fields.bits.filter_type=0 for BBB — direct mapping per Phase 2 table.

Q6: First-frame DPB sentinel

Empirical: confirmed Q3 above — last_ts=golden_ts=alt_ts=0 for the keyframe. No self-reference fallback (different from iter1's mpeg2.c where I had to fix self-reference to use 0 sentinel; FFmpeg's VP8 path naturally writes 0 via C99 designated init).

Phase 3 → Phase 4 transition (proceed condition)

All Phase 3 deliverables green:

Substrate not regressed (3-codec hashes hold, criterion-5 anchor solid)
Cross-validator strace captured (~13 S_EXT_CTRLS for 5 frames decoded; verbatim payload for keyframe + inter frames available)
struct size CORRECTED to 1232 bytes (vs Phase 2 implicit assumption of ~400)
5 of 6 open questions answered empirically; Q1 (first_part_header_bits) deferred with safe-default fallback
VP8 SW reference JPEGs captured (criterion-4 anchor)

Phase 4 plan can lock against:

Verbatim keyframe + inter-frame payload bytes (above) for byte-compare anchors.
Confirmed quantization deltas all zero for BBB → libva backend computes quant.y_dc_delta = quantization_index[0][1] - quantization_index[0][0] and verify all-zero for BBB; mapping is correct.
Confirmed segment fields all zero for BBB (segmentation disabled); segment.flags |= DELTA_VALUE_MODE per FFmpeg pattern, but kernel ignores when ENABLED bit clear.
lf.flags = ADJ_ENABLE for BBB on inter frames; ADJ_ENABLE | DELTA_UPDATE on keyframe (DELTA_UPDATE only when keyframe initializes the loop-filter delta state).
flags byte-compare to use mainline-documented bits only (libva backend will produce 0x0d for keyframe, 0x04|... for inter frames; FFmpeg's bit 0x40 explicitly NOT replicated).

Substrate state at Phase 3 close

iter3 Phase 1 + Phase 2 commits pushed to gitea (ea2413e, 898544a).
Fork on noether at iter2 tip 8d71e20; Phase 6 patches will land here.
fresnel went offline twice during Phase 3 capture (suspend mid-run), captures preserved on /tmp/iter3_phase3 between runs.
Memory rules carry forward unchanged (5 entries + new feedback_fresnel_hostname).
Capture artefacts on fresnel /tmp/iter3_phase3/:
- vp8_strace.* — 19 strace files (multi-thread)
- decode_vp8.py — payload decoder (kept for Phase 7 re-run)
- vp8_sw_00{1,2}.jpg — SW reference (criterion-4 anchor)
- {h264,mpeg2,hevc}_hw_00{1,2}.jpg — regression block (criterion-5 anchors)

13 KiB Raw Blame History Unescape Escape