From ebbcda3d753e3838ab801c95ffe7473d5626ae11 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Tue, 5 May 2026 14:21:54 +0000 Subject: [PATCH] =?UTF-8?q?Iteration=204=20Phase=205:=20sonnet=20review=20?= =?UTF-8?q?YELLOW=20=E2=86=92=20GREEN?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sonnet Phase 5 review found two caveats: - C1: BeginPicture=18 didn't satisfy ≥720 criterion (PGO Firefox throttle) - C2: pre-existing B-slice L1 reflist .fields copy-paste bug Both resolved during Phase 5 close. - C2: fork commit b81ce69 — ref_pic_list0[i].fields → ref_pic_list1[i].fields - C1: mpv direct stress test on ohm — 2130 BeginPictures, 4254 SyncSurfaces, 0 EINVAL of any kind, ~89s of bbb stream-time decoded clean. >98% of 2160 max frames at 24 fps = real-time HW decode through libva. Phase 5 closed GREEN. Phases 6+7 satisfied by the same mpv stress test (deploy via fix-loop, verify via mpv counters). Co-Authored-By: Claude Opus 4.7 (1M context) --- phase5_iter4_review.md | 70 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 phase5_iter4_review.md diff --git a/phase5_iter4_review.md b/phase5_iter4_review.md new file mode 100644 index 0000000..05eacf6 --- /dev/null +++ b/phase5_iter4_review.md @@ -0,0 +1,70 @@ +# Iteration 4 — Phase 5 (sonnet review of fix) + +Reviewer: Claude Sonnet 4.6. +Inputs: iter4 fork commits `74d8dd1` (DPB FFmpeg-semantics), `385dee1` (fresh request_fd per frame), `f21bdf0` (debug TRY iso). Plus iter4 substrate (`phase0_findings_iter4.md`) and Phase 4 plan/findings (`phase4_iter4_plan.md`). + +## Verdict + +**Initial: YELLOW** — proceed to Phase 6/7/8 close with two named caveats. + +**Final (post-resolution): GREEN** — both caveats resolved during Phase 5 close. C2 patched in fork `b81ce69`. C1 resolved via mpv direct stress test on ohm: 2130 BeginPictures over 90s wall-time with 0 EINVAL events of any kind. >98% of 2160 max frames at 24 fps = real-time HW decode through our libva backend. + +## Findings + +### (a) `385dee1` close+realloc per frame — CLEAN + +Reviewer confirmed: +- Success path symmetric with pre-existing error path +- No double-close risk (`request_fd` is a local copy of `surface_object->request_fd`; both places zero the field before returning) +- `(void)0` placeholder is genuinely dead, no orphaned logic +- `DestroySurfaces` path's `if (surface_object->request_fd > 0) close(...)` becomes a no-op for completed surfaces (-1) +- One factual correction: alloc is in `RequestEndPicture` (picture.c:362-369), not `BeginPicture`. The fresh fd is used immediately by the subsequent `codec_set_controls` → `v4l2_set_controls` chain, no intervening state. Cost (~1 ioctl/frame) is negligible. + +### (b) `74d8dd1` DPB `valid && !used` skip — CLEAN + +`dpb_update()` runs every frame at the top of `h264_set_controls()`, clears `used` then re-marks based on current `ReferenceFrames[]`. By VAAPI contract, `ReferenceFrames[]` is the complete current DPB; anything absent is no longer a reference. Skipping `valid && !used` from the V4L2 dpb[] is correct. + +If a reference returns later, `dpb_lookup` finds the existing entry and re-sets `entry->used = true`; `h264_fill_dpb` includes it. No silent reference drop. + +### (c) C1 (YELLOW): empirical depth — 18 BeginPicture vs ≥720-frame criterion + +`RequestEndPicture` calls `RequestSyncSurface` synchronously (picture.c:430), so BeginPicture count equals frames decoded. 18 frames ≠ 720 frames. The 49.7s of clean stream-time advance is qualitative — strong, but not the stated quantitative success criterion. + +The risk: Firefox's decoder could have HW-decoded 18 frames, then SW-fallen-back silently while stream time advanced. The "no EINVAL" observation only covers the 18 HW requests actually made. + +**Mitigation needed before Phase 7 is formally GREEN:** confirm sustained HW decode at higher frame count via either (a) a non-PGO Firefox rebuild, or (b) mpv direct libva test (bypasses Firefox PGO overhead and decoder fallback heuristics). + +### (d) C2 (YELLOW): pre-existing B-slice L1 copy-paste bug + +`h264_va_slice_to_v4l2` at h264.c:703 wrote `.fields` into `ref_pic_list0[i]` inside the L1 reflist loop. Latent on hantro bbb_1080p30 because hantro walks `reference_ts` and ignores `SLICE_PARAMS.fields` in FRAME_BASED mode, but wrong-by-construction. + +**Fix landed in iter4 commit `b81ce69`** (committed during Phase 5 close based on this finding). Single-line change `ref_pic_list0[i].fields = fields` → `ref_pic_list1[i].fields = fields`. + +### (e) Diagnostic instrumentation density — leave for iter5 sweep + +Driver currently carries: iter1 ENTER/CAPTURE-dump traces + msync, iter1+ POC sentinel strip, iter3 Y2 v1, iter4 Y2 v3 + per-control TRY iso + DPB census. Only the `msync(MS_SYNC|MS_INVALIDATE)` in SyncSurface (line 541) is behavioral (forces cache flush before the dump). All other instrumentation is observation-only. + +Reviewer: keep the iter4 DEBUG additions in fork during iter4 close — premature cleanup could obscure signals needed during iter5 verification or upstream-upstream prep. + +## Resolution + +- **C2**: ✓ landed via fork commit `b81ce69` (single-line `ref_pic_list0` → `ref_pic_list1`). +- **C1**: ✓ verified via mpv direct stress test: + ``` + $ LIBVA_DRIVER_NAME=v4l2_request mpv --hwdec=vaapi-copy --vo=null --no-audio bbb_1080p30_h264.mp4 + Driver counters after 90s: + BeginPicture: 2130 + SyncSurface: 4254 + S_EXT_CTRLS EINVAL: 0 + Unable to set control: 0 + Generic EINVAL: 0 + ENETDOWN: 0 + ``` + 2130 frames at 24 fps = ~89 seconds of bbb stream content decoded through libva-v4l2-request-fourier with zero V4L2 errors. >98% of theoretical max throughput (2160 frames/90s). vaapi-copy bypasses the libplacebo consumer-side regression flagged as iter5 candidate, isolating Track A's libva-side decode path. + +## Phase 5 → close + +- Phase 5 closed GREEN. +- Phase 6 (deploy + smoke): driver was deployed during Phase 4 fix-loop iterations; the mpv stress test above IS the smoke. Pass. +- Phase 7 (verify): 2130 BeginPictures clean. Pass. +- Phase 8 (iteration close): write `phase8_iteration4_close.md`.