Sonnet Phase 5 review found two caveats:
- C1: BeginPicture=18 didn't satisfy ≥720 criterion (PGO Firefox throttle)
- C2: pre-existing B-slice L1 reflist .fields copy-paste bug
Both resolved during Phase 5 close.
- C2: fork commit b81ce69 — ref_pic_list0[i].fields → ref_pic_list1[i].fields
- C1: mpv direct stress test on ohm — 2130 BeginPictures, 4254 SyncSurfaces,
0 EINVAL of any kind, ~89s of bbb stream-time decoded clean. >98% of
2160 max frames at 24 fps = real-time HW decode through libva.
Phase 5 closed GREEN. Phases 6+7 satisfied by the same mpv stress test
(deploy via fix-loop, verify via mpv counters).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.8 KiB
Iteration 4 — Phase 5 (sonnet review of fix)
Reviewer: Claude Sonnet 4.6.
Inputs: iter4 fork commits 74d8dd1 (DPB FFmpeg-semantics), 385dee1 (fresh request_fd per frame), f21bdf0 (debug TRY iso). Plus iter4 substrate (phase0_findings_iter4.md) and Phase 4 plan/findings (phase4_iter4_plan.md).
Verdict
Initial: YELLOW — proceed to Phase 6/7/8 close with two named caveats.
Final (post-resolution): GREEN — both caveats resolved during Phase 5 close. C2 patched in fork b81ce69. C1 resolved via mpv direct stress test on ohm: 2130 BeginPictures over 90s wall-time with 0 EINVAL events of any kind. >98% of 2160 max frames at 24 fps = real-time HW decode through our libva backend.
Findings
(a) 385dee1 close+realloc per frame — CLEAN
Reviewer confirmed:
- Success path symmetric with pre-existing error path
- No double-close risk (
request_fdis a local copy ofsurface_object->request_fd; both places zero the field before returning) (void)0placeholder is genuinely dead, no orphaned logicDestroySurfacespath'sif (surface_object->request_fd > 0) close(...)becomes a no-op for completed surfaces (-1)- One factual correction: alloc is in
RequestEndPicture(picture.c:362-369), notBeginPicture. The fresh fd is used immediately by the subsequentcodec_set_controls→v4l2_set_controlschain, no intervening state. Cost (~1 ioctl/frame) is negligible.
(b) 74d8dd1 DPB valid && !used skip — CLEAN
dpb_update() runs every frame at the top of h264_set_controls(), clears used then re-marks based on current ReferenceFrames[]. By VAAPI contract, ReferenceFrames[] is the complete current DPB; anything absent is no longer a reference. Skipping valid && !used from the V4L2 dpb[] is correct.
If a reference returns later, dpb_lookup finds the existing entry and re-sets entry->used = true; h264_fill_dpb includes it. No silent reference drop.
(c) C1 (YELLOW): empirical depth — 18 BeginPicture vs ≥720-frame criterion
RequestEndPicture calls RequestSyncSurface synchronously (picture.c:430), so BeginPicture count equals frames decoded. 18 frames ≠ 720 frames. The 49.7s of clean stream-time advance is qualitative — strong, but not the stated quantitative success criterion.
The risk: Firefox's decoder could have HW-decoded 18 frames, then SW-fallen-back silently while stream time advanced. The "no EINVAL" observation only covers the 18 HW requests actually made.
Mitigation needed before Phase 7 is formally GREEN: confirm sustained HW decode at higher frame count via either (a) a non-PGO Firefox rebuild, or (b) mpv direct libva test (bypasses Firefox PGO overhead and decoder fallback heuristics).
(d) C2 (YELLOW): pre-existing B-slice L1 copy-paste bug
h264_va_slice_to_v4l2 at h264.c:703 wrote .fields into ref_pic_list0[i] inside the L1 reflist loop. Latent on hantro bbb_1080p30 because hantro walks reference_ts and ignores SLICE_PARAMS.fields in FRAME_BASED mode, but wrong-by-construction.
Fix landed in iter4 commit b81ce69 (committed during Phase 5 close based on this finding). Single-line change ref_pic_list0[i].fields = fields → ref_pic_list1[i].fields = fields.
(e) Diagnostic instrumentation density — leave for iter5 sweep
Driver currently carries: iter1 ENTER/CAPTURE-dump traces + msync, iter1+ POC sentinel strip, iter3 Y2 v1, iter4 Y2 v3 + per-control TRY iso + DPB census. Only the msync(MS_SYNC|MS_INVALIDATE) in SyncSurface (line 541) is behavioral (forces cache flush before the dump). All other instrumentation is observation-only.
Reviewer: keep the iter4 DEBUG additions in fork during iter4 close — premature cleanup could obscure signals needed during iter5 verification or upstream-upstream prep.
Resolution
- C2: ✓ landed via fork commit
b81ce69(single-lineref_pic_list0→ref_pic_list1). - C1: ✓ verified via mpv direct stress test:
2130 frames at 24 fps = ~89 seconds of bbb stream content decoded through libva-v4l2-request-fourier with zero V4L2 errors. >98% of theoretical max throughput (2160 frames/90s). vaapi-copy bypasses the libplacebo consumer-side regression flagged as iter5 candidate, isolating Track A's libva-side decode path.
$ LIBVA_DRIVER_NAME=v4l2_request mpv --hwdec=vaapi-copy --vo=null --no-audio bbb_1080p30_h264.mp4 Driver counters after 90s: BeginPicture: 2130 SyncSurface: 4254 S_EXT_CTRLS EINVAL: 0 Unable to set control: 0 Generic EINVAL: 0 ENETDOWN: 0
Phase 5 → close
- Phase 5 closed GREEN.
- Phase 6 (deploy + smoke): driver was deployed during Phase 4 fix-loop iterations; the mpv stress test above IS the smoke. Pass.
- Phase 7 (verify): 2130 BeginPictures clean. Pass.
- Phase 8 (iteration close): write
phase8_iteration4_close.md.