diff --git a/phase8_iteration2_close.md b/phase8_iteration2_close.md index 7c7d5fb..f732dab 100644 --- a/phase8_iteration2_close.md +++ b/phase8_iteration2_close.md @@ -23,10 +23,11 @@ | vainfo | ✓ enumerates 7 H.264 + 2 MPEG-2 profiles | | mpv `--hwdec=vaapi-copy --vo=null --frames=200` bbb_1080p30 | ✓ 0 drops, 0 decoder drops, real luma gradient (0x10→0x81), pool LRU visibly recycling slot indices `0..23 → 0,2,1,4,3...` | | mpv `--hwdec=vaapi --vo=gpu` (real-VO, operator inspection) | ✓ smooth — iteration 2 goal met | -| Firefox 150 multi-video session | ⚠ not re-verified post-Fix-3 (deferred — fix 3 changed the surface→buffer model that iter1 Firefox path depends on; smoke-clean on vaapi-copy + vaapi suggests no regression but operator inspection is the binding cell) | +| Firefox 150 (sandbox-disabled) | ✓ engages our libva, decodes 10 frames cleanly through hantro (luma gradient `0x10→0x1c` matching BBB intro fade, real NV12 pixels), then EINVAL on `set_controls` at frame 11. The EINVAL is a **non-iter2 issue** — same Sonnet 7.x family carryover from iter1 (likely 7.5 mid-stream / 7.2 num_ref_idx). cap_pool model is NOT the regression. | +| Firefox 150 (default sandbox) | ✗ libva init fails inside RDD sandbox on `open(/dev/media0)` returning ENETDOWN — Firefox SW-falls-back. **NOT an iter2 code regression** (iter1 init code is byte-identical), but a Firefox routing change since iter1: iter1's findings.md shows decode happened on the **utility** process (`sandboxingKind=0`), iter2 today shows the libva path goes through RDD which is sandbox-blocked. Workaround: launch Firefox with `MOZ_DISABLE_RDD_SANDBOX=1`. | | Brave / chromium-fourier 149 | ✓ unchanged scope (uses chromium-internal V4L2 backend, bypasses libva) | -The Firefox + 4-consumer regression matrix from iteration 1 was NOT re-run. If a regression exists in the Firefox path under Fix 3, the cap_pool's LRU + DECODED-state-retention model should still be correct — but operator validation is the boolean criterion. Recommended for iteration 2.5 or as the first move of iteration 3 substrate. +The 10-frame decoded sequence under sandbox-bypass confirms Fix 3's cap_pool architecture works correctly with Firefox: surface IDs 67108864..67108871 each acquired their own slot, and surface IDs were recycled across frames 5,6,9 with the slot state machine cycling through IN_DECODE → DECODED → recycle on next BeginPicture for the same surface. Pool was operating exactly as designed. ## Architecture-level changes worth knowing for iter3+ @@ -67,7 +68,8 @@ State that carries (re-verified in iteration 2): - All three iter2 fixes deployed (driver sha256 `f27e0064...`) Open questions for iteration 3: -1. **Re-run iter1 4-consumer regression matrix under Fix 3.** Especially Firefox 150 multi-video on mozilla.org (Fix 1's CAPTURE REQBUFS path) and chromium-fourier-149 (uses chromium-internal V4L2 backend but worth confirming no environmental break). +1. **Firefox 150 RDD sandbox blocks `/dev/media0`** — workaround `MOZ_DISABLE_RDD_SANDBOX=1` confirmed. Either a Firefox routing change since iter1 (decode used to go through utility/sandboxingKind=0; now goes through RDD/sandboxed) or a Mesa-side change; needs investigation. Either upstream a Firefox prefs option, document the env var requirement permanently, or switch to a libva backend pattern that doesn't need /dev/media0 (likely impossible — the V4L2 stateless API requires it). +2. **EINVAL after ~10 frames in Firefox.** Same as iter1 carryover Sonnet 7.x family (7.5 mid-stream non-IDR, 7.2 num_ref_idx for multi-slice). Reproducible now; iter3 should narrow which control is rejected. 2. **DEBUG instrumentation sweep.** Remove ENTER logging, CAPTURE/OUTPUT hex dumps, sentinel write in `EndPicture`, msync workaround comment-only-or-removable. Required before any upstream snapshot. 3. **Performance comparison binding cell** (deferred from iter1, iter2 substrate notes). Iteration 3 should be the first iteration that does perf measurement: drop counts, effective FPS, browser CPU% on a defined corpus, scanout-plane residency for vaapi vs vaapi-copy vs SW. 4. **Multi-context libva safety** (Sonnet review 9.6). Two concurrent contexts (different processes or same-process Firefox + mpv overlap) need pool isolation that doesn't currently exist.