Phase 8 close: append Firefox 150 regression re-verify result
Operator-driven Firefox 150 video session under iter2 driver: - With MOZ_DISABLE_RDD_SANDBOX=1: our libva backend engages, decodes 10 frames cleanly through hantro (luma gradient 0x10..0x1c matching BBB intro fade), surface ID recycle + cap_pool slot cycling work as designed. EINVAL on frame 11 is iter1-carryover (Sonnet 7.x family), not an iter2 code regression. - Without sandbox bypass: Firefox 150 RDD sandbox blocks open of /dev/media0 with ENETDOWN, libva init fails, Firefox SW-falls-back. iter1 evidence shows libva ran in the utility process (sandboxingKind=0) at that time; Firefox routing has since changed to RDD. NOT iter2-side; flag for iter3 substrate. Cap_pool architecture confirmed working with Firefox: 8 surfaces, slots recycled IN_DECODE -> DECODED -> reacquired across multiple BeginPicture cycles for the same surface ID. Decoded NV12 content is real (matches mpv vaapi-copy luma signature on the same clip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -23,10 +23,11 @@
|
|||||||
| vainfo | ✓ enumerates 7 H.264 + 2 MPEG-2 profiles |
|
| vainfo | ✓ enumerates 7 H.264 + 2 MPEG-2 profiles |
|
||||||
| mpv `--hwdec=vaapi-copy --vo=null --frames=200` bbb_1080p30 | ✓ 0 drops, 0 decoder drops, real luma gradient (0x10→0x81), pool LRU visibly recycling slot indices `0..23 → 0,2,1,4,3...` |
|
| mpv `--hwdec=vaapi-copy --vo=null --frames=200` bbb_1080p30 | ✓ 0 drops, 0 decoder drops, real luma gradient (0x10→0x81), pool LRU visibly recycling slot indices `0..23 → 0,2,1,4,3...` |
|
||||||
| mpv `--hwdec=vaapi --vo=gpu` (real-VO, operator inspection) | ✓ smooth — iteration 2 goal met |
|
| mpv `--hwdec=vaapi --vo=gpu` (real-VO, operator inspection) | ✓ smooth — iteration 2 goal met |
|
||||||
| Firefox 150 multi-video session | ⚠ not re-verified post-Fix-3 (deferred — fix 3 changed the surface→buffer model that iter1 Firefox path depends on; smoke-clean on vaapi-copy + vaapi suggests no regression but operator inspection is the binding cell) |
|
| Firefox 150 (sandbox-disabled) | ✓ engages our libva, decodes 10 frames cleanly through hantro (luma gradient `0x10→0x1c` matching BBB intro fade, real NV12 pixels), then EINVAL on `set_controls` at frame 11. The EINVAL is a **non-iter2 issue** — same Sonnet 7.x family carryover from iter1 (likely 7.5 mid-stream / 7.2 num_ref_idx). cap_pool model is NOT the regression. |
|
||||||
|
| Firefox 150 (default sandbox) | ✗ libva init fails inside RDD sandbox on `open(/dev/media0)` returning ENETDOWN — Firefox SW-falls-back. **NOT an iter2 code regression** (iter1 init code is byte-identical), but a Firefox routing change since iter1: iter1's findings.md shows decode happened on the **utility** process (`sandboxingKind=0`), iter2 today shows the libva path goes through RDD which is sandbox-blocked. Workaround: launch Firefox with `MOZ_DISABLE_RDD_SANDBOX=1`. |
|
||||||
| Brave / chromium-fourier 149 | ✓ unchanged scope (uses chromium-internal V4L2 backend, bypasses libva) |
|
| Brave / chromium-fourier 149 | ✓ unchanged scope (uses chromium-internal V4L2 backend, bypasses libva) |
|
||||||
|
|
||||||
The Firefox + 4-consumer regression matrix from iteration 1 was NOT re-run. If a regression exists in the Firefox path under Fix 3, the cap_pool's LRU + DECODED-state-retention model should still be correct — but operator validation is the boolean criterion. Recommended for iteration 2.5 or as the first move of iteration 3 substrate.
|
The 10-frame decoded sequence under sandbox-bypass confirms Fix 3's cap_pool architecture works correctly with Firefox: surface IDs 67108864..67108871 each acquired their own slot, and surface IDs were recycled across frames 5,6,9 with the slot state machine cycling through IN_DECODE → DECODED → recycle on next BeginPicture for the same surface. Pool was operating exactly as designed.
|
||||||
|
|
||||||
## Architecture-level changes worth knowing for iter3+
|
## Architecture-level changes worth knowing for iter3+
|
||||||
|
|
||||||
@@ -67,7 +68,8 @@ State that carries (re-verified in iteration 2):
|
|||||||
- All three iter2 fixes deployed (driver sha256 `f27e0064...`)
|
- All three iter2 fixes deployed (driver sha256 `f27e0064...`)
|
||||||
|
|
||||||
Open questions for iteration 3:
|
Open questions for iteration 3:
|
||||||
1. **Re-run iter1 4-consumer regression matrix under Fix 3.** Especially Firefox 150 multi-video on mozilla.org (Fix 1's CAPTURE REQBUFS path) and chromium-fourier-149 (uses chromium-internal V4L2 backend but worth confirming no environmental break).
|
1. **Firefox 150 RDD sandbox blocks `/dev/media0`** — workaround `MOZ_DISABLE_RDD_SANDBOX=1` confirmed. Either a Firefox routing change since iter1 (decode used to go through utility/sandboxingKind=0; now goes through RDD/sandboxed) or a Mesa-side change; needs investigation. Either upstream a Firefox prefs option, document the env var requirement permanently, or switch to a libva backend pattern that doesn't need /dev/media0 (likely impossible — the V4L2 stateless API requires it).
|
||||||
|
2. **EINVAL after ~10 frames in Firefox.** Same as iter1 carryover Sonnet 7.x family (7.5 mid-stream non-IDR, 7.2 num_ref_idx for multi-slice). Reproducible now; iter3 should narrow which control is rejected.
|
||||||
2. **DEBUG instrumentation sweep.** Remove ENTER logging, CAPTURE/OUTPUT hex dumps, sentinel write in `EndPicture`, msync workaround comment-only-or-removable. Required before any upstream snapshot.
|
2. **DEBUG instrumentation sweep.** Remove ENTER logging, CAPTURE/OUTPUT hex dumps, sentinel write in `EndPicture`, msync workaround comment-only-or-removable. Required before any upstream snapshot.
|
||||||
3. **Performance comparison binding cell** (deferred from iter1, iter2 substrate notes). Iteration 3 should be the first iteration that does perf measurement: drop counts, effective FPS, browser CPU% on a defined corpus, scanout-plane residency for vaapi vs vaapi-copy vs SW.
|
3. **Performance comparison binding cell** (deferred from iter1, iter2 substrate notes). Iteration 3 should be the first iteration that does perf measurement: drop counts, effective FPS, browser CPU% on a defined corpus, scanout-plane residency for vaapi vs vaapi-copy vs SW.
|
||||||
4. **Multi-context libva safety** (Sonnet review 9.6). Two concurrent contexts (different processes or same-process Firefox + mpv overlap) need pool isolation that doesn't currently exist.
|
4. **Multi-context libva safety** (Sonnet review 9.6). Two concurrent contexts (different processes or same-process Firefox + mpv overlap) need pool isolation that doesn't currently exist.
|
||||||
|
|||||||
Reference in New Issue
Block a user