Iteration 5 Phase 4: A + E + B all complete; G running on boltzmann
Track A (DEBUG sweep): 6 commits, ~232 lines removed, per-frame v4l2-request log noise from ~30+ lines/frame to 0. 2000-frame stress clean (0 EINVAL, log size 4.4 KB). Track E (multi-context safety): LAST_OUTPUT_WIDTH/HEIGHT moved from process-global static to per-driver_data. Two concurrent mpv (2s stagger) both decode 300 frames clean. Track B (mpv libplacebo segfault): RE-TEST on iter5-end driver shows the iter3-era segfault is GONE. 32s of mpv --vo=gpu decode with 0 segfaults / SIGSEGV. Implicit fix from iter4 fresh-request_fd-per-frame + DPB semantics + iter5 per-driver-data move closed the race window. Track G (PGO-disabled Firefox rebuild): single-pass build kicked on boltzmann; ETA ~60 min. Phase 7G pending deployment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,66 @@
|
|||||||
|
# Iteration 5 — Phase 4 (plan + execution across 4 tracks)
|
||||||
|
|
||||||
|
iter5 locked four tracks at Phase 1: A (DEBUG sweep) + G (PGO-disabled Firefox rebuild) + B (mpv libplacebo segfault) + E (multi-context libva safety). Phase 4 splits into 4A / 4G / 4E / 4B sub-phases.
|
||||||
|
|
||||||
|
## Track A — DEBUG instrumentation sweep ✓ COMPLETE
|
||||||
|
|
||||||
|
Sweep landed in 6 commits (in apply order):
|
||||||
|
|
||||||
|
1. **`848fc0c`** — remove iter3 Y2 v1 + iter4 Y2 v3 + per-control TRY iso from `v4l2.c::v4l2_ioctl_controls` (-54 lines)
|
||||||
|
2. **`39498f0`** — remove iter4 DPB census + per-entry dump from `h264.c::h264_set_controls` (-31 lines)
|
||||||
|
3. **`951233a`** — remove iter1 patch-0014 ENTER traces from buffer.c, image.c, picture.c, surface.c (-17 lines, 13 call sites)
|
||||||
|
4. **`d3a299b`** — remove iter1 patch-0010 hex-dumps + patch-0011 sentinel write from picture.c + surface.c (-81 lines)
|
||||||
|
5. **`843febc`** — remove iter1 slice_header parse echo + VAPicture byte-dump in h264.c, RequestSyncSurface RETURN/early-exit traces in surface.c, suppress per-frame "Unable to get control(s)" when errno==EACCES (-49 lines net)
|
||||||
|
|
||||||
|
Total: ~232 lines of instrumentation removed. Per-frame v4l2-request log noise dropped from ~30+ lines/frame to 0 (only init-time + once-per-resolution-change). Driver source builds clean; 2000-frame stress test (timeout 120s) shows 0 EINVAL, 0 "Unable to" lines, 9 v4l2-request log lines total (all init).
|
||||||
|
|
||||||
|
KEPT (justified):
|
||||||
|
- POC sentinel strip (`h264_strip_ffmpeg_poc_sentinel`) — load-bearing for ffmpeg-vaapi consumers
|
||||||
|
- slice_header bit-precise parser — load-bearing for hantro hw decode (DECODE_PARAMS bit_size fields)
|
||||||
|
- EACCES retry-skip in v4l2_get_controls — load-bearing reflective behavior; one-time announcement message stays
|
||||||
|
- "slice_header parse FAILED" log — fires only on decode-blocking errors, not per-frame noise
|
||||||
|
|
||||||
|
## Track E — Multi-context libva safety ✓ COMPLETE
|
||||||
|
|
||||||
|
Commit **`b993355`** moves `LAST_OUTPUT_WIDTH/HEIGHT` from process-global static in `surface.c` to `struct request_data.last_output_width/height`. The V4L2 device fd is per-driver_data, so this is the correct binding unit (one fd, one current OUTPUT format).
|
||||||
|
|
||||||
|
`surface_reset_format_cache()` signature changed to take a `struct request_data *driver_data` parameter; one callsite in `context.c` updated.
|
||||||
|
|
||||||
|
Audit confirmed only LAST_OUTPUT_* was mutable process-global state. Other statics (formats[], formats_count) are constant lookup tables — no race.
|
||||||
|
|
||||||
|
**Verification:** two concurrent mpv processes with 2-second stagger both decoded 300 frames cleanly, no cross-context corruption. Sub-second co-launch hits kernel-level fd contention on /dev/video1 (hantro is a single-instance device); cross-process serialization is out of scope for a libva backend.
|
||||||
|
|
||||||
|
## Track B — mpv libplacebo `--vo=gpu` segfault ✓ COMPLETE (implicit fix)
|
||||||
|
|
||||||
|
iter3 substrate documented the segfault: Vulkan init fails → mpv falls through to GPU non-vulkan path → 4 frames decode → REQBUFS EBUSY → bizarre CreateSurfaces2 with `sizes[1]=1050626` (uninitialized memory) → SIGSEGV.
|
||||||
|
|
||||||
|
**Empirical re-test on iter5-end driver (post-A + post-E):** `mpv --hwdec=vaapi --vo=gpu` ran for 32 seconds of stream content (all of `--frames=200` + sustained beyond), 98 dropped frames out of ~768, **zero segfaults / SIGSEGV / VK_ERROR_DEVICE_LOST / abort()**. The Vulkan-init-failed warnings still appear ("EnumeratePhysicalDevices ... VK_ERROR_INITIALIZATION_FAILED") but mpv successfully fall-through-decodes.
|
||||||
|
|
||||||
|
The iter3-era crash was implicitly fixed somewhere between iter3 and iter5, most likely by:
|
||||||
|
- iter4's fresh-request_fd-per-frame fix (`385dee1`): timing change closes the cap_pool race window where REQBUFS EBUSY surfaces.
|
||||||
|
- iter4's DPB fields/used-only fixes: kernel state stays consistent, no garbage CreateSurfaces2.
|
||||||
|
- iter5's per-driver-data move: race elimination on resolution-change.
|
||||||
|
|
||||||
|
No iter5 code change required for Track B beyond what A + E already landed. The iter3-era documentation in `phase0_findings_iter3.md` was correct that the bug was real, but the bug is gone now.
|
||||||
|
|
||||||
|
## Track G — PGO-disabled Firefox rebuild (in progress)
|
||||||
|
|
||||||
|
PKGBUILD overlay edit replaced the 3-tier PGO sequence with a single-pass optimized build. The PGO profile-collection step needed `xvfb-run` + display server, which the boltzmann LXC container can't provide.
|
||||||
|
|
||||||
|
Single-pass build kicked at iter5 Phase 4G start; running on boltzmann firefox-fourier container. Currently at ~36 minutes in, mid C++ compile phase. ETA: 30-60 min more, then `mach package` step (5-10 min), then transfer to ohm + extract.
|
||||||
|
|
||||||
|
Will deploy to `/opt/firefox-fourier/` replacing the iter3 PGO-instrumented binary. Expected libxul.so size delta: 3.6 GB (PGO instrumented) → ~150-300 MB (release). Phase 7G verifies on-ohm playback.
|
||||||
|
|
||||||
|
## Phase 4 → Phase 5 transition
|
||||||
|
|
||||||
|
Phase 4 deliverables landed for A + E + B. G in progress. Phase 5 sonnet review will cover:
|
||||||
|
- Track A correctness: did any sweep removal break load-bearing code?
|
||||||
|
- Track E semantics: is per-driver-data the right binding unit for last_output_*?
|
||||||
|
- Track B verification: is "32s clean" sufficient or do we need longer/different content?
|
||||||
|
- Track G: post-rebuild deployment + Firefox-side verification once package() finishes.
|
||||||
|
|
||||||
|
Phase 7 verification anchored:
|
||||||
|
- A: 2000-frame mpv vaapi-copy stress, 0 EINVAL, log size 4.4 KB
|
||||||
|
- E: 2-process concurrent mpv (300 frames each, 2s stagger), both clean
|
||||||
|
- B: mpv --vo=gpu 32s, 0 segfaults
|
||||||
|
- G: pending package + deploy
|
||||||
Reference in New Issue
Block a user