f91469abe3
Phase 1 locked F (Firefox RDD sandbox verify-by-patch) and A (frame-11
EINVAL diagnose) running in parallel on a single firefox-fourier build.
Track F: GREEN. Patched Firefox 150.0.1 (firefox-fourier, pkgrel=1.1)
launches on ohm WITHOUT MOZ_DISABLE_RDD_SANDBOX=1 and engages our
libva-v4l2-request backend end-to-end. Three patches needed (Phase 2
identified one and deferred two):
- Broker policy (SandboxBrokerPolicyFactory.cpp): allow /dev/media*,
extend cap-filter to admit stateless decoders that lack M2M caps.
- Seccomp policy (SandboxFilter.cpp): allow ioctl magic byte '|'
for <linux/media.h> request-API ioctls.
- Driver (media.c): replace select() with poll() — Mozilla's RDD
seccomp common policy admits poll/ppoll/epoll_* but not
select/pselect6. Driver-side fix preferred; smaller surface,
portable across sandbox policies, and poll() is the modern API.
Track A: REPRODUCES + DIAGNOSED. Frame-11 EINVAL fires deterministically
on a single-slice P-frame (slice_type=0, frame_num=5, post-IDR) — the
exact iter1/iter2 carryover signature, confirming it isn't environmental.
Y2 instrumentation (in v4l2_ioctl_controls) now logs num_controls /
error_idx / per-control id+size on EINVAL. Sizes match kernel UAPI;
error_idx == num_controls is the kernel's "all bad / no specific control"
sentinel — it's a request-level rejection, not a single-field violation.
Fix is iter4's lock; rig + Y2 in place for fast iter4 turnaround.
Build infrastructure introduced: firefox-fourier LXD container on
boltzmann (RK3588 aarch64, persistent, ssh -J boltzmann
builder@firefox-fourier). Upstream Arch x86_64 wasi packages installed
to work around 4-year-stale ALARM versions. PGO generation crashes at
exit (LXC has no display); obj/dist/ tarball used as the deployable
artifact instead of the pacman package.
Phase 6 surprises captured in phase6_iter3_findings.md: malformed
first-cut patch (descriptive vs numeric hunk headers), --enable-v4l2
isn't a Mozilla 150 flag (auto-set on aarch64+GTK), Mozilla 2025 PGP
key rotation, ALARM-stale wasi, onnxruntime missing in ALARM, and the
"no tricks" lesson (revert workarounds first when redirected).
Carries to iter4 substrate: Track A fix is the natural lock; mpv
libplacebo --vo=gpu segfault stays as separate iter4 candidate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
91 lines
6.9 KiB
Markdown
91 lines
6.9 KiB
Markdown
# Iteration 3 — Phase 5 (sonnet review of Phase 4 deliverables)
|
|
|
|
Reviewer: Claude Sonnet 4.6, in-conversation subagent.
|
|
Date: 2026-05-04.
|
|
Inputs reviewed: `phase0_findings_iter3.md`, `phase2_iter3_situation.md`, `phase3_iter3_baseline.md`, `phase4_iter3_plan.md`, `firefox-fourier/0001-rdd-allow-stateless-v4l2-request-api.patch`, `firefox-fourier/PKGBUILD-overlay.md`. Reviewer additionally read the actual Mozilla source via fetch (not relying solely on Phase 2's quoted excerpts) and the `libva-v4l2-request-fourier/src/` nested fork.
|
|
|
|
## Verdict
|
|
|
|
**YELLOW** — proceed to Phase 6 with two named required fixes.
|
|
|
|
## Findings
|
|
|
|
### Y1 (BLOCKER for Phase 6) — string idiom mismatch in new function
|
|
|
|
The patch's `AddV4l2RequestApiDependencies` constructs the path as:
|
|
|
|
```cpp
|
|
nsAutoCString path("/dev/");
|
|
path.Append(dir_entry->d_name);
|
|
```
|
|
|
|
The existing Mozilla codebase in the same translation unit uses:
|
|
|
|
```cpp
|
|
nsCString path = "/dev/"_ns;
|
|
path += nsDependentCString(dir_entry->d_name);
|
|
```
|
|
|
|
`nsAutoCString` is a stack-buffered subclass of `nsCString` and the `(const char*)` constructor + `.Append()` exist, so the patch likely compiles, but it diverges from the file's own idiom and would be flagged by any Mozilla reviewer. Match the existing style.
|
|
|
|
**Fix applied to `firefox-fourier/0001-rdd-allow-stateless-v4l2-request-api.patch`** before build kickoff.
|
|
|
|
### Y2 (BLOCKER for Phase 7 capture, not Phase 6) — driver does not log `error_idx`
|
|
|
|
`v4l2_set_controls()` in libva-v4l2-request-fourier currently logs `"Unable to set control(s): %s"` on `VIDIOC_S_EXT_CTRLS` failure, but does not surface `controls.error_idx`. When `errno == EINVAL`, that field names exactly which control in the array was rejected. Without it, Phase 7 capture is no more diagnostic than iter2's existing log — Track A's plan to identify "which control fails on frame 11" cannot succeed.
|
|
|
|
**Fix:** in `v4l2_set_controls()` (or whichever wrapper actually calls `VIDIOC_S_EXT_CTRLS`), after the ioctl returns -1 with EINVAL, log `ext_controls.error_idx`, the offending control's `id` (with V4L2_CID_* symbolic name lookup), and its `size`/`value` content. One-line change. Apply at Phase 6 alongside the firefox-fourier build (driver build is independent and fast).
|
|
|
|
### Bonus (not Phase 4 induced; potential Track A fix candidate) — B-slice ref-list-1 copy-paste bug
|
|
|
|
In `libva-v4l2-request-fourier/src/h264.c`, the `h264_va_slice_to_v4l2()` function around line 663 has the B-slice ref-list-1 loop writing `slice->ref_pic_list0[i].fields = fields` instead of `slice->ref_pic_list1[i].fields = fields`. L1 entries `.fields` member is being written into L0 slot.
|
|
|
|
For bbb_1080p30 (mostly I+P frames in the BBB SFX intro segment), this bug may not fire. If frame 11 happens to be a B-frame in this stream, this could be the EINVAL cause — or could contribute to silent reference-list corruption with a downstream EINVAL signature.
|
|
|
|
**Disposition:** do NOT speculative-fix at Phase 6. We don't yet know whether frame 11 is a B-frame. Y2's `error_idx` logging will reveal whether the failing control is a SLICE_PARAMS field touching `ref_pic_list1` — if yes, the copy-paste fix becomes the obvious patch. Save the candidate fix for Phase 7's analysis stage.
|
|
|
|
### Minor — `--skipinteg` vs `--skipchecksums` in PKGBUILD overlay doc
|
|
|
|
The overlay doc references `makepkg -ef --skipinteg`. On modern Arch makepkg (7.1.0 inside the firefox-fourier container) the flag is `--skipchecksums`. Both work via `--skipinteg` aliasing in some pacman branches but `--skipchecksums` is canonical. Cosmetic; fix later.
|
|
|
|
### Phase 6 finding (overrides Sonnet) — `--enable-v4l2` is NOT a Mozilla 150 configure flag
|
|
|
|
Sonnet's review noted an `--enable-v4l2` mozconfig "verify present" gate. Empirical Phase 6 ground-truth (2026-05-05): Mozilla 150 has no `--enable-v4l2` flag at all. Adding it crashes configure with `mozbuild.configure.options.InvalidOptionError: Unknown option: --enable-v4l2`. The actual gate is in `toolkit/moz.configure:643-651`:
|
|
|
|
```python
|
|
@depends(target, toolkit_gtk)
|
|
def v4l2(target, toolkit_gtk):
|
|
# V4L2 decode is only used in GTK/Linux and generally only appears on
|
|
# embedded SOCs.
|
|
if target.cpu in ("arm", "aarch64", "riscv64") and toolkit_gtk:
|
|
return True
|
|
|
|
set_config("MOZ_ENABLE_V4L2", True, when=v4l2)
|
|
set_define("MOZ_ENABLE_V4L2", True, when=v4l2)
|
|
```
|
|
|
|
So MOZ_ENABLE_V4L2 is automatically set whenever the target is arm/aarch64/riscv64 and the toolkit is GTK. boltzmann's container is aarch64+GTK → MOZ_ENABLE_V4L2 is implicitly true; our patch's `#ifdef MOZ_ENABLE_V4L2` blocks compile in normally.
|
|
|
|
This is a tighter binding than --enable-v4l2 would have been: x86_64 desktop builds will NOT compile our patch. Acceptable for ohm; the ARM-only auto-enable in moz.configure also explains why upstream Mozilla doesn't ship `--enable-v4l2` as a user-facing option — it's a target-architecture decision, not a per-build choice. Filing-day implication: any upstream submission of this patch should not add a configure-flag toggle, but live inside the existing MOZ_ENABLE_V4L2 ifdef.
|
|
|
|
### Minor — mozconfig linker flag check at Phase 6 start
|
|
|
|
The upstream Arch PKGBUILD targets `arch=(x86_64)` and may not include aarch64-specific linker hints (`--enable-linker=lld` or equivalent). Low probability of build break given boltzmann's rust 1.95 + clang 22, but check `grep -E 'lld|linker' mozconfig` before kickoff. ALARM-style PKGBUILDs sometimes patch this; upstream Arch may not.
|
|
|
|
## Cap-filter and security review (NOT findings — green-lit)
|
|
|
|
The reviewer confirms:
|
|
|
|
- The `(CAPTURE_MPLANE & OUTPUT_MPLANE & STREAMING)` triple-AND for stateless decoders is the correct guard — camera-capture-only nodes lack OUTPUT_MPLANE; display-output-only nodes lack CAPTURE_MPLANE; the union with M2M arms is idempotent in `AddPath`.
|
|
- `/dev/media*` rdwr enumeration on the embedded ARM target is in the same security domain as `/dev/video*` already-rdwr — not a campaign-blocking attack-surface increase. For upstream Mozilla submission, a reviewer would prefer filtering by `MEDIA_IOC_DEVICE_INFO` + `MEDIA_ENT_F_PROC_VIDEO_DECODER`, but the campaign goal (verify on ohm) is well-served by the blunt enumeration. Note for an eventual Mozilla bug filing.
|
|
- Seccomp deferral is sound: ENETDOWN at `open()` time is broker-policy evidence; SIGSYS at ioctl time is unmistakable and different. Deferring `SandboxFilter.cpp` to Phase 7 empirical is correct.
|
|
- PKGBUILD pattern (rename + conflicts + provides) is valid and standard. `makepkg -e` semantics in the doc match makepkg actual behavior.
|
|
|
|
## Phase 5 → Phase 6 transition gates
|
|
|
|
- [x] Y1 patch fix applied (this Phase 5 close).
|
|
- [x] Y2 driver instrumentation applied (this Phase 5 close, in libva-v4l2-request-fourier).
|
|
- [ ] Phase 6 build kicked off in firefox-fourier container.
|
|
- [ ] Phase 6 first action: `grep -E 'lld|linker' mozconfig` after PKGBUILD fetch.
|
|
- [ ] Phase 7 includes the B-slice bug as a candidate Track A fix; trigger only if Y2's `error_idx` log names a `ref_pic_list1` field.
|