Phase 1 locked F (Firefox RDD sandbox verify-by-patch) and A (frame-11
EINVAL diagnose) running in parallel on a single firefox-fourier build.
Track F: GREEN. Patched Firefox 150.0.1 (firefox-fourier, pkgrel=1.1)
launches on ohm WITHOUT MOZ_DISABLE_RDD_SANDBOX=1 and engages our
libva-v4l2-request backend end-to-end. Three patches needed (Phase 2
identified one and deferred two):
- Broker policy (SandboxBrokerPolicyFactory.cpp): allow /dev/media*,
extend cap-filter to admit stateless decoders that lack M2M caps.
- Seccomp policy (SandboxFilter.cpp): allow ioctl magic byte '|'
for <linux/media.h> request-API ioctls.
- Driver (media.c): replace select() with poll() — Mozilla's RDD
seccomp common policy admits poll/ppoll/epoll_* but not
select/pselect6. Driver-side fix preferred; smaller surface,
portable across sandbox policies, and poll() is the modern API.
Track A: REPRODUCES + DIAGNOSED. Frame-11 EINVAL fires deterministically
on a single-slice P-frame (slice_type=0, frame_num=5, post-IDR) — the
exact iter1/iter2 carryover signature, confirming it isn't environmental.
Y2 instrumentation (in v4l2_ioctl_controls) now logs num_controls /
error_idx / per-control id+size on EINVAL. Sizes match kernel UAPI;
error_idx == num_controls is the kernel's "all bad / no specific control"
sentinel — it's a request-level rejection, not a single-field violation.
Fix is iter4's lock; rig + Y2 in place for fast iter4 turnaround.
Build infrastructure introduced: firefox-fourier LXD container on
boltzmann (RK3588 aarch64, persistent, ssh -J boltzmann
builder@firefox-fourier). Upstream Arch x86_64 wasi packages installed
to work around 4-year-stale ALARM versions. PGO generation crashes at
exit (LXC has no display); obj/dist/ tarball used as the deployable
artifact instead of the pacman package.
Phase 6 surprises captured in phase6_iter3_findings.md: malformed
first-cut patch (descriptive vs numeric hunk headers), --enable-v4l2
isn't a Mozilla 150 flag (auto-set on aarch64+GTK), Mozilla 2025 PGP
key rotation, ALARM-stale wasi, onnxruntime missing in ALARM, and the
"no tricks" lesson (revert workarounds first when redirected).
Carries to iter4 substrate: Track A fix is the natural lock; mpv
libplacebo --vo=gpu segfault stays as separate iter4 candidate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.0 KiB
Iteration 3 — Phase 4 (plan + inputs)
Track F (sandbox patch) and Track A (frame-11 EINVAL) plans, ready for Phase 5 sonnet review.
Track F — firefox-fourier RDD sandbox patch
Deliverable authored at firefox-fourier/0001-rdd-allow-stateless-v4l2-request-api.patch.
What it changes (single source file, two hunks + one new function):
-
AddV4l2Dependencies()cap-filter widened to also admit nodes withV4L2_CAP_VIDEO_CAPTURE_MPLANE & V4L2_CAP_VIDEO_OUTPUT_MPLANE & V4L2_CAP_STREAMING. This catches stateless decoders that don't advertise M2M. -
New static
AddV4l2RequestApiDependencies()function that enumerates/dev/media*and adds each rdwr to the RDD broker policy. Mirrors the structure ofAddV4l2Dependencies()for symmetry and reviewer-friendliness. -
GetRDDPolicy()calls the new function underMOZ_ENABLE_V4L2.
What it does NOT change: the seccomp policy in SandboxFilter.cpp. iter3 Phase 2 deferred this to empirical Phase 7 verification. Rationale: the iter2 failure signature was ENETDOWN at open(/dev/media0), which is broker-policy-denial, not seccomp. If MEDIA_REQUEST_IOC_QUEUE turns out to be seccomp-blocked once the open succeeds (would manifest as SIGSYS abort with seccomp_unotify in stderr), Phase 7 amends the patch with a SandboxFilter.cpp hunk allowing ioctl with magic byte '|' (or specifically the MEDIA_IOC_* range). This is a known-feasible amendment, not architectural; the cost of guess-and-check vs source-fetch-through-WebFetch favored guess-and-check.
Patch-application risk: the hunks use text-context anchors (verbatim Mozilla source from Phase 2), not line numbers. Minor whitespace drift in firefox-150.0.1.source.tar.xz vs the searchfox mozilla-release snapshot is the failure mode. Mitigation: dry-run patch -p1 --dry-run against an unpacked tarball BEFORE first makepkg. If hunks fail, re-anchor.
Track F — AUR PKGBUILD overlay
Deliverable authored at firefox-fourier/PKGBUILD-overlay.md.
Strategy: use upstream Arch firefox PKGBUILD (gitlab.archlinux.org) as basis, layer 5 hunks: rename → add aarch64 → add patch source → updpkgsums → apply in prepare(). NO mach-build or mozilla-central. The boltzmann LXD container has rust 1.95 / clang 22 / cbindgen 0.29 pre-staged and the upstream PKGBUILD's --enable-v4l2 mozconfig option is verified active.
Rebuild contract: makepkg -e (--noextract) skips re-extracting the firefox tarball and re-applying the patch, dramatically faster on iteration. For full clean rebuild (e.g. patch text changed): makepkg -C (--cleanbuild). Acknowledged user guidance: "if an aur package is the basis, remember to skip re-extraction and patching (makepkg -e) on rebuilds".
Fallback if rust-on-aarch64 fails: documented in iter3 Phase 1 lock. Power on data (x86), prevent sleep, set up x86 host with cross-compile target aarch64. Same .patch and same PKGBUILD overlay carry over; only arch= and the build host change. NOT expected to be needed since boltzmann's rust 1.95 toolchain already exists and Mozilla certifies aarch64 builds in CI.
Track A — libva-v4l2-request-fourier frame-11 EINVAL
No code fix in Phase 4. The fix requires knowing WHICH V4L2 control field returns EINVAL on frame 11, which we don't yet know. Phase 4 instead delivers the diagnostic-loaded driver build that surfaces the failing field name when run under the patched Firefox.
Plan:
-
Diagnostic instrumentation in
libva-v4l2-request-fourier/src/:- In
surface.c::EndPicture(or wherever per-request controls are submitted viaVIDIOC_S_EXT_CTRLS), wrap the ioctl with arequest_log()call that, on EINVAL, dumps every control struct member:id,size,value(or for compound controls, the compound struct contents). UseV4L2_CID_*symbolic name lookup (a switch on id → string), or fall through to numeric id. - Also log the slice index, picture index, surface ID, and POC (Picture Order Count) so we can correlate with the 11th-frame timing.
- This is purely add-only logging; revert in iter4's DEBUG sweep.
- In
-
Build + deploy: rebuild driver via
meson setup --buildtype=release && ninjaon ohm at/tmp/libva-src/..., deploy to/usr/lib/dri/v4l2_request_drv_video.so. Driver sha256 changes. -
Phase 7 capture: with patched Firefox + instrumented driver, run bbb_1080p30. Capture stderr; the EINVAL frame-11 line will name the control. Then we know whether it's:
- DECODE_PARAMS (Sonnet 7.5 mid-stream non-IDR territory)
- SLICE_PARAMS (
num_ref_idx_l0/l1, Sonnet 7.2) - SCALING_MATRIX (less likely; usually constant)
- SPS/PPS (even less likely; usually constant or per-IDR-only)
-
Fix authoring happens AFTER Phase 7 capture, in what becomes Phase 7.5 / Phase 8 territory rather than Phase 4. This is the natural shape of "Track A informed by Track F's rig".
Reading reference for control validation rules: drivers/staging/media/hantro/hantro_g1_h264_dec.c in the kernel tree on ohm. Check on which control fields the driver returns -EINVAL in the validate path. (This ALSO is doable on rpi if we have a copy of the kernel source nearby; ohm being offline doesn't block this preliminary read.)
Phase 5 review checklist (what sonnet should look at)
-
Patch correctness: does the .patch text apply cleanly to firefox-150.0.1? Are the hunks anchored on stable text? Is
nsAutoCString path("/dev/")the right string-builder type for this codebase (vsstd::string,nsCString, or others)? Are the cap-filter conditions logically equivalent to the substrate's claim "stateless decoders need CAPTURE_MPLANE+OUTPUT_MPLANE+STREAMING"? -
Patch security: does adding
/dev/media*rdwr to RDD increase the attack surface in a way the existing/dev/video*rdwr policy doesn't already? Is there a media-controller node on common Linux desktops that exposes more than V4L2 (e.g. ISP / camera control nodes)? Should we filter /dev/media* by some capability check analogous to AddV4l2Dependencies's M2M check, or is enumeration sufficient? -
PKGBUILD safety: is renaming to firefox-fourier with conflicts=(firefox) the right pacman pattern, or should we use a
provides=()pin without the conflict? Does the makepkg -e contract documented in the overlay actually hold for this PKGBUILD's prepare() shape? -
Track A diagnostic plan: is the EndPicture wrapping going to fire on the failing path, or could there be a different ioctl call site (S_EXT_CTRLS in submit_request, in queue.c, etc.) that hits EINVAL first? Should the instrumentation be at a lower layer (libva ioctl wrapper, or strace-derived signature) instead?
-
Deferred-seccomp risk: Phase 2 deferred
SandboxFilter.cppto empirical Phase 7 test. Does sonnet have a fast path to fetch that source we missed? Is the deferral acceptable?
Stop point
Phase 4 deliverables landed: patch text, PKGBUILD overlay strategy, Track A diagnostic plan, Phase 5 review checklist. Proceeding to Phase 5: sonnet review of the above. After Phase 5 passes (or the issues from review are resolved), Phase 6 builds firefox-fourier in the container and Phase 7 verifies on ohm.