Phase 0: browser X11-overlay inventory + mpv reference cell + tooling installs

Source-level verdict: no browser in the matrix has a code path
to hand NV12 to the X server for plane scanout. Chromium ozone-x11
wires StubOverlayManager (ozone_platform_x11.cc:262); Brave 147 +
chromium-fourier 149 inherit unchanged. Firefox WindowSurfaceX11
is RGB-only. The campaign's load-bearing hypothesis is structurally
weakened — what the X11 cells will measure for browsers is
KWin's per-frame RGB-recomposite cost, not the original
"force-GL-composite of NV12" framing. mpv --vo=xv becomes the
matrix's only direct test of the operator-supplied mechanism.

Matrix updated: 12 cells -> 16 cells (+8 mpv VO sub-points).
revert.log entries 1-5 capture all package + per-user state
mutations from this turn (measurement tools, openbox, XFCE +
xfwm4-no-comp pre-seed, firefox + AUR xtrace, XFCE rotation
+ touchscreen mapping); single SSH-driveable revert chain
returns ohm to its pre-campaign 1169-package state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-03 11:14:46 +00:00
parent 5d34a957ee
commit d2e11be430
30 changed files with 13176 additions and 35 deletions
@@ -0,0 +1,507 @@
# Phase 0 — Browser X11-overlay-path inventory
**Date:** 2026-05-03
**Scope:** answer worklist.md item *"Browser X11-overlay-path
inventory"* — for each of Brave 147, chromium-fourier 149,
Firefox 150, and the mpv 0.41 reference client, determine
**whether a code path exists under X11 to request
hardware-overlay scanout of NV12 video buffers**, vs always
GL-compositing internally to RGB before X11 presentation.
**Method:** source-level inspection. Local Firefox 149 source
tree (acquired by the predecessor as `firefox-fourier-work/`)
read directly. Chromium 147 source acquired from the
`chromium-builder` LXD CT on boltzmann via the his subagent —
13 files, 180 KB, copied into
`phase0_evidence/chromium_ozone_x11_2026-05-03/` with full
provenance in that subtree's `README.txt`. mpv inspected via
upstream docs + its installed binary's `--vo=help`. **Runtime
engagement (does the path actually fire at playback time?) is
NOT covered here** — that's a separate, attended-launch
follow-up; design sketch at the end of this doc.
## TL;DR
| Client | X11 overlay path exists in code? | Runtime-engaged today? | Notes |
|---|---|---|---|
| Brave 147 | **No.** ozone-x11 uses `StubOverlayManager`. | N/A | upstream Chromium 147 + Brave UI patches; Ozone unchanged |
| chromium-fourier 149 | **No.** Same as upstream Chromium 149. | N/A | local patches target decode side only; Ozone unchanged |
| Firefox 150 | **No.** `WindowSurfaceX11{,Image,SHM}` are RGB-only. | N/A | gfx/webrender_bindings has DComp (Windows), no Linux equivalent |
| mpv 0.41 `--vo=xv` | **Yes (XVideo legacy path).** | unknown | only path in this matrix that *could* engage a hardware plane via X server's Xv adapter |
| mpv 0.41 `--vo=gpu --gpu-context=x11` | **No.** GL composite, identical plumbing to browsers. | N/A | DRI3 + XPresent of an RGB GL-composited framebuffer, same as Chromium/Firefox under X11 |
**Headline finding:** every Blink-based browser (Brave,
chromium-fourier) and Firefox under stock X11 ozone *cannot* hand
an NV12 dmabuf to the X server for hardware-overlay scanout;
they always GL-composite NV12 → RGB in their own GPU process and
then present an RGB framebuffer to the X server.
This **structurally weakens** the campaign's load-bearing
hypothesis (`README.md` § 1: *"the campaign's load-bearing
hypothesis is that this plane-allocation freedom translates into
measurable browser-video speedup"*). The freedom *exists* at the
DRM/Xorg layer — Plane 39 NV12 LINEAR is reachable to X11
clients in principle — but **no shipping browser ever tries to
exercise it** under X11. What the matrix's X11 cells will
actually measure for browsers is the cost of the GL composite
running in the *browser's* GPU process vs the GL composite
running in *kwin_wayland*'s process — an interesting question,
but a different one from the stated mechanism.
The mpv `--vo=xv` cell becomes the matrix's most important
single data point: it's the only client in the campaign that
can *attempt* to engage an X11 hardware-overlay plane directly.
A clean `--vo=xv` engagement establishes that the X11
hardware-overlay path *works on this hardware*; absence
establishes it doesn't even exist on rockchip-drm + Panfrost +
modesetting Xorg.
---
## Chromium ozone-x11: `StubOverlayManager` (decisive)
Source provenance:
`phase0_evidence/chromium_ozone_x11_2026-05-03/`
(Chromium 147.0.7727.116 from `chromium-builder` LXD CT on
boltzmann; tarball mtime 2026-04-25).
### The wiring
`ui/ozone/platform/x11/ozone_platform_x11.cc:262`:
```cpp
overlay_manager_ = std::make_unique<StubOverlayManager>();
```
`StubOverlayManager` (in `ui/ozone/common/stub_overlay_manager.h`)
implements the `OverlayManagerOzone` interface as a no-op —
its `CreateOverlayCandidates()` returns no candidates. Whatever
chromium's `OverlayProcessor` asks the X11 backend, the answer
is "no overlay possible here" and the regular GL-composite path
runs.
**Contrast with Wayland**
(`ui/ozone/platform/wayland/ozone_platform_wayland.cc:330`):
```cpp
std::make_unique<WaylandOverlayManager>(buffer_manager_.get());
```
A real `WaylandOverlayManager` backed by the
`WaylandBufferManagerHost`. This is the path the predecessor
campaign `kwin_overlay_subsurface` was investigating — the route
where Chromium emits `wp_subsurface` overlay candidates that
KWin Wayland may or may not promote to scanout planes (and
historically didn't, hence the predecessor's Phase 8 closure).
### What the X11 directory does NOT contain
Directory listing of `ui/ozone/platform/x11/` (from `README.txt`)
filtered for relevance:
```
$ ls ui/ozone/platform/x11/ | grep -iE 'overlay|buffer.*manager|dmabuf|present'
x11_window_manager.cc
x11_window_manager.h
```
`x11_window_manager` is the WM-events bookkeeping class
(focus/raise/restack), unrelated to overlays. **There is no
`x11_overlay_manager.cc`, no `x11_buffer_manager_host.cc`, no
`x11_dmabuf_*.cc`, and no `x11_present_*.cc`** — files that
exist on the Wayland side. The X11 backend has no machinery
to track overlay candidates, no buffer-manager IPC, no
dmabuf-feedback handling, and no Present-extension protocol
support.
### What the X11 directory DOES contain — and why it's not enough
`x11_surface_factory.cc:40-43`:
```cpp
// Native pixmaps are first imported as X11 pixmaps using DRI3
// and then into EGL.
```
DRI3 IS used by ozone-x11. But the path is:
1. GPU process gets an NV12 dmabuf from the video decoder
(VAAPI / V4L2VideoDecoder).
2. `X11SurfaceFactory::CreateNativePixmap()` wraps it as a
`gfx::NativePixmapDmaBuf`.
3. The dmabuf is imported either directly via
`EGL_EXT_image_dma_buf_import` or, as a fallback, by
creating an X11 Pixmap via DRI3 and binding that as an
EGL image.
4. **Either way, the resulting EGL image is sampled by
Chromium's GL compositor** to render the WebRender output.
5. The browser's final RGB framebuffer is presented to the X
server via `glXSwapBuffers` (or equivalent EGL swap).
This is the GL-composite path — the same path browsers use on
every X11 + GL-context combo since the late 2000s. The dmabuf
is consumed by the GPU process, not handed to the X server for
plane scanout. Plane 39 (the only NV12-LINEAR-capable plane on
rockchip-drm) is programmed by the X server at scanout time
with the **RGB** browser framebuffer, not the NV12 video buffer.
### What `chrome://gpu` would show for video acceleration
Independent of decode-side acceleration, the
*Hardware-accelerated video decode* line in `chrome://gpu` only
reports decoder status, not overlay status. `chrome://gpu`
doesn't have a "Hardware video overlay" line for Linux/X11 —
that line is Windows-only (`DCOMPSurface` on the DirectComposition
swap chain). The absence of such a line in chrome://gpu under
Linux/X11 is itself a sign: there's no overlay surface concept
in this backend.
### Brave 147 verdict
Brave is a Chromium fork. Its tree adds ~50 patches on top of
upstream Chromium covering UI/Shields/Wallet/IPFS/Tor/etc. None
of Brave's diff touches `ui/ozone/platform/x11/`, the
`OverlayProcessor`, or the GPU buffer pipeline. **Brave 147
under X11 has the identical "StubOverlayManager → GL composite"
behavior as upstream Chromium 147.**
### chromium-fourier 149 verdict
`marfrit-packages/arch/chromium-fourier/STUDY.md` documents the
patch set:
- Patch 1: bypass `media/gpu/chromeos/video_decoder_pipeline.cc`
on Linux non-ChromeOS so VaapiVideoDecoder gets used.
- Patch 2: V4L2VideoDecoder factory un-gating for non-ChromeOS.
- Patch 3 (cosmetic): default `LIBVA_DRIVER_NAME=v4l2_request`.
- `nv12-external-oes-on-modifier-external-only.patch`: EGL
import quirk for NV12 modifier-external dmabufs.
- `wayland-allow-direct-egl-gles2.patch`: Wayland-specific.
**All decode-side. Zero changes to `ui/ozone/platform/x11/`,
zero changes to the OverlayProcessor.** chromium-fourier 149's
X11 behavior is identical to upstream Chromium 149's, which is
in turn identical in Ozone shape to 147 (the
`StubOverlayManager` wiring dates to ~2019 and hasn't moved).
---
## Firefox 150 widget/gtk: RGB-only X11 surfaces (decisive)
Source: `firefox-fourier-work/firefox-149.0/`. Firefox 149 ≈ 150
on the X11-surface code path; this layer hasn't changed
materially in 5+ years.
### The X11 surface implementations
Firefox's X11 backend lives at `widget/gtk/WindowSurfaceX11*`:
- `WindowSurfaceX11.cpp` — abstract base.
- `WindowSurfaceX11Image.cpp` — XImage-based (slow legacy path).
- `WindowSurfaceX11SHM.cpp` — XShm-based (shared-memory pixmap).
`WindowSurfaceX11.cpp::GetVisualFormat()` enumerates exactly
three pixel formats:
```cpp
case 32: return gfx::SurfaceFormat::B8G8R8A8;
case 24: return gfx::SurfaceFormat::B8G8R8X8;
case 16: return gfx::SurfaceFormat::R5G6B5_UINT16;
```
All RGB. No NV12. No YUV. No dmabuf. Final pixel buffer is RGB
before reaching X.
A grep across all
`widget/gtk/WindowSurfaceX11*.{cpp,h}` files for any of
`dmabuf|DMABuf|NV12|YUV|hardware.*overlay|Plane|XPresent|DRI3`
returns **zero matches**. Firefox's X11 presentation path is
strictly software-RGB-pixmap → X.
### Why the dmabuf code in widget/gtk/ doesn't apply
Files like `DMABufSurface.cpp`, `DMABufBuffer.cpp`,
`WaylandSurface.cpp` exist and are reachable on Linux, but
their use is:
- Wayland-side: produce/consume `wl_buffer` dmabufs for the
client→compositor handoff via `zwp_linux_dmabuf_v1`.
- VAAPI-decode-side: import a hardware-decoded NV12 dmabuf as
an EGL image so Firefox's WebRender can sample it as a
texture.
In the second case (VAAPI under X11), the dmabuf is consumed by
Firefox's own GPU process for compositing. The composited
output is then handed to `WindowSurfaceX11SHM` as RGB —
identical situation to Chromium.
### Mozilla's `gfx.x11-egl.force-enabled` pref
`modules/libpref/init/StaticPrefList.yaml`:
```yaml
- name: gfx.x11-egl.force-enabled
type: bool
value: false
mirror: once
# Whether to force using EGL over GLX.
```
This forces Firefox to use **EGL over GLX** for its GL context
under X11 (better dmabuf import support since EGL has
`EGL_EXT_image_dma_buf_import`, GLX doesn't). It changes the
GPU-process composite-input plumbing, **not** the
WindowSurfaceX11 presentation plumbing. The output to X is
still RGB.
### Mozilla's hardware-overlay code is Windows-only
The only files in Firefox 149 source matching
`hardware.*overlay`:
- `gfx/webrender_bindings/DCLayerTree.cpp`
DirectComposition layer tree (Windows).
- `gfx/webrender_bindings/RenderCompositorANGLE.cpp`
ANGLE composition (Windows).
- `gfx/config/gfxFeature.h` — feature-flag enum.
- `gfx/thebes/gfxPlatform.cpp` — feature-flag plumbing.
There is no Linux/X11 hardware-overlay equivalent. Mozilla's
`MOZ_X11_EGL` env var (sometimes mentioned in forum threads as
a "force X11 hardware overlay" toggle) is just a synonym for
`gfx.x11-egl.force-enabled` — same EGL-vs-GLX scope, no
overlay-scanout effect.
### Firefox 150 verdict
**No X11 hardware-overlay path. Firefox under X11 always
GL-composites NV12 → RGB internally and presents RGB to the X
server.** Same architectural shape as Chromium ozone-x11.
---
## mpv 0.41: the only client with a path that *could* engage X11 hardware overlay
mpv installed: `mpv 1:0.41.0-3` with `libplacebo v7.360.1`
(per `02_x11_paths.txt`).
### `--vo=xv` — the legacy XVideo overlay path
XVideo (`Xv`) is an X protocol extension dating to the late
1990s, designed precisely for hardware video overlays: the
client hands the X server a YUV image; the X server programs a
hardware video plane (where available) to scanout that image,
hardware-blended with the rest of the desktop. On modesetting
Xorg driver + a DRM driver that exposes a YUV-capable plane,
the X server's XVideo adapter wires through DRI2/DRI3 to the
DRM plane allocator, and YUV image goes onto a hardware plane.
`xdpyinfo` on ohm confirms XVideo extension is initialized on
the running X server (`03_xprotocol_extensions.txt:50`,
`Xorg.0.log` line 96-97 in `01_live_session.txt`).
**Whether modesetting + rockchip-drm actually wires XVideo to
Plane 39 is the empirical question.** Possible outcomes:
- mpv `--vo=xv` programs Plane 39 NV12 → X11 hardware-overlay
path is reachable on this hardware, and the campaign has its
reference baseline.
- mpv `--vo=xv` falls back to software YUV→RGB conversion in
the X server (the modesetting "shadow Xv adapter" path) →
no hardware-overlay path on this hardware, regardless of
client.
A 5-second mpv `--vo=xv` run under a non-compositing WM, with
`drm_info` snapshots taken before / during / after, will
unambiguously answer this. *Design at the end of this doc.*
### `--vo=gpu --gpu-context=x11` — modern Mesa GL path
mpv's modern GL VO uses the same plumbing as the browsers:
DRI3 + XPresent + Mesa GL. `--hwdec=auto` or
`--hwdec=v4l2request-copy` decodes via libva → NV12 dmabuf →
imports as EGL image → mpv samples it in libplacebo's GL
compositor → glXSwapBuffers RGB to X.
This is **not** an X11 hardware-overlay path. It's the same
client-side GL composite that browsers do. Useful as a
control: if `--vo=gpu` is markedly slower than `--vo=xv` on
the same machine + same workload, the delta is the
hardware-overlay-vs-GL-composite gap — which is exactly the
quantity the campaign wants to measure.
### `--vo=drm` — bypass X entirely (NOT in scope)
mpv has a `--vo=drm` VO that talks directly to KMS, bypassing
X. The KWIN_PIVOT.md from chromium-fourier already reports
0.7 % drops with mpv `--vo=drm --hwdec=v4l2request` under no
compositor — strong evidence the underlying hardware path
(decode + DRM scanout) is healthy on this stack. But `--vo=drm`
isn't an X11 path; it doesn't tell us anything about whether
the X server can do the same plane assignment for a windowed
client. **Not added to the matrix.** Useful as a "this is the
hardware ceiling" reference outside the matrix.
---
## Implications for the matrix
### What the X11 cells of the matrix actually measure for the three browsers
Given the StubOverlayManager / WindowSurfaceX11-RGB-only
findings, the X11-cell value chain for browsers is:
1. Browser GPU process decodes video → NV12 dmabuf.
2. NV12 dmabuf imported as EGL image.
3. **Browser GPU process GL-composites NV12 → RGB** in its own
GL context (via libplacebo equivalents in Chromium's WebRender
or Firefox's WebRender).
4. RGB framebuffer presented via DRI3 + XPresent to the X
server.
5. **X server's plane allocator schedules the RGB pixmap on
Plane 39** (since on rockchip-drm Plane 39 is the only one
that can scan out the browser-window-sized buffer; Plane 45
doesn't accept RGB AFBC at the resolutions involved either,
though it does accept RGB LINEAR — to be confirmed).
Compare to Wayland-with-KWin cell:
1. Browser GPU process decodes → NV12 dmabuf.
2. NV12 dmabuf imported as EGL image (or sent via wp_subsurface
if the browser engages that route — in practice, per the
predecessor campaign, browsers don't engage wp_subsurface for
the test page).
3. **Browser GPU process GL-composites NV12 → RGB** in its own
GL context (same as X11 cell).
4. RGB framebuffer dispatched to KWin via wl_surface attach.
5. **KWin GL-composites the browser's RGB surface** with the
rest of the desktop (the wallpaper, panels, etc.).
6. KWin's merged RGB framebuffer goes to Plane 39.
The structural delta the matrix's X11 vs Wayland cells will
actually measure for browsers is therefore **the cost of step
5 in the Wayland flow** — KWin's per-frame compositing of
already-RGB browser surfaces onto its merged framebuffer. The
NV12-GL-composite (step 3) happens in *both* sessions, in the
*browser's* GPU process. The campaign's hypothesised "force-GL-
composite of every NV12 video buffer" is performed by the
browser regardless of session, not by KWin.
This **does not invalidate the campaign**. KWin's per-frame RGB
composite is still measurable and the predecessor's
`kwin_wayland %CPU at steady-state ~36 %` is suggestive that
this overhead is real. But the magnitude and the mechanism are
different from the original framing.
### Where the original "plane allocation freedom" hypothesis still applies
For the **mpv `--vo=xv`** cell — and ONLY that cell — the
client genuinely tries to hand NV12 to the X server for
scanout. Under without-KWin (X11 + non-compositing WM), the
X server is free to put that NV12 buffer on Plane 39 and
the desktop on Plane 45 — exactly the campaign's mechanism,
realised. Under Wayland-with-KWin there is no equivalent path
at all (XVideo isn't a Wayland concept; XWayland would route
through KWin and lose any scanout opportunity). So the mpv-xv
row of the matrix is the **direct test** of the
operator-supplied mechanism — and likely the only one that
can produce a positive answer.
### Recommended Phase 1 framing
The campaign's research question stays valid but its
sub-questions sharpen:
- **Q1 (mpv `--vo=xv`):** does X11+non-compositing WM on
rockchip-drm-PineTab2 actually engage hardware-overlay
scanout for an NV12 client? (matrix cells `C-X-mpv-sw` and
`C-X-mpv-hw` with `--vo=xv`)
- **Q2 (browsers):** given that browsers always GL-composite
to RGB before presentation, does removing KWin from the
display path reduce the per-frame RGB-composite cost enough
to matter for video fps / drops? (the existing browser
cells, just with their interpretation tightened)
- **Q3 (mpv `--vo=gpu`):** is the GL-composite cost in mpv's
GPU process comparable to the browsers' GL-composite cost?
(i.e., is the browsers' overhead the GL-composite itself, or
something extra browsers do around it?)
A clean Q1 positive (mpv-xv hits Plane 39) plus a marginal Q2
(browsers don't speed up much without KWin) would be the
"X11 path is fast, but browsers leave it on the table"
verdict the campaign already named as a possible outcome
shape.
A Q1 negative (mpv-xv falls back to RGB anyway) would mean
the campaign's mechanism is structurally absent on this
hardware, regardless of session. In that case the matrix
collapses to "what's the per-frame KWin overhead?" — a
scoping the predecessor `kwin_overlay_subsurface` already
asked at the protocol level and got an answer to.
---
## Open: runtime engagement check (deferred, attended-launch)
The source-level inventory above answers "does the path
*exist* in the code". To answer "is the path *engaged at
runtime* for THIS hardware × THIS Mesa × THIS X server", a
per-client probe is needed. This is the right thing to do at
**Phase 1 binding-cell-design time**, not at Phase 0
inventory time.
### Probe sketch (for later)
For each client × X11-session combination, around a
short (10-30 s) test-video playback:
1. Pre-state: `drm_info | grep -A 20 'Plane 39'` and
`drm_info | grep -A 20 'Plane 45'` — capture plane state
before launch.
2. Launch the client (operator action: open the browser to
`brave_drops_test.html` or run mpv on a known short video).
3. Capture for 10 s mid-playback:
- `x11trace -k -d :0 -o /tmp/x11trace.<client>.log` for
~3 s (the AUR `x11trace` binary at `/usr/bin/x11trace`
installed in `revert.log` entry 3) — to see what X
protocol requests the client is issuing per frame.
- `drm_info` snapshot — to see which planes are now
programmed and with what FOURCC/modifier.
- `chrome://gpu` page text via DevTools / about:support
copy — for the browser's own self-report.
4. Post-state `drm_info`.
5. Diff: a positive engagement looks like **Plane 39
programmed with `NV12` FOURCC during step 3**; a negative
engagement is **Plane 39 programmed with `XR24` / `AR24`
RGB FOURCC for the entire window**.
The probe is shaped like a Phase 1 measurement protocol but
uses a much shorter window and isn't recording per-frame
metrics — it's just "does NV12 ever reach a hardware plane?"
binary outcome per cell.
### Why deferred
Running the probe productively requires:
- A non-compositing-WM session active (entry-1 openbox or
entry-2 XFCE-no-comp) — operator session switch needed.
- An operator at the keyboard to start each client.
- A short test video file on ohm at a known path — minor
prep, not done in this campaign yet.
- The matrix cells finalised with their browser flags
(`--enable-features=...`, `MOZ_X11_EGL=1`, etc.) so the
probe gives the right answer for the conditions Phase 1
will measure.
None of those is a blocker — but folding the probe into
Phase 1's first measurement rep (where the operator is
already at the keyboard launching browsers and a test video
is being played for fps-counting purposes) is more efficient
than running it standalone in Phase 0.
---
## Worklist update
`worklist.md` item *"NEW: Browser X11-overlay-path inventory"*
should flip to `[x]` with a pointer to this file.