# Phase 0 — Browser X11-overlay-path inventory **Date:** 2026-05-03 **Scope:** answer worklist.md item *"Browser X11-overlay-path inventory"* — for each of Brave 147, chromium-fourier 149, Firefox 150, and the mpv 0.41 reference client, determine **whether a code path exists under X11 to request hardware-overlay scanout of NV12 video buffers**, vs always GL-compositing internally to RGB before X11 presentation. **Method:** source-level inspection. Local Firefox 149 source tree (acquired by the predecessor as `firefox-fourier-work/`) read directly. Chromium 147 source acquired from the `chromium-builder` LXD CT on boltzmann via the his subagent — 13 files, 180 KB, copied into `phase0_evidence/chromium_ozone_x11_2026-05-03/` with full provenance in that subtree's `README.txt`. mpv inspected via upstream docs + its installed binary's `--vo=help`. **Runtime engagement (does the path actually fire at playback time?) is NOT covered here** — that's a separate, attended-launch follow-up; design sketch at the end of this doc. ## TL;DR | Client | X11 overlay path exists in code? | Runtime-engaged today? | Notes | |---|---|---|---| | Brave 147 | **No.** ozone-x11 uses `StubOverlayManager`. | N/A | upstream Chromium 147 + Brave UI patches; Ozone unchanged | | chromium-fourier 149 | **No.** Same as upstream Chromium 149. | N/A | local patches target decode side only; Ozone unchanged | | Firefox 150 | **No.** `WindowSurfaceX11{,Image,SHM}` are RGB-only. | N/A | gfx/webrender_bindings has DComp (Windows), no Linux equivalent | | mpv 0.41 `--vo=xv` | **Yes (XVideo legacy path).** | unknown | only path in this matrix that *could* engage a hardware plane via X server's Xv adapter | | mpv 0.41 `--vo=gpu --gpu-context=x11` | **No.** GL composite, identical plumbing to browsers. | N/A | DRI3 + XPresent of an RGB GL-composited framebuffer, same as Chromium/Firefox under X11 | **Headline finding:** every Blink-based browser (Brave, chromium-fourier) and Firefox under stock X11 ozone *cannot* hand an NV12 dmabuf to the X server for hardware-overlay scanout; they always GL-composite NV12 → RGB in their own GPU process and then present an RGB framebuffer to the X server. This **structurally weakens** the campaign's load-bearing hypothesis (`README.md` § 1: *"the campaign's load-bearing hypothesis is that this plane-allocation freedom translates into measurable browser-video speedup"*). The freedom *exists* at the DRM/Xorg layer — Plane 39 NV12 LINEAR is reachable to X11 clients in principle — but **no shipping browser ever tries to exercise it** under X11. What the matrix's X11 cells will actually measure for browsers is the cost of the GL composite running in the *browser's* GPU process vs the GL composite running in *kwin_wayland*'s process — an interesting question, but a different one from the stated mechanism. The mpv `--vo=xv` cell becomes the matrix's most important single data point: it's the only client in the campaign that can *attempt* to engage an X11 hardware-overlay plane directly. A clean `--vo=xv` engagement establishes that the X11 hardware-overlay path *works on this hardware*; absence establishes it doesn't even exist on rockchip-drm + Panfrost + modesetting Xorg. --- ## Chromium ozone-x11: `StubOverlayManager` (decisive) Source provenance: `phase0_evidence/chromium_ozone_x11_2026-05-03/` (Chromium 147.0.7727.116 from `chromium-builder` LXD CT on boltzmann; tarball mtime 2026-04-25). ### The wiring `ui/ozone/platform/x11/ozone_platform_x11.cc:262`: ```cpp overlay_manager_ = std::make_unique(); ``` `StubOverlayManager` (in `ui/ozone/common/stub_overlay_manager.h`) implements the `OverlayManagerOzone` interface as a no-op — its `CreateOverlayCandidates()` returns no candidates. Whatever chromium's `OverlayProcessor` asks the X11 backend, the answer is "no overlay possible here" and the regular GL-composite path runs. **Contrast with Wayland** (`ui/ozone/platform/wayland/ozone_platform_wayland.cc:330`): ```cpp std::make_unique(buffer_manager_.get()); ``` A real `WaylandOverlayManager` backed by the `WaylandBufferManagerHost`. This is the path the predecessor campaign `kwin_overlay_subsurface` was investigating — the route where Chromium emits `wp_subsurface` overlay candidates that KWin Wayland may or may not promote to scanout planes (and historically didn't, hence the predecessor's Phase 8 closure). ### What the X11 directory does NOT contain Directory listing of `ui/ozone/platform/x11/` (from `README.txt`) filtered for relevance: ``` $ ls ui/ozone/platform/x11/ | grep -iE 'overlay|buffer.*manager|dmabuf|present' x11_window_manager.cc x11_window_manager.h ``` `x11_window_manager` is the WM-events bookkeeping class (focus/raise/restack), unrelated to overlays. **There is no `x11_overlay_manager.cc`, no `x11_buffer_manager_host.cc`, no `x11_dmabuf_*.cc`, and no `x11_present_*.cc`** — files that exist on the Wayland side. The X11 backend has no machinery to track overlay candidates, no buffer-manager IPC, no dmabuf-feedback handling, and no Present-extension protocol support. ### What the X11 directory DOES contain — and why it's not enough `x11_surface_factory.cc:40-43`: ```cpp // Native pixmaps are first imported as X11 pixmaps using DRI3 // and then into EGL. ``` DRI3 IS used by ozone-x11. But the path is: 1. GPU process gets an NV12 dmabuf from the video decoder (VAAPI / V4L2VideoDecoder). 2. `X11SurfaceFactory::CreateNativePixmap()` wraps it as a `gfx::NativePixmapDmaBuf`. 3. The dmabuf is imported either directly via `EGL_EXT_image_dma_buf_import` or, as a fallback, by creating an X11 Pixmap via DRI3 and binding that as an EGL image. 4. **Either way, the resulting EGL image is sampled by Chromium's GL compositor** to render the WebRender output. 5. The browser's final RGB framebuffer is presented to the X server via `glXSwapBuffers` (or equivalent EGL swap). This is the GL-composite path — the same path browsers use on every X11 + GL-context combo since the late 2000s. The dmabuf is consumed by the GPU process, not handed to the X server for plane scanout. Plane 39 (the only NV12-LINEAR-capable plane on rockchip-drm) is programmed by the X server at scanout time with the **RGB** browser framebuffer, not the NV12 video buffer. ### What `chrome://gpu` would show for video acceleration Independent of decode-side acceleration, the *Hardware-accelerated video decode* line in `chrome://gpu` only reports decoder status, not overlay status. `chrome://gpu` doesn't have a "Hardware video overlay" line for Linux/X11 — that line is Windows-only (`DCOMPSurface` on the DirectComposition swap chain). The absence of such a line in chrome://gpu under Linux/X11 is itself a sign: there's no overlay surface concept in this backend. ### Brave 147 verdict Brave is a Chromium fork. Its tree adds ~50 patches on top of upstream Chromium covering UI/Shields/Wallet/IPFS/Tor/etc. None of Brave's diff touches `ui/ozone/platform/x11/`, the `OverlayProcessor`, or the GPU buffer pipeline. **Brave 147 under X11 has the identical "StubOverlayManager → GL composite" behavior as upstream Chromium 147.** ### chromium-fourier 149 verdict `marfrit-packages/arch/chromium-fourier/STUDY.md` documents the patch set: - Patch 1: bypass `media/gpu/chromeos/video_decoder_pipeline.cc` on Linux non-ChromeOS so VaapiVideoDecoder gets used. - Patch 2: V4L2VideoDecoder factory un-gating for non-ChromeOS. - Patch 3 (cosmetic): default `LIBVA_DRIVER_NAME=v4l2_request`. - `nv12-external-oes-on-modifier-external-only.patch`: EGL import quirk for NV12 modifier-external dmabufs. - `wayland-allow-direct-egl-gles2.patch`: Wayland-specific. **All decode-side. Zero changes to `ui/ozone/platform/x11/`, zero changes to the OverlayProcessor.** chromium-fourier 149's X11 behavior is identical to upstream Chromium 149's, which is in turn identical in Ozone shape to 147 (the `StubOverlayManager` wiring dates to ~2019 and hasn't moved). --- ## Firefox 150 widget/gtk: RGB-only X11 surfaces (decisive) Source: `firefox-fourier-work/firefox-149.0/`. Firefox 149 ≈ 150 on the X11-surface code path; this layer hasn't changed materially in 5+ years. ### The X11 surface implementations Firefox's X11 backend lives at `widget/gtk/WindowSurfaceX11*`: - `WindowSurfaceX11.cpp` — abstract base. - `WindowSurfaceX11Image.cpp` — XImage-based (slow legacy path). - `WindowSurfaceX11SHM.cpp` — XShm-based (shared-memory pixmap). `WindowSurfaceX11.cpp::GetVisualFormat()` enumerates exactly three pixel formats: ```cpp case 32: return gfx::SurfaceFormat::B8G8R8A8; case 24: return gfx::SurfaceFormat::B8G8R8X8; case 16: return gfx::SurfaceFormat::R5G6B5_UINT16; ``` All RGB. No NV12. No YUV. No dmabuf. Final pixel buffer is RGB before reaching X. A grep across all `widget/gtk/WindowSurfaceX11*.{cpp,h}` files for any of `dmabuf|DMABuf|NV12|YUV|hardware.*overlay|Plane|XPresent|DRI3` returns **zero matches**. Firefox's X11 presentation path is strictly software-RGB-pixmap → X. ### Why the dmabuf code in widget/gtk/ doesn't apply Files like `DMABufSurface.cpp`, `DMABufBuffer.cpp`, `WaylandSurface.cpp` exist and are reachable on Linux, but their use is: - Wayland-side: produce/consume `wl_buffer` dmabufs for the client→compositor handoff via `zwp_linux_dmabuf_v1`. - VAAPI-decode-side: import a hardware-decoded NV12 dmabuf as an EGL image so Firefox's WebRender can sample it as a texture. In the second case (VAAPI under X11), the dmabuf is consumed by Firefox's own GPU process for compositing. The composited output is then handed to `WindowSurfaceX11SHM` as RGB — identical situation to Chromium. ### Mozilla's `gfx.x11-egl.force-enabled` pref `modules/libpref/init/StaticPrefList.yaml`: ```yaml - name: gfx.x11-egl.force-enabled type: bool value: false mirror: once # Whether to force using EGL over GLX. ``` This forces Firefox to use **EGL over GLX** for its GL context under X11 (better dmabuf import support since EGL has `EGL_EXT_image_dma_buf_import`, GLX doesn't). It changes the GPU-process composite-input plumbing, **not** the WindowSurfaceX11 presentation plumbing. The output to X is still RGB. ### Mozilla's hardware-overlay code is Windows-only The only files in Firefox 149 source matching `hardware.*overlay`: - `gfx/webrender_bindings/DCLayerTree.cpp` — DirectComposition layer tree (Windows). - `gfx/webrender_bindings/RenderCompositorANGLE.cpp` — ANGLE composition (Windows). - `gfx/config/gfxFeature.h` — feature-flag enum. - `gfx/thebes/gfxPlatform.cpp` — feature-flag plumbing. There is no Linux/X11 hardware-overlay equivalent. Mozilla's `MOZ_X11_EGL` env var (sometimes mentioned in forum threads as a "force X11 hardware overlay" toggle) is just a synonym for `gfx.x11-egl.force-enabled` — same EGL-vs-GLX scope, no overlay-scanout effect. ### Firefox 150 verdict **No X11 hardware-overlay path. Firefox under X11 always GL-composites NV12 → RGB internally and presents RGB to the X server.** Same architectural shape as Chromium ozone-x11. --- ## mpv 0.41: the only client with a path that *could* engage X11 hardware overlay mpv installed: `mpv 1:0.41.0-3` with `libplacebo v7.360.1` (per `02_x11_paths.txt`). ### `--vo=xv` — the legacy XVideo overlay path XVideo (`Xv`) is an X protocol extension dating to the late 1990s, designed precisely for hardware video overlays: the client hands the X server a YUV image; the X server programs a hardware video plane (where available) to scanout that image, hardware-blended with the rest of the desktop. On modesetting Xorg driver + a DRM driver that exposes a YUV-capable plane, the X server's XVideo adapter wires through DRI2/DRI3 to the DRM plane allocator, and YUV image goes onto a hardware plane. `xdpyinfo` on ohm confirms XVideo extension is initialized on the running X server (`03_xprotocol_extensions.txt:50`, `Xorg.0.log` line 96-97 in `01_live_session.txt`). **Whether modesetting + rockchip-drm actually wires XVideo to Plane 39 is the empirical question.** Possible outcomes: - mpv `--vo=xv` programs Plane 39 NV12 → X11 hardware-overlay path is reachable on this hardware, and the campaign has its reference baseline. - mpv `--vo=xv` falls back to software YUV→RGB conversion in the X server (the modesetting "shadow Xv adapter" path) → no hardware-overlay path on this hardware, regardless of client. A 5-second mpv `--vo=xv` run under a non-compositing WM, with `drm_info` snapshots taken before / during / after, will unambiguously answer this. *Design at the end of this doc.* ### `--vo=gpu --gpu-context=x11` — modern Mesa GL path mpv's modern GL VO uses the same plumbing as the browsers: DRI3 + XPresent + Mesa GL. `--hwdec=auto` or `--hwdec=v4l2request-copy` decodes via libva → NV12 dmabuf → imports as EGL image → mpv samples it in libplacebo's GL compositor → glXSwapBuffers RGB to X. This is **not** an X11 hardware-overlay path. It's the same client-side GL composite that browsers do. Useful as a control: if `--vo=gpu` is markedly slower than `--vo=xv` on the same machine + same workload, the delta is the hardware-overlay-vs-GL-composite gap — which is exactly the quantity the campaign wants to measure. ### `--vo=drm` — bypass X entirely (NOT in scope) mpv has a `--vo=drm` VO that talks directly to KMS, bypassing X. The KWIN_PIVOT.md from chromium-fourier already reports 0.7 % drops with mpv `--vo=drm --hwdec=v4l2request` under no compositor — strong evidence the underlying hardware path (decode + DRM scanout) is healthy on this stack. But `--vo=drm` isn't an X11 path; it doesn't tell us anything about whether the X server can do the same plane assignment for a windowed client. **Not added to the matrix.** Useful as a "this is the hardware ceiling" reference outside the matrix. --- ## Implications for the matrix ### What the X11 cells of the matrix actually measure for the three browsers Given the StubOverlayManager / WindowSurfaceX11-RGB-only findings, the X11-cell value chain for browsers is: 1. Browser GPU process decodes video → NV12 dmabuf. 2. NV12 dmabuf imported as EGL image. 3. **Browser GPU process GL-composites NV12 → RGB** in its own GL context (via libplacebo equivalents in Chromium's WebRender or Firefox's WebRender). 4. RGB framebuffer presented via DRI3 + XPresent to the X server. 5. **X server's plane allocator schedules the RGB pixmap on Plane 39** (since on rockchip-drm Plane 39 is the only one that can scan out the browser-window-sized buffer; Plane 45 doesn't accept RGB AFBC at the resolutions involved either, though it does accept RGB LINEAR — to be confirmed). Compare to Wayland-with-KWin cell: 1. Browser GPU process decodes → NV12 dmabuf. 2. NV12 dmabuf imported as EGL image (or sent via wp_subsurface if the browser engages that route — in practice, per the predecessor campaign, browsers don't engage wp_subsurface for the test page). 3. **Browser GPU process GL-composites NV12 → RGB** in its own GL context (same as X11 cell). 4. RGB framebuffer dispatched to KWin via wl_surface attach. 5. **KWin GL-composites the browser's RGB surface** with the rest of the desktop (the wallpaper, panels, etc.). 6. KWin's merged RGB framebuffer goes to Plane 39. The structural delta the matrix's X11 vs Wayland cells will actually measure for browsers is therefore **the cost of step 5 in the Wayland flow** — KWin's per-frame compositing of already-RGB browser surfaces onto its merged framebuffer. The NV12-GL-composite (step 3) happens in *both* sessions, in the *browser's* GPU process. The campaign's hypothesised "force-GL- composite of every NV12 video buffer" is performed by the browser regardless of session, not by KWin. This **does not invalidate the campaign**. KWin's per-frame RGB composite is still measurable and the predecessor's `kwin_wayland %CPU at steady-state ~36 %` is suggestive that this overhead is real. But the magnitude and the mechanism are different from the original framing. ### Where the original "plane allocation freedom" hypothesis still applies For the **mpv `--vo=xv`** cell — and ONLY that cell — the client genuinely tries to hand NV12 to the X server for scanout. Under without-KWin (X11 + non-compositing WM), the X server is free to put that NV12 buffer on Plane 39 and the desktop on Plane 45 — exactly the campaign's mechanism, realised. Under Wayland-with-KWin there is no equivalent path at all (XVideo isn't a Wayland concept; XWayland would route through KWin and lose any scanout opportunity). So the mpv-xv row of the matrix is the **direct test** of the operator-supplied mechanism — and likely the only one that can produce a positive answer. ### Recommended Phase 1 framing The campaign's research question stays valid but its sub-questions sharpen: - **Q1 (mpv `--vo=xv`):** does X11+non-compositing WM on rockchip-drm-PineTab2 actually engage hardware-overlay scanout for an NV12 client? (matrix cells `C-X-mpv-sw` and `C-X-mpv-hw` with `--vo=xv`) - **Q2 (browsers):** given that browsers always GL-composite to RGB before presentation, does removing KWin from the display path reduce the per-frame RGB-composite cost enough to matter for video fps / drops? (the existing browser cells, just with their interpretation tightened) - **Q3 (mpv `--vo=gpu`):** is the GL-composite cost in mpv's GPU process comparable to the browsers' GL-composite cost? (i.e., is the browsers' overhead the GL-composite itself, or something extra browsers do around it?) A clean Q1 positive (mpv-xv hits Plane 39) plus a marginal Q2 (browsers don't speed up much without KWin) would be the "X11 path is fast, but browsers leave it on the table" verdict the campaign already named as a possible outcome shape. A Q1 negative (mpv-xv falls back to RGB anyway) would mean the campaign's mechanism is structurally absent on this hardware, regardless of session. In that case the matrix collapses to "what's the per-frame KWin overhead?" — a scoping the predecessor `kwin_overlay_subsurface` already asked at the protocol level and got an answer to. --- ## Open: runtime engagement check (deferred, attended-launch) The source-level inventory above answers "does the path *exist* in the code". To answer "is the path *engaged at runtime* for THIS hardware × THIS Mesa × THIS X server", a per-client probe is needed. This is the right thing to do at **Phase 1 binding-cell-design time**, not at Phase 0 inventory time. ### Probe sketch (for later) For each client × X11-session combination, around a short (10-30 s) test-video playback: 1. Pre-state: `drm_info | grep -A 20 'Plane 39'` and `drm_info | grep -A 20 'Plane 45'` — capture plane state before launch. 2. Launch the client (operator action: open the browser to `brave_drops_test.html` or run mpv on a known short video). 3. Capture for 10 s mid-playback: - `x11trace -k -d :0 -o /tmp/x11trace..log` for ~3 s (the AUR `x11trace` binary at `/usr/bin/x11trace` installed in `revert.log` entry 3) — to see what X protocol requests the client is issuing per frame. - `drm_info` snapshot — to see which planes are now programmed and with what FOURCC/modifier. - `chrome://gpu` page text via DevTools / about:support copy — for the browser's own self-report. 4. Post-state `drm_info`. 5. Diff: a positive engagement looks like **Plane 39 programmed with `NV12` FOURCC during step 3**; a negative engagement is **Plane 39 programmed with `XR24` / `AR24` RGB FOURCC for the entire window**. The probe is shaped like a Phase 1 measurement protocol but uses a much shorter window and isn't recording per-frame metrics — it's just "does NV12 ever reach a hardware plane?" binary outcome per cell. ### Why deferred Running the probe productively requires: - A non-compositing-WM session active (entry-1 openbox or entry-2 XFCE-no-comp) — operator session switch needed. - An operator at the keyboard to start each client. - A short test video file on ohm at a known path — minor prep, not done in this campaign yet. - The matrix cells finalised with their browser flags (`--enable-features=...`, `MOZ_X11_EGL=1`, etc.) so the probe gives the right answer for the conditions Phase 1 will measure. None of those is a blocker — but folding the probe into Phase 1's first measurement rep (where the operator is already at the keyboard launching browsers and a test video is being played for fps-counting purposes) is more efficient than running it standalone in Phase 0. --- ## Worklist update `worklist.md` item *"NEW: Browser X11-overlay-path inventory"* should flip to `[x]` with a pointer to this file.