Files
x11-session-research/phase0_evidence/browser_overlay_inventory_2026-05-03.md
marfrit d2e11be430 Phase 0: browser X11-overlay inventory + mpv reference cell + tooling installs
Source-level verdict: no browser in the matrix has a code path
to hand NV12 to the X server for plane scanout. Chromium ozone-x11
wires StubOverlayManager (ozone_platform_x11.cc:262); Brave 147 +
chromium-fourier 149 inherit unchanged. Firefox WindowSurfaceX11
is RGB-only. The campaign's load-bearing hypothesis is structurally
weakened — what the X11 cells will measure for browsers is
KWin's per-frame RGB-recomposite cost, not the original
"force-GL-composite of NV12" framing. mpv --vo=xv becomes the
matrix's only direct test of the operator-supplied mechanism.

Matrix updated: 12 cells -> 16 cells (+8 mpv VO sub-points).
revert.log entries 1-5 capture all package + per-user state
mutations from this turn (measurement tools, openbox, XFCE +
xfwm4-no-comp pre-seed, firefox + AUR xtrace, XFCE rotation
+ touchscreen mapping); single SSH-driveable revert chain
returns ohm to its pre-campaign 1169-package state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 11:14:46 +00:00

20 KiB
Raw Permalink Blame History

Phase 0 — Browser X11-overlay-path inventory

Date: 2026-05-03 Scope: answer worklist.md item "Browser X11-overlay-path inventory" — for each of Brave 147, chromium-fourier 149, Firefox 150, and the mpv 0.41 reference client, determine whether a code path exists under X11 to request hardware-overlay scanout of NV12 video buffers, vs always GL-compositing internally to RGB before X11 presentation.

Method: source-level inspection. Local Firefox 149 source tree (acquired by the predecessor as firefox-fourier-work/) read directly. Chromium 147 source acquired from the chromium-builder LXD CT on boltzmann via the his subagent — 13 files, 180 KB, copied into phase0_evidence/chromium_ozone_x11_2026-05-03/ with full provenance in that subtree's README.txt. mpv inspected via upstream docs + its installed binary's --vo=help. Runtime engagement (does the path actually fire at playback time?) is NOT covered here — that's a separate, attended-launch follow-up; design sketch at the end of this doc.

TL;DR

Client X11 overlay path exists in code? Runtime-engaged today? Notes
Brave 147 No. ozone-x11 uses StubOverlayManager. N/A upstream Chromium 147 + Brave UI patches; Ozone unchanged
chromium-fourier 149 No. Same as upstream Chromium 149. N/A local patches target decode side only; Ozone unchanged
Firefox 150 No. WindowSurfaceX11{,Image,SHM} are RGB-only. N/A gfx/webrender_bindings has DComp (Windows), no Linux equivalent
mpv 0.41 --vo=xv Yes (XVideo legacy path). unknown only path in this matrix that could engage a hardware plane via X server's Xv adapter
mpv 0.41 --vo=gpu --gpu-context=x11 No. GL composite, identical plumbing to browsers. N/A DRI3 + XPresent of an RGB GL-composited framebuffer, same as Chromium/Firefox under X11

Headline finding: every Blink-based browser (Brave, chromium-fourier) and Firefox under stock X11 ozone cannot hand an NV12 dmabuf to the X server for hardware-overlay scanout; they always GL-composite NV12 → RGB in their own GPU process and then present an RGB framebuffer to the X server.

This structurally weakens the campaign's load-bearing hypothesis (README.md § 1: "the campaign's load-bearing hypothesis is that this plane-allocation freedom translates into measurable browser-video speedup"). The freedom exists at the DRM/Xorg layer — Plane 39 NV12 LINEAR is reachable to X11 clients in principle — but no shipping browser ever tries to exercise it under X11. What the matrix's X11 cells will actually measure for browsers is the cost of the GL composite running in the browser's GPU process vs the GL composite running in kwin_wayland's process — an interesting question, but a different one from the stated mechanism.

The mpv --vo=xv cell becomes the matrix's most important single data point: it's the only client in the campaign that can attempt to engage an X11 hardware-overlay plane directly. A clean --vo=xv engagement establishes that the X11 hardware-overlay path works on this hardware; absence establishes it doesn't even exist on rockchip-drm + Panfrost + modesetting Xorg.


Chromium ozone-x11: StubOverlayManager (decisive)

Source provenance: phase0_evidence/chromium_ozone_x11_2026-05-03/ (Chromium 147.0.7727.116 from chromium-builder LXD CT on boltzmann; tarball mtime 2026-04-25).

The wiring

ui/ozone/platform/x11/ozone_platform_x11.cc:262:

overlay_manager_ = std::make_unique<StubOverlayManager>();

StubOverlayManager (in ui/ozone/common/stub_overlay_manager.h) implements the OverlayManagerOzone interface as a no-op — its CreateOverlayCandidates() returns no candidates. Whatever chromium's OverlayProcessor asks the X11 backend, the answer is "no overlay possible here" and the regular GL-composite path runs.

Contrast with Wayland (ui/ozone/platform/wayland/ozone_platform_wayland.cc:330):

std::make_unique<WaylandOverlayManager>(buffer_manager_.get());

A real WaylandOverlayManager backed by the WaylandBufferManagerHost. This is the path the predecessor campaign kwin_overlay_subsurface was investigating — the route where Chromium emits wp_subsurface overlay candidates that KWin Wayland may or may not promote to scanout planes (and historically didn't, hence the predecessor's Phase 8 closure).

What the X11 directory does NOT contain

Directory listing of ui/ozone/platform/x11/ (from README.txt) filtered for relevance:

$ ls ui/ozone/platform/x11/ | grep -iE 'overlay|buffer.*manager|dmabuf|present'
x11_window_manager.cc
x11_window_manager.h

x11_window_manager is the WM-events bookkeeping class (focus/raise/restack), unrelated to overlays. There is no x11_overlay_manager.cc, no x11_buffer_manager_host.cc, no x11_dmabuf_*.cc, and no x11_present_*.cc — files that exist on the Wayland side. The X11 backend has no machinery to track overlay candidates, no buffer-manager IPC, no dmabuf-feedback handling, and no Present-extension protocol support.

What the X11 directory DOES contain — and why it's not enough

x11_surface_factory.cc:40-43:

// Native pixmaps are first imported as X11 pixmaps using DRI3
// and then into EGL.

DRI3 IS used by ozone-x11. But the path is:

  1. GPU process gets an NV12 dmabuf from the video decoder (VAAPI / V4L2VideoDecoder).
  2. X11SurfaceFactory::CreateNativePixmap() wraps it as a gfx::NativePixmapDmaBuf.
  3. The dmabuf is imported either directly via EGL_EXT_image_dma_buf_import or, as a fallback, by creating an X11 Pixmap via DRI3 and binding that as an EGL image.
  4. Either way, the resulting EGL image is sampled by Chromium's GL compositor to render the WebRender output.
  5. The browser's final RGB framebuffer is presented to the X server via glXSwapBuffers (or equivalent EGL swap).

This is the GL-composite path — the same path browsers use on every X11 + GL-context combo since the late 2000s. The dmabuf is consumed by the GPU process, not handed to the X server for plane scanout. Plane 39 (the only NV12-LINEAR-capable plane on rockchip-drm) is programmed by the X server at scanout time with the RGB browser framebuffer, not the NV12 video buffer.

What chrome://gpu would show for video acceleration

Independent of decode-side acceleration, the Hardware-accelerated video decode line in chrome://gpu only reports decoder status, not overlay status. chrome://gpu doesn't have a "Hardware video overlay" line for Linux/X11 — that line is Windows-only (DCOMPSurface on the DirectComposition swap chain). The absence of such a line in chrome://gpu under Linux/X11 is itself a sign: there's no overlay surface concept in this backend.

Brave 147 verdict

Brave is a Chromium fork. Its tree adds ~50 patches on top of upstream Chromium covering UI/Shields/Wallet/IPFS/Tor/etc. None of Brave's diff touches ui/ozone/platform/x11/, the OverlayProcessor, or the GPU buffer pipeline. Brave 147 under X11 has the identical "StubOverlayManager → GL composite" behavior as upstream Chromium 147.

chromium-fourier 149 verdict

marfrit-packages/arch/chromium-fourier/STUDY.md documents the patch set:

  • Patch 1: bypass media/gpu/chromeos/video_decoder_pipeline.cc on Linux non-ChromeOS so VaapiVideoDecoder gets used.
  • Patch 2: V4L2VideoDecoder factory un-gating for non-ChromeOS.
  • Patch 3 (cosmetic): default LIBVA_DRIVER_NAME=v4l2_request.
  • nv12-external-oes-on-modifier-external-only.patch: EGL import quirk for NV12 modifier-external dmabufs.
  • wayland-allow-direct-egl-gles2.patch: Wayland-specific.

All decode-side. Zero changes to ui/ozone/platform/x11/, zero changes to the OverlayProcessor. chromium-fourier 149's X11 behavior is identical to upstream Chromium 149's, which is in turn identical in Ozone shape to 147 (the StubOverlayManager wiring dates to ~2019 and hasn't moved).


Firefox 150 widget/gtk: RGB-only X11 surfaces (decisive)

Source: firefox-fourier-work/firefox-149.0/. Firefox 149 ≈ 150 on the X11-surface code path; this layer hasn't changed materially in 5+ years.

The X11 surface implementations

Firefox's X11 backend lives at widget/gtk/WindowSurfaceX11*:

  • WindowSurfaceX11.cpp — abstract base.
  • WindowSurfaceX11Image.cpp — XImage-based (slow legacy path).
  • WindowSurfaceX11SHM.cpp — XShm-based (shared-memory pixmap).

WindowSurfaceX11.cpp::GetVisualFormat() enumerates exactly three pixel formats:

case 32: return gfx::SurfaceFormat::B8G8R8A8;
case 24: return gfx::SurfaceFormat::B8G8R8X8;
case 16: return gfx::SurfaceFormat::R5G6B5_UINT16;

All RGB. No NV12. No YUV. No dmabuf. Final pixel buffer is RGB before reaching X.

A grep across all widget/gtk/WindowSurfaceX11*.{cpp,h} files for any of dmabuf|DMABuf|NV12|YUV|hardware.*overlay|Plane|XPresent|DRI3 returns zero matches. Firefox's X11 presentation path is strictly software-RGB-pixmap → X.

Why the dmabuf code in widget/gtk/ doesn't apply

Files like DMABufSurface.cpp, DMABufBuffer.cpp, WaylandSurface.cpp exist and are reachable on Linux, but their use is:

  • Wayland-side: produce/consume wl_buffer dmabufs for the client→compositor handoff via zwp_linux_dmabuf_v1.
  • VAAPI-decode-side: import a hardware-decoded NV12 dmabuf as an EGL image so Firefox's WebRender can sample it as a texture.

In the second case (VAAPI under X11), the dmabuf is consumed by Firefox's own GPU process for compositing. The composited output is then handed to WindowSurfaceX11SHM as RGB — identical situation to Chromium.

Mozilla's gfx.x11-egl.force-enabled pref

modules/libpref/init/StaticPrefList.yaml:

- name: gfx.x11-egl.force-enabled
  type: bool
  value: false
  mirror: once
  # Whether to force using EGL over GLX.

This forces Firefox to use EGL over GLX for its GL context under X11 (better dmabuf import support since EGL has EGL_EXT_image_dma_buf_import, GLX doesn't). It changes the GPU-process composite-input plumbing, not the WindowSurfaceX11 presentation plumbing. The output to X is still RGB.

Mozilla's hardware-overlay code is Windows-only

The only files in Firefox 149 source matching hardware.*overlay:

  • gfx/webrender_bindings/DCLayerTree.cpp — DirectComposition layer tree (Windows).
  • gfx/webrender_bindings/RenderCompositorANGLE.cpp — ANGLE composition (Windows).
  • gfx/config/gfxFeature.h — feature-flag enum.
  • gfx/thebes/gfxPlatform.cpp — feature-flag plumbing.

There is no Linux/X11 hardware-overlay equivalent. Mozilla's MOZ_X11_EGL env var (sometimes mentioned in forum threads as a "force X11 hardware overlay" toggle) is just a synonym for gfx.x11-egl.force-enabled — same EGL-vs-GLX scope, no overlay-scanout effect.

Firefox 150 verdict

No X11 hardware-overlay path. Firefox under X11 always GL-composites NV12 → RGB internally and presents RGB to the X server. Same architectural shape as Chromium ozone-x11.


mpv 0.41: the only client with a path that could engage X11 hardware overlay

mpv installed: mpv 1:0.41.0-3 with libplacebo v7.360.1 (per 02_x11_paths.txt).

--vo=xv — the legacy XVideo overlay path

XVideo (Xv) is an X protocol extension dating to the late 1990s, designed precisely for hardware video overlays: the client hands the X server a YUV image; the X server programs a hardware video plane (where available) to scanout that image, hardware-blended with the rest of the desktop. On modesetting Xorg driver + a DRM driver that exposes a YUV-capable plane, the X server's XVideo adapter wires through DRI2/DRI3 to the DRM plane allocator, and YUV image goes onto a hardware plane.

xdpyinfo on ohm confirms XVideo extension is initialized on the running X server (03_xprotocol_extensions.txt:50, Xorg.0.log line 96-97 in 01_live_session.txt).

Whether modesetting + rockchip-drm actually wires XVideo to Plane 39 is the empirical question. Possible outcomes:

  • mpv --vo=xv programs Plane 39 NV12 → X11 hardware-overlay path is reachable on this hardware, and the campaign has its reference baseline.
  • mpv --vo=xv falls back to software YUV→RGB conversion in the X server (the modesetting "shadow Xv adapter" path) → no hardware-overlay path on this hardware, regardless of client.

A 5-second mpv --vo=xv run under a non-compositing WM, with drm_info snapshots taken before / during / after, will unambiguously answer this. Design at the end of this doc.

--vo=gpu --gpu-context=x11 — modern Mesa GL path

mpv's modern GL VO uses the same plumbing as the browsers: DRI3 + XPresent + Mesa GL. --hwdec=auto or --hwdec=v4l2request-copy decodes via libva → NV12 dmabuf → imports as EGL image → mpv samples it in libplacebo's GL compositor → glXSwapBuffers RGB to X.

This is not an X11 hardware-overlay path. It's the same client-side GL composite that browsers do. Useful as a control: if --vo=gpu is markedly slower than --vo=xv on the same machine + same workload, the delta is the hardware-overlay-vs-GL-composite gap — which is exactly the quantity the campaign wants to measure.

--vo=drm — bypass X entirely (NOT in scope)

mpv has a --vo=drm VO that talks directly to KMS, bypassing X. The KWIN_PIVOT.md from chromium-fourier already reports 0.7 % drops with mpv --vo=drm --hwdec=v4l2request under no compositor — strong evidence the underlying hardware path (decode + DRM scanout) is healthy on this stack. But --vo=drm isn't an X11 path; it doesn't tell us anything about whether the X server can do the same plane assignment for a windowed client. Not added to the matrix. Useful as a "this is the hardware ceiling" reference outside the matrix.


Implications for the matrix

What the X11 cells of the matrix actually measure for the three browsers

Given the StubOverlayManager / WindowSurfaceX11-RGB-only findings, the X11-cell value chain for browsers is:

  1. Browser GPU process decodes video → NV12 dmabuf.
  2. NV12 dmabuf imported as EGL image.
  3. Browser GPU process GL-composites NV12 → RGB in its own GL context (via libplacebo equivalents in Chromium's WebRender or Firefox's WebRender).
  4. RGB framebuffer presented via DRI3 + XPresent to the X server.
  5. X server's plane allocator schedules the RGB pixmap on Plane 39 (since on rockchip-drm Plane 39 is the only one that can scan out the browser-window-sized buffer; Plane 45 doesn't accept RGB AFBC at the resolutions involved either, though it does accept RGB LINEAR — to be confirmed).

Compare to Wayland-with-KWin cell:

  1. Browser GPU process decodes → NV12 dmabuf.
  2. NV12 dmabuf imported as EGL image (or sent via wp_subsurface if the browser engages that route — in practice, per the predecessor campaign, browsers don't engage wp_subsurface for the test page).
  3. Browser GPU process GL-composites NV12 → RGB in its own GL context (same as X11 cell).
  4. RGB framebuffer dispatched to KWin via wl_surface attach.
  5. KWin GL-composites the browser's RGB surface with the rest of the desktop (the wallpaper, panels, etc.).
  6. KWin's merged RGB framebuffer goes to Plane 39.

The structural delta the matrix's X11 vs Wayland cells will actually measure for browsers is therefore the cost of step 5 in the Wayland flow — KWin's per-frame compositing of already-RGB browser surfaces onto its merged framebuffer. The NV12-GL-composite (step 3) happens in both sessions, in the browser's GPU process. The campaign's hypothesised "force-GL- composite of every NV12 video buffer" is performed by the browser regardless of session, not by KWin.

This does not invalidate the campaign. KWin's per-frame RGB composite is still measurable and the predecessor's kwin_wayland %CPU at steady-state ~36 % is suggestive that this overhead is real. But the magnitude and the mechanism are different from the original framing.

Where the original "plane allocation freedom" hypothesis still applies

For the mpv --vo=xv cell — and ONLY that cell — the client genuinely tries to hand NV12 to the X server for scanout. Under without-KWin (X11 + non-compositing WM), the X server is free to put that NV12 buffer on Plane 39 and the desktop on Plane 45 — exactly the campaign's mechanism, realised. Under Wayland-with-KWin there is no equivalent path at all (XVideo isn't a Wayland concept; XWayland would route through KWin and lose any scanout opportunity). So the mpv-xv row of the matrix is the direct test of the operator-supplied mechanism — and likely the only one that can produce a positive answer.

The campaign's research question stays valid but its sub-questions sharpen:

  • Q1 (mpv --vo=xv): does X11+non-compositing WM on rockchip-drm-PineTab2 actually engage hardware-overlay scanout for an NV12 client? (matrix cells C-X-mpv-sw and C-X-mpv-hw with --vo=xv)
  • Q2 (browsers): given that browsers always GL-composite to RGB before presentation, does removing KWin from the display path reduce the per-frame RGB-composite cost enough to matter for video fps / drops? (the existing browser cells, just with their interpretation tightened)
  • Q3 (mpv --vo=gpu): is the GL-composite cost in mpv's GPU process comparable to the browsers' GL-composite cost? (i.e., is the browsers' overhead the GL-composite itself, or something extra browsers do around it?)

A clean Q1 positive (mpv-xv hits Plane 39) plus a marginal Q2 (browsers don't speed up much without KWin) would be the "X11 path is fast, but browsers leave it on the table" verdict the campaign already named as a possible outcome shape.

A Q1 negative (mpv-xv falls back to RGB anyway) would mean the campaign's mechanism is structurally absent on this hardware, regardless of session. In that case the matrix collapses to "what's the per-frame KWin overhead?" — a scoping the predecessor kwin_overlay_subsurface already asked at the protocol level and got an answer to.


Open: runtime engagement check (deferred, attended-launch)

The source-level inventory above answers "does the path exist in the code". To answer "is the path engaged at runtime for THIS hardware × THIS Mesa × THIS X server", a per-client probe is needed. This is the right thing to do at Phase 1 binding-cell-design time, not at Phase 0 inventory time.

Probe sketch (for later)

For each client × X11-session combination, around a short (10-30 s) test-video playback:

  1. Pre-state: drm_info | grep -A 20 'Plane 39' and drm_info | grep -A 20 'Plane 45' — capture plane state before launch.
  2. Launch the client (operator action: open the browser to brave_drops_test.html or run mpv on a known short video).
  3. Capture for 10 s mid-playback:
    • x11trace -k -d :0 -o /tmp/x11trace.<client>.log for ~3 s (the AUR x11trace binary at /usr/bin/x11trace installed in revert.log entry 3) — to see what X protocol requests the client is issuing per frame.
    • drm_info snapshot — to see which planes are now programmed and with what FOURCC/modifier.
    • chrome://gpu page text via DevTools / about:support copy — for the browser's own self-report.
  4. Post-state drm_info.
  5. Diff: a positive engagement looks like Plane 39 programmed with NV12 FOURCC during step 3; a negative engagement is Plane 39 programmed with XR24 / AR24 RGB FOURCC for the entire window.

The probe is shaped like a Phase 1 measurement protocol but uses a much shorter window and isn't recording per-frame metrics — it's just "does NV12 ever reach a hardware plane?" binary outcome per cell.

Why deferred

Running the probe productively requires:

  • A non-compositing-WM session active (entry-1 openbox or entry-2 XFCE-no-comp) — operator session switch needed.
  • An operator at the keyboard to start each client.
  • A short test video file on ohm at a known path — minor prep, not done in this campaign yet.
  • The matrix cells finalised with their browser flags (--enable-features=..., MOZ_X11_EGL=1, etc.) so the probe gives the right answer for the conditions Phase 1 will measure.

None of those is a blocker — but folding the probe into Phase 1's first measurement rep (where the operator is already at the keyboard launching browsers and a test video is being played for fps-counting purposes) is more efficient than running it standalone in Phase 0.


Worklist update

worklist.md item "NEW: Browser X11-overlay-path inventory" should flip to [x] with a pointer to this file.