From 8756ce38be8227b104589a3cb80d5f4efcc059d4 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Tue, 28 Apr 2026 12:02:18 +0000 Subject: [PATCH] chromium-fourier r2 + firefox-fourier 150.0.1 + KWIN_PIVOT.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit chromium-fourier: - patch 3/3 nv12-external-oes-on-modifier-external-only.patch — adds NativePixmapEGLBinding::ModifierRequiresExternalOES helper, extends OzoneImageGLTexturesHolder::GetBinding to honor EGL external_only flag for NV12 dmabufs on panfrost / panthor. Validated on ohm (RK3566 hantro mainline 6.19.10): bbb_1080p30_h264.mp4 plays at 34.7 % combined CPU vs ~131 % pre-patch baseline (~3.8x). - PKGBUILD pkgrel 1->2, source array + sha256sums + prepare() hook for patch 4, patch numbering 1/2,2/2 -> 1/3,2/3,3/3. - NEXT.md appended with 2026-04-28 section: patch 4 design, validation log, KWin GL_ALPHA bug pinpoint (preexisting since 2026-03-06, affects every wayland video client; unrelated to chromium-fourier), device-renumbering note (/dev/video1 = encoder post-reboot). - KWIN_PIVOT.md: 4-phase plan to identify and patch KWin's glTexImage2D(internalFormat=GL_ALPHA) site, ohm-only test plan, scope discipline. - patches/ now tracked (compiler-rt-adjust-paths, enable-v4l2, wayland-allow-direct-egl-gles2, nv12-external-oes); the dead-end chromeos-pipeline-bypass.patch removed. firefox-fourier: - 4 patches (gfxinfo v4l2 stateless fourccs, libwrapper hwdevice ctx, ffmpegvideo v4l2-request route, prefs v4l2-request default). - PKGBUILD bumped to firefox 150.0.1, Arch toolchain glue patches layered in, mozconfig with --without-wasm-sandboxed-libraries for ALARM, package() launcher fix (rm -f symlink before cat > to avoid ENOENT through the dangling /usr/local symlink mach install drops). - 150.0.1-1-aarch64.pkg.tar.zst built on boltzmann (95 MB), pending fresnel power-on for V4L2 stateless validation on RK3399. --- arch/chromium-fourier/KWIN_PIVOT.md | 181 ++++++++++++ arch/chromium-fourier/NEXT.md | 278 ++++++++++++++++++ arch/chromium-fourier/PKGBUILD | 175 ++++++++--- .../patches/chromeos-pipeline-bypass.patch | 22 -- .../patches/compiler-rt-adjust-paths.patch | 38 +++ .../patches/enable-v4l2-decoder-default.patch | 55 ++++ ...ternal-oes-on-modifier-external-only.patch | 187 ++++++++++++ .../wayland-allow-direct-egl-gles2.patch | 57 ++++ arch/firefox-fourier/PKGBUILD | 163 ++++++++++ arch/firefox-fourier/PLAN.md | 210 +++++++++++++ arch/firefox-fourier/mozconfig | 36 +++ .../0001-gfxinfo-v4l2-stateless-fourccs.patch | 75 +++++ .../0002-libwrapper-hwdevice-ctx-create.patch | 52 ++++ .../0003-ffmpegvideo-v4l2-request-route.patch | 208 +++++++++++++ .../patches/0004-prefs-v4l2-request.patch | 34 +++ 15 files changed, 1711 insertions(+), 60 deletions(-) create mode 100644 arch/chromium-fourier/KWIN_PIVOT.md delete mode 100644 arch/chromium-fourier/patches/chromeos-pipeline-bypass.patch create mode 100644 arch/chromium-fourier/patches/compiler-rt-adjust-paths.patch create mode 100644 arch/chromium-fourier/patches/enable-v4l2-decoder-default.patch create mode 100644 arch/chromium-fourier/patches/nv12-external-oes-on-modifier-external-only.patch create mode 100644 arch/chromium-fourier/patches/wayland-allow-direct-egl-gles2.patch create mode 100644 arch/firefox-fourier/PKGBUILD create mode 100644 arch/firefox-fourier/PLAN.md create mode 100644 arch/firefox-fourier/mozconfig create mode 100644 arch/firefox-fourier/patches/0001-gfxinfo-v4l2-stateless-fourccs.patch create mode 100644 arch/firefox-fourier/patches/0002-libwrapper-hwdevice-ctx-create.patch create mode 100644 arch/firefox-fourier/patches/0003-ffmpegvideo-v4l2-request-route.patch create mode 100644 arch/firefox-fourier/patches/0004-prefs-v4l2-request.patch diff --git a/arch/chromium-fourier/KWIN_PIVOT.md b/arch/chromium-fourier/KWIN_PIVOT.md new file mode 100644 index 000000000..668442fef --- /dev/null +++ b/arch/chromium-fourier/KWIN_PIVOT.md @@ -0,0 +1,181 @@ +# KWin pivot — fix the `glTexImage2D(GL_ALPHA)` stall + +## What we know + +KWin 6.6.4-1 on Arch Linux ARM (Plasma 6.6.4-1, mesa 26.0.5-1, libdrm +2.4.131-1) on ohm (PineTab2 / RK3566 / panfrost) silently corrupts its +GL command queue mid-frame whenever a wayland client posts a video +buffer. The journal carries a rolling stream of: + +``` +kwin_wayland: 0x4: GL_INVALID_VALUE in glTexImage2D(internalFormat=GL_ALPHA) +kwin_wayland: 0x4: GL_INVALID_OPERATION in glTexSubImage2D(invalid texture level 0) × N +``` + +`GL_ALPHA` is not a valid `internalFormat` for `glTexImage2D` under +**OpenGL ES 3.x** (it was the GLES1.x single-channel alpha format; +GLES3 deprecates it for sized formats — `GL_R8`, `GL_LUMINANCE8_ALPHA8`, +etc.). Once the texture allocation fails, the `glTexSubImage2D` calls +that should populate it all error at level 0. KWin keeps retrying the +same broken upload every frame, never recovers, and the present-callback +path that depends on that texture stops acking client frames. Every +wayland video client deadlocks on the missing ack. + +First occurrence in this box's journal: **2026-03-06** — the bug +predates any chromium-fourier work by roughly seven weeks. + +## Triangulation already in hand + +| Client | Outcome | +|---|---| +| chromium-fourier 149-r2 (with patch 3/3) | plays ~3 s @ 34.7 % CPU then renderer/GPU park in `futex_do_wait` | +| chromium-fourier 149-r2 (without patch 3/3) | plays ~10 s (slower path delays surfacing) then identical deadlock | +| VLC | `cannot convert decoder/filter output to any format supported by the output` → `could not initialize video chain` | +| mpv `--vo=null --hwdec=v4l2request` | `Could not create device.` (mpv-side bug, separate, unrelated) | +| ffmpeg `-hwaccel v4l2request -i bbb -f null -` | plays through clean at 36 fps; hardware path is healthy | + +Decode path is healthy on this hardware. The wall is exclusively the +compositor's GL backend. + +## Constraint: ohm is the only test box on hand + +ampere (RK3588 / panthor) is in the boxes-from-Shenzhen pile, currently +DOWN. fresnel (RK3399 / Pinebook Pro) is offline. boltzmann (Rock 5 +ITX+ build host) doesn't run KWin. We do every step on ohm; we accept +the wifi flakiness and the occasional reboot. + +## Phase 1 — Reproduce outside chrome and bound the trigger (1 evening) + +Goal: a deterministic, headless-or-near-headless reproduction that +doesn't require launching a 800-MB browser. + +1. **Smallest-possible client.** Build a 50-line C wayland client that + creates a `wp_linux_dmabuf_v1` buffer, pumps frames at 30 fps, and + exits when KWin first errors. Use `weston-simple-dmabuf-egl` from + the `weston` package as a starting template — already does exactly + this but without our specific format/modifier matrix. +2. **Vary the format/modifier matrix.** Run the smallest-possible + client with each of: NV12 + LINEAR, NV12 + AFBC, NV12 + AFRC, + AR24 + LINEAR, XR24 + LINEAR. We already know NV12 paths trigger; + confirming AR24/XR24 do *not* trigger localizes the bug to KWin's + YUV import path (vs a generic dmabuf import bug). +3. **Vary the buffer dimensions.** Some KWin texture-cache paths + allocate fixed-size internal scratch textures; non-power-of-two, + non-multiple-of-16, or specifically odd-aspect cases sometimes + trigger paths that healthy aspect ratios skip. Test 1920×1080, + 1280×720, 854×480, 640×360 and a deliberately weird 1366×768. +4. **Vary KWin scene type.** Switch + `kwin_wayland --scene-type=opengl` vs `--scene-type=opengl-es` + (current default on this hardware). If the bug only fires under + GLES, that's a strong signal — the offending site is in a + GLES-only fallback. + +By the end of Phase 1 we should have a one-line `weston-simple-dmabuf-egl +-format=NV12 -modifier=…` that triggers the GL_ALPHA error within +seconds, plus a yes/no answer to "does AR24 also trigger". + +## Phase 2 — Identify the call site (1–2 evenings) + +The crime scene is somewhere in `kwin/src/scene/*` or +`kwin/src/effects/*`. Suspects, ranked: + +- **`SurfaceItemWayland::createPixmapTexture` → `GLTexture::create` + with `GL_ALPHA`.** This is the most likely path: KWin allocates a + fallback per-plane texture when the dmabuf import path can't take + the buffer whole. NV12 has a Y plane (single-channel) and a CbCr + plane (two-channel); historically the Y plane has been allocated as + `GL_ALPHA` in software fallbacks. If the EGL dmabuf import returned + `EGL_BAD_ATTRIBUTE` for `external_only` modifiers and KWin fell + through to per-plane, this is exactly where it would land. +- **`BlurEffect::initBlurTexture` / `BackgroundContrastEffect::*`.** + Single-channel noise textures for blur dither. Less likely (these + fire on every frame regardless of video clients) but listed for + completeness. +- **Window-decoration text glyph cache.** Qt's QGLTexture historically + requested `GL_ALPHA` for monochrome glyph atlases. Plasma 6 should + have moved to `GL_RED` long ago, but a stale code path in a + third-party theme or systray icon could still hit it. +- **Cursor texture upload via `wl_shm_pool` + ARGB8888.** KWin's + cursor scene sometimes uploads via glTexImage2D — but the format + there is `GL_RGBA`, not `GL_ALPHA`. Probably not the suspect. + +Tooling to identify *which*: + +1. **`apitrace trace --api egl kwin_wayland …`** then + `apitrace dump trace.trace | grep -B5 GL_ALPHA`. Apitrace gives + us the C++ call stack at the offending site if KWin was built with + debug symbols. +2. **`MESA_GL_DEBUG=context KWIN_GL_DEBUG=1 kwin_wayland --replace`** + plus `glDebugMessageCallback` already installed in KWin's + `OpenGLBackend` will print the source/type/severity for each + `GL_INVALID_VALUE`. Whether the file/line in the message includes + the user-space caller depends on Mesa's debug-extension support; + on panfrost it usually does include the GL function name and an + ID, but not the C++ source — that is what apitrace adds. +3. **Build kwin from source** (`extra/kwin` PKGBUILD on Arch ARM, + patch in `-DDEBUG=ON`, `-DCMAKE_BUILD_TYPE=Debug`) so the call + stacks resolve to file:line. + +## Phase 3 — Write the patch (½ evening once Phase 2 is done) + +If the offender is a `GL_ALPHA` allocation in a GLES3 context, the +fix is mechanical: + +```diff +- glTexImage2D(GL_TEXTURE_2D, 0, GL_ALPHA, width, height, 0, +- GL_ALPHA, GL_UNSIGNED_BYTE, data); ++ glTexImage2D(GL_TEXTURE_2D, 0, GL_R8, width, height, 0, ++ GL_RED, GL_UNSIGNED_BYTE, data); +``` + +…and adjust the consuming shader's swizzle: + +```diff +- gl_FragColor = vec4(texture2D(s, uv).a, …); ++ gl_FragColor = vec4(texture2D(s, uv).r, …); +``` + +If the offender is a per-plane fallback in the dmabuf import path +(suspect #1 above), the patch is larger because the right fix is to +*not fall through to the broken path* — handle the `external_only` +case by binding `GL_TEXTURE_EXTERNAL_OES` instead. That mirrors the +chromium-fourier patch 3/3 done at the chromium layer; symmetry says +KWin should do the same in its `glTexImage` consumer. + +## Phase 4 — Ship and upstream (1 evening) + +1. **Local Arch package** as `kwin-fourier` under + `marfrit-packages/arch/kwin-fourier/`, sibling to chromium-fourier + and firefox-fourier. PKGBUILD inherits from `extra/kwin`, drops + in our patch, bumps `pkgrel`. Same `provides=kwin conflicts=kwin` + pattern. +2. **Validate on ohm** by running the chromium-fourier 149-r2 build + + the bbb sample for a minute uninterrupted. Success = no GL_ALPHA + in the journal, no stall, smooth playback at the 34.7 % CPU + number from the chromium validation. +3. **Upstream** via: + - File a `kwin` bug on bugs.kde.org with: apitrace fragment, our + hardware (Mali-G52 panfrost on RK3566 mainline), exact mesa + version, repro steps via `weston-simple-dmabuf-egl` if Phase 1 + produced one. + - Push an MR to invent.kde.org/plasma/kwin against `master`. +4. **Document** the fix in `chromium-fourier/docs/dmabuf-zero-copy.md` + so the next person who lands on the same wall finds the breadcrumb + trail. + +## What success looks like + +`chromium-fourier-149-r2` on ohm under KWin Wayland plays +`bbb_1080p30_h264.mp4` end-to-end at the 34.7 % CPU figure already +recorded by the architectural validation, with zero `GL_INVALID_VALUE` +in the journal during playback. That number is the goal of the entire +chromium-fourier campaign for RK3566 — it is currently blocked on a +bug that has nothing to do with chromium. + +## Scope discipline + +We do not turn this into "audit the entire KWin GLES backend." If +Phase 2 surfaces additional latent GL_INVALID_* errors that don't +matter for video playback, we note them in the bug report and move +on. The pivot is explicitly "remove this single wall so the +chromium-fourier patch series can ship a working stack on RK3566." diff --git a/arch/chromium-fourier/NEXT.md b/arch/chromium-fourier/NEXT.md index 39d5a1520..44d8a4b03 100644 --- a/arch/chromium-fourier/NEXT.md +++ b/arch/chromium-fourier/NEXT.md @@ -148,3 +148,281 @@ to install chromium's bundled clang (x86_64 host, arm64 sysroot), then The boltzmann `chromium-builder` LXD container is preserved as fallback but no longer the active build host. If cross-compile pans out, that container can be torn down. + +## First runtime validation on ohm — 2026-04-26 22:26 UTC + +Cross-compile produced a working aarch64 binary (chrome 647 MB ELF + +chrome_crashpad_handler 4.3 MB + .pak + locales). Tarball +`chromium-fourier-147.tar.gz` (226 MB) transferred CT 220 → hertz → ohm. +Launched in mfritsche's KWin Wayland session (tty2, panfrost render +node) playing `bbb_1080p30_h264.mp4` from file:// with +`LIBVA_DRIVER_NAME=v4l2_request`, +`LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0`, +`--use-gl=egl --ozone-platform=wayland +--enable-features=VaapiVideoDecodeLinuxGL,AcceleratedVideoDecodeLinuxGL +--disable-features=UseChromeOSDirectVideoDecoder +--autoplay-policy=no-user-gesture-required`. + +**Result: V4L2 path NOT engaged.** Chrome 147 routes the H.264 stream +through `MojoVideoDecoderService` → `media/filters/ffmpeg_video_decoder.cc` +(software FFmpeg). Renderer pegs at ~92 % CPU, `/dev/video0` is never +opened (`fuser` returns empty), no `V4L2VideoDecoder` / +`VaapiVideoDecoder` log lines appear at `--v=1 +--vmodule="*/vaapi/*=2,*/v4l2/*=2,*video_decoder*=2,*media/gpu/*=2"`. +Compositor also fell back to software (`Switching to software +compositing.` even though panfrost render node was picked) — secondary +issue, separate from the codec wall. + +**Conclusion**: 7Ji-style gn args (`use_v4l2_codec=true +use_v4lplugin=true use_linux_v4l2_only=true`) alone are insufficient +on chromium 147. The V4L2VideoDecoder factory is still gated behind +`BUILDFLAG(IS_CHROMEOS)` — `media/mojo/services/gpu_mojo_media_client_*.cc` +and `media/gpu/gpu_video_decode_accelerator_factory.cc` only register +the V4L2 path on ChromeOS targets. + +## Validation pass 2 — 2026-04-26 22:38 UTC — V4L2VDA proven engaged + +Two distinct issues were diagnosed and the codec one was fully resolved +without source surgery beyond a 2-line patch: + +### Issue 1 — runtime master gate +`media::kAcceleratedVideoDecodeLinux` (user-visible feature name +"AcceleratedVideoDecoder") is hard-coded in +`media/base/media_switches.cc:750` to `FEATURE_ENABLED_BY_DEFAULT` only +when `BUILDFLAG(USE_VAAPI)` is set. On a USE_V4L2_CODEC-only build it +defaults DISABLED, the linux gpu_mojo_media_client returns +`VideoDecoderType::kUnknown`, and chrome silently falls back to +`media/filters/ffmpeg_video_decoder.cc`. + +**Fix**: 2-line patch (now `patches/enable-v4l2-decoder-default.patch`): +``` +-#if BUILDFLAG(USE_VAAPI) ++#if BUILDFLAG(USE_VAAPI) || BUILDFLAG(USE_V4L2_CODEC) +``` + +The placeholder `chromeos-pipeline-bypass.patch` was deleted; PKGBUILD +now references the real patch. **Verified to apply cleanly on the CT 220 +tree** (chromium 149 main). + +### Issue 2 — bundled GL libs missing from tarball +The first runtime tarball shipped only `chrome` + `.pak` + locales + +`chrome_crashpad_handler`. It omitted `libEGL.so` / `libGLESv2.so` +(ANGLE) plus `libvk_swiftshader.so` and `libvulkan.so.1`. Without these, +the GPU process logs `gl::init::InitializeStaticGLBindingsOneOff failed` +and chrome falls into "Switching to software compositing." mode — which +*also* gates the V4L2 path off because the gpu_mojo_media_client never +gets a chance to dispatch. + +Additionally, `--use-gl=egl` is rejected ("Requested GL implementation +gl=egl-gles2,angle=none not found in allowed implementations: +[(gl=egl-angle,angle=opengl|opengles|vulkan)]"): the build only allows +ANGLE-mediated paths. Right launcher invocation: +`--use-gl=angle --use-angle=gles`. + +**Fix**: package the four libs alongside `chrome` and update the +launcher flag set. Both will be encoded in the next iteration of the +PKGBUILD's `package()` and a `chromium-fourier` launcher script. + +### What we observed once both fixes were in place +With patch + bundled libs + `--enable-features=AcceleratedVideoDecoder` ++ `--use-gl=angle --use-angle=gles`, chrome on RK3566 hantro logs: + +``` +[gpu]: V4L2VideoDecoder() +[gpu]: Open(): No devices supporting H264 for type: 0 <- type=0 is single-planar; chrome retries multi-planar +[gpu]: InitializeBackend(): Using a stateless API for profile: h264 main and fourcc: S264 +[gpu]: SetupInputFormat(): Input (OUTPUT queue) Fourcc: S264 +[gpu]: AllocateInputBuffers(): Requesting: 17 OUTPUT buffers of type V4L2_MEMORY_MMAP +[gpu]: SetExtCtrlsInit(): Setting EXT_CTRLS for H264 +[gpu]: SetupOutputFormat(): Output (CAPTURE queue) candidate: NV12 +[gpu]: ContinueChangeResolution(): Requesting: 6 CAPTURE buffers of type V4L2_MEMORY_MMAP +[renderer]: OnDecoderSelected