# KWin pivot — fix the `glTexImage2D(GL_ALPHA)` stall > **2026-04-28 update — Phase 2 collapsed onto Phase 1: it's not KWin.** > Source-grep nailed the offender on the first pass. Real culprit: > Qt 6's `QOpenGLTextureGlyphCache` (`src/opengl/qopengltextureglyphcache.cpp:111-117`) > and `QRhiGles2::toGlTextureFormat` (`src/gui/rhi/qrhigles2.cpp:1373-1378`). > KWin's own GL paths use `GL_R8` correctly (`src/opengl/gltexture.cpp:61`, > `src/scene/shadowitem.cpp:494`). The pivot becomes a **Qt-fourier** > patch, not a kwin-fourier one. Plan rewritten below; the pre-rewrite > reproduction/triangulation phases are kept verbatim because they > still apply to whatever lives downstream of the Qt fix. > > Qt's broken logic, in plain English: *"If qtbase was built with > opengles2, just always use `GL_ALPHA`."* That's correct for an > OpenGL ES 2.x context. It's wrong for OpenGL ES 3.x, where > `GL_ALPHA` is no longer a valid `glTexImage2D` internalFormat > (only sized formats — `GL_R8`, etc.). Mali / panfrost on RK3566 > exposes ES 3.2; KWin requests an ES 3.2 context; Qt picks > `GL_ALPHA`; mesa returns `GL_INVALID_VALUE`; the texture is > permanently broken; every dependent draw errors at level 0; the > compositor's frame-callback path stalls. Affects every Qt 6 > application on Mali-class hardware that ends up rendering text > through `QOpenGLTextureGlyphCache` (KDE's window decorations, > Plasma overlays, Qt Quick scenegraph via RHI, ad nauseam) — KWin > just happens to be the most visible victim because it's the > compositor and its stall takes everyone else down with it. ## What we know KWin 6.6.4-1 on Arch Linux ARM (Plasma 6.6.4-1, mesa 26.0.5-1, libdrm 2.4.131-1) on ohm (PineTab2 / RK3566 / panfrost) silently corrupts its GL command queue mid-frame whenever a wayland client posts a video buffer. The journal carries a rolling stream of: ``` kwin_wayland: 0x4: GL_INVALID_VALUE in glTexImage2D(internalFormat=GL_ALPHA) kwin_wayland: 0x4: GL_INVALID_OPERATION in glTexSubImage2D(invalid texture level 0) × N ``` `GL_ALPHA` is not a valid `internalFormat` for `glTexImage2D` under **OpenGL ES 3.x** (it was the GLES1.x single-channel alpha format; GLES3 deprecates it for sized formats — `GL_R8`, `GL_LUMINANCE8_ALPHA8`, etc.). Once the texture allocation fails, the `glTexSubImage2D` calls that should populate it all error at level 0. KWin keeps retrying the same broken upload every frame, never recovers, and the present-callback path that depends on that texture stops acking client frames. Every wayland video client deadlocks on the missing ack. First occurrence in this box's journal: **2026-03-06** — the bug predates any chromium-fourier work by roughly seven weeks. ## Triangulation already in hand | Client | Outcome | |---|---| | chromium-fourier 149-r2 (with patch 3/3) | plays ~3 s @ 34.7 % CPU then renderer/GPU park in `futex_do_wait` | | chromium-fourier 149-r2 (without patch 3/3) | plays ~10 s (slower path delays surfacing) then identical deadlock | | VLC | `cannot convert decoder/filter output to any format supported by the output` → `could not initialize video chain` | | mpv `--vo=null --hwdec=v4l2request` | `Could not create device.` (mpv-side bug, separate, unrelated) | | ffmpeg `-hwaccel v4l2request -i bbb -f null -` | plays through clean at 36 fps; hardware path is healthy | Decode path is healthy on this hardware. The wall is exclusively the compositor's GL backend. ## Constraint: ohm is the only test box on hand ampere (RK3588 / panthor) is in the boxes-from-Shenzhen pile, currently DOWN. fresnel (RK3399 / Pinebook Pro) is offline. boltzmann (Rock 5 ITX+ build host) doesn't run KWin. We do every step on ohm; we accept the wifi flakiness and the occasional reboot. ## Phase 1 — Reproduce outside chrome and bound the trigger (1 evening) Goal: a deterministic, headless-or-near-headless reproduction that doesn't require launching a 800-MB browser. 1. **Smallest-possible client.** Build a 50-line C wayland client that creates a `wp_linux_dmabuf_v1` buffer, pumps frames at 30 fps, and exits when KWin first errors. Use `weston-simple-dmabuf-egl` from the `weston` package as a starting template — already does exactly this but without our specific format/modifier matrix. 2. **Vary the format/modifier matrix.** Run the smallest-possible client with each of: NV12 + LINEAR, NV12 + AFBC, NV12 + AFRC, AR24 + LINEAR, XR24 + LINEAR. We already know NV12 paths trigger; confirming AR24/XR24 do *not* trigger localizes the bug to KWin's YUV import path (vs a generic dmabuf import bug). 3. **Vary the buffer dimensions.** Some KWin texture-cache paths allocate fixed-size internal scratch textures; non-power-of-two, non-multiple-of-16, or specifically odd-aspect cases sometimes trigger paths that healthy aspect ratios skip. Test 1920×1080, 1280×720, 854×480, 640×360 and a deliberately weird 1366×768. 4. **Vary KWin scene type.** Switch `kwin_wayland --scene-type=opengl` vs `--scene-type=opengl-es` (current default on this hardware). If the bug only fires under GLES, that's a strong signal — the offending site is in a GLES-only fallback. By the end of Phase 1 we should have a one-line `weston-simple-dmabuf-egl -format=NV12 -modifier=…` that triggers the GL_ALPHA error within seconds, plus a yes/no answer to "does AR24 also trigger". ## Phase 2 — Identify the call site (1–2 evenings) The crime scene is somewhere in `kwin/src/scene/*` or `kwin/src/effects/*`. Suspects, ranked: - **`SurfaceItemWayland::createPixmapTexture` → `GLTexture::create` with `GL_ALPHA`.** This is the most likely path: KWin allocates a fallback per-plane texture when the dmabuf import path can't take the buffer whole. NV12 has a Y plane (single-channel) and a CbCr plane (two-channel); historically the Y plane has been allocated as `GL_ALPHA` in software fallbacks. If the EGL dmabuf import returned `EGL_BAD_ATTRIBUTE` for `external_only` modifiers and KWin fell through to per-plane, this is exactly where it would land. - **`BlurEffect::initBlurTexture` / `BackgroundContrastEffect::*`.** Single-channel noise textures for blur dither. Less likely (these fire on every frame regardless of video clients) but listed for completeness. - **Window-decoration text glyph cache.** Qt's QGLTexture historically requested `GL_ALPHA` for monochrome glyph atlases. Plasma 6 should have moved to `GL_RED` long ago, but a stale code path in a third-party theme or systray icon could still hit it. - **Cursor texture upload via `wl_shm_pool` + ARGB8888.** KWin's cursor scene sometimes uploads via glTexImage2D — but the format there is `GL_RGBA`, not `GL_ALPHA`. Probably not the suspect. Tooling to identify *which*: 1. **`apitrace trace --api egl kwin_wayland …`** then `apitrace dump trace.trace | grep -B5 GL_ALPHA`. Apitrace gives us the C++ call stack at the offending site if KWin was built with debug symbols. 2. **`MESA_GL_DEBUG=context KWIN_GL_DEBUG=1 kwin_wayland --replace`** plus `glDebugMessageCallback` already installed in KWin's `OpenGLBackend` will print the source/type/severity for each `GL_INVALID_VALUE`. Whether the file/line in the message includes the user-space caller depends on Mesa's debug-extension support; on panfrost it usually does include the GL function name and an ID, but not the C++ source — that is what apitrace adds. 3. **Build kwin from source** (`extra/kwin` PKGBUILD on Arch ARM, patch in `-DDEBUG=ON`, `-DCMAKE_BUILD_TYPE=Debug`) so the call stacks resolve to file:line. ## Phase 3 — Write the patch (½ evening once Phase 2 is done) The Qt 6 fix is two ~3-line changes, runtime-safe, no new dependency. **Fix #1 — `src/opengl/qopengltextureglyphcache.cpp` lines 111-117:** ```diff #if !QT_CONFIG(opengles2) const GLint internalFormat = isCoreProfile() ? GL_R8 : GL_ALPHA; const GLenum format = isCoreProfile() ? GL_RED : GL_ALPHA; #else - const GLint internalFormat = GL_ALPHA; - const GLenum format = GL_ALPHA; + // OpenGL ES 3.x deprecated GL_ALPHA as a glTexImage2D + // internalFormat; only true ES 2 contexts retain it. Use GL_R8 + // + the matching swizzle (handled in the fragment shader's .r + // sample below) on ES 3+ hardware so Mali / panfrost / panthor + // GLES3 contexts stop emitting GL_INVALID_VALUE every frame. + const bool useR8 = ctx->format().majorVersion() >= 3; + const GLint internalFormat = useR8 ? GL_R8 : GL_ALPHA; + const GLenum format = useR8 ? GL_RED : GL_ALPHA; #endif ``` The downstream fragment shader path that samples this texture must read `.r` instead of `.a` when `GL_R8` is used. Qt's text-rendering fragment program already has both code paths conditioned on context core-profile; the ES 3+ branch needs the same treatment. Lines 214-216 of the same file (the resize / re-upload path) need the identical change. **Fix #2 — `src/gui/rhi/qrhigles2.cpp` lines 1373-1378:** ```diff case QRhiTexture::RED_OR_ALPHA8: - *glintformat = caps.coreProfile ? GL_R8 : GL_ALPHA; + *glintformat = (caps.coreProfile || (caps.gles && caps.ctxMajor >= 3)) + ? GL_R8 : GL_ALPHA; *glsizedintformat = *glintformat; - *glformat = caps.coreProfile ? GL_RED : GL_ALPHA; + *glformat = (caps.coreProfile || (caps.gles && caps.ctxMajor >= 3)) + ? GL_RED : GL_ALPHA; *gltype = GL_UNSIGNED_BYTE; break; ``` `caps.gles` and `caps.ctxMajor` are populated at context creation (qrhigles2.cpp:804 + :855); the disjunct is free. **Fix #3 — `src/opengl/qopengltextureuploader.cpp` lines 253-257:** This is the QImage→GL upload path (used by `QOpenGLPaintEngineEx` and its descendants). Same pattern, same fix shape: extend the "core profile or GLES2 fallback" branching to also consider GLES3+ as needing `GL_R8`. If we want to be aggressive, we can collapse all three sites onto a single `qt_gl_use_r8_for_alpha8(ctx)` helper in `qopenglhelper_p.h` so future Qt versions don't drift apart again — but a minimal patch should keep the three sites independent so each is reviewable in isolation by the relevant Qt module owner. ## Phase 4 — Ship and upstream (1 evening) 1. **Local Arch package** as `qt6-base-fourier` under `marfrit-packages/arch/qt6-base-fourier/`, sibling to chromium-fourier and firefox-fourier. PKGBUILD inherits from `extra/qt6-base`, drops in the three patches above, bumps `pkgrel`. Same `provides=qt6-base conflicts=qt6-base` pattern. Rebuild is heavy (qtbase compile is ~30 minutes on boltzmann; ohm rebuild is sustained-fan-territory and probably better avoided — boltzmann builds the aarch64 .pkg.tar.zst, then we rsync it to ohm and `pacman -U` there). 2. **Validate on ohm** by: - `pacman -U` the patched qt6-base. - Restart Plasma session (logout / login) so the new qt6-base.so is mapped into the fresh kwin_wayland. - Re-run `journalctl -u plasma-kwin_wayland.service -f` while opening any Qt 6 application that triggers text caching (a terminal, kate, the system tray) — the GL_INVALID_VALUE spam should be **gone**. - Then run chromium-fourier 149-r2 + the bbb sample for a full minute uninterrupted. Success = smooth playback through to EOF at the 34.7 % CPU number, no stall, no audio static, no KWin-side errors in the journal. 3. **Upstream** via: - File on `bugreports.qt.io` against `QtBase: OpenGL`, with: the three diff hunks above, the exact behavior on Mali-G52 panfrost RK3566 mainline 6.19, an excerpt of the journal noise, and mesa 26.0.5 / qt 6.11.0 / kwin 6.6.4 versions. - Push a Gerrit change against `qtbase` `dev` branch (`codereview.qt-project.org`). Qt won't accept a GitHub MR — they live on Gerrit. Create a Qt account, configure `git-review`, push. - Reference the chromium-fourier project as the discovery site so the next Mali-on-Linux Qt 6 user finds the breadcrumb. 4. **Document** the fix in `chromium-fourier/docs/dmabuf-zero-copy.md` "Caveat — KWin 6.6.4 GLES backend on this hardware" subsection: replace the "to be investigated" wording with "fixed by qt6-base-fourier; see `marfrit-packages/arch/qt6-base-fourier/`. Upstream Qt change pending review at ``." ## Reflection — corporate IT spec leakage, as predicted The user's Phase-1 hypothesis was that this was the result of code written by people who never read the spec they were claiming to implement. They were correct, with one nuance: the Qt code did read the spec — *the OpenGL ES 2.x spec*, where `GL_ALPHA` is genuinely the canonical single-channel format for `glTexImage2D`. What it never went back and re-read is the OpenGL ES 3.0 spec (section 3.8.3, "Texture Image Specification"), where `GL_ALPHA` is moved to the deprecated list and only sized formats are retained. The bug is: *Qt 6 was written assuming "OpenGL ES" is one thing, and never updated the assumption when ES 3 dropped the unsized formats.* That's a corporate-IT-style architectural shortcut: codify the world in two boxes (desktop vs ES), call it done, ship. The fact that a category had a sub-category which moved in 2012 is not the framework's job to track. Until the bug report arrives and someone has to extend the boolean to a triple. ## What success looks like `chromium-fourier-149-r2` on ohm under KWin Wayland plays `bbb_1080p30_h264.mp4` end-to-end at the 34.7 % CPU figure already recorded by the architectural validation, with zero `GL_INVALID_VALUE` in the journal during playback. That number is the goal of the entire chromium-fourier campaign for RK3566 — it is currently blocked on a bug that has nothing to do with chromium. ## Scope discipline We do not turn this into "audit the entire KWin GLES backend." If Phase 2 surfaces additional latent GL_INVALID_* errors that don't matter for video playback, we note them in the bug report and move on. The pivot is explicitly "remove this single wall so the chromium-fourier patch series can ship a working stack on RK3566."