forked from marfrit/marfrit-packages
KWIN_PIVOT: Phase-4 done (qt6 patches landed) + weston A/B vindicates KWin theory
Today's deltas:
- qt6-base-fourier built, installed, validated on ohm. Static-idle
journal shows zero GL_INVALID_VALUE post-relogin; the Qt 6
GL_ALPHA bug is genuinely fixed.
- chrome v4 under KWin still stalls — at ~6s vs ~3s pre-Qt-fix, so
the GL_ALPHA churn was contributing some load but wasn't the
primary cause.
- Clean A/B with weston: same chrome v4 binary, same panfrost,
same V4L2, same hardware → swapping KWin → weston turns the
stall off entirely. Chrome plays through with elevated CPU
(~96 % vs KWin's ~50 % when it isn't stalled) because weston
falls back to LINEAR composite vs KWin's fast-tile path.
- mpv triangulation:
--vo=null --hwdec=v4l2request: clean (decode only)
--vo=drm --hwdec=v4l2request: 0.7 % drops in 19 s (KMS scanout)
--vo=gpu-next --hwdec=v4l2request under KWin: 76 % drops, slideshow
Decode + display hardware path is fully capable. The wall is
specifically KWin's compositor scheduling/presentation pipeline on
this stack — panfrost ES 3.2 + V4L2 stateless NV12 dmabuf clients.
KWIN_PIVOT.md rewritten:
- Phase 4 (qt6 patch, ship, upstream) marked done.
- New Phase 5 (find the KWin culprit): WAYLAND_DEBUG on chrome +
KWin to capture the missing wl_buffer.release / wp_presentation
exchange around the 6 s stall, plus strace-on-kwin and
effects-disable bisection.
- New Phase 6 (fix and ship): kwin-fourier package pattern, ohm
validation, bugs.kde.org filing.
This commit is contained in:
@@ -1,29 +1,137 @@
|
||||
# KWin pivot — fix the `glTexImage2D(GL_ALPHA)` stall
|
||||
# KWin pivot — fix the chrome-on-KWin video stall
|
||||
|
||||
> **2026-04-28 update — Phase 2 collapsed onto Phase 1: it's not KWin.**
|
||||
> Source-grep nailed the offender on the first pass. Real culprit:
|
||||
> Qt 6's `QOpenGLTextureGlyphCache` (`src/opengl/qopengltextureglyphcache.cpp:111-117`)
|
||||
> and `QRhiGles2::toGlTextureFormat` (`src/gui/rhi/qrhigles2.cpp:1373-1378`).
|
||||
> **2026-04-28 update part 2 — qt6-base-fourier landed, validated, did
|
||||
> not fix the chrome stall.** The Qt 6 GL_ALPHA bug (qopengltextureglyphcache.cpp,
|
||||
> qrhigles2.cpp, qopengltextureuploader.cpp) is real, the patch is
|
||||
> correct, the journal noise is gone — but the chromium-fourier 149-r4
|
||||
> playback under KWin still deadlocks at ~6 seconds (vs ~3 seconds
|
||||
> pre-patch — so the GL_ALPHA churn was contributing some overhead,
|
||||
> just not the primary cause). Vindication came from a clean weston
|
||||
> A/B: same chrome v4 binary, same panfrost mesa, same V4L2 driver,
|
||||
> same hardware → swapping KWin for weston turns the stall off. Chrome
|
||||
> plays through under weston (with elevated CPU because weston falls
|
||||
> back to LINEAR composite). KWin has a *second* bug, structurally
|
||||
> deeper than the Qt one. **Phase 4 below — write/ship/upstream the Qt
|
||||
> fix — was completed today.** This document now pivots again to the
|
||||
> remaining KWin investigation.
|
||||
|
||||
> **2026-04-28 update part 1 — Phase 2 collapsed onto Phase 1: not KWin
|
||||
> for the GL_ALPHA part.** Source-grep nailed the offender on the
|
||||
> first pass. Real culprit: Qt 6's `QOpenGLTextureGlyphCache`
|
||||
> (`src/opengl/qopengltextureglyphcache.cpp:111-117`) and
|
||||
> `QRhiGles2::toGlTextureFormat` (`src/gui/rhi/qrhigles2.cpp:1373-1378`).
|
||||
> KWin's own GL paths use `GL_R8` correctly (`src/opengl/gltexture.cpp:61`,
|
||||
> `src/scene/shadowitem.cpp:494`). The pivot becomes a **Qt-fourier**
|
||||
> patch, not a kwin-fourier one. Plan rewritten below; the pre-rewrite
|
||||
> reproduction/triangulation phases are kept verbatim because they
|
||||
> still apply to whatever lives downstream of the Qt fix.
|
||||
>
|
||||
> Qt's broken logic, in plain English: *"If qtbase was built with
|
||||
> opengles2, just always use `GL_ALPHA`."* That's correct for an
|
||||
> OpenGL ES 2.x context. It's wrong for OpenGL ES 3.x, where
|
||||
> `GL_ALPHA` is no longer a valid `glTexImage2D` internalFormat
|
||||
> (only sized formats — `GL_R8`, etc.). Mali / panfrost on RK3566
|
||||
> exposes ES 3.2; KWin requests an ES 3.2 context; Qt picks
|
||||
> `GL_ALPHA`; mesa returns `GL_INVALID_VALUE`; the texture is
|
||||
> permanently broken; every dependent draw errors at level 0; the
|
||||
> compositor's frame-callback path stalls. Affects every Qt 6
|
||||
> application on Mali-class hardware that ends up rendering text
|
||||
> through `QOpenGLTextureGlyphCache` (KDE's window decorations,
|
||||
> Plasma overlays, Qt Quick scenegraph via RHI, ad nauseam) — KWin
|
||||
> just happens to be the most visible victim because it's the
|
||||
> compositor and its stall takes everyone else down with it.
|
||||
> `src/scene/shadowitem.cpp:494`). The Qt-fourier patch series shipped
|
||||
> as `marfrit-packages/arch/qt6-base-fourier/` and was validated on
|
||||
> ohm — zero `GL_INVALID_VALUE` in a fresh-session journal.
|
||||
|
||||
## Triangulation table after today's work
|
||||
|
||||
| Path | Result |
|
||||
|---|---|
|
||||
| `ffmpeg -hwaccel v4l2request -f null` | ✓ 36 fps, clean |
|
||||
| `mpv --vo=null --hwdec=v4l2request` | ✓ decode-only, clean |
|
||||
| `mpv --vo=drm --hwdec=v4l2request` (KMS scanout, no compositor) | ✓ 0.7% drops in 19 s |
|
||||
| **chrome v4 under weston** | **✓ plays through; ~96 % CPU** |
|
||||
| chrome v4 under KWin (post-Qt-fix) | ✗ stall @ ~6 s, ⏸ icon, audio clock advances |
|
||||
| `mpv --vo=gpu-next --hwdec=v4l2request` under KWin | ✗ ~76 % drops, slideshow |
|
||||
| **chrome v4 under KWin pre-Qt-fix** | **✗ stall @ ~3 s** (GL_ALPHA spam adds load) |
|
||||
|
||||
Decode + display hardware path is fully capable. Wayland *as a
|
||||
protocol* is fine (weston works). The wall is **specifically KWin's
|
||||
compositor scheduling and presentation pipeline on this stack** —
|
||||
panfrost ES 3.2 + V4L2 stateless NV12 dmabuf clients.
|
||||
|
||||
The chromium-fourier patch series is correct. The qt6-base-fourier
|
||||
patch series is correct. The KWin bug is the third independent
|
||||
problem on this hardware, exposed by the prior two fixes.
|
||||
|
||||
## What we know and what we don't
|
||||
|
||||
**Known:**
|
||||
- The stall is *not* in chrome's V4L2VideoDecoder (works under weston).
|
||||
- The stall is *not* in panfrost's dmabuf import (works under weston).
|
||||
- The stall is *not* a GL error (no `GL_INVALID_VALUE` after qt6 fix).
|
||||
- The stall is *not* a thread parked in `vb2`/`v4l2`/`dma_fence` wchan
|
||||
(kwin/chrome/audio threads all sit in `futex_do_wait` /
|
||||
`poll_schedule_timeout` / `unix_stream_read_generic`).
|
||||
- The stall *does* idle the audio output socket (renderer audio
|
||||
thread blocks reading from the audio service unix socket → audio
|
||||
drains the last ALSA buffer, static, silence).
|
||||
- The stall *does* leave the `<video>` element's currentTime
|
||||
advancing (audio clock keeps running) while no new frames present.
|
||||
|
||||
**Unknown (where the next investigation should bisect):**
|
||||
- Is KWin failing to deliver `wl_buffer.release` events promptly?
|
||||
Symptom would be: chrome holds dmabuf references → V4L2's 6-buffer
|
||||
CAPTURE pool exhausts → decoder blocks. Test by tracing
|
||||
`wl_buffer.release` events on the chrome wl_display via a wayland
|
||||
proxy / debugger.
|
||||
- Is KWin failing to deliver `wp_presentation_feedback`? Chrome's
|
||||
viz compositor uses presentation feedback to pace; if it never
|
||||
fires, the renderer's frame submission backs up.
|
||||
- Is KWin's render loop blocking on a vsync wait that never
|
||||
releases? The 6-second figure roughly matches multiples of
|
||||
panfrost's idle-power-state wakeup latency, but that's a
|
||||
speculative correlation.
|
||||
- Is KWin doing something specific with `wp_subsurface` that chrome's
|
||||
multi-process IPC tickles in a way mpv's single-process flow does
|
||||
not? Chrome attaches video to a subsurface inside its main
|
||||
surface; mpv attaches video to its main surface directly.
|
||||
|
||||
## Phase 5 — Find the KWin culprit
|
||||
|
||||
Order of cheapness:
|
||||
|
||||
1. **Strace KWin during a chrome stall.** What is KWin actually
|
||||
doing at the moment of stall? (`strace -p <kwin_pid> -f -e trace=poll,read,write,recvmsg,sendmsg -o /tmp/kwin.strace` for ~10 s during the chrome run, then chase a flatlined fd.)
|
||||
2. **WAYLAND_DEBUG=client + WAYLAND_DEBUG=server.** Run chrome with
|
||||
`WAYLAND_DEBUG=client`, capture all wayland traffic, look for
|
||||
the last `wl_buffer.attach` chrome sent before the stall and
|
||||
whether KWin ever emitted the corresponding `wl_buffer.release`.
|
||||
This single experiment probably localizes the bug.
|
||||
3. **Disable KWin effects, scene type, blur, contrast, animation
|
||||
speed.** `kcmshell6 kwincompositing` toggles. If turning off
|
||||
*all* effects unsticks chrome, the bug is in an effect plugin.
|
||||
4. **Bisect KWin git.** v6.6.4 vs v6.5.x or earlier — does the
|
||||
stall reproduce on v6.5? If not, bisect the offending commit.
|
||||
(Heavy: KWin builds for ~1 h on boltzmann.)
|
||||
|
||||
Step (2) is the headliner. WAYLAND_DEBUG yields a complete
|
||||
client-server transcript; the missing event after the stall is
|
||||
usually the bug, and the exchange around it is the call site.
|
||||
|
||||
## Phase 6 — Fix and ship
|
||||
|
||||
Once we know the call site:
|
||||
|
||||
- Write the patch against `kwin/master`, smallest possible diff.
|
||||
- Local Arch package as `kwin-fourier` under
|
||||
`marfrit-packages/arch/kwin-fourier/`, pattern matching
|
||||
`qt6-base-fourier`. Bump epoch to dominate Arch's version.
|
||||
- Validate on ohm: chrome v4 + bbb sample plays through to EOF at
|
||||
the 34.7 % CPU number (KWin's fast-tile path is more efficient
|
||||
than weston's LINEAR fallback, so once unstuck, KWin should beat
|
||||
weston on CPU).
|
||||
- File on bugs.kde.org with the WAYLAND_DEBUG transcript, the
|
||||
weston-vs-KWin A/B table from this document, and the patch.
|
||||
- Push MR to invent.kde.org/plasma/kwin against `master`.
|
||||
|
||||
## Reflection — the spec-shaped void
|
||||
|
||||
The user's original Phase-1 hypothesis was "leaked corporate
|
||||
short-cuts." Today vindicates it twice over:
|
||||
|
||||
- **Qt 6**: codified "OpenGL ES" as one thing in 2012, never
|
||||
re-read the ES 3 spec when GL_ALPHA was deprecated. We patched
|
||||
it. Three sites, ~9 lines of net change.
|
||||
- **KWin**: still TBD which spec-or-spec-shaped-omission it tripped
|
||||
on. The data so far points at a compositor scheduling issue —
|
||||
most likely a missing or late `wl_buffer.release` /
|
||||
`wp_presentation_feedback` for the specific case of multiple
|
||||
IPC-fragmented client surfaces driving a fast video stream.
|
||||
That's exactly the kind of scenario that gets "we never tested
|
||||
this combination" treatment in QA.
|
||||
|
||||
## What we know
|
||||
|
||||
|
||||
Reference in New Issue
Block a user