kwin-fourier: bypass watchDmaBuf implicit-sync fence wait (experiment)
build and publish packages / distcc-avahi-aarch64 (push) Successful in 35s
build and publish packages / lmcp-any (push) Successful in 7s
build and publish packages / lmcp-debian (push) Successful in 6s
build and publish packages / claude-his-any (push) Successful in 7s
build and publish packages / ffmpeg-v4l2-request-aarch64 (push) Successful in 12m2s
build and publish packages / claude-his-debian (push) Successful in 9s
build and publish packages / distcc-avahi-aarch64 (push) Successful in 35s
build and publish packages / lmcp-any (push) Successful in 7s
build and publish packages / lmcp-debian (push) Successful in 6s
build and publish packages / claude-his-any (push) Successful in 7s
build and publish packages / ffmpeg-v4l2-request-aarch64 (push) Successful in 12m2s
build and publish packages / claude-his-debian (push) Successful in 9s
Hypothesis under test: KWin's Transaction::watchDmaBuf calls DMA_BUF_IOCTL_EXPORT_SYNC_FILE on every plane of every imported dmabuf and parks the transaction on a QSocketNotifier(POLLIN) waiting for that sync_file. On V4L2 hantro CAPTURE buffers (RK3566 mainline 6.19, panfrost mesa 26.0.5) the resulting fence either never signals or signals so late that chrome's 6-buffer V4L2 capture pool exhausts at ~6s, hard-stalling the decoder. mpv with gpu-next slideshows at 76% drop. weston A/B with same chrome v4 binary plays through clean — KWin's watchDmaBuf is the suspect. This experiment patches watchDmaBuf to no-op. Wayland clients are required by spec to ensure buffer contents are complete before wl_surface.attach+commit, so the fence-wait is a defensive optimization for misbehaving clients, not a correctness primitive. If chrome plays through end-to-end at the recorded 34.7% combined CPU number with this patched KWin, the bug is confirmed and the upstream fix can be refined (timeout, V4L2-source skip, or use the dmabuf fd directly in the QSocketNotifier instead of an extra exported sync_file). KWIN_PIVOT.md (in chromium-fourier/) carries the discovery thread.
This commit is contained in:
@@ -0,0 +1,89 @@
|
|||||||
|
From: Markus Fritsche <mfritsche@reauktion.de>
|
||||||
|
Subject: [PATCH] transaction: bypass watchDmaBuf implicit-sync fence wait
|
||||||
|
Date: 2026-04-28
|
||||||
|
|
||||||
|
Background
|
||||||
|
----------
|
||||||
|
KWin's `Transaction::watchDmaBuf` (src/wayland/transaction.cpp) calls
|
||||||
|
`DMA_BUF_IOCTL_EXPORT_SYNC_FILE` on every plane of every imported
|
||||||
|
dmabuf and parks the transaction on a `QSocketNotifier(POLLIN)`
|
||||||
|
waiting for the resulting sync_file fd to become readable. The intent
|
||||||
|
is correct in principle — wait for the producer to finish writing
|
||||||
|
before sampling — but on V4L2 hantro CAPTURE buffers (RK3566 mainline
|
||||||
|
6.19, panfrost mesa 26.0.5) the resulting fence either signals so
|
||||||
|
late that chrome's 6-buffer V4L2 capture pool exhausts, or never
|
||||||
|
signals at all. Symptom (per chromium-fourier KWIN_PIVOT.md):
|
||||||
|
|
||||||
|
- chrome v4 attaches a video frame to a wp_subsurface, commits
|
||||||
|
- KWin's Transaction::commit calls watchDmaBuf, exports a sync_file,
|
||||||
|
parks on QSocketNotifier
|
||||||
|
- Sync_file never becomes readable
|
||||||
|
- Transaction never applies; old surface state never replaced
|
||||||
|
- wl_buffer.release for the previous video buffer never sent
|
||||||
|
- chrome's V4L2 capture pool starves at ~6 seconds, decoder blocks,
|
||||||
|
audio drains, hard stall
|
||||||
|
|
||||||
|
mpv with `--vo=gpu-next` on the same KWin session slideshows at 76%
|
||||||
|
drop rate but does not deadlock — its single-surface attach pattern
|
||||||
|
hits a different transaction shape than chrome's subsurface flow.
|
||||||
|
|
||||||
|
A clean weston A/B with the same chrome v4 binary plays through
|
||||||
|
end-to-end: the bug is specifically KWin's transaction fence-wait
|
||||||
|
path on this stack, not Wayland-as-a-protocol.
|
||||||
|
|
||||||
|
Fix
|
||||||
|
---
|
||||||
|
This experimental patch no-ops `watchDmaBuf` to test the hypothesis.
|
||||||
|
Implicit-sync correctness in this case is not lost: the V4L2
|
||||||
|
producer guarantees the buffer's contents are complete before
|
||||||
|
chrome sends `wl_surface.attach + commit`, and the wp_linux_dmabuf
|
||||||
|
client is required to do so by spec. The fence-wait was a defensive
|
||||||
|
optimization for misbehaving clients, not a correctness primitive.
|
||||||
|
|
||||||
|
If chrome plays through end-to-end at the recorded 34.7% combined
|
||||||
|
CPU number under KWin with this patch, the bug is confirmed and the
|
||||||
|
upstream fix can be refined (timeout, V4L2-source skip, or use the
|
||||||
|
dmabuf fd directly in the QSocketNotifier instead of an extra
|
||||||
|
exported sync_file).
|
||||||
|
|
||||||
|
diff --git a/src/wayland/transaction.cpp b/src/wayland/transaction.cpp
|
||||||
|
index 967b22b..e3fbc06 100644
|
||||||
|
--- a/src/wayland/transaction.cpp
|
||||||
|
+++ b/src/wayland/transaction.cpp
|
||||||
|
@@ -263,27 +263,18 @@ static FileDescriptor exportWaitSyncFile(const FileDescriptor &fileDescriptor)
|
||||||
|
return FileDescriptor{};
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
void Transaction::watchDmaBuf(TransactionEntry *entry)
|
||||||
|
{
|
||||||
|
-#if defined(Q_OS_LINUX)
|
||||||
|
- const DmaBufAttributes *attributes = entry->buffer->dmabufAttributes();
|
||||||
|
- if (!attributes) {
|
||||||
|
- return;
|
||||||
|
- }
|
||||||
|
-
|
||||||
|
- for (int i = 0; i < attributes->planeCount; ++i) {
|
||||||
|
- const FileDescriptor &fileDescriptor = attributes->fd[i];
|
||||||
|
- if (fileDescriptor.isReadable()) {
|
||||||
|
- continue;
|
||||||
|
- }
|
||||||
|
-
|
||||||
|
- auto syncFile = exportWaitSyncFile(fileDescriptor);
|
||||||
|
- if (syncFile.isValid()) {
|
||||||
|
- entry->fences.emplace_back(std::make_unique<TransactionFence>(this, std::move(syncFile)));
|
||||||
|
- }
|
||||||
|
- }
|
||||||
|
-#endif
|
||||||
|
+ // kwin-fourier: no-op the implicit-sync fence wait. On V4L2
|
||||||
|
+ // hantro CAPTURE buffers (RK3566 mainline 6.19, panfrost mesa
|
||||||
|
+ // 26.0.5) the DMA_BUF_IOCTL_EXPORT_SYNC_FILE fence either never
|
||||||
|
+ // signals or signals so late that chrome's V4L2 capture pool
|
||||||
|
+ // exhausts at ~6s, hard-stalling the decoder. Wayland clients
|
||||||
|
+ // are required by spec to ensure the buffer's contents are
|
||||||
|
+ // complete before wl_surface.attach+commit, so this fence-wait
|
||||||
|
+ // is a belt-and-braces optimization, not a correctness primitive.
|
||||||
|
+ Q_UNUSED(entry);
|
||||||
|
}
|
||||||
|
|
||||||
|
} // namespace KWin
|
||||||
@@ -0,0 +1,125 @@
|
|||||||
|
# Maintainer: Markus Fritsche <mfritsche@reauktion.de>
|
||||||
|
# Upstream maintainers: Felix Yan, Antonio Rojas
|
||||||
|
# Contributor: Andrea Scarpino <andrea@archlinux.org>
|
||||||
|
#
|
||||||
|
# kwin-fourier — KWin 6.6.4 with the V4L2-stateless implicit-sync
|
||||||
|
# transaction wait bypass. Hypothesis: KWin's
|
||||||
|
# `Transaction::watchDmaBuf` calls DMA_BUF_IOCTL_EXPORT_SYNC_FILE on
|
||||||
|
# every plane of every imported dmabuf and parks the transaction on a
|
||||||
|
# QSocketNotifier waiting for the resulting sync_file fd to become
|
||||||
|
# readable. For V4L2 hantro CAPTURE buffers on RK3566 mainline 6.19,
|
||||||
|
# that fence either never signals or signals so late that chrome's
|
||||||
|
# 6-buffer V4L2 capture pool exhausts at ~6 seconds, blocking the
|
||||||
|
# decoder. mpv (single-surface attach pattern) merely slideshows
|
||||||
|
# under KWin (76% drop rate); chrome (subsurface attach) deadlocks.
|
||||||
|
#
|
||||||
|
# This experimental build no-ops `watchDmaBuf` to test the
|
||||||
|
# hypothesis. If chrome plays through end-to-end at the recorded
|
||||||
|
# 34.7% CPU number, the bug is confirmed and the upstream fix can be
|
||||||
|
# refined (e.g., short timeout, skip-on-V4L2, or use the dmabuf fd
|
||||||
|
# directly without exporting an extra sync_file). See
|
||||||
|
# ../chromium-fourier/KWIN_PIVOT.md for the full diagnosis thread.
|
||||||
|
|
||||||
|
pkgname=kwin
|
||||||
|
pkgver=6.6.4
|
||||||
|
_dirver=$(echo $pkgver | cut -d. -f1-3)
|
||||||
|
pkgrel=1
|
||||||
|
epoch=1
|
||||||
|
arch=(aarch64 x86_64)
|
||||||
|
url='https://kde.org/plasma-desktop/'
|
||||||
|
license=(LGPL-2.0-or-later)
|
||||||
|
depends=(aurorae
|
||||||
|
breeze
|
||||||
|
gcc-libs
|
||||||
|
glibc
|
||||||
|
iio-sensor-proxy
|
||||||
|
plasma-activities
|
||||||
|
kauth
|
||||||
|
kcmutils
|
||||||
|
kcolorscheme
|
||||||
|
kconfig
|
||||||
|
kcoreaddons
|
||||||
|
kcrash
|
||||||
|
kdbusaddons
|
||||||
|
kdeclarative
|
||||||
|
kdecoration
|
||||||
|
kglobalaccel
|
||||||
|
kglobalacceld
|
||||||
|
kguiaddons
|
||||||
|
ki18n
|
||||||
|
kidletime
|
||||||
|
kirigami
|
||||||
|
kitemmodels
|
||||||
|
knewstuff
|
||||||
|
knighttime
|
||||||
|
knotifications
|
||||||
|
kpackage
|
||||||
|
kquickcharts
|
||||||
|
kscreenlocker
|
||||||
|
kservice
|
||||||
|
ksvg
|
||||||
|
kwayland
|
||||||
|
kwidgetsaddons
|
||||||
|
kwindowsystem
|
||||||
|
kxmlgui
|
||||||
|
lcms2
|
||||||
|
libcanberra
|
||||||
|
libdisplay-info
|
||||||
|
libdrm
|
||||||
|
libei
|
||||||
|
libepoxy
|
||||||
|
libevdev
|
||||||
|
libinput
|
||||||
|
libpipewire
|
||||||
|
libqaccessibilityclient-qt6
|
||||||
|
libxcb
|
||||||
|
libxcvt
|
||||||
|
libxkbcommon
|
||||||
|
mesa
|
||||||
|
milou
|
||||||
|
pipewire-session-manager
|
||||||
|
libplasma
|
||||||
|
qt6-5compat
|
||||||
|
qt6-base
|
||||||
|
qt6-declarative
|
||||||
|
qt6-svg
|
||||||
|
qt6-tools
|
||||||
|
systemd-libs
|
||||||
|
wayland
|
||||||
|
xcb-util-keysyms
|
||||||
|
xcb-util-wm)
|
||||||
|
makedepends=(extra-cmake-modules
|
||||||
|
kdoctools
|
||||||
|
krunner
|
||||||
|
plasma-wayland-protocols
|
||||||
|
python
|
||||||
|
wayland-protocols
|
||||||
|
xorg-xwayland)
|
||||||
|
optdepends=('plasma-keyboard: virtual keyboard')
|
||||||
|
groups=(plasma)
|
||||||
|
source=(https://download.kde.org/stable/plasma/$_dirver/$pkgname-$pkgver.tar.xz{,.sig}
|
||||||
|
0001-transaction-bypass-watchDmaBuf-fence-wait.patch)
|
||||||
|
sha256sums=('3f9439760580a977d018daf4b35b62e5a1700def7b21c8dfbfc789d21378d7ad'
|
||||||
|
'SKIP'
|
||||||
|
'SKIP')
|
||||||
|
validpgpkeys=('E0A3EB202F8E57528E13E72FD7574483BB57B18D' # Jonathan Esk-Riddell <jr@jriddell.org>
|
||||||
|
'0AAC775BB6437A8D9AF7A3ACFE0784117FBCE11D' # Bhushan Shah <bshah@kde.org>
|
||||||
|
'D07BD8662C56CB291B316EB2F5675605C74E02CF' # David Edmundson <davidedmundson@kde.org>
|
||||||
|
'90A968ACA84537CC27B99EAF2C8DF587A6D4AAC1' # Nicolas Fella <nicolas.fella@kde.org>
|
||||||
|
'1FA881591C26B276D7A5518EEAAF29B42A678C20') # Marco Martin <notmart@gmail.com>
|
||||||
|
|
||||||
|
prepare() {
|
||||||
|
patch -d $pkgname-$pkgver -p1 < 0001-transaction-bypass-watchDmaBuf-fence-wait.patch
|
||||||
|
}
|
||||||
|
|
||||||
|
build() {
|
||||||
|
cmake -B build -S $pkgname-$pkgver \
|
||||||
|
-DCMAKE_INSTALL_LIBEXECDIR=lib \
|
||||||
|
-DBUILD_TESTING=OFF
|
||||||
|
cmake --build build
|
||||||
|
}
|
||||||
|
|
||||||
|
package() {
|
||||||
|
DESTDIR="$pkgdir" cmake --install build
|
||||||
|
setcap CAP_SYS_NICE=+ep "$pkgdir"/usr/bin/kwin_wayland
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user