forked from marfrit/marfrit-packages
kwin-fourier: bypass watchDmaBuf implicit-sync fence wait (experiment)
Hypothesis under test: KWin's Transaction::watchDmaBuf calls DMA_BUF_IOCTL_EXPORT_SYNC_FILE on every plane of every imported dmabuf and parks the transaction on a QSocketNotifier(POLLIN) waiting for that sync_file. On V4L2 hantro CAPTURE buffers (RK3566 mainline 6.19, panfrost mesa 26.0.5) the resulting fence either never signals or signals so late that chrome's 6-buffer V4L2 capture pool exhausts at ~6s, hard-stalling the decoder. mpv with gpu-next slideshows at 76% drop. weston A/B with same chrome v4 binary plays through clean — KWin's watchDmaBuf is the suspect. This experiment patches watchDmaBuf to no-op. Wayland clients are required by spec to ensure buffer contents are complete before wl_surface.attach+commit, so the fence-wait is a defensive optimization for misbehaving clients, not a correctness primitive. If chrome plays through end-to-end at the recorded 34.7% combined CPU number with this patched KWin, the bug is confirmed and the upstream fix can be refined (timeout, V4L2-source skip, or use the dmabuf fd directly in the QSocketNotifier instead of an extra exported sync_file). KWIN_PIVOT.md (in chromium-fourier/) carries the discovery thread.
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
From: Markus Fritsche <mfritsche@reauktion.de>
|
||||
Subject: [PATCH] transaction: bypass watchDmaBuf implicit-sync fence wait
|
||||
Date: 2026-04-28
|
||||
|
||||
Background
|
||||
----------
|
||||
KWin's `Transaction::watchDmaBuf` (src/wayland/transaction.cpp) calls
|
||||
`DMA_BUF_IOCTL_EXPORT_SYNC_FILE` on every plane of every imported
|
||||
dmabuf and parks the transaction on a `QSocketNotifier(POLLIN)`
|
||||
waiting for the resulting sync_file fd to become readable. The intent
|
||||
is correct in principle — wait for the producer to finish writing
|
||||
before sampling — but on V4L2 hantro CAPTURE buffers (RK3566 mainline
|
||||
6.19, panfrost mesa 26.0.5) the resulting fence either signals so
|
||||
late that chrome's 6-buffer V4L2 capture pool exhausts, or never
|
||||
signals at all. Symptom (per chromium-fourier KWIN_PIVOT.md):
|
||||
|
||||
- chrome v4 attaches a video frame to a wp_subsurface, commits
|
||||
- KWin's Transaction::commit calls watchDmaBuf, exports a sync_file,
|
||||
parks on QSocketNotifier
|
||||
- Sync_file never becomes readable
|
||||
- Transaction never applies; old surface state never replaced
|
||||
- wl_buffer.release for the previous video buffer never sent
|
||||
- chrome's V4L2 capture pool starves at ~6 seconds, decoder blocks,
|
||||
audio drains, hard stall
|
||||
|
||||
mpv with `--vo=gpu-next` on the same KWin session slideshows at 76%
|
||||
drop rate but does not deadlock — its single-surface attach pattern
|
||||
hits a different transaction shape than chrome's subsurface flow.
|
||||
|
||||
A clean weston A/B with the same chrome v4 binary plays through
|
||||
end-to-end: the bug is specifically KWin's transaction fence-wait
|
||||
path on this stack, not Wayland-as-a-protocol.
|
||||
|
||||
Fix
|
||||
---
|
||||
This experimental patch no-ops `watchDmaBuf` to test the hypothesis.
|
||||
Implicit-sync correctness in this case is not lost: the V4L2
|
||||
producer guarantees the buffer's contents are complete before
|
||||
chrome sends `wl_surface.attach + commit`, and the wp_linux_dmabuf
|
||||
client is required to do so by spec. The fence-wait was a defensive
|
||||
optimization for misbehaving clients, not a correctness primitive.
|
||||
|
||||
If chrome plays through end-to-end at the recorded 34.7% combined
|
||||
CPU number under KWin with this patch, the bug is confirmed and the
|
||||
upstream fix can be refined (timeout, V4L2-source skip, or use the
|
||||
dmabuf fd directly in the QSocketNotifier instead of an extra
|
||||
exported sync_file).
|
||||
|
||||
diff --git a/src/wayland/transaction.cpp b/src/wayland/transaction.cpp
|
||||
index 967b22b..e3fbc06 100644
|
||||
--- a/src/wayland/transaction.cpp
|
||||
+++ b/src/wayland/transaction.cpp
|
||||
@@ -263,27 +263,18 @@ static FileDescriptor exportWaitSyncFile(const FileDescriptor &fileDescriptor)
|
||||
return FileDescriptor{};
|
||||
}
|
||||
#endif
|
||||
|
||||
void Transaction::watchDmaBuf(TransactionEntry *entry)
|
||||
{
|
||||
-#if defined(Q_OS_LINUX)
|
||||
- const DmaBufAttributes *attributes = entry->buffer->dmabufAttributes();
|
||||
- if (!attributes) {
|
||||
- return;
|
||||
- }
|
||||
-
|
||||
- for (int i = 0; i < attributes->planeCount; ++i) {
|
||||
- const FileDescriptor &fileDescriptor = attributes->fd[i];
|
||||
- if (fileDescriptor.isReadable()) {
|
||||
- continue;
|
||||
- }
|
||||
-
|
||||
- auto syncFile = exportWaitSyncFile(fileDescriptor);
|
||||
- if (syncFile.isValid()) {
|
||||
- entry->fences.emplace_back(std::make_unique<TransactionFence>(this, std::move(syncFile)));
|
||||
- }
|
||||
- }
|
||||
-#endif
|
||||
+ // kwin-fourier: no-op the implicit-sync fence wait. On V4L2
|
||||
+ // hantro CAPTURE buffers (RK3566 mainline 6.19, panfrost mesa
|
||||
+ // 26.0.5) the DMA_BUF_IOCTL_EXPORT_SYNC_FILE fence either never
|
||||
+ // signals or signals so late that chrome's V4L2 capture pool
|
||||
+ // exhausts at ~6s, hard-stalling the decoder. Wayland clients
|
||||
+ // are required by spec to ensure the buffer's contents are
|
||||
+ // complete before wl_surface.attach+commit, so this fence-wait
|
||||
+ // is a belt-and-braces optimization, not a correctness primitive.
|
||||
+ Q_UNUSED(entry);
|
||||
}
|
||||
|
||||
} // namespace KWin
|
||||
@@ -0,0 +1,125 @@
|
||||
# Maintainer: Markus Fritsche <mfritsche@reauktion.de>
|
||||
# Upstream maintainers: Felix Yan, Antonio Rojas
|
||||
# Contributor: Andrea Scarpino <andrea@archlinux.org>
|
||||
#
|
||||
# kwin-fourier — KWin 6.6.4 with the V4L2-stateless implicit-sync
|
||||
# transaction wait bypass. Hypothesis: KWin's
|
||||
# `Transaction::watchDmaBuf` calls DMA_BUF_IOCTL_EXPORT_SYNC_FILE on
|
||||
# every plane of every imported dmabuf and parks the transaction on a
|
||||
# QSocketNotifier waiting for the resulting sync_file fd to become
|
||||
# readable. For V4L2 hantro CAPTURE buffers on RK3566 mainline 6.19,
|
||||
# that fence either never signals or signals so late that chrome's
|
||||
# 6-buffer V4L2 capture pool exhausts at ~6 seconds, blocking the
|
||||
# decoder. mpv (single-surface attach pattern) merely slideshows
|
||||
# under KWin (76% drop rate); chrome (subsurface attach) deadlocks.
|
||||
#
|
||||
# This experimental build no-ops `watchDmaBuf` to test the
|
||||
# hypothesis. If chrome plays through end-to-end at the recorded
|
||||
# 34.7% CPU number, the bug is confirmed and the upstream fix can be
|
||||
# refined (e.g., short timeout, skip-on-V4L2, or use the dmabuf fd
|
||||
# directly without exporting an extra sync_file). See
|
||||
# ../chromium-fourier/KWIN_PIVOT.md for the full diagnosis thread.
|
||||
|
||||
pkgname=kwin
|
||||
pkgver=6.6.4
|
||||
_dirver=$(echo $pkgver | cut -d. -f1-3)
|
||||
pkgrel=1
|
||||
epoch=1
|
||||
arch=(aarch64 x86_64)
|
||||
url='https://kde.org/plasma-desktop/'
|
||||
license=(LGPL-2.0-or-later)
|
||||
depends=(aurorae
|
||||
breeze
|
||||
gcc-libs
|
||||
glibc
|
||||
iio-sensor-proxy
|
||||
plasma-activities
|
||||
kauth
|
||||
kcmutils
|
||||
kcolorscheme
|
||||
kconfig
|
||||
kcoreaddons
|
||||
kcrash
|
||||
kdbusaddons
|
||||
kdeclarative
|
||||
kdecoration
|
||||
kglobalaccel
|
||||
kglobalacceld
|
||||
kguiaddons
|
||||
ki18n
|
||||
kidletime
|
||||
kirigami
|
||||
kitemmodels
|
||||
knewstuff
|
||||
knighttime
|
||||
knotifications
|
||||
kpackage
|
||||
kquickcharts
|
||||
kscreenlocker
|
||||
kservice
|
||||
ksvg
|
||||
kwayland
|
||||
kwidgetsaddons
|
||||
kwindowsystem
|
||||
kxmlgui
|
||||
lcms2
|
||||
libcanberra
|
||||
libdisplay-info
|
||||
libdrm
|
||||
libei
|
||||
libepoxy
|
||||
libevdev
|
||||
libinput
|
||||
libpipewire
|
||||
libqaccessibilityclient-qt6
|
||||
libxcb
|
||||
libxcvt
|
||||
libxkbcommon
|
||||
mesa
|
||||
milou
|
||||
pipewire-session-manager
|
||||
libplasma
|
||||
qt6-5compat
|
||||
qt6-base
|
||||
qt6-declarative
|
||||
qt6-svg
|
||||
qt6-tools
|
||||
systemd-libs
|
||||
wayland
|
||||
xcb-util-keysyms
|
||||
xcb-util-wm)
|
||||
makedepends=(extra-cmake-modules
|
||||
kdoctools
|
||||
krunner
|
||||
plasma-wayland-protocols
|
||||
python
|
||||
wayland-protocols
|
||||
xorg-xwayland)
|
||||
optdepends=('plasma-keyboard: virtual keyboard')
|
||||
groups=(plasma)
|
||||
source=(https://download.kde.org/stable/plasma/$_dirver/$pkgname-$pkgver.tar.xz{,.sig}
|
||||
0001-transaction-bypass-watchDmaBuf-fence-wait.patch)
|
||||
sha256sums=('3f9439760580a977d018daf4b35b62e5a1700def7b21c8dfbfc789d21378d7ad'
|
||||
'SKIP'
|
||||
'SKIP')
|
||||
validpgpkeys=('E0A3EB202F8E57528E13E72FD7574483BB57B18D' # Jonathan Esk-Riddell <jr@jriddell.org>
|
||||
'0AAC775BB6437A8D9AF7A3ACFE0784117FBCE11D' # Bhushan Shah <bshah@kde.org>
|
||||
'D07BD8662C56CB291B316EB2F5675605C74E02CF' # David Edmundson <davidedmundson@kde.org>
|
||||
'90A968ACA84537CC27B99EAF2C8DF587A6D4AAC1' # Nicolas Fella <nicolas.fella@kde.org>
|
||||
'1FA881591C26B276D7A5518EEAAF29B42A678C20') # Marco Martin <notmart@gmail.com>
|
||||
|
||||
prepare() {
|
||||
patch -d $pkgname-$pkgver -p1 < 0001-transaction-bypass-watchDmaBuf-fence-wait.patch
|
||||
}
|
||||
|
||||
build() {
|
||||
cmake -B build -S $pkgname-$pkgver \
|
||||
-DCMAKE_INSTALL_LIBEXECDIR=lib \
|
||||
-DBUILD_TESTING=OFF
|
||||
cmake --build build
|
||||
}
|
||||
|
||||
package() {
|
||||
DESTDIR="$pkgdir" cmake --install build
|
||||
setcap CAP_SYS_NICE=+ep "$pkgdir"/usr/bin/kwin_wayland
|
||||
}
|
||||
Reference in New Issue
Block a user