Files
kernel-agent/patches/subsystem/media/videobuf2/dma-resv-release-fence/0004-media-videobuf2-add-opt-in-dma_resv-producer-fence-h.patch
T
claude-noether 18da673ccc phase 1: promote vb2_dma_resv RFC v2 + add ka-status + ampere as 2nd aarch64 host
Three changes that together flip kernel-agent from spec'd to operational
in the manual-orchestrated form. Real ka-* CLI verbs come in later phases;
this commit gets a first iteration through the pipeline and proves the
flow at the artifact level.

1. Promote vb2_dma_resv RFC v2 series into the scope-tagged tree

Markus iterated v2 locally on boltzmann (kernel-agent-bootstrap dir,
reaching linux-fresnel-fourier pkgrel=14). v2 attaches the producer
fence at device_run in slept-OK context per Dufresne's v1 review on
linux-media. The three patches land under
patches/subsystem/media/videobuf2/dma-resv-release-fence/:

  - 0004 (helper) — opt-in vb2 dma_resv producer-fence helper
  - 0005 (driver opt-in) — hantro device_run attach
  - 0006 (driver opt-in) — rockchip-rga device_run attach

Numbered 4/5/6 because the fresnel build PKGBUILD applies them after
the three 0001/0002/0003 PBP DTS patches; this directory's numbering
follows that apply-order, not the upstream lore series numbering.

README at the scope dir documents fleet eligibility, decision history,
and the v1 → v2 design pivot.

2. Update fleet/fresnel.yaml to include the v2 series

Pre-v2 manifest had a comment block 'Explicitly NOT included … vb2
dma-resv-release-fence … defer until v2 lands'. v2 has landed. Move
those three lines from 'excluded' to 'includes', annotate the decision
inline.

3. README updates

- Build hosts table: add ampere (CoolPi GenBook, RK3588 32GB) as
  secondary aarch64 host. Same uarch as boltzmann, on-demand wake via
  His. Gives the fleet a second native build target for when boltzmann
  is busy (e.g. carrying a firefox-fourier 4h build).
- 'Out of scope this round' bootstrap section: mark vb2_dma_resv as
  resolved 2026-05-15, keep panfrost IOMMU_CACHE deferred.

4. First ka-* CLI verb implemented: bin/ka-status

bash, ~120 lines. Reads fleet/*.yaml manifests, queries Gitea for
open [ka:*] issues, probes each reachable host for the installed
kernel-package version. Read-only — no sudo, no host writes. Picks
GITEA_TOKEN from /opt/herding/etc/claude-identities/<host>.creds or
env override.

Proves the agent's Gitea-API + manifest-parsing skeleton works
end-to-end without committing to a full ka-promote/build/install
implementation. Smoke-tested locally:

  $ bin/ka-status
  kernel-agent status (repo: marfrit/kernel-agent)
  open [ka:*] issues total: 1
  ══ fresnel ══
    manifest: arch=arm64 soc=rockchip/rk3399 board=pinebook-pro
    package:  linux-fresnel-fourier
    installed: host-down            # (fresnel is currently powered off)
    open ka-issues: (none for this host)

No PKGBUILD update in this PR — that lives in marfrit-packages and
ships as a sibling PR (the actual linux-fresnel-fourier-7.0-14 publish).
2026-05-15 15:32:00 +00:00

357 lines
13 KiB
Diff

From a202de1646d4c8f8ee2ebc2e4c100b621975754a Mon Sep 17 00:00:00 2001
In-Reply-To: <20260429195306.239666-1-mfritsche@reauktion.de>
References: <20260429195306.239666-1-mfritsche@reauktion.de>
From: Markus Fritsche <mfritsche@reauktion.de>
Date: Sat, 9 May 2026 16:16:07 +0200
Subject: [PATCH RFC v2] media: videobuf2: add opt-in dma_resv producer fence
helper
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
V4L2 producers historically don't propagate buffer-state-done into
the dmabuf's dma_resv exclusive fence. Userspace consumers that
import V4L2-produced dmabufs and wait on the dmabuf's implicit-sync
fence (poll(POLLIN), DMA_BUF_IOCTL_EXPORT_SYNC_FILE,
EGL_LINUX_DMA_BUF_EXT) currently see either zero fences or a stub
fence from dma_fence_get_stub(). This is correct by accident for the
common DQBUF-then-import case but represents a contract gap that
breaks Wayland compositors importing CAPTURE buffers from a stateless
H.264 decoder under continuous playback on implicit-sync GPU stacks
(observed on RK3566 + hantro VPU + Mali-G52 panfrost; manifests as
green frames -- BT.709 limited-range YUV(0,0,0) -> RGB(0,77,0) -- when
the GPU samples the dmabuf before the producer's decode completes).
Add an opt-in API gated by both a per-driver runtime flag
(vb2_queue::supports_release_fences) and a Kconfig
(CONFIG_VIDEOBUF2_RELEASE_FENCES, default n) that lets producers
populate a real dma_resv exclusive write fence on the dmabufs they
export. Drivers call vb2_buffer_attach_release_fence(vb) at a
finite-time-fenced point in their pipeline (typically m2m
device_run, just before the HW kick); vb2_buffer_done() signals and
puts the fence as part of its state transition.
The publish and signal paths are wrapped in
dma_fence_begin_signalling() / dma_fence_end_signalling() so
PROVE_LOCKING can validate that nothing taken in those critical
sections deadlocks against the signal path. dma_resv_lock is
sleepable but not taken on the signal path, so taking it inside the
publish critical section is safe under lockdep.
Skips planes whose vb2_plane.dbuf is NULL -- buffers never exported
via VIDIOC_EXPBUF (or imported via V4L2_MEMORY_DMABUF) have no
dmabuf for userspace to wait on.
Drivers that don't opt in pay nothing: the helper is a no-op stub
when CONFIG_VIDEOBUF2_RELEASE_FENCES=n, and an early-return check
of supports_release_fences when =y but the flag is unset.
Validated on RK3566 PineTab2 with PROVE_LOCKING enabled: 30s of
bbb_1080p30 H.264 stateless decode + zero-copy panfrost EGL import
via dmabuf-wayland (mpv 0.41 + KWin 6.6.4 + Mesa panfrost 26.0.5)
produces 31,816 dma_fence init/signal pairs across 5,724 vb2 buffer
cycles with zero lockdep splats from videobuf2 / dma_resv code paths.
Subsequent patches in this series opt the hantro and rockchip-rga
drivers in.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Christian König <christian.koenig@amd.com>
Cc: Nicolas Dufresne <nicolas@ndufresne.ca>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Markus Fritsche <mfritsche@reauktion.de>
---
drivers/media/common/videobuf2/Kconfig | 29 ++++
.../media/common/videobuf2/videobuf2-core.c | 135 ++++++++++++++++++
include/media/videobuf2-core.h | 51 +++++++
3 files changed, 215 insertions(+)
diff --git a/drivers/media/common/videobuf2/Kconfig b/drivers/media/common/videobuf2/Kconfig
index d2223a12c..bbfa26984 100644
--- a/drivers/media/common/videobuf2/Kconfig
+++ b/drivers/media/common/videobuf2/Kconfig
@@ -30,3 +30,32 @@ config VIDEOBUF2_DMA_SG
config VIDEOBUF2_DVB
tristate
select VIDEOBUF2_CORE
+
+config VIDEOBUF2_RELEASE_FENCES
+ bool "videobuf2: opt-in dma_resv producer fences for V4L2 dmabuf exports"
+ depends on VIDEOBUF2_CORE
+ depends on DMA_SHARED_BUFFER
+ default n
+ help
+ Enables an opt-in API that lets vb2 producers populate a dma_resv
+ exclusive write fence on the dmabufs they export to userspace.
+ The fence is signalled when the buffer transitions to
+ VB2_BUF_STATE_DONE.
+
+ This gives userspace consumers that import V4L2-produced dmabufs
+ and wait on the dmabuf's implicit-sync fence (poll(POLLIN),
+ DMA_BUF_IOCTL_EXPORT_SYNC_FILE, EGL_LINUX_DMA_BUF_EXT) a real
+ producer fence to wait on, instead of a stub fence from
+ dma_fence_get_stub() that the dma_buf core substitutes when
+ dma_resv is empty.
+
+ Drivers individually opt in by setting
+ vb2_queue::supports_release_fences = true and calling
+ vb2_buffer_attach_release_fence() at the right point in their
+ pipeline (typically m2m device_run, just before HW kick).
+
+ Distributors leave this off unless targeting Wayland/EGL
+ consumers of V4L2 stateless decoder output on
+ implicit-sync-only GPU stacks (e.g. mainline panfrost).
+
+ If unsure, say N.
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index adf668b21..85d7fddbd 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -26,6 +26,12 @@
#include <linux/freezer.h>
#include <linux/kthread.h>
+#ifdef CONFIG_VIDEOBUF2_RELEASE_FENCES
+#include <linux/dma-fence.h>
+#include <linux/dma-resv.h>
+#include <linux/dma-buf.h>
+#endif
+
#include <media/videobuf2-core.h>
#include <media/v4l2-mc.h>
@@ -1173,6 +1179,120 @@ void *vb2_plane_cookie(struct vb2_buffer *vb, unsigned int plane_no)
}
EXPORT_SYMBOL_GPL(vb2_plane_cookie);
+#ifdef CONFIG_VIDEOBUF2_RELEASE_FENCES
+/*
+ * dma_resv release-fence integration.
+ *
+ * Optional, opt-in path that lets producers publish a real
+ * dma_fence on their CAPTURE-side dmabufs so userspace consumers
+ * (compositors, EGL importers) get spec-clean implicit-sync
+ * semantics instead of the dma_buf core's stub fence. Drivers
+ * call vb2_buffer_attach_release_fence() at a finite-time-fenced
+ * point (typically m2m device_run) and the fence is signalled by
+ * vb2_buffer_done(). Gated at runtime by
+ * vb2_queue::supports_release_fences and at compile time by
+ * CONFIG_VIDEOBUF2_RELEASE_FENCES.
+ */
+
+static const char *vb2_dma_resv_get_driver_name(struct dma_fence *fence)
+{
+ return "videobuf2";
+}
+
+static const char *vb2_dma_resv_get_timeline_name(struct dma_fence *fence)
+{
+ return "vb2-release-fence";
+}
+
+static const struct dma_fence_ops vb2_dma_resv_fence_ops = {
+ .get_driver_name = vb2_dma_resv_get_driver_name,
+ .get_timeline_name = vb2_dma_resv_get_timeline_name,
+};
+
+int vb2_buffer_attach_release_fence(struct vb2_buffer *vb)
+{
+ struct vb2_queue *q = vb->vb2_queue;
+ struct dma_fence *fence;
+ unsigned int plane;
+ bool cookie;
+
+ if (!q->supports_release_fences)
+ return 0;
+
+ if (WARN_ON(vb->release_fence))
+ return -EINVAL;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (!fence)
+ return -ENOMEM;
+
+ dma_fence_init(fence, &vb2_dma_resv_fence_ops, &q->dma_resv_fence_lock,
+ q->dma_resv_fence_context,
+ atomic64_inc_return(&q->dma_resv_fence_seqno));
+
+ /*
+ * Annotate the publish-side critical section. Per
+ * Documentation/driver-api/dma-buf.rst, lockdep validates
+ * that nothing taken in this region can deadlock against
+ * the signal path in vb2_buffer_signal_release_fence().
+ * dma_resv_lock is sleepable but is not taken on the signal
+ * path, so taking it inside the critical section is safe.
+ */
+ cookie = dma_fence_begin_signalling();
+ for (plane = 0; plane < vb->num_planes; plane++) {
+ struct dma_buf *dbuf = vb->planes[plane].dbuf;
+
+ if (!dbuf)
+ continue;
+
+ dma_resv_lock(dbuf->resv, NULL);
+ dma_resv_add_fence(dbuf->resv, fence, DMA_RESV_USAGE_WRITE);
+ dma_resv_unlock(dbuf->resv);
+ }
+ dma_fence_end_signalling(cookie);
+
+ /* One reference for the eventual signal in vb2_buffer_done. */
+ vb->release_fence = dma_fence_get(fence);
+
+ /* The dma_resv held its own reference per plane. Drop ours. */
+ dma_fence_put(fence);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_buffer_attach_release_fence);
+
+static void vb2_buffer_signal_release_fence(struct vb2_buffer *vb,
+ enum vb2_buffer_state state)
+{
+ struct dma_fence *fence = vb->release_fence;
+ bool cookie;
+
+ if (!fence)
+ return;
+
+ cookie = dma_fence_begin_signalling();
+ if (state == VB2_BUF_STATE_ERROR)
+ dma_fence_set_error(fence, -EIO);
+ dma_fence_signal(fence);
+ dma_fence_end_signalling(cookie);
+
+ dma_fence_put(fence);
+ vb->release_fence = NULL;
+}
+#else /* !CONFIG_VIDEOBUF2_RELEASE_FENCES */
+
+int vb2_buffer_attach_release_fence(struct vb2_buffer *vb)
+{
+ return 0;
+}
+EXPORT_SYMBOL_GPL(vb2_buffer_attach_release_fence);
+
+static inline void vb2_buffer_signal_release_fence(struct vb2_buffer *vb,
+ enum vb2_buffer_state state)
+{
+}
+#endif /* CONFIG_VIDEOBUF2_RELEASE_FENCES */
+
void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
{
struct vb2_queue *q = vb->vb2_queue;
@@ -1199,6 +1319,9 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
if (state != VB2_BUF_STATE_QUEUED)
__vb2_buf_mem_finish(vb);
+ if (state != VB2_BUF_STATE_QUEUED)
+ vb2_buffer_signal_release_fence(vb, state);
+
spin_lock_irqsave(&q->done_lock, flags);
if (state == VB2_BUF_STATE_QUEUED) {
vb->state = VB2_BUF_STATE_QUEUED;
@@ -2651,6 +2774,18 @@ int vb2_core_queue_init(struct vb2_queue *q)
mutex_init(&q->mmap_lock);
init_waitqueue_head(&q->done_wq);
+#ifdef CONFIG_VIDEOBUF2_RELEASE_FENCES
+ /*
+ * Per-queue dma_resv release-fence context. Drivers that
+ * opt in via supports_release_fences and call
+ * vb2_buffer_attach_release_fence() use these to allocate
+ * fences on a single per-queue timeline.
+ */
+ q->dma_resv_fence_context = dma_fence_context_alloc(1);
+ atomic64_set(&q->dma_resv_fence_seqno, 0);
+ spin_lock_init(&q->dma_resv_fence_lock);
+#endif
+
q->memory = VB2_MEMORY_UNKNOWN;
if (q->buf_struct_size == 0)
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 4424d481d..766ff2194 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -288,6 +288,16 @@ struct vb2_buffer {
unsigned int skip_cache_sync_on_finish:1;
struct vb2_plane planes[VB2_MAX_PLANES];
+#ifdef CONFIG_VIDEOBUF2_RELEASE_FENCES
+ /*
+ * Producer release fence published on each plane's
+ * dmabuf->resv when the driver opts in via
+ * vb2_buffer_attach_release_fence(). Signalled and put by
+ * vb2_buffer_done() on transition to DONE/ERROR. NULL when
+ * the driver did not opt in for this buffer.
+ */
+ struct dma_fence *release_fence;
+#endif
struct list_head queued_entry;
struct list_head done_entry;
#ifdef CONFIG_VIDEO_ADV_DEBUG
@@ -648,6 +658,19 @@ struct vb2_queue {
spinlock_t done_lock;
wait_queue_head_t done_wq;
+#ifdef CONFIG_VIDEOBUF2_RELEASE_FENCES
+ /*
+ * dma_resv release-fence context. Drivers that set
+ * supports_release_fences and call
+ * vb2_buffer_attach_release_fence() use these to allocate
+ * fences on a per-queue timeline.
+ */
+ u64 dma_resv_fence_context;
+ atomic64_t dma_resv_fence_seqno;
+ spinlock_t dma_resv_fence_lock;
+#endif
+
+ unsigned int supports_release_fences:1;
unsigned int streaming:1;
unsigned int start_streaming_called:1;
unsigned int error:1;
@@ -735,6 +758,34 @@ void *vb2_plane_cookie(struct vb2_buffer *vb, unsigned int plane_no);
*/
void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state);
+/**
+ * vb2_buffer_attach_release_fence() - opt-in dma_resv release fence.
+ * @vb: the buffer being committed to the producer.
+ *
+ * Drivers that have set vb2_queue::supports_release_fences may call
+ * this from any sleepable context where they have committed to
+ * running the operation in finite time -- typically m2m
+ * device_run(), just before the HW kick. The helper allocates a
+ * dma_fence on the queue's per-queue timeline, attaches it as
+ * DMA_RESV_USAGE_WRITE on each plane's dmabuf->resv, and stashes
+ * it in vb->release_fence. vb2_buffer_done() signals and puts the
+ * fence as part of the buffer's state transition.
+ *
+ * Skips planes whose vb2_plane.dbuf is NULL -- buffers never
+ * exported via VIDIOC_EXPBUF (or imported via V4L2_MEMORY_DMABUF)
+ * have no dmabuf for userspace to wait on.
+ *
+ * No-op when vb2_queue::supports_release_fences is not set
+ * (regardless of CONFIG_VIDEOBUF2_RELEASE_FENCES). When
+ * CONFIG_VIDEOBUF2_RELEASE_FENCES=n, this is a stub that returns 0.
+ *
+ * Returns 0 on success or when the no-op stub is in effect,
+ * negative errno on allocation failure when fence publishing was
+ * attempted. Best-effort: drivers should ignore the return value
+ * unless they want diagnostics.
+ */
+int vb2_buffer_attach_release_fence(struct vb2_buffer *vb);
+
/**
* vb2_discard_done() - discard all buffers marked as DONE.
* @q: pointer to &struct vb2_queue with videobuf2 queue.
--
2.53.0