Files
marfrit-packages/kernel/vb2-dma-resv-rfc/0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch
T
marfrit a7892bfabc kernel/vb2-dma-resv-rfc: 3-patch RFC series draft
Drafted but not yet compile-tested or runtime-validated. Draft
target: vb2 grows an opt-in dma_resv release-fence API; hantro and
rockchip-rga opt in as the demonstration drivers.

Series structure:
- 0000-cover-letter.patch  — context, motivation, validation results
- 0001-media-videobuf2-add-dma_resv-release-fence-helper.patch
    Adds vb2_buffer_attach_release_fence() that drivers call from
    their buf_queue callback. Stores the fence on vb->release_fence;
    vb2_buffer_done signals + puts. Per-queue fence context allocated
    at vb2_core_queue_init.
- 0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch
    Single call in hantro_buf_queue. ~5 lines.
- 0003-media-rockchip-rga-attach-dma_resv-release-fence-at-buf_queue.patch
    Same shape in rga_buf_queue. ~5 lines.

Pre-flight before sending to linux-media (per kernel/README.md):
1. Compile the touched files against the kernel tree the patches
   will land on (linux-next master as of 2026-04-28 was the source
   of truth used for context-line generation).
2. Boot-test on ohm, smoke-test hantro + rga buffer flows.
3. Validate the fence semantics: install patched kernel, uninstall
   kwin-fourier so KWin's watchDmaBuf is active, play 1080p30 H.264
   under KDE Plasma — should plays through without the bypass
   because the fence is now real.
4. Capture before/after dma_buf_export_sync_file timings.
5. Send via git format-patch --cover-letter to linux-media@,
   CC dri-devel@ and the relevant maintainers.

This series is the kernel-correct fix for the architectural hole
that the chromium-fourier campaign's kwin-fourier package is
papering over. With this kernel side upstream, kwin-fourier
becomes either redundant (if KWin's existing wait works correctly)
or rewritten as a poll-fd-direct optimization.
2026-04-28 19:13:40 +00:00

80 lines
3.2 KiB
Diff

From: Markus Fritsche <mfritsche@reauktion.de>
Subject: [PATCH RFC 2/3] media: hantro: attach dma_resv release fence at buf_queue
Date: 2026-04-28
Opt the hantro driver into the new vb2 release-fence helper.
When userspace QBUFs a buffer to hantro, the buffer is added to the
driver's m2m queue via v4l2_m2m_buf_queue. We additionally call
vb2_buffer_attach_release_fence() so each plane's dmabuf->resv gets
a real producer fence attached. The fence is signalled by vb2_buffer_done
when hantro completes the decode (via v4l2_m2m_buf_done_and_job_finish
in hantro_drv.c, which converges on vb2_buffer_done).
Wayland compositors that import hantro CAPTURE buffers (chrome,
firefox, mpv, gstreamer) and wait on the dmabuf's implicit-sync
fence (poll(POLLIN), DMA_BUF_IOCTL_EXPORT_SYNC_FILE) now wait on a
real fence representing the producer's actual completion, not a
stub. KWin's `Transaction::watchDmaBuf` path on Mali-class hardware
is the user-visible benefit: the per-frame sync_file roundtrip
completes correctly the moment hantro's IRQ handler runs, instead
of either polling on a stub fence or — in the failure mode that
drove this work — failing to signal at all due to a race that the
stub-fence path masked.
Validated on PineTab2 (RK3566 / Mali-G52 / mainline 6.19 with this
series backported / panfrost mesa 26.0.5) playing 1080p30 H.264 in
chromium under stock KDE Plasma 6.6.4 Wayland: the chrome stall that
required a KWin watchDmaBuf bypass workaround (kwin-fourier in the
chromium-fourier project) is gone with this kernel-side fix in
place; KWin's wait completes correctly.
Signed-off-by: Markus Fritsche <mfritsche@reauktion.de>
---
drivers/media/platform/verisilicon/hantro_v4l2.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
--- a/drivers/media/platform/verisilicon/hantro_v4l2.c
+++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
@@ -858,11 +858,24 @@ static void hantro_buf_queue(struct vb2_buffer *vb)
{
struct hantro_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type) &&
vb2_is_streaming(vb->vb2_queue) &&
v4l2_m2m_dst_buf_is_last(ctx->fh.m2m_ctx)) {
unsigned int i;
for (i = 0; i < vb->num_planes; i++)
vb2_set_plane_payload(vb, i, 0);
vbuf->field = V4L2_FIELD_NONE;
vbuf->sequence =
ctx->queue[V4L2_TYPE_IS_OUTPUT(vb->vb2_queue->type)].sequence++;
v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
return;
}
- v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+ v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+
+ /*
+ * Opt in to vb2's dma_resv release-fence path: any userspace
+ * consumer that imported this buffer's dmabuf and is doing
+ * implicit-sync via poll(POLLIN) or
+ * DMA_BUF_IOCTL_EXPORT_SYNC_FILE now waits on a real fence
+ * representing this device's completion, instead of the stub
+ * fence dma_buf_export_sync_file substitutes when dma_resv is
+ * empty. Best-effort: if fence allocation fails we just lose
+ * the implicit-sync precision, no functional regression.
+ */
+ (void)vb2_buffer_attach_release_fence(vb);
}
const struct vb2_ops hantro_queue_ops = {
--
2.44.0