a7892bfabc
Drafted but not yet compile-tested or runtime-validated. Draft
target: vb2 grows an opt-in dma_resv release-fence API; hantro and
rockchip-rga opt in as the demonstration drivers.
Series structure:
- 0000-cover-letter.patch — context, motivation, validation results
- 0001-media-videobuf2-add-dma_resv-release-fence-helper.patch
Adds vb2_buffer_attach_release_fence() that drivers call from
their buf_queue callback. Stores the fence on vb->release_fence;
vb2_buffer_done signals + puts. Per-queue fence context allocated
at vb2_core_queue_init.
- 0002-media-hantro-attach-dma_resv-release-fence-at-buf_queue.patch
Single call in hantro_buf_queue. ~5 lines.
- 0003-media-rockchip-rga-attach-dma_resv-release-fence-at-buf_queue.patch
Same shape in rga_buf_queue. ~5 lines.
Pre-flight before sending to linux-media (per kernel/README.md):
1. Compile the touched files against the kernel tree the patches
will land on (linux-next master as of 2026-04-28 was the source
of truth used for context-line generation).
2. Boot-test on ohm, smoke-test hantro + rga buffer flows.
3. Validate the fence semantics: install patched kernel, uninstall
kwin-fourier so KWin's watchDmaBuf is active, play 1080p30 H.264
under KDE Plasma — should plays through without the bypass
because the fence is now real.
4. Capture before/after dma_buf_export_sync_file timings.
5. Send via git format-patch --cover-letter to linux-media@,
CC dri-devel@ and the relevant maintainers.
This series is the kernel-correct fix for the architectural hole
that the chromium-fourier campaign's kwin-fourier package is
papering over. With this kernel side upstream, kwin-fourier
becomes either redundant (if KWin's existing wait works correctly)
or rewritten as a poll-fd-direct optimization.
80 lines
3.2 KiB
Diff
80 lines
3.2 KiB
Diff
From: Markus Fritsche <mfritsche@reauktion.de>
|
|
Subject: [PATCH RFC 2/3] media: hantro: attach dma_resv release fence at buf_queue
|
|
Date: 2026-04-28
|
|
|
|
Opt the hantro driver into the new vb2 release-fence helper.
|
|
|
|
When userspace QBUFs a buffer to hantro, the buffer is added to the
|
|
driver's m2m queue via v4l2_m2m_buf_queue. We additionally call
|
|
vb2_buffer_attach_release_fence() so each plane's dmabuf->resv gets
|
|
a real producer fence attached. The fence is signalled by vb2_buffer_done
|
|
when hantro completes the decode (via v4l2_m2m_buf_done_and_job_finish
|
|
in hantro_drv.c, which converges on vb2_buffer_done).
|
|
|
|
Wayland compositors that import hantro CAPTURE buffers (chrome,
|
|
firefox, mpv, gstreamer) and wait on the dmabuf's implicit-sync
|
|
fence (poll(POLLIN), DMA_BUF_IOCTL_EXPORT_SYNC_FILE) now wait on a
|
|
real fence representing the producer's actual completion, not a
|
|
stub. KWin's `Transaction::watchDmaBuf` path on Mali-class hardware
|
|
is the user-visible benefit: the per-frame sync_file roundtrip
|
|
completes correctly the moment hantro's IRQ handler runs, instead
|
|
of either polling on a stub fence or — in the failure mode that
|
|
drove this work — failing to signal at all due to a race that the
|
|
stub-fence path masked.
|
|
|
|
Validated on PineTab2 (RK3566 / Mali-G52 / mainline 6.19 with this
|
|
series backported / panfrost mesa 26.0.5) playing 1080p30 H.264 in
|
|
chromium under stock KDE Plasma 6.6.4 Wayland: the chrome stall that
|
|
required a KWin watchDmaBuf bypass workaround (kwin-fourier in the
|
|
chromium-fourier project) is gone with this kernel-side fix in
|
|
place; KWin's wait completes correctly.
|
|
|
|
Signed-off-by: Markus Fritsche <mfritsche@reauktion.de>
|
|
---
|
|
drivers/media/platform/verisilicon/hantro_v4l2.c | 17 +++++++++++++++--
|
|
1 file changed, 15 insertions(+), 2 deletions(-)
|
|
|
|
diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
|
|
--- a/drivers/media/platform/verisilicon/hantro_v4l2.c
|
|
+++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
|
|
@@ -858,11 +858,24 @@ static void hantro_buf_queue(struct vb2_buffer *vb)
|
|
{
|
|
struct hantro_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
|
|
struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
|
|
|
|
if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type) &&
|
|
vb2_is_streaming(vb->vb2_queue) &&
|
|
v4l2_m2m_dst_buf_is_last(ctx->fh.m2m_ctx)) {
|
|
unsigned int i;
|
|
|
|
for (i = 0; i < vb->num_planes; i++)
|
|
vb2_set_plane_payload(vb, i, 0);
|
|
|
|
vbuf->field = V4L2_FIELD_NONE;
|
|
vbuf->sequence =
|
|
ctx->queue[V4L2_TYPE_IS_OUTPUT(vb->vb2_queue->type)].sequence++;
|
|
|
|
v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
|
|
return;
|
|
}
|
|
|
|
- v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
|
|
+ v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
|
|
+
|
|
+ /*
|
|
+ * Opt in to vb2's dma_resv release-fence path: any userspace
|
|
+ * consumer that imported this buffer's dmabuf and is doing
|
|
+ * implicit-sync via poll(POLLIN) or
|
|
+ * DMA_BUF_IOCTL_EXPORT_SYNC_FILE now waits on a real fence
|
|
+ * representing this device's completion, instead of the stub
|
|
+ * fence dma_buf_export_sync_file substitutes when dma_resv is
|
|
+ * empty. Best-effort: if fence allocation fails we just lose
|
|
+ * the implicit-sync precision, no functional regression.
|
|
+ */
|
|
+ (void)vb2_buffer_attach_release_fence(vb);
|
|
}
|
|
|
|
const struct vb2_ops hantro_queue_ops = {
|
|
--
|
|
2.44.0
|