From 1844c263bde8dd244d7db46f8c508e7c70da459c Mon Sep 17 00:00:00 2001 In-Reply-To: <20260429195306.239666-1-mfritsche@reauktion.de> References: <20260429195306.239666-1-mfritsche@reauktion.de> From: Markus Fritsche Date: Sat, 9 May 2026 16:24:01 +0200 Subject: [PATCH RFC v2] media: hantro: attach dma_resv release fence at device_run Opt the hantro driver into the new vb2 release-fence helper so its CAPTURE-side dmabufs carry a real producer fence that wayland compositors and other implicit-sync consumers can wait on, instead of the dma_buf core's stub fence. Attach point is m2m device_run, immediately after v4l2_m2m_buf_copy_metadata() and before ctx->codec_ops->run(). Per Nicolas Dufresne's v1 review (lore.kernel.org/linux-media/ 3d8deeb15581b754e4c061d4c4a13657aa08bc3c.camel@ndufresne.ca/), this satisfies the dma_fence finite-time contract: the m2m core has committed to running the job by this point, codec_ops->run either kicks the HW (decode-complete signals the fence via vb2_buffer_done) or fails immediately (job_finish with VB2_BUF_STATE_ERROR signals with -EIO). PM and clocks are already up by this point, so no allocation context restrictions. The CAPTURE queue is opted in with supports_release_fences=true at queue_init. Userspace consumers that import hantro CAPTURE dmabufs and wait on their implicit-sync fence (Wayland zwp_linux_dmabuf_v1 + panfrost EGL_LINUX_DMA_BUF_EXT) now wait on a real fence representing the producer's actual completion, fixing green-frame corruption observed on RK3566 PineTab2 + Mali-G52 panfrost (the GPU was sampling zero pages because the dmabuf's implicit fence was the dma_buf core's pre-signalled stub). Validated end-to-end on PineTab2 (RK3566 / hantro G1 / Mali-G52 mainline panfrost): 30s of bbb_1080p30 H.264 stateless decode + zero-copy panfrost EGL import via dmabuf-wayland (mpv 0.41 + KWin 6.6.4 + Mesa panfrost 26.0.5) renders correctly with no green-frame corruption and no PROVE_LOCKING splats. Cc: Ezequiel Garcia Cc: Philipp Zabel Cc: Nicolas Dufresne Cc: linux-media@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Signed-off-by: Markus Fritsche --- .../media/platform/verisilicon/hantro_drv.c | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c index 2e81877f6..6a66c47ed 100644 --- a/drivers/media/platform/verisilicon/hantro_drv.c +++ b/drivers/media/platform/verisilicon/hantro_drv.c @@ -186,6 +186,22 @@ static void device_run(void *priv) v4l2_m2m_buf_copy_metadata(src, dst); + /* + * Attach a producer fence on the CAPTURE-side dmabuf so userspace + * importers (e.g. Wayland compositors) get spec-clean implicit-sync + * semantics. Called from device_run rather than buf_queue: the + * dma_fence finite-time contract requires that once a fence is + * published, the producer must signal it in finite time. By the + * time we reach device_run, the m2m core has committed to running + * this job, and the next hop (codec_ops->run) either kicks the HW + * (decode-complete signals the fence via vb2_buffer_done) or + * fails immediately (job_finish with VB2_BUF_STATE_ERROR signals + * the fence with -EIO). Either path resolves the fence in finite + * time. Best-effort: a NOMEM here means we lose implicit-sync + * precision for this frame, no functional regression. + */ + (void)vb2_buffer_attach_release_fence(&dst->vb2_buf); + if (ctx->codec_ops->run(ctx)) goto err_cancel_job; @@ -249,6 +265,13 @@ queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq) dst_vq->lock = &ctx->dev->vpu_mutex; dst_vq->dev = ctx->dev->v4l2_dev.dev; + /* + * Opt the CAPTURE queue into vb2 release-fence publishing. + * No-op unless CONFIG_VIDEOBUF2_RELEASE_FENCES=y; runtime cost + * is one extra fence allocation + dma_resv update per device_run. + */ + dst_vq->supports_release_fences = true; + return vb2_queue_init(dst_vq); } -- 2.53.0