From 234a103084b6a80b4294e76752abdc2b0db83fc8 Mon Sep 17 00:00:00 2001 From: claude-noether Date: Thu, 21 May 2026 17:14:33 +0200 Subject: [PATCH] =?UTF-8?q?daemon:=20AV=5FCODEC=5FFLAG=5FLOW=5FDELAY=20for?= =?UTF-8?q?=20H.264=20=E2=80=94=20fix=20display-reorder=20breaking=20V4L2?= =?UTF-8?q?=201:1?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Force libavcodec's H.264 decoder to emit frames in DECODE order (one frame per send_packet, no internal display-order reorder queue). Single-line addition: ctx->flags |= AV_CODEC_FLAG_LOW_DELAY before avcodec_open2, gated on codec_id == DAEDALUS_CODEC_H264. Closes daedalus-v4l2#11 part (2). Background ---------- PR #7's "parking design" approach to the H.264 display-reorder problem broke libva-v4l2-request-fourier's 1:1 CAPTURE-completion contract (see #9 + #10). After the revert, the visible "2 1 4 3" pair-swap regressed and the only path forward was to align the daemon's output ordering with what V4L2 stateless clients expect: **decode order, one CAPTURE buffer per OUTPUT slice, with display reorder pushed upstream to ffmpeg-vaapi's per-VAAPI-surface POC logic** (which it already does correctly for every real H.264 hardware decoder via VAPictureParameterBufferH264). How LOW_DELAY does this ----------------------- Inside libavcodec/h264dec.c, the flag sets h->low_delay = 1. h264_select_output_frame (h264_picture.c) emits the just-decoded picture immediately instead of routing through the display-order DPB output queue. DPB management for reference frames (short_ref / long_ref) is unaffected — B-frame decoding correctness is preserved; only the output buffering is bypassed. Skipped for VP9 / AV1 — those codecs don't reorder internally, so the flag would be a no-op but adds no value. Verified -------- On higgs (Pi CM5, 6.18.29+rpt-rpi-2712), test daemon hot-swapped into /usr/bin/daedalus_v4l2_daemon, mpv --hwdec=vaapi-copy --frames=300 against bbb_720p_h264.mp4: 311 REQ_DECODEs received, 308 successful "decoder: OK" responses (99.04% steady-state delivery — 3 lost at GOP boundaries, no compounding drift). mpv plays to its --frames cap and exits cleanly with "End of file". No "Unable to dequeue buffer", no "Failed to end picture decode", no "AVHWFramesContext: Failed to sync surface" — all the failures from #9 are gone. Builds clean against ffmpeg-v4l2-request-fourier libavcodec. --- daemon/src/decoder.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/daemon/src/decoder.c b/daemon/src/decoder.c index e91eb35..a43d6ab 100644 --- a/daemon/src/decoder.c +++ b/daemon/src/decoder.c @@ -132,6 +132,32 @@ static int decoder_open_codec(struct daedalus_decoder *dec, uint32_t codec_id, ctx = fm->avcodec_alloc_context3(codec); if (!ctx) return -ENOMEM; + + /* + * H.264-only: force libavcodec to emit frames in DECODE order + * (one frame per send_packet, no internal display-order reorder + * queue). V4L2 stateless decoder protocol expects each OUTPUT + * bitstream packet to produce one CAPTURE buffer with that + * packet's slice-decoded pixels — regardless of display order. + * ffmpeg-vaapi's H.264 decoder (which is what consumes our + * CAPTURE buffers via libva-v4l2-request-fourier) does its own + * POC-based display reorder upstream, so producing decode-order + * output is correct. + * + * AV_CODEC_FLAG_LOW_DELAY forces `low_delay = 1` inside + * libavcodec's H.264 decoder — `h264_select_output_frame` emits + * the just-decoded picture immediately instead of holding it + * for the display-order DPB output queue. DPB management for + * reference frames (short_ref / long_ref) is unaffected; B-frame + * decoding correctness is preserved. + * + * Closes daedalus-v4l2#11 part (2). Skipped for VP9 / AV1 — + * those formats don't internally reorder, so the flag would be + * a no-op but adds no value. + */ + if (codec_id == DAEDALUS_CODEC_H264) + ctx->flags |= AV_CODEC_FLAG_LOW_DELAY; + rc = fm->avcodec_open2(ctx, codec, NULL); if (rc < 0) { log_err("decoder: avcodec_open2 failed: %d", rc);