From c672f19f44ec77ce1cf60acb50921259300e6083 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Sat, 2 May 2026 12:00:00 +0000 Subject: [PATCH] h264: hardcode SPS level_idc = 51 (intentional over-allocation) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit fourier's h264_va_picture_to_v4l2 never assigns sps->level_idc; the field stays at zero-init. level_idc=0 is invalid per the H.264 spec (lowest legal value is 10, Level 1.0). Hantro and other stateless H.264 decoders use level_idc to pre-allocate decoder resources (DPB size, motion-vector buffers); when fed an invalid level the hantro kernel driver silently skips the decode-hardware dispatch — the V4L2 request completes with no error, DQBUF returns the CAPTURE buffer reporting bytesused=3655712 and no V4L2_BUF_FLAG_ERROR, but the buffer is never written. VAAPI's decode-side VAPictureParameterBufferH264 structurally does NOT include level_idc — `grep level_idc va/va.h` returns only hits inside VAEncSequenceParameterBufferH264 (the encode path). The H.264 SPS NAL is also not included in VASliceDataBuffer because ffmpeg-vaapi parses it client-side and forwards only slice data (verified empirically via patch 0010's hex-dump of the OUTPUT buffer: it contains "00 00 01 65 ..." — i.e. ANNEX_B start code + IDR slice NAL byte, no SPS NAL). A SPS-NAL byte extractor is therefore not viable from the bitstream libva-v4l2-request receives. Workaround: hardcode level_idc = 51 (= Level 5.1, max for 1080p and 4K@30 mainstream consumer profiles). This INTENTIONALLY OVER-ALLOCATES decoder resources but is sufficient for any stream up to 4K@30. It is corpus-correct, not contract-correct: a 4K@60 stream (Level 6.x) would under-allocate. This patch is a known-incomplete intermediate, not a final fix. The proper upstreamable answer is a level-from-resolution derivation per H.264 Annex A.3 (max MB rate / max frame size thresholds). That requires mapping consumer-side framerate which VAAPI does not expose, so the lookup table is non-trivial. The TODO is captured inline. This patch's goal is unblocking decode-hardware engagement on the ohm_gl_fix corpus while the full level-derivation work proceeds. Cross-reference: kernel doc ext-ctrls-codec-stateless.rst V4L2_CID_STATELESS_H264_SPS lists level_idc as a required field with no "kernel-derives" annotation — i.e., userspace-required. Signed-off-by: Markus Fritsche --- src/h264.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/src/h264.c b/src/h264.c index b22beb4..ba29c5d 100644 --- a/src/h264.c +++ b/src/h264.c @@ -552,6 +552,35 @@ int h264_set_controls(struct request_data *driver_data, sps.profile_idc = h264_profile_to_idc(profile); + /* + * VAAPI's decode-side VAPictureParameterBufferH264 does not carry + * level_idc — see va.h, the field exists only in + * VAEncSequenceParameterBufferH264 on the encode path. The H.264 + * SPS NAL is also not included in VASliceDataBuffer (ffmpeg-vaapi + * parses it client-side and forwards only slice data), so a + * SPS-NAL byte extractor is not viable from the bitstream we + * receive. + * + * Hantro and other stateless H.264 decoders use level_idc to + * pre-allocate decoder resources (DPB, motion-vector buffers); a + * zero-init level_idc=0 is invalid (lowest legal is 10 = Level + * 1.0) and causes hantro to silently skip the decode hardware + * dispatch. + * + * Hardcode level_idc = 51 (Level 5.1, max for 1080p/4K@30) as a + * known-incomplete intermediate. This INTENTIONALLY OVER-ALLOCATES + * decoder resources and is sufficient for any stream up to 4K@30. + * It is corpus-correct, not contract-correct. + * + * TODO: derive level_idc from (VAProfile, picture_width_in_mbs, + * picture_height_in_mbs) per H.264 Annex A.3 max-MB-per-second + * thresholds. That is a small lookup table but requires also + * mapping the consumer's framerate, which VAAPI doesn't provide + * directly. For now the over-allocation is the upstreamable + * compromise. + */ + sps.level_idc = 51; + /* * Build the per-request control list incrementally: * - SPS, PPS, DECODE_PARAMS: always required (in either decode