7e0848d7d2
ROOT CAUSE FIX for VP8 libva decode garbage output. ffmpeg-vaapi's vaapi_vp8.c:191-192 STRIPS the VP8 uncompressed header (3 bytes for interframe, 10 bytes for keyframe) before submitting the slice data via VAAPI. ffmpeg-v4l2request (kdirect) KEEPS the header in its OUTPUT buffer. Hantro's rockchip_vpu2_vp8_dec_run (rockchip_vpu2_hw_vp8_dec.c:349) hard-codes 'first_part_offset = V4L2_VP8_FRAME_IS_KEY_FRAME(hdr) ? 10 : 3' as the byte offset into OUTPUT where the first compressed partition starts. It uses this offset for: - mb_offset_bits = first_part_offset * 8 + first_part_header_bits + 8 - dct_part_offset = first_part_offset + first_part_size Without the header, every offset is wrong, the entropy decoder spins on the wrong bytes, and every frame decodes to garbage. Fix: in codec_store_buffer for VAProfileVP8Version0_3, prepend header_size bytes (10 keyframe / 3 interframe) of zeros to OUTPUT before the slice data memcpy. Hantro skips these bytes for actual parsing (uses ctrl-struct values instead), so zero-fill is fine. Empirical: iter33 kernel printk in vpu2_vp8_dec_run dumped the v4l2_ctrl_vp8_frame struct for libva vs kdirect and confirmed byte-identical control fields. Only the OUTPUT buffer bytes differed, traced to ffmpeg-vaapi's header stripping.