libva-v4l2-request-fourier

Author	SHA1	Message	Date
test0r	b0a93e4683	h264: fill dpb[].pic_num as PicNum/LongTermPicNum, not VAAPI surface id fourier's h264_fill_dpb assigned `dpb->pic_num = entry->pic.picture_id` — the VAAPI surface id. Per ext-ctrls-codec-stateless.rst:651-655, v4l2_h264_dpb_entry.pic_num must equal the H.264 spec PicNum (equation 8-28) for short-term references or LongTermPicNum (equation 8-29) for long-term references. The surface id has no relationship to either. Kernel-side consumers of pic_num: - mediatek/decoder/vdec/vdec_h264_req_common.c (line 210): dst_entry->pic_num = src_entry->pic_num. Used for field-coded short-term reference disambiguation. - hantro / rkvdec / cedrus / qcom-iris-stateless: do NOT read pic_num. They resolve refs via reference_ts (timestamp) and POC. This is why fourier's wrong value never surfaced on RK3568 hantro. This patch makes pic_num spec-correct so the libva-v4l2-request fork is upstreamable across drivers without depending on each target's tolerance for non-spec fills. Computation, derived from H.264 spec section 8.2.4.1: For frames (not field-coded), PicNum = FrameNumWrap. FrameNumWrap = (frame_num > cur_frame_num) ? frame_num - max_frame_num : frame_num max_frame_num = 1 << (sps.log2_max_frame_num_minus4 + 4) cur_frame_num = current picture's frame_num For long-term references: LongTermPicNum = long_term_frame_idx (when not field-coded). VAAPI convention (libavcodec/vaapi_h264.c::fill_vaapi_pic line 64): VAPictureH264.frame_idx = long_ref ? pic_id : frame_num So long-term refs already carry long_term_frame_idx in frame_idx; we copy it through. Field-coded streams require an extra factor-of-2 plus a parity adjustment per spec equations 8-28/8-29; this patch does not handle field-coded content. ohm corpus is all frame-coded so this is a follow-up for later. Implementation: add VAPicture parameter to h264_fill_dpb so the function has access to seq_fields.log2_max_frame_num_minus4 and the current picture's frame_num. Update the single caller in h264_va_picture_to_v4l2. Cross-reference: kernel doc ext-ctrls-codec-stateless.rst dpb_entry table (line 651-655) and mediatek/vdec/vdec_h264_req_common.c line 210. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	05ffd02ff2	h264: derive PFRAME / BFRAME flags from VASlice slice_type v4l2_ctrl_h264_decode_params.flags has PFRAME and BFRAME bits per ext-ctrls-codec-stateless.rst. fourier never set them; libva-v4l2- request relied on each backing driver tolerating frame-class ambiguity. Kernel survey (linux 6.19.x): - tegra-vde/h264.c (lines 783-799) consumes both flags to select the inter-frame decode kernel. Without them the I-frame kernel runs on P/B content. - visl-trace-h264.h uses them for decode tracing. - hantro / rkvdec / cedrus / mediatek / qcom-iris-stateless do not consume the flags. Hantro on ohm decoded bbb cleanly without these flags set (see phase6/step1/ohm_smoke_2026-05-02T060255Z_post_0015/), so this is an upstreamability fix for cross-driver portability rather than a correctness fix for hantro. VAAPI's VASliceParameterBufferH264.slice_type maps directly to the H.264 slice_header() slice_type field. Per spec 7.4.3: 0=P 1=B 2=I 3=SP 4=SI; 5..9 = "all slices in the picture have this slice_type." `slice_type % 5` recovers the underlying type in either encoding form. In FRAME_BASED mode we only see surface->params.h264.slice from the most-recent VASliceParameterBuffer — that's fine: a single coded picture has a uniform slice_type for the purposes of the PFRAME / BFRAME flag (multi-slice frames may mix slice types in some streams, but the flag's semantic is "this is an inter-coded frame," which holds if any slice is P or B; using the last-seen slice's type is a reasonable approximation). Cross-reference: ext-ctrls-codec-stateless.rst Decode Parameters Flags table. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	fdb0b728d7	h264: strip ffmpeg-vaapi POC sentinel before passing to V4L2 ROOT CAUSE for "kernel decodes successfully but produces zeroed CAPTURE buffers despite no V4L2_BUF_FLAG_ERROR": ffmpeg's H264POCContext initialises prev_poc_msb to (1 << 16) = 0x10000 as a sentinel for "uninitialised": libavcodec/h264dec.c:301 — global init in ff_h264_decode_init libavcodec/h264dec.c:444 — IDR reset in idr() helper ff_h264_init_poc (libavcodec/h264_parse.c:296-305) then computes pc->poc_msb = pc->prev_poc_msb whenever the slice header's pic_order_cnt_lsb hasn't wrapped relative to prev_poc_lsb (which is the typical case for any normal H.264 content with sane POC ordering). The sentinel leaks into field_poc[] (line 305) and from there into VAPictureH264.TopFieldOrderCnt / BottomFieldOrderCnt at libavcodec/vaapi_h264.c::fill_vaapi_pic (lines 73-78). Empirical confirmation via meitner 2026-05-02 ground-truth test: ran an LD_PRELOAD shim around vaCreateBuffer against an i965 VAAPI backend decoding a 60-frame H.264 Main clip. Every frame showed TopFieldOrderCnt = (POC \| 0x10000): Frame 1 IDR: raw bytes "00 00 01 00" at offset 12 → TopFOC=65536 Frame 2: raw bytes "06 00 01 00" → TopFOC=65542 Frame 3: "02 00 01 00" → TopFOC=65538 i965 successfully decodes regardless. V4L2 stateless drivers (hantro_h264.c::prepare_table feeds the value direct to tbl->poc[i*2]/[32], the kernel reflist builder uses it directly for cur_pic_order_count comparison) cannot tolerate the high word — the kernel's resource sizing math sees POC=65536 for an IDR and breaks. This patch adds h264_strip_ffmpeg_poc_sentinel() as a small static inline in src/h264.c. It detects bit 16 set rather than blindly subtracting, so a future ffmpeg version that fixes the leak degrades gracefully. The helper is applied at all four POC sites: 1. h264_fill_dpb: dpb->top_field_order_cnt 2. h264_fill_dpb: dpb->bottom_field_order_cnt 3. h264_va_picture_to_v4l2: decode->top_field_order_cnt 4. h264_va_picture_to_v4l2: decode->bottom_field_order_cnt VA_PICTURE_H264_INVALID DPB slots are short-circuited to POC=0 because libavcodec/vaapi_h264.c::init_vaapi_pic (line 43) already sets POC=0 there; the sentinel never applies. Zeroing them explicitly removes a class of "stale POC value in invalidated slot" foot-guns. Non-trivial follow-ups identified during the meitner experiment that are NOT addressed by this patch: - PFRAME / BFRAME flags in v4l2_ctrl_h264_decode_params.flags are not yet derived from VASliceParameterBufferH264.slice_type. The bbb corpus is I-only at the start so this hasn't been a blocker, but a clip with B-frames will need the slice-type routing patch. - h264_fill_dpb's pic_num assignment (entry->pic.picture_id) is almost certainly wrong per the kernel doc — pic_num must equal the H.264 spec's PicNum / FrameNumWrap, not the VAAPI surface id. Out of scope here; will surface as a defect on streams that have multi-frame DPB lookups. Cross-references: audit_0008_decode_params_2026-05-01.md — kernel-side consumer audit confirming POC fields are userspace-required. api_contract_findings_2026-05-01.md — VAAPI doc gap on POC semantics; H.264 spec section 8.2.1 is the binding contract. meitner_2026-05-02_vaapi_idr_groundtruth/ — full empirical capture of the sentinel pattern across 60 frames. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	affb4bd12a	DEBUG: dump VAPictureH264 raw bytes + decoded fields Diagnostic-only. Investigating the observed anomaly: - V4L2 strace shows decode_params.top_field_order_cnt = 65536 on the first IDR frame submitted by mpv+ffmpeg+libva-v4l2-request - GStreamer's reference path writes 0 (spec-correct: PicOrderCnt=0 for IDR with pic_order_cnt_type=0 / pic_order_cnt_lsb=0) - Reading FFmpeg source (libavcodec/vaapi_h264.c::fill_vaapi_pic): va_pic->TopFieldOrderCnt = 0; if (pic->field_poc[0] != INT_MAX) va_pic->TopFieldOrderCnt = pic->field_poc[0]; For IDR: ff_h264_init_poc sets field_poc[0] = poc_msb + poc_lsb = 0 + 0 = 0. So FFmpeg should write 0. If FFmpeg writes 0 but fourier reads 65536, the mismatch is in the libva ABI between ffmpeg's writer and our reader. Most likely suspect: VA_PADDING_LOW size in VAPictureH264 differs between the libva headers ffmpeg+libva were built against and the headers fourier was built against, shifting struct field offsets. This patch dumps: 1. sizeof(VAPictureH264) at our reader's view 2. First 32 raw bytes of VAPicture->CurrPic 3. Field-decoded values via the .picture_id, .frame_idx, .flags, .TopFieldOrderCnt, .BottomFieldOrderCnt accessors If the raw bytes show 00 00 01 00 at offset 12 (= 65536 LE), the field offset is correct and FFmpeg actually wrote 65536 — meaning either FFmpeg has a bug, or our test scenario triggers a non-spec code path. If the raw bytes show 00 00 00 00 at offset 12 but TopFieldOrderCnt accessor returns 65536, the struct ABI is mismatched and we need to reconcile libva versions. If sizeof(VAPictureH264) prints as something other than 36 (= 45 + 4VA_PADDING_LOW assuming VA_PADDING_LOW=4), the struct layout on this build differs from the documented libva-2.x layout. Removed once the source of the 65536 is identified. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	c672f19f44	h264: hardcode SPS level_idc = 51 (intentional over-allocation) fourier's h264_va_picture_to_v4l2 never assigns sps->level_idc; the field stays at zero-init. level_idc=0 is invalid per the H.264 spec (lowest legal value is 10, Level 1.0). Hantro and other stateless H.264 decoders use level_idc to pre-allocate decoder resources (DPB size, motion-vector buffers); when fed an invalid level the hantro kernel driver silently skips the decode-hardware dispatch — the V4L2 request completes with no error, DQBUF returns the CAPTURE buffer reporting bytesused=3655712 and no V4L2_BUF_FLAG_ERROR, but the buffer is never written. VAAPI's decode-side VAPictureParameterBufferH264 structurally does NOT include level_idc — `grep level_idc va/va.h` returns only hits inside VAEncSequenceParameterBufferH264 (the encode path). The H.264 SPS NAL is also not included in VASliceDataBuffer because ffmpeg-vaapi parses it client-side and forwards only slice data (verified empirically via patch 0010's hex-dump of the OUTPUT buffer: it contains "00 00 01 65 ..." — i.e. ANNEX_B start code + IDR slice NAL byte, no SPS NAL). A SPS-NAL byte extractor is therefore not viable from the bitstream libva-v4l2-request receives. Workaround: hardcode level_idc = 51 (= Level 5.1, max for 1080p and 4K@30 mainstream consumer profiles). This INTENTIONALLY OVER-ALLOCATES decoder resources but is sufficient for any stream up to 4K@30. It is corpus-correct, not contract-correct: a 4K@60 stream (Level 6.x) would under-allocate. This patch is a known-incomplete intermediate, not a final fix. The proper upstreamable answer is a level-from-resolution derivation per H.264 Annex A.3 (max MB rate / max frame size thresholds). That requires mapping consumer-side framerate which VAAPI does not expose, so the lookup table is non-trivial. The TODO is captured inline. This patch's goal is unblocking decode-hardware engagement on the ohm_gl_fix corpus while the full level-derivation work proceeds. Cross-reference: kernel doc ext-ctrls-codec-stateless.rst V4L2_CID_STATELESS_H264_SPS lists level_idc as a required field with no "kernel-derives" annotation — i.e., userspace-required. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	841f616e74	h264: gate SCALING_MATRIX submission on VAIQMatrixBuffer presence VAAPI signals "explicit scaling lists are present in the bitstream" implicitly: the consumer (ffmpeg-vaapi, mpv, etc.) sends a VAIQMatrixBufferH264 alongside RenderPicture iff sps_scaling_matrix_present_flag \|\| pps_scaling_matrix_present_flag. When the bitstream uses default (flat) scaling, no IQMatrixBuffer arrives and the in-tree h264.matrix struct stays zero-initialised. fourier's existing codec_store_buffer for MPEG2 and HEVC tracks this via a per-surface iqmatrix_set boolean (surface.h::mpeg2.iqmatrix_set, h265.iqmatrix_set) — the H.264 path was missing the equivalent flag, so set_controls always submitted the scaling matrix, including the zero-initialised case. Symptom on hantro-vpu RK3568: when TRANSFORM_8X8_MODE is enabled in PPS, the kernel multiplies all 8x8 DCT coefficients by the zeroed scaling_list_8x8, producing a zeroed CAPTURE buffer despite a successful decode round-trip (no V4L2_BUF_FLAG_ERROR, bytesused=3655712 reported). Earlier draft of this patch unconditionally omitted SCALING_MATRIX in FRAME_BASED. That's corpus-correct (bbb has no explicit scaling lists) but the wrong predicate: the kernel-side gating is by "matrix-supplied vs. not," not by decode mode. Streams that signal explicit scaling lists must submit SCALING_MATRIX in either mode. Contract verification (audit_0008_decode_params_2026-05-01.md + hantro_h264.c::assemble_scaling_list): the kernel uses the supplied matrix when SCALING_MATRIX is in the control batch and falls back to spec-defined defaults when absent. Mode-independent. This patch: - surface.h: adds bool matrix_set to params.h264, mirroring mpeg2.iqmatrix_set / h265.iqmatrix_set. - picture.c codec_store_buffer (H.264 VAIQMatrixBufferType case): sets matrix_set = true when the buffer arrives. - picture.c RequestBeginPicture: resets matrix_set = false at the start of each Begin/Render/End cycle. - h264.c h264_set_controls: builds the controls[] array incrementally; SPS/PPS/DECODE_PARAMS always; SCALING_MATRIX iff matrix_set; SLICE_PARAMS only in SLICE_BASED; PRED_WEIGHTS only when both SLICE_BASED and V4L2_H264_CTRL_PRED_WEIGHTS_REQUIRED. The pre-existing FRAME_BASED-omits-SLICE_PARAMS rule is preserved — kernel doc ext-ctrls-codec-stateless.rst:752: "When this mode is selected, the V4L2_CID_STATELESS_H264_SLICE_PARAMS control shall not be set." Cross-reference: kernel UAPI section ext-ctrls-codec-stateless.rst V4L2_CID_STATELESS_H264_SCALING_MATRIX (matrix supplied iff explicit scaling lists in bitstream) and hantro_h264.c::assemble_scaling_list (consumes supplied matrix or falls back to defaults). Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	86a8545146	h264: fill DECODE_PARAMS frame_num + field flags from VAAPI Fourier's h264_va_picture_to_v4l2 only populated four fields of the struct v4l2_ctrl_h264_decode_params: dpb (via h264_fill_dpb), nal_ref_idc, top_field_order_cnt, bottom_field_order_cnt, and the IDR_PIC flag. Many other required-by-spec fields were left at zero- init (frame_num, idr_pic_id, pic_order_cnt_lsb, delta_pic_order_cnt_, dec_ref_pic_marking_bit_size, pic_order_cnt_bit_size, slice_group_change_cycle, FIELD_PIC and BOTTOM_FIELD flags). For an IDR (first frame) on hantro-vpu RK3568, the kernel parses the bitstream from the OUTPUT buffer and uses these fields to drive its bitstream-element offset tracking. Empirically the kernel returned a successfully-decoded but ZEROED CAPTURE buffer — flat dark-green frames in mpv output, no errors logged. This patch fills every field VAAPI exposes: - frame_num: from VAPicture->frame_num. - FIELD_PIC flag: from VAPicture->pic_fields.bits.field_pic_flag. - BOTTOM_FIELD flag: from VAPicture->CurrPic.flags & VA_PICTURE_H264_BOTTOM_FIELD. Also corrects the IDR_PIC flag to use \|= instead of = so the new field flags don't clobber it. Fields NOT derivable from VAAPI's pre-parsed structures — idr_pic_id, pic_order_cnt_lsb, delta_pic_order_cnt_, dec_ref_pic_marking_bit_size, pic_order_cnt_bit_size, slice_group_change_cycle — require a slice_header() bit-level parse. libva-v4l2-request does not currently do this. They remain at zero-init. Empirical question this patch answers: does hantro tolerate the bit_size fields being zero for IDR frames, or does it strictly require them? If post-patch CAPTURE is still zeroed, a slice-header parser is required. If CAPTURE shows real picture data, hantro fills in the bit-positions itself when no hint is supplied. Cross-reference: gstv4l2codech264dec.c:: gst_v4l2_codec_h264_dec_fill_decoder_params (commit 9e3e775, lines 632-678). Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	4246d5d537	h264: omit per-slice controls in FRAME_BASED mode Identified by cross-reference against GStreamer's gst-plugins-bad/sys/v4l2codecs/gstv4l2codech264dec.c (upstream commit 9e3e775). At lines 1263-1304, GStreamer gates SLICE_PARAMS and PRED_WEIGHTS submission on is_slice_based(self): if (is_slice_based (self)) { control[num_controls].id = V4L2_CID_STATELESS_H264_SLICE_PARAMS; ... control[num_controls].id = V4L2_CID_STATELESS_H264_PRED_WEIGHTS; ... } In V4L2_STATELESS_H264_DECODE_MODE_FRAME_BASED, the kernel parses the bitstream itself from the OUTPUT-queue payload; per-slice controls in the request trigger cluster-validation EINVAL at error_idx=count (observed on RK3568 hantro-vpu, kernel 6.19.10). This patch: - Reorders controls[] so FRAME_BASED-required entries come first (SPS, PPS, SCALING_MATRIX, DECODE_PARAMS at indices 0..3) and the SLICE_BASED-only entries come last (SLICE_PARAMS, PRED_WEIGHTS at indices 4..5). - Defaults num_controls=4 (FRAME_BASED), expanding to 5 for SLICE_BASED and 6 when V4L2_H264_CTRL_PRED_WEIGHTS_REQUIRED. - Hardcodes slice_based=false for now since patch 0002 sets the device to FRAME_BASED unconditionally. A TODO marks the spot for the planned probe-then-set commit, which will populate context->decode_mode at CreateContext via VIDIOC_QUERYCTRL/ G_EXT_CTRLS and replace the hardcoded false with a runtime check. Diagnosis chain: - patch 0005 reduced one EINVAL per frame on PRED_WEIGHTS submission, but cluster-level rejection persisted at error_idx=5 (count) — meaning kernel walked all 5 controls cleanly but rejected the request as a whole. - dmesg silent → rejection in V4L2 core (v4l2-ctrls-request.c / v4l2-h264.c), not in hantro driver where it could log. - GStreamer reference confirmed FRAME_BASED contract: only 4 sequence-and-frame-level controls go in the per-request batch. After this patch the kernel should accept the per-request controls and actually decode the bitstream into the CAPTURE buffer. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	e382c63e20	h264: submit PRED_WEIGHTS only when WEIGHTED_PRED applies Per kernel UAPI (include/uapi/linux/v4l2-controls.h), V4L2_CID_STATELESS_H264_PRED_WEIGHTS is a conditional control: V4L2_H264_CTRL_PRED_WEIGHTS_REQUIRED(pps, slice) := ((pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) && (slice_type == P \|\| slice_type == SP)) \|\| (pps->weighted_bipred_idc == 1 && slice_type == B) Submitting PRED_WEIGHTS on a frame where the macro evaluates false triggers VIDIOC_S_EXT_CTRLS to return EINVAL at error_idx=5 (the 6th, last control in the per-request batch) on hantro-vpu and any other driver that strictly enforces the spec. Smoke trace from RK3568 hantro on bbb_1080p30 (Main profile, no weighted prediction): every per-frame batch fails identically, 13 EINVALs over a 10-frame run. Without this fix, ffmpeg's vaapi-copy falls back to software decode for every frame. Fix: narrow num_controls to 5 (excluding PRED_WEIGHTS at index 5) when the macro returns false; keep at 6 when it returns true. Defect found and fixed via Phase 6 Step 1 ohm smoke testing. Not part of Sonnet's six-commit upstreamable plan; slotted in as patch 0005 ahead of the planned probe-then-set / FRAME_BASED commits because it unblocks per-frame submission on every backing driver, not just hantro. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>	2026-05-04 09:45:05 +00:00
test0r	c45fea96e3	fourier-local: stateless control modernization + HEVC strip Compound patch carrying the fork's pre-Step-1 substrate, originally authored by Jernej Škrabec / fourier on top of bootlin's `a3c2476`: - src/h264.c + src/picture.c: V4L2_CID_MPEG_VIDEO_H264_* renamed to V4L2_CID_STATELESS_H264_*, struct shapes tracked to mainline (V4L2_CID_STATELESS_H264_DECODE_MODE/_START_CODE added to the passthrough shim). - include/hevc-ctrls.h: redirect shim to <linux/v4l2-controls.h> (kernel-side HEVC controls now live in the canonical UAPI header). - src/meson.build: src/h265.c / src/h265.h commented out — HEVC build path is excluded from this fork (RK3568 hantro G1/G2 has no HEVC, and the kernel-side HEVC controls have a separate rework in flight upstream). - src/tiled_yuv.S: aarch64 stub for tiled_to_planar (assembly source was sunxi-cedrus armv7-only; aarch64 needs a stub to keep the build linking). - include/h264-ctrls.h: removed (dead post-fourier — no source includes it; the passthrough shim's CID aliases live in the kernel header now). Functionally equivalent to the prior fork master commits: `c1f5108` V4L2_PIX_FMT_H264_SLICE rename `4ccbfe9` Strip HEVC build path `da9f2a5` include/h264-ctrls.h passthrough + CID aliases `fc4bb10` src/h264.c track upstream UAPI shape `13e9b64` src/h264.c drop num_slices field `4d14ffb` src/tiled_yuv.S aarch64 stub `1b02c9b` src/h264.c include utils.h Folded into one commit during 2026-05-04 Step 1 reconciliation (see ../phase0_evidence/2026-05-04/findings.md). Per-patch history of the early fork commits preserved on the pre-step1 branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 09:40:14 +00:00
Paul Kocialkowski	b5cee9f480	include: Update headers to latest series Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>	2019-05-16 16:14:55 +02:00
Paul Kocialkowski	0c611c6b7a	Implement proper timestamping for references Reference frames are now identified using their timestamp: set the timestamp when queuing the output buffer and use it to identify the frame later on. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>	2019-03-07 11:41:56 +01:00
Paul Kocialkowski	3176adf69c	Include local copies of DRM and V4L2 codec definitions Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>	2019-03-07 11:37:12 +01:00
Paul Kocialkowski	518d7a0c59	Update and harmonize heading author lists Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>	2019-03-07 11:37:12 +01:00
Maxime Ripard	111f5b209a	tree: Rename cedrus_data to request_data The cedrus_data structure carries the old name. In order to migrate to the new name, let's rename it to request_data. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 17:02:23 +02:00
Maxime Ripard	4ad990e087	tree: Rename the header and defines The sunxi_cedrus.h header contains a bunch of defines prefixed with SUNXI_CEDRUS. As part as the ongoing migration to a more generic name, change that prefix for V4L2_REQUEST, and the header file to request.h Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 17:02:23 +02:00
Maxime Ripard	2d1bce38c2	h264: Don't set num_slices anymore The num_slices parameter was improperly set to the number of reference frames, which is incorrect. Add a counter for the number of slices per surface, and set num_slices to that value. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:28:55 +02:00
Maxime Ripard	38d38134c7	h264: Set PPS pic_init_qp_minus26 field The pic_init_qp_minus26 must be set but was not until now. Fix this. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:28:55 +02:00
Maxime Ripard	1fca951c05	h264: Fix prediction weight table The current code sets the prediction weight table by doing a memcpy of the libva structure to the v4l2's structure. However, for the offset and weight parameters, libva's structure uses 16-bits integer, while v4l2 uses 8-bits, which obviously doesn't work well with memcpy. Create a function to copy those arrays and matrices instead that follows the algorithm defined in the H264 spec, and use it so that it works properly. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:28:55 +02:00
Maxime Ripard	e7c09a336f	h264: Implement local cache of the latest decoded pictures The libva only provides the reference images needed to decode the current picture, but not the full DPB. However, some codecs need that whole DPB in order to decode a picture. For example, the Allwinner hardware codec has an internal SRAM, with each picture getting a slot in that SRAM, and during each decoding process, some metadata will then be generated from that SRAM content to a separate buffer. Therefore, each frames must be located at the same SRAM position each time so that the metadata are then re-used properly. However, since libva will only pass a few reference images, we can end up in a situation where multiple, subsequent, frames will have the same reference images set, but might all be used as reference later on and cannot therefore be located at the same position. And from a more theorical point of view, Linux expects a full blown DPB in its H264 control. In order to work around this, we can create a shadow of the DPB by simply maintaining a list of 16 decoded images, each associated with their VAPictureH264 and an age. This age is the last time we used that frame as reference. When a new picture is decoded, either we assign it to a free slot, or we reuse the slot from the frame that hasn't been used as a reference for the longest time. This is a much simpler approach than the one documented in the H264 spec, but this shouldn't really be a problem since we don't handle the reference frames ourselves, but just re-use the one from the libva, and taken from the bitstream before. As such, frames that are not supposed to be used for reference will not be anymore, their age will not increase, and therefore after a while we will garbage-collect their slot to store a much newer frame. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:30:33 +02:00
Maxime Ripard	dadb3d344f	h264: Pass the context to the sub-control functions Some functions setting the controls in the H264 code will need the context in order to access the DPB. Make sure that we pass it as an argument. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:29:28 +02:00
Maxime Ripard	acc0cf3475	codecs: pass the context to the controls function as well Some functions setting the controls will need the context in the future. Make sure that we provide it as an argument. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 15:28:55 +02:00
Maxime Ripard	5aeb07f8bf	tree: Run clang-format to conform to the kernel coding style The coding style has been a bit erratic. Enforce the linux kernel coding style by reusing their .clang-format file, running clang-format on the source, and ignoring the few shortcomings that clang-format has at the moment (especially on aligning the define values). Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 10:12:15 +02:00
Maxime Ripard	b938824c48	tree: Shorten struct sunxi_cedrus_driver_data name This long structure name makes it quite difficult to fit within the 80 characters limit. Shorten it. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 09:34:15 +02:00
Maxime Ripard	2208d57b8f	h264: shorten the surface_object parameter name Using the same words but not in the same order for both the type and the variable name isn't particularly helpful, and prevents to stay within 80 characters. Shorten the name a bit. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-17 09:31:17 +02:00
Maxime Ripard	6194f1e7da	h264: Adjust for the latest h264 API changes Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-13 16:10:21 +02:00
Maxime Ripard	22b51f5ced	h264: Fix build failure introduced by previous commit Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>	2018-07-13 16:10:02 +02:00
Maxime Ripard	1efa9d877e	Add support for H264 decoding Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>	2018-07-11 17:07:15 +02:00

28 Commits