15 Commits

Author SHA1 Message Date
claude-noether 902d6c17ba ampere-av1 Phase 5 review: stale linked_decode_surface_id clear; remap fix REVERTED
Two of three Phase 5 sonnet-architect review amendments addressed.

Amendment 4 (kept): clear surface_object->linked_decode_surface_id at
BeginPicture after the iter2 Fix 3 release. Prevents stale-link
borrows in copy_surface_to_image when ffmpeg-vaapi recycles a former
display surface as a decode target. No-op for non-AV1 codecs (link
field stays VA_INVALID_SURFACE for them throughout).

Amendment 1 (reverted): reviewer proposed remap_lr_type table
{NONE, SWITCHABLE, WIENER, SGRPROJ} per Kwiboo's permutation,
arguing AV1 spec FrameRestoreType wire encoding differs from
V4L2_AV1_FRAME_RESTORE_* enum order. Applied the proposed table
empirically → regressed ALL tests (allintra 10/10 → 0/10, test_av1
bit-exact → DIFF). Reverted to identity mapping. Either VAAPI's
yframe_restoration_type is already in V4L2-enum order, or vpu981
interprets the V4L2 enum values via a mapping that differs from the
uAPI header documentation. Per [[feedback_review_empirical_over_theoretical]]
empirical PASS wins; updated the code comment to capture the
investigation outcome so the next session has the context.

Amendment 5 (SEPARATE_UV_DELTA_Q sequence flag missing): noted but
not actionable — VAAPI doesn't expose color_config.separate_uv_delta_q.
Will need bitstream-side info to surface. Not blocking current tests.

Verification on ampere:
  test_av1.ivf:             bit-exact PASS sha 029ee72c214b37c1
  av1-1-b8-02-allintra.ivf: 10/10 PASS (no regression)
  av1_larger.ivf:           3/10 PASS (no regression)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 12:19:19 +00:00
claude-noether c839b9456e ampere-av1 Phase 3 finding: iter2 Fix 3 release is NOT the divergence cause
Investigated whether picture.c::BeginPicture's iter2 Fix 3 release-on-
rebind was causing AV1 inter-frame divergence on av1_larger.ivf
(film_grain stress vector). Added env-gated LIBVA_SKIP_REBIND=1
experiment (leak old slot instead of release); A/B run showed identical
3/10 PASS count with and without the release. Hypothesis disproved.

Where the divergence actually lives:
  - patched ffmpeg-v4l2-request-fourier libavcodec.so with a fwrite
    diag in ff_v4l2_request_append_output → 7 dump files for the
    -frames:v 5 kdirect run, sizes [15133, 3670, 1970, 1323, 812,
    886, 1310] BYTE-IDENTICAL to our LIBVA_V4L2_DUMP_OUTPUT first 7
    submissions for the same input
  - our backend has 2 EXTRA EndPicture calls (t8 size 824, t9 size
    487) on RE-USED surfaces (0x4000008 and 0x4000006)
  - the extras happen because ffmpeg-vaapi's AV1 hwaccel issues
    redecode requests onto surfaces that already hold frames the
    consumer hasn't downloaded yet
  - SKIP_REBIND should let those redecodes' slots stay around but
    doesn't help, because surface_object->current_slot can only
    point at ONE slot at a time and bind_slot overwrites it

True root cause: ffmpeg-vaapi AV1 hwaccel's surface accounting is
incompatible with the iter2 Fix 3 1:1 surface↔slot invariant when
the stream has show_existing_frame frames. Fix would need either
(a) cap_pool tracking N surfaces per slot, or (b) backend reading
ffmpeg-vaapi's display-order mapping and remapping slots accordingly.
Both are non-trivial Phase 4 work — outside this iteration's scope.

Reverted the LIBVA_SKIP_REBIND env-gate to clean shape. Comment
updated with the investigation outcome so the next session has the
context without rediscovering.

State: 3/10 av1_larger frames bit-exact (frames 0/2/4, the
apply_grain=1 IDR-derived ones). test_av1.ivf 208x208 still bit-exact
PASS (no regression). diagnostic logs in BeginPicture +
surface_unbind_slot + v4l2_ioctl_controls retained for future
investigation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 12:12:23 +00:00
claude-noether d7ef0f6cd9 ampere-av1 Phase 3: SEQUENCE byte-equal kdirect; 3/10 frames PASS bit-exact
Three more fixes after strace-diff localization vs kdirect.

Fix 6 — fill_sequence ENABLE_SUPERRES: gate on
picture->pic_info_fields.bits.use_superres instead of unconditional
set-true. VAAPI doesn't expose enable_superres at sequence level; per
strace diff kdirect clears the flag for streams not using superres
(byte 1 of flags was the only SEQUENCE diff). After this fix,
SEQUENCE ctrl byte-equal kdirect on every call.

Fix 7 — refresh_frame_flags = 0xff (was 0): VAAPI doesn't expose
refresh_frame_flags. Default 0xff = "refresh all DPB slots" matches
kdirect's submission and AV1 spec default for KEY/SWITCH frames; for
inter frames simple P-frame chains naturally tolerate this.

Fix 8 — surface_object->av1_order_hint per-surface tracking. Set in
av1_set_controls from picture->order_hint of the current frame. Also
propagated to the linked display surface (when apply_grain=1 →
cur_frame != cur_display) so future frames referencing the display
surface find the order_hint via the linked_decode_surface_id.

Tried + reverted: ref-name iteration of reference_frame_ts / order_hints
via picture->ref_frame_idx[i-1] → DPB slot (Kwiboo's convention via
FFmpeg's s->ref[i]). Empirically regressed 3/10 → 1/10. V4L2 uAPI's
indexing here looks DPB-slot-direct despite the AV1 spec lexicon —
needs kernel-side disambiguation to settle.

Verification on ampere (av1_larger.ivf 352x288, 10 frames):
  Frames 0, 2, 4: PASS bit-exact (apply_grain=1, grain HW path)
  Frames 1, 3, 5-9: DIFF (apply_grain=0)
  3/10 PASS (was 1/10 after iter checkpoint).
  test_av1.ivf 208x208: unchanged bit-exact PASS sha 029ee72c214b37c1

Remaining open: frame 1 (apply_grain=0, first inter) submits IDENTICAL
FRAME ctrl bytes to kdirect (verified strace-diff post-fix), yet
decoded output diverges. That means the divergence is no longer in
control submission — points at OUTPUT-side bitstream differences
between ffmpeg-vaapi and ffmpeg-v4l2request, or at DPB CAPTURE buffer
state (grain-applied data being used as reference vs pre-grain).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:55:07 +00:00
claude-noether 5803cbcf6c ampere-av1 Phase 3 progress: film_grain link + UPDATE_GRAIN; frame 0 bit-exact
Three structural fixes for AV1 with film_grain on vpu981 (RK3588). Output
is no longer empty / crashed; frame 0 (IDR with apply_grain=1) is
bit-exact vs kdirect. Inter frames still diverge.

Fix 1 — surface.h + surface.c: linked_decode_surface_id field on
object_surface, initialized to VA_INVALID_SURFACE. When AV1 picture has
apply_grain=1, VAAPI's VADecPictureParameterBufferAV1 carries a
current_display_picture distinct from current_frame. ffmpeg-vaapi calls
vaBeginPicture on current_frame (decode surface, slot gets bound) but
vaGetImage on current_display_picture (display surface, no slot) → NULL
deref in copy_surface_to_image.

Fix 2 — av1.c: in av1_set_controls, when cur_frame != cur_display, set
display_surface->linked_decode_surface_id = current_frame. Establishes
the back-link so display surface can borrow decode surface's data.

Fix 3 — image.c copy_surface_to_image: when slot is NULL and the
surface has linked_decode_surface_id, lookup the decode surface and
mirror its destination_data[] + destination_sizes[] +
destination_planes_count. NULL guard with diagnostic log retained.

Fix 4 — av1.c fill_film_grain: when apply_grain=1, also set
V4L2_AV1_FILM_GRAIN_FLAG_UPDATE_GRAIN. Confirmed by strace-diff: kdirect
sends flags=0x0B (APPLY|UPDATE|...), libva was sending 0x09 (APPLY but
no UPDATE). Without UPDATE the kernel tries to reuse from
film_grain_params_ref_idx=0, which is never populated. Earlier reverted
because UPDATE seemed to trigger a SEGV — but that SEGV was the
unmasked NULL-slot deref; with fix 1+2+3 in place UPDATE is safe.

Fix 5 — av1.c reference_frame_ts plumbing: when a referenced surface
has timestamp=0 AND linked_decode_surface_id set, follow the link to
find the decode surface that carries the real timestamp. Display
surfaces don't get OUTPUT QBUF'd by us, so their own timestamp stays
zero.

Also: BeginPicture diagnostic log + surface_unbind_slot diagnostic log
+ v4l2.c error_idx diagnostic (kept from earlier — useful for ongoing
investigation).

Verification on ampere:
  test_av1.ivf (208x208, 2 frames, no grain): bit-exact PASS sha
    029ee72c214b37c1 (unchanged, no regression)
  av1_larger.ivf (352x288, 10 frames, film_grain alternates):
    frame 0 (key, apply_grain=1): PASS bit-exact vs kdirect
    frame 4: PASS bit-exact
    frames 1,2,3,5,6,7,8,9: DIFFER

Frame 0 PASS proves: SEQUENCE + FRAME + TILE_GROUP_ENTRY + FILM_GRAIN
mapping is correct for IDR. Frame 4 PASS is unexplained but encouraging.
Inter-frame divergence (frame 1+) points at: reference handling for
inter prediction is still off — either order_hints[] (still zero,
VAAPI doesn't expose per-ref), or grain-applied vs pre-grain DPB
semantics, or ref_frame_idx pointing into the wrong surface space.

Next investigation: per-frame strace diff between libva and kdirect
controls payload to spot remaining field mis-mappings on inter frames.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:45:31 +00:00
claude-noether ab79ed5e4d ampere-av1 Phase 3 in-progress notes: UPDATE_GRAIN segfault; 352x288 still 0-byte
Phase 3 iteration on the av1_larger.ivf (352x288 film_grain-50) fixture.
208x208 test_av1.ivf remains bit-exact PASS at sha 029ee72c214b37c1
(libva == kdirect, post-reference_frame_ts plumbing).

Negative result this commit: setting V4L2_AV1_FILM_GRAIN_FLAG_UPDATE_GRAIN
unconditionally when apply_grain=1 triggered a userspace SIGSEGV in
ffmpeg on the av1_larger fixture (consistent across runs). Reverted to
implicit update_grain=0 — same behavior as before the experiment
(silent 0-byte output, no segfault).

Hypothesis ruled out: the 352x288 silent-decode-failure is NOT solved
by always-update_grain. A/B test earlier also confirmed that omitting
the FILM_GRAIN control entirely (AV1_NO_FG=1) still produces 0 bytes,
so film_grain is not the trigger.

Remaining Phase 3 investigation candidates:
  - tile_info field shape — single-tile av1_larger may stress mi_col/
    row_starts sentinel differently than single-tile test_av1
  - segmentation / quantization fields — different streams use
    different combinations
  - order_hints[] still zero (VAAPI doesn't expose per-ref)
  - kernel-side dev_dbg in vpu981 driver would expose which
    control field validation rejects
  - strace -e ioctl on the failing decode reveals MEDIA_REQUEST_IOC_QUEUE
    return value

Sibling-iteration parallel: ampere-kernel-decoders iter2-5 took
multiple iterations to localize the HEVC OOPS to kernel-side
ext_sps NULL init + slice_params; AV1 likely needs the same depth
of kernel-side instrumentation for the 352x288 case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:31:24 +00:00
claude-noether 5fb7e36955 ampere-av1 Phase 3 fix: wire reference_frame_ts[] from VAAPI ref_frame_map[]
Phase 2.1 first hardware test on ampere passed frame 1 (IDR) bit-exact
vs kdirect but frame 2 (inter) diverged starting at byte 64897. Root
cause: reference_frame_ts[] left at zero — kernel can't cross-reference
prior CAPTURE buffers without timestamps.

Fix: in av1_set_controls (which has driver_data), iterate VAAPI's
ref_frame_map[8] (VASurfaceIDs), look up each via SURFACE(driver_data,
ref_id), and pull v4l2_timeval_to_ns(&ref_surface->timestamp) into the
V4L2 ctrl. VA_INVALID_SURFACE entries stay at calloc-zero. Mirrors the
vp9.c:614-628 pattern scaled to AV1's 8 ref slots.

surface_object->timestamp itself is populated in picture.c::EndPicture
from context_object->timestamp_counter at QBUF time on the OUTPUT
buffer — already in place from iter1 baseline.

Verification on ampere (/tmp/test_av1.ivf 208x208, 2 frames):
  Frame 1 + 2 libva sha 029ee72c214b37c1 == kdirect 029ee72c214b37c1
  → 100% byte-identical, kdirect was Phase 0-verified bit-perfect

order_hints[] still zero — VAAPI doesn't expose per-ref POC; observed
not load-bearing on the 208x208 smoke vector. Multi-tile + film_grain
stress vectors are next (av1-1-b8-23-film_grain-50.ivf).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:28:32 +00:00
claude-noether 85bcddb5ad v4l2: surface error_idx + errno on VIDIOC_S_EXT_CTRLS failure
ampere-av1 Phase 2.1 + 3 diagnostic: log which control failed validation
on S_EXT_CTRLS rejection so debug iterations can identify the offending
CID without strace. Pre-validation failures (error_idx >= count) log as
"<pre-validation>" with the syscall errno surfacing the root reason.

Already informative on ampere — surfaces the pre-existing benign H264 +
HEVC device-init failures on the vpu981 AV1 fd as count=2 / failed_cid=0
(those go through (void)cast at context.c:450/473 by design).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:20:31 +00:00
claude-noether 9c30eccd52 ampere-av1 Phase 2.1: implement av1_set_controls body (~500 LoC)
Replaces stub av1_set_controls with full VAAPI → V4L2 stateless AV1
control translation. Four V4L2 controls batched per-frame:
  V4L2_CID_STATELESS_AV1_SEQUENCE       (sequence-level flags)
  V4L2_CID_STATELESS_AV1_FRAME          (heavy — quant, lf, cdef, lr, gm,
                                          tile_info, refs, frame flags)
  V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY[] (DYNAMIC_ARRAY, size=MAX(N,1))
  V4L2_CID_STATELESS_AV1_FILM_GRAIN     (gated on driver_data->has_av1_film_grain)

Reference: Kwiboo/FFmpeg v4l2-request-n8.1:libavcodec/v4l2_request_av1.c
(636 LoC); same V4L2 output schema, sourced from VAAPI's
VADecPictureParameterBufferAV1 instead of FFmpeg's AV1RawSequenceHeader.

VAAPI gap notes (fields the spec needs but VAAPI doesn't expose):
  - sequence max_frame_{width,height}_minus_1 — use current frame size
  - enable_warped_motion / enable_ref_frame_mvs / enable_superres /
    enable_restoration sequence-level — conservative set-true (per-frame
    flags gate actual behavior)
  - order_hints[], reference_frame_ts[] — zero (kernel cross-refs by
    OUTPUT timestamp / surface id)
  - tile_start_col_sb[] / tile_start_row_sb[] — reconstruct via
    prefix-sum on VAAPI's width/height_in_sbs_minus_1[]
  - tile_size_bytes — set to 4 for multi-tile frames (max value), 0
    for single-tile (matches Kwiboo's conditional)
  - render_width/height — fall back to coded dimensions
  - current_frame_id / refresh_frame_flags / skip_mode_frame_idx /
    buffer_removal_time / frame_refs_short_signaling — zero
  - film_grain_params_ref_idx / update_grain — zero (only consulted in
    reuse paths; apply_grain=1 + populated arrays drive decode directly)

F1/F2/F3 risk mitigations per phase1_plan_v2:
  F1: mi_col/row_starts sentinel = 2 * ((frame_width + 7) >> 3) at
      index [tile_cols]/[tile_rows] — mirrors Kwiboo lines 238/244
  F2: superres_denom direct from VAAPI's superres_scale_denominator
      (VAAPI's encoding is the final value; no AV1_SUPERRES_DENOM_MIN
      math). Fallback to AV1_SUPERRES_NUM=8 if zero.
  F3: loop_restoration_size[] gated on USES_LR flag derived from
      y_t != 0 || cb_t != 0 || cr_t != 0 — mirrors Kwiboo lines 281-287

Plus:
  - request.h: has_av1_film_grain bool on driver_data
  - request.c: probe VIDIOC_QUERY_EXT_CTRL for FILM_GRAIN on vpu981 fd
    at VA_DRIVER_INIT (Janet v3 amendment A: init-time, not lazy)

Compile-tested on boltzmann (aarch64 native, gcc 15.2.1): clean .so,
0 errors, pre-existing GStreamer #warnings only.

Phase 3 verification on ampere is next: 208x208 smoke + film_grain
stress vector (av1-1-b8-23-film_grain-50.ivf) byte-compare libva vs
kdirect (Phase 0 proved kdirect bit-perfect).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:18:46 +00:00
claude-noether 78a9978b02 ampere-av1 Phase 2 step 4: AV1 dispatch scaffolding compiles and wires
surface.h: av1 substruct (picture + tile_group_entries[AV1_MAX_TILES=128]
  + num_tile_group_entries counter)
picture.c: dispatch VAPictureParameterBufferAV1 + VASliceParameterBufferAV1
  into surface->params.av1.*; call av1_set_controls in EndPicture path
av1.h: minimal interface (av1_set_controls signature)
av1.c: stub set_controls returning -1 with diagnostic; _Static_assert on
  v4l2_ctrl_av1_tile_group_entry size = 16 (Janet hygiene)
meson.build: av1.c + av1.h in source list

Verified on ampere with /tmp/test_av1.ivf via LIBVA_DRIVER_NAME=v4l2_request:

  v4l2-request: ampere-av1: vpu981 AV1 decoder at /dev/video4 + /dev/media3
  v4l2-request: ampere-av1: av1_set_controls stub — Phase 2.1 will implement ...
  [av1] Failed to end picture decode issue: 1 (operation failed).
  [av1] HW accel end frame fail.
  [dec:av1] Error submitting packet to decoder: Input/output error

Clean graceful failure — vpu981 probe works, dispatch reaches av1.c,
stub returns ERROR, ffmpeg falls back to SW. No crash, no IOMMU fault,
no kernel taint.

Next: Phase 2.1 implementation of fill_sequence + fill_frame +
fill_film_grain + fill_tile_group_entries (~700 LoC mirror of Kwiboo
v4l2_request_av1.c, applying F1/F2/F3 implementation-time corrections
from Janet review v2).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:55:39 +02:00
claude-noether 61db76ebcf ampere-av1 Phase 2 step 2: advertise VAProfileAV1Profile0 via libva
Extended any_fd_supports_output_format() with vpu981 fd as 4th probe
target. Added V4L2_PIX_FMT_AV1_FRAME advertisement in
RequestQueryConfigProfiles. VAProfileAV1Profile0 in entrypoints +
GetConfigAttributes switches.

V4L2_REQUEST_MAX_PROFILES=11 now exactly full; comment added warning
about future profile additions needing the constant bumped.

Verified via vainfo:
  VAProfileMPEG2Simple/Main, H264×5, HEVC, VP8, AV1   — all advertised
  (VP9 absent because rkvdec module is on sibling-campaign-close
   state, not the broken vp9-iter1; restoring VP9 needs the
   ampere-vp9-enablement campaign reopened or the fail-state module
   reloaded.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:54:12 +02:00
claude-noether bed75c0cef ampere-av1 Phase 2 step 1: third-device fd scaffolding for vpu981
RK3588 has THREE hantro-vpu instances (legacy MPEG2/VP8 at /dev/video2,
encoder at /dev/video3, vpu981 AV1 at /dev/video4). The existing
2-device probe is "RK3399-shaped knowledge" — silently picks the first
hantro-vpu and never finds vpu981.

This commit adds:
- video_fd_vpu981 + media_fd_vpu981 slots to request_data
- video_node_supports_output_fmt(): capability probe via VIDIOC_ENUM_FMT
  on OUTPUT/OUTPUT_MPLANE queues
- find_decoder_device_by_driver_with_fmt(): walks /dev/media* matching
  driver name AND capability filter (V4L2_PIX_FMT_AV1_FRAME for vpu981)
- 'a' kind in request_device_kind_for_profile (VAProfileAV1Profile0)
- 'a' branch in request_switch_device_for_profile
- vpu981 probe at backend init, alongside existing rkvdec + hantro
- vpu981 fd cleanup in RequestTerminate
- VAProfileAV1Profile0 → V4L2_PIX_FMT_AV1_FRAME in codec.c

Verified on ampere:

  $ LIBVA_DRIVER_NAME=v4l2_request ffmpeg ... 2>&1 | grep iter38
  v4l2-request: auto-selected codec device: /dev/video1 + /dev/media0
  v4l2-request: iter38: also opened hantro-vpu decoder at /dev/video2 + /dev/media1
  v4l2-request: ampere-av1: vpu981 AV1 decoder at /dev/video4 + /dev/media3

Three devices opened. HEVC still works (iter2 EXT_SPS_RPS probe still
triggers on rkvdec, sibling-campaign bit-perfect behaviour preserved).

Next steps: config.c advertise VAProfileAV1Profile0, surface.h add
av1 substruct, picture.c dispatch, av1.{c,h} for the codec dispatch
(~700 LoC mirroring Kwiboo v4l2_request_av1.c).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:53:37 +02:00
claude-noether 1a2c958ab3 iter2 step4: wire h265_set_controls to populate EXT_SPS_*_RPS controls
Per Phase 4 plan + Phase 5 review amendments (SPS parse-and-cache,
per-fd gating).

src/h265.c additions:
  - #include <errno.h>, the v4l2-hevc-ext-controls.h, and the
    vendored gst/codecparsers/gsth265parser.h
  - new static helper h265_populate_ext_sps_rps_cache(): walks
    surface_object->source_data for an SPS NAL (nal_unit_type == 33)
    using gst_h265_parser_identify_nalu; if found, calls
    gst_h265_parser_parse_sps_ext (NOT gst_h265_parser_parse_sps —
    the latter discards the per-RPS-entry EXT data we need); maps
    GstH265ShortTermRefPicSet (base) + GstH265ShortTermRefPicSetExt
    (carrying use_delta_flag[16], used_by_curr_pic_flag[16],
    delta_poc_s0_minus1[16], delta_poc_s1_minus1[16]) into the V4L2
    struct arrays; stores on driver_data->hevc_rps_cache_*
  - non-IDR-frame handling: cache holds across frames, so frames
    whose source_data lacks an SPS NAL reuse the previously-parsed
    cached arrays (Phase 5 review item #3)
  - controls[] grows from [5] to [7]; the 2 new entries are appended
    after the standard 5 (SPS/PPS/SLICE_PARAMS/SCALING_MATRIX/
    DECODE_PARAMS), gated by driver_data->has_hevc_ext_sps_rps_rkvdec
    (per-fd probe result from Step 3) + the cache being valid
  - field-by-field mapping mirrors GStreamer's
    gst_v4l2_codec_h265_dec_fill_ext_sps_rps verbatim (the upstream
    reference identified in Phase 0 prior-art survey)

src/request.h additions:
  - struct request_data carries hevc_rps_cache_st (array pointer),
    _st_count, hevc_rps_cache_lt, _lt_count, hevc_rps_cache_valid.
    Single-slot cache (sps_id 0 only; multi-SPS streams would need
    expanding). Stores POST-MAPPED V4L2 structs so request.h doesn't
    need to know GstH265SPS / GstH265SPSEXT types.

Critical interpretation correction (Phase 5 review followup):
GstH265SPS has short_term_ref_pic_set[65] (base) but NOT
short_term_ref_pic_set_ext[]. The EXT array lives on a SEPARATE
GstH265SPSEXT struct accessed via gst_h265_parser_parse_sps_ext.
The 'plain' gst_h265_parser_parse_sps internally calls _ext with a
LOCAL discarded SPSEXT (see gsth265parser.c:2050). Our call must
use the _ext variant directly to keep the EXT data. Caught during
Step 4 first-build error.

Build verified: ninja -C build clean. .so is 759 KB (up from 485 KB
original, 682 KB after Step 2 vendor — the +80 KB is the new helper
+ extension).

iter2 Phase 6 Step 5 (install + reboot + smoke-test) is the F1
falsifier moment: if HEVC stops OOPSing, mechanism confirmed; if it
still OOPSes, loopback Phase 0 with re-opened kernel-agent#11.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 11:09:58 +02:00
claude-noether 4f6ba6c0e3 iter2 step3: HEVC EXT_SPS_*_RPS UAPI header + runtime probe
src/hevc-ctrls/v4l2-hevc-ext-controls.h (NEW, MIT, ~95 LOC):
  Verbatim mirror of Linux 7.0 V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS
  and _LT_RPS control IDs + struct definitions + flag macros. Each
  symbol is ifndef-guarded so when ampere's linux-api-headers
  eventually bumps to 7.0+, the kernel header takes precedence and
  this shim silently no-ops. Citation block links the upstream
  Casanova v8 series.

  Per LGPL section 3.b, kernel UAPI struct definitions are excepted
  from GPL inheritance, so copying them into MIT userspace is fine.

src/request.h: added has_hevc_ext_sps_rps_rkvdec + _hantro bool
  fields on struct request_data — pair-of-flags layout mirrors
  video_fd_rkvdec / video_fd_hantro (iter38 multi-device-probe
  pattern, per feedback_multi_device_probe_design). Phase 5 review
  identified single-scalar storage as a silent-misbehavior risk
  across device-switch boundaries.

src/request.c:
  - new probe_hevc_ext_sps_rps_controls(fd) helper: queries the two
    new CIDs via VIDIOC_QUERYCTRL; returns true iff both register.
    RK3399 rkvdec (linux 6.x or 7.x without VDPU381/383 bindings)
    returns false; RK3588 rkvdec (VDPU381/383) returns true.
  - probe each driver_data->video_fd_rkvdec / _hantro after the
    iter38 multi-device-probe block at VA_DRIVER_INIT time
  - log-line if rkvdec supports it - diagnostic for Phase 7

src/meson.build: added the new UAPI header to the headers list.

Build verified: ninja -C build clean, .so produced. The new probe
runs at driver init and stores the result, but nothing CONSUMES the
result yet — that's Step 4 (h265_set_controls wiring).

Per ampere-kernel-decoders campaign iter2 Phase 4 step 3 (amended
by Phase 5 review item 'per-fd storage').

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 11:08:10 +02:00
claude-noether c5fbc5bf04 iter2 step2: GLib/GStreamer compat shim, build succeeds
Vendored gsth265parser + nalutils + gstbitreader + gstbytereader (the
Step 1 commit) compile cleanly against libc + libv4l2 only after
adding 1 compat translation unit + 5 stub headers, no edits to the
vendored .c/.h files themselves.

src/h265_parser/gst_compat.{h,c} — new files (MIT, original work):
  - GLib type aliases (gboolean, gchar, gint*, guint*, gsize, gpointer)
  - Memory helpers (g_malloc/g_free as #define free, g_memdup2 inline)
  - Asserts as no-op + parser-return-code-propagation
  - All GST_DEBUG/INFO/WARNING/ERROR/LOG/FIXME as no-ops (the parser
    is heavy on debug logging; we compile it all out)
  - GArray implementation (~100 LOC, just enough for gsth265parser.c's
    24 call sites)
  - GList full struct with .data/.next/.prev so callers compile;
    list-manipulation functions abort() — dead code paths only
  - Byte-order read/write macros (GST_READ_UINT8/16/24/32/64_LE/BE,
    GST_WRITE_UINT8/16/24/32_BE) — aarch64 LE inlines
  - g_once_init_enter/leave as simple gate
  - G_MAXUINT*, G_MAXINT*, G_MINxxx, G_GNUC_* attribute macros, etc.
  - Opaque GstBuffer/GstMemory/GstMapInfo + abort-stub functions for
    the encoder-side SEI-insertion paths the libva backend never invokes
  - gst_util_ceil_log2 real impl (used by slice-header parser; dead
    for our SPS-only call path but cheaper to implement than stub)

src/h265_parser/gst/{gst.h,base/base-prelude.h,base/gstbitwriter.h,
codecparsers/codecparsers-prelude.h,glib-compat-private.h} — 5 new
stub headers (MIT). All include gst_compat.h. gstbitwriter.h adds
abort-stub functions for the bit-writer API (used by nalutils.c's NAL
emulation-prevention encoder path — dead code for the parse-only
libva backend).

src/meson.build — added the 5 new .c source files and 10 new .h
headers; added include_directories('h265_parser') to the include path
so the vendored files' '#include <gst/base/...>' style references
resolve to the stub headers + actual vendored files in the local
tree.

Build verified: ninja -C build produces v4l2_request_drv_video.so
(682 KB, up from 485 KB pre-vendor — the +200 KB is the vendored
parser code). nm shows gst_h265_parse_sps, gst_h265_parse_sps_ext,
gst_h265_parser_identify_nalu, and the other functions we need for
Step 4 are present in the binary.

Two #warning messages from gsth265parser.h about API stability are
upstream-intentional and harmless ('The H.265 parsing library is
unstable API and may change in future').

This commit completes Step 2 of ampere-kernel-decoders iter2 Phase 6.
Backend remains functionally identical to pre-iter2 — the new code
compiles + links but is not yet called from h265_set_controls (that's
Step 4). Existing 5 codecs continue to work as before.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 11:06:30 +02:00
claude-noether f91c3f53c5 iter2 step1: vendor GStreamer 1.28.2 H.265 parser unchanged
Source: gitlab.freedesktop.org/gstreamer/gstreamer @ commit 43421c2a5b8a
(refs/tags/1.28.2). All 8 vendored files copied verbatim into
src/h265_parser/:

  gst-plugins-bad/gst-libs/gst/codecparsers/gsth265parser.c (168 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/gsth265parser.h ( 92 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/nalutils.c       (13 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/nalutils.h       (  8 KB)
  gstreamer/libs/gst/base/gstbitreader.c                     (  8 KB)
  gstreamer/libs/gst/base/gstbitreader.h                     ( 10 KB)
  gstreamer/libs/gst/base/gstbytereader.c                    ( 39 KB)
  gstreamer/libs/gst/base/gstbytereader.h                    ( 25 KB)

Total ~11 KLOC, LGPL v2.1+ per original headers (Intel + Sreerenj
Balachandran + others). LGPL headers preserved verbatim. Backend's
existing COPYING.LGPL covers redistribution.

** Build is INTENTIONALLY BROKEN at this commit. ** GLib dependencies
(GArray, g_malloc, gboolean, GST_DEBUG, etc.) are not yet satisfied;
src/Makefile.am is not yet updated to include these files. Step 2
performs the GLib-to-libc mechanical adaptation; Step 3 wires the
header + Makefile.

This vendor-unchanged commit is the upstream-tracking baseline. When
GStreamer ships a parser bug fix, the future-sync workflow is:
  git diff src/h265_parser/ HEAD..(this commit)
to surface our adaptations, then rebase those over the upstream fix.

Per ampere-kernel-decoders campaign iter2 Phase 4 §Step 1
(/home/mfritsche/src/ampere-kernel-decoders/phase4_plan_iter2.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 11:02:12 +02:00
29 changed files with 13319 additions and 8 deletions
+689
View File
@@ -0,0 +1,689 @@
/*
* Copyright (C) 2026 claude-noether <claude-noether@reauktion.de>
*
* ampere-av1-enablement Phase 2.1: AV1 codec dispatcher for libva-v4l2-
* request-fourier. Translates VAAPI AV1 picture/slice parameter buffers
* into V4L2 stateless AV1 controls (V4L2_CID_STATELESS_AV1_*) for the
* Rockchip vpu981 hardware on RK3588.
*
* Reference: Kwiboo/FFmpeg v4l2-request-n8.1:libavcodec/v4l2_request_av1.c
* (636 LoC; reads from FFmpeg's AV1RawSequenceHeader + AV1RawFrameHeader).
* VAAPI exposes the same AV1 spec semantics through different struct
* shapes: sequence-level fields are folded into VADecPictureParameterBufferAV1
* (no separate sequence buffer); per-frame fields live in the same struct.
*
* F1/F2/F3 risk mitigations per phase1_plan_v2 §"General fill_frame
* implementation risks":
* F1 tile_info.mi_col/row_starts sentinel = 2 * ((frame_width + 7) >> 3)
* mirrors Kwiboo lines 238/244 exactly.
* F2 superres_denom: VAAPI exposes superres_scale_denominator directly
* and per spec it's already 8 when use_superres=0. No offset math
* needed (Kwiboo does it because FFmpeg stores raw coded_denom).
* F3 loop_restoration_size[] gated on USES_LR flag mirrors Kwiboo
* lines 281-287 exactly.
*
* V4L2 controls (4 per frame, batched in one VIDIOC_S_EXT_CTRLS):
* 1. V4L2_CID_STATELESS_AV1_SEQUENCE
* 2. V4L2_CID_STATELESS_AV1_FRAME
* 3. V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY[] (DYNAMIC_ARRAY)
* 4. V4L2_CID_STATELESS_AV1_FILM_GRAIN (conditional on driver_data->
* has_av1_film_grain probe)
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
* THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
#include "av1.h"
#include "context.h"
#include "object_heap.h"
#include "request.h"
#include "surface.h"
#include "utils.h"
#include "v4l2.h"
#include <va/va.h>
#include <linux/videodev2.h>
#include <linux/v4l2-controls.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
/* Sanity asserts to catch kernel uAPI drift. If these fire, the kernel
* headers on the build machine are out of sync with what the running
* driver expects — silent register-misalignment bugs result. Cross-compile
* hazard per Janet v3 amendment: native-arm64 builds only (boltzmann +
* ampere); no cross from x86 against ARM kernel headers. */
_Static_assert(sizeof(struct v4l2_ctrl_av1_tile_group_entry) == 16,
"v4l2_ctrl_av1_tile_group_entry size drift — recheck uAPI");
/* Per AV1 spec, when use_superres=0 the superres denominator is 8.
* VAAPI's superres_scale_denominator already encodes this directly
* (per va_dec_av1.h: "When use_superres=0, superres_scale_denominator
* must be 8"). Kwiboo's AV1_SUPERRES_DENOM_MIN+coded_denom math is
* not needed when reading from VAAPI. */
#define AV1_SUPERRES_NUM 8
/* AV1 spec maxima used for V4L2 array sizing. */
#define BACKEND_AV1_MAX_SEGMENTS 8
#define BACKEND_AV1_SEG_LVL_MAX 8
#define BACKEND_AV1_SEG_LVL_REF_FRAME 5
#define BACKEND_AV1_NUM_REF_FRAMES 8
#define BACKEND_AV1_TOTAL_REFS_PER_FRAME 8
#define BACKEND_AV1_REFS_PER_FRAME 7
/* ===== fill_sequence ===== */
static void av1_fill_sequence(VADecPictureParameterBufferAV1 *picture,
struct v4l2_ctrl_av1_sequence *ctrl)
{
uint8_t bit_depth;
memset(ctrl, 0, sizeof(*ctrl));
switch (picture->bit_depth_idx) {
case 0: bit_depth = 8; break;
case 1: bit_depth = 10; break;
case 2: bit_depth = 12; break;
default: bit_depth = 8; break;
}
ctrl->seq_profile = picture->profile;
ctrl->order_hint_bits = picture->seq_info_fields.fields.enable_order_hint ?
(picture->order_hint_bits_minus_1 + 1) : 0;
ctrl->bit_depth = bit_depth;
/* VAAPI does NOT separately expose max_frame_{width,height}_minus_1
* (sequence-level). Use the current frame size as a proxy. Correct
* for fixed-size sequences (the 208/352/1080p test vectors). */
ctrl->max_frame_width_minus_1 = picture->frame_width_minus1;
ctrl->max_frame_height_minus_1 = picture->frame_height_minus1;
if (picture->seq_info_fields.fields.still_picture)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_STILL_PICTURE;
if (picture->seq_info_fields.fields.use_128x128_superblock)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_USE_128X128_SUPERBLOCK;
if (picture->seq_info_fields.fields.enable_filter_intra)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_FILTER_INTRA;
if (picture->seq_info_fields.fields.enable_intra_edge_filter)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_INTRA_EDGE_FILTER;
if (picture->seq_info_fields.fields.enable_interintra_compound)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_INTERINTRA_COMPOUND;
if (picture->seq_info_fields.fields.enable_masked_compound)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_MASKED_COMPOUND;
/* VAAPI doesn't expose enable_warped_motion as a sequence flag;
* per-frame allow_warped_motion gates it. Conservative: set true so
* per-frame flag is honored. */
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_WARPED_MOTION;
if (picture->seq_info_fields.fields.enable_dual_filter)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_DUAL_FILTER;
if (picture->seq_info_fields.fields.enable_order_hint)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_ORDER_HINT;
if (picture->seq_info_fields.fields.enable_jnt_comp)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_JNT_COMP;
/* enable_ref_frame_mvs / enable_restoration not exposed at sequence
* level — conservative set-true (kdirect also sets these for the
* test streams; gating doesn't matter because per-frame flags
* govern actual use). */
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_REF_FRAME_MVS;
/* enable_superres: gate on the current frame's use_superres so the
* SEQUENCE flag matches the bitstream-derived value. Empirical
* strace diff vs kdirect: kdirect clears this for streams that
* never use superres; we were unconditionally setting it true. */
if (picture->pic_info_fields.bits.use_superres)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_SUPERRES;
if (picture->seq_info_fields.fields.enable_cdef)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_CDEF;
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_ENABLE_RESTORATION;
if (picture->seq_info_fields.fields.mono_chrome)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_MONO_CHROME;
if (picture->seq_info_fields.fields.color_range)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_COLOR_RANGE;
if (picture->seq_info_fields.fields.subsampling_x)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_SUBSAMPLING_X;
if (picture->seq_info_fields.fields.subsampling_y)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_SUBSAMPLING_Y;
if (picture->seq_info_fields.fields.film_grain_params_present)
ctrl->flags |= V4L2_AV1_SEQUENCE_FLAG_FILM_GRAIN_PARAMS_PRESENT;
}
/* ===== fill_frame ===== */
static void av1_fill_frame(VADecPictureParameterBufferAV1 *picture,
struct v4l2_ctrl_av1_frame *ctrl)
{
unsigned int i, j;
memset(ctrl, 0, sizeof(*ctrl));
/* ---- tile_info ---- */
ctrl->tile_info.context_update_tile_id = picture->context_update_tile_id;
ctrl->tile_info.tile_cols = picture->tile_cols;
ctrl->tile_info.tile_rows = picture->tile_rows;
if (picture->tile_cols > 1 || picture->tile_rows > 1)
ctrl->tile_info.tile_size_bytes = 4;
else
ctrl->tile_info.tile_size_bytes = 0;
if (picture->pic_info_fields.bits.uniform_tile_spacing_flag)
ctrl->tile_info.flags |= V4L2_AV1_TILE_INFO_FLAG_UNIFORM_TILE_SPACING;
/* F1: mi_col/row_starts[]: prefix-sum from width_in_sbs_minus_1[]+1
* (Kwiboo reads tile_start_col_sb[] directly; VAAPI doesn't expose
* starts, only widths — reconstruct via accumulation). Plus the
* sentinel at index tile_cols/tile_rows. */
{
uint16_t cum = 0;
for (i = 0; i < picture->tile_cols && i < 63; i++) {
ctrl->tile_info.mi_col_starts[i] = cum;
ctrl->tile_info.width_in_sbs_minus_1[i] =
picture->width_in_sbs_minus_1[i];
cum = (uint16_t)(cum + picture->width_in_sbs_minus_1[i] + 1);
}
ctrl->tile_info.mi_col_starts[picture->tile_cols] =
2 * ((picture->frame_width_minus1 + 1 + 7) >> 3);
}
{
uint16_t cum = 0;
for (i = 0; i < picture->tile_rows && i < 63; i++) {
ctrl->tile_info.mi_row_starts[i] = cum;
ctrl->tile_info.height_in_sbs_minus_1[i] =
picture->height_in_sbs_minus_1[i];
cum = (uint16_t)(cum + picture->height_in_sbs_minus_1[i] + 1);
}
ctrl->tile_info.mi_row_starts[picture->tile_rows] =
2 * ((picture->frame_height_minus1 + 1 + 7) >> 3);
}
/* ---- quantization ---- */
ctrl->quantization.base_q_idx = picture->base_qindex;
ctrl->quantization.delta_q_y_dc = picture->y_dc_delta_q;
ctrl->quantization.delta_q_u_dc = picture->u_dc_delta_q;
ctrl->quantization.delta_q_u_ac = picture->u_ac_delta_q;
ctrl->quantization.delta_q_v_dc = picture->v_dc_delta_q;
ctrl->quantization.delta_q_v_ac = picture->v_ac_delta_q;
ctrl->quantization.qm_y = picture->qmatrix_fields.bits.qm_y;
ctrl->quantization.qm_u = picture->qmatrix_fields.bits.qm_u;
ctrl->quantization.qm_v = picture->qmatrix_fields.bits.qm_v;
ctrl->quantization.delta_q_res =
picture->mode_control_fields.bits.log2_delta_q_res;
if (picture->u_dc_delta_q != picture->v_dc_delta_q ||
picture->u_ac_delta_q != picture->v_ac_delta_q)
ctrl->quantization.flags |= V4L2_AV1_QUANTIZATION_FLAG_DIFF_UV_DELTA;
if (picture->qmatrix_fields.bits.using_qmatrix)
ctrl->quantization.flags |= V4L2_AV1_QUANTIZATION_FLAG_USING_QMATRIX;
if (picture->mode_control_fields.bits.delta_q_present_flag)
ctrl->quantization.flags |= V4L2_AV1_QUANTIZATION_FLAG_DELTA_Q_PRESENT;
/* ---- segmentation ---- */
if (picture->seg_info.segment_info_fields.bits.enabled)
ctrl->segmentation.flags |= V4L2_AV1_SEGMENTATION_FLAG_ENABLED;
if (picture->seg_info.segment_info_fields.bits.update_map)
ctrl->segmentation.flags |= V4L2_AV1_SEGMENTATION_FLAG_UPDATE_MAP;
if (picture->seg_info.segment_info_fields.bits.temporal_update)
ctrl->segmentation.flags |= V4L2_AV1_SEGMENTATION_FLAG_TEMPORAL_UPDATE;
if (picture->seg_info.segment_info_fields.bits.update_data)
ctrl->segmentation.flags |= V4L2_AV1_SEGMENTATION_FLAG_UPDATE_DATA;
for (i = 0; i < BACKEND_AV1_MAX_SEGMENTS; i++) {
for (j = 0; j < BACKEND_AV1_SEG_LVL_MAX; j++) {
if (picture->seg_info.feature_mask[i] & (1 << j)) {
ctrl->segmentation.feature_enabled[i] |=
V4L2_AV1_SEGMENT_FEATURE_ENABLED(j);
ctrl->segmentation.last_active_seg_id = i;
if (j >= BACKEND_AV1_SEG_LVL_REF_FRAME)
ctrl->segmentation.flags |=
V4L2_AV1_SEGMENTATION_FLAG_SEG_ID_PRE_SKIP;
}
ctrl->segmentation.feature_data[i][j] =
picture->seg_info.feature_data[i][j];
}
}
/* ---- loop_filter ---- */
ctrl->loop_filter.level[0] = picture->filter_level[0];
ctrl->loop_filter.level[1] = picture->filter_level[1];
ctrl->loop_filter.level[2] = picture->filter_level_u;
ctrl->loop_filter.level[3] = picture->filter_level_v;
ctrl->loop_filter.sharpness =
picture->loop_filter_info_fields.bits.sharpness_level;
ctrl->loop_filter.mode_deltas[0] = picture->mode_deltas[0];
ctrl->loop_filter.mode_deltas[1] = picture->mode_deltas[1];
ctrl->loop_filter.delta_lf_res =
picture->mode_control_fields.bits.log2_delta_lf_res;
for (i = 0; i < BACKEND_AV1_NUM_REF_FRAMES; i++)
ctrl->loop_filter.ref_deltas[i] = picture->ref_deltas[i];
if (picture->loop_filter_info_fields.bits.mode_ref_delta_enabled)
ctrl->loop_filter.flags |= V4L2_AV1_LOOP_FILTER_FLAG_DELTA_ENABLED;
if (picture->loop_filter_info_fields.bits.mode_ref_delta_update)
ctrl->loop_filter.flags |= V4L2_AV1_LOOP_FILTER_FLAG_DELTA_UPDATE;
if (picture->mode_control_fields.bits.delta_lf_present_flag)
ctrl->loop_filter.flags |= V4L2_AV1_LOOP_FILTER_FLAG_DELTA_LF_PRESENT;
if (picture->mode_control_fields.bits.delta_lf_multi)
ctrl->loop_filter.flags |= V4L2_AV1_LOOP_FILTER_FLAG_DELTA_LF_MULTI;
/* ---- cdef ---- */
ctrl->cdef.damping_minus_3 = picture->cdef_damping_minus_3;
ctrl->cdef.bits = picture->cdef_bits;
for (i = 0; i < (unsigned)(1 << picture->cdef_bits) && i < 8; i++) {
uint8_t y = picture->cdef_y_strengths[i];
uint8_t uv = picture->cdef_uv_strengths[i];
ctrl->cdef.y_pri_strength[i] = (y >> 2) & 0x0F;
ctrl->cdef.y_sec_strength[i] = y & 0x03;
ctrl->cdef.uv_pri_strength[i] = (uv >> 2) & 0x0F;
ctrl->cdef.uv_sec_strength[i] = uv & 0x03;
}
/* ---- loop_restoration ---- (F3)
* Phase 5 review Amendment 1 was REVERTED. The reviewer proposed
* remap = {NONE, SWITCHABLE, WIENER, SGRPROJ} (Kwiboo's table)
* based on AV1 spec FrameRestoreType wire encoding
* {NONE=0, SWITCHABLE=1, WIENER=2, SGRPROJ=3} differing from V4L2's
* {NONE=0, WIENER=1, SGRPROJ=2, SWITCHABLE=3}. Empirically applying
* that permutation regressed ALL tests (allintra 10/10 → 0/10) —
* so either VAAPI's yframe_restoration_type is NOT the raw spec
* value (already-remapped to V4L2 enum semantics?), or vpu981
* interprets the V4L2 enum values via a different mapping than
* the V4L2 uAPI header documents. Per
* [[feedback_review_empirical_over_theoretical]] keep the
* identity mapping that empirically works; revisit if a
* restoration-using fixture surfaces a real decode bug.
*/
{
uint8_t remap[4] = {
V4L2_AV1_FRAME_RESTORE_NONE,
V4L2_AV1_FRAME_RESTORE_WIENER,
V4L2_AV1_FRAME_RESTORE_SGRPROJ,
V4L2_AV1_FRAME_RESTORE_SWITCHABLE,
};
uint8_t y_t = picture->loop_restoration_fields.bits.yframe_restoration_type & 3;
uint8_t cb_t = picture->loop_restoration_fields.bits.cbframe_restoration_type & 3;
uint8_t cr_t = picture->loop_restoration_fields.bits.crframe_restoration_type & 3;
bool uses_lr = false;
ctrl->loop_restoration.frame_restoration_type[0] = remap[y_t];
ctrl->loop_restoration.frame_restoration_type[1] = remap[cb_t];
ctrl->loop_restoration.frame_restoration_type[2] = remap[cr_t];
if (y_t != 0)
uses_lr = true;
if (cb_t != 0 || cr_t != 0) {
uses_lr = true;
ctrl->loop_restoration.flags |=
V4L2_AV1_LOOP_RESTORATION_FLAG_USES_CHROMA_LR;
}
ctrl->loop_restoration.lr_unit_shift =
picture->loop_restoration_fields.bits.lr_unit_shift;
ctrl->loop_restoration.lr_uv_shift =
picture->loop_restoration_fields.bits.lr_uv_shift;
if (uses_lr) {
uint8_t shift = picture->loop_restoration_fields.bits.lr_unit_shift;
uint8_t uv_shift = picture->loop_restoration_fields.bits.lr_uv_shift;
ctrl->loop_restoration.flags |=
V4L2_AV1_LOOP_RESTORATION_FLAG_USES_LR;
ctrl->loop_restoration.loop_restoration_size[0] =
1 << (6 + shift);
ctrl->loop_restoration.loop_restoration_size[1] =
1 << (6 + shift - uv_shift);
ctrl->loop_restoration.loop_restoration_size[2] =
1 << (6 + shift - uv_shift);
}
}
/* ---- global_motion ---- */
for (i = 0; i < BACKEND_AV1_TOTAL_REFS_PER_FRAME; i++) {
if (i == 0)
continue; /* INTRA_FRAME slot — no warp */
ctrl->global_motion.type[i] = picture->wm[i - 1].wmtype;
for (j = 0; j < 6; j++)
ctrl->global_motion.params[i][j] = picture->wm[i - 1].wmmat[j];
if (picture->wm[i - 1].invalid)
ctrl->global_motion.invalid |=
V4L2_AV1_GLOBAL_MOTION_IS_INVALID(i);
switch (picture->wm[i - 1].wmtype) {
case 1:
ctrl->global_motion.flags[i] |=
V4L2_AV1_GLOBAL_MOTION_FLAG_IS_TRANSLATION;
ctrl->global_motion.flags[i] |=
V4L2_AV1_GLOBAL_MOTION_FLAG_IS_GLOBAL;
break;
case 2:
ctrl->global_motion.flags[i] |=
V4L2_AV1_GLOBAL_MOTION_FLAG_IS_ROT_ZOOM;
ctrl->global_motion.flags[i] |=
V4L2_AV1_GLOBAL_MOTION_FLAG_IS_GLOBAL;
break;
case 3:
ctrl->global_motion.flags[i] |=
V4L2_AV1_GLOBAL_MOTION_FLAG_IS_GLOBAL;
break;
default:
break;
}
}
/* ---- reference frames + order hints ---- */
/* reference_frame_ts[] is filled by the orchestrator (av1_set_controls)
* which has driver_data for the SURFACE() lookup. order_hints[] not
* exposed per-ref by VAAPI — leave zero. ref_frame_idx[7] is the
* index map from spec-defined ref slots (LAST..ALTREF) into
* ref_frame_map[8] (the surface IDs). */
for (i = 0; i < BACKEND_AV1_TOTAL_REFS_PER_FRAME; i++)
ctrl->order_hints[i] = 0;
for (i = 0; i < BACKEND_AV1_REFS_PER_FRAME; i++)
ctrl->ref_frame_idx[i] = picture->ref_frame_idx[i];
/* F2: superres_denom direct from VAAPI; fallback to AV1_SUPERRES_NUM
* if zero (spec violation but defensive). */
ctrl->superres_denom = picture->superres_scale_denominator
? picture->superres_scale_denominator : AV1_SUPERRES_NUM;
ctrl->skip_mode_frame[0] = 0;
ctrl->skip_mode_frame[1] = 0;
ctrl->primary_ref_frame = picture->primary_ref_frame;
ctrl->frame_type = picture->pic_info_fields.bits.frame_type;
ctrl->order_hint = picture->order_hint;
ctrl->upscaled_width = picture->frame_width_minus1 + 1;
ctrl->interpolation_filter = picture->interp_filter;
ctrl->tx_mode = picture->mode_control_fields.bits.tx_mode;
ctrl->frame_width_minus_1 = picture->frame_width_minus1;
ctrl->frame_height_minus_1 = picture->frame_height_minus1;
ctrl->render_width_minus_1 = picture->frame_width_minus1;
ctrl->render_height_minus_1 = picture->frame_height_minus1;
ctrl->current_frame_id = 0;
/* Phase 3: VAAPI doesn't expose refresh_frame_flags. For KEY/SWITCH
* frames the AV1 spec mandates 0xff (refresh all DPB slots). For
* inter frames we default to 0xff too — simple P-frame chains will
* naturally rotate through slots without a precise per-slot value.
* If the stream needs precise control, this needs SPS-side parsing.
* Empirical diff vs kdirect shows kdirect always sends 0xff here. */
ctrl->refresh_frame_flags = 0xff;
/* ---- frame flags ---- */
if (picture->pic_info_fields.bits.show_frame)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_SHOW_FRAME;
if (picture->pic_info_fields.bits.showable_frame)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_SHOWABLE_FRAME;
if (picture->pic_info_fields.bits.error_resilient_mode)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_ERROR_RESILIENT_MODE;
if (picture->pic_info_fields.bits.disable_cdf_update)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_DISABLE_CDF_UPDATE;
if (picture->pic_info_fields.bits.allow_screen_content_tools)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_ALLOW_SCREEN_CONTENT_TOOLS;
if (picture->pic_info_fields.bits.force_integer_mv)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_FORCE_INTEGER_MV;
if (picture->pic_info_fields.bits.allow_intrabc)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_ALLOW_INTRABC;
if (picture->pic_info_fields.bits.use_superres)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_USE_SUPERRES;
if (picture->pic_info_fields.bits.allow_high_precision_mv)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_ALLOW_HIGH_PRECISION_MV;
if (picture->pic_info_fields.bits.is_motion_mode_switchable)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_IS_MOTION_MODE_SWITCHABLE;
if (picture->pic_info_fields.bits.use_ref_frame_mvs)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_USE_REF_FRAME_MVS;
if (picture->pic_info_fields.bits.disable_frame_end_update_cdf)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_DISABLE_FRAME_END_UPDATE_CDF;
if (picture->pic_info_fields.bits.allow_warped_motion)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_ALLOW_WARPED_MOTION;
if (picture->mode_control_fields.bits.reference_select)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_REFERENCE_SELECT;
if (picture->mode_control_fields.bits.reduced_tx_set_used)
ctrl->flags |= V4L2_AV1_FRAME_FLAG_REDUCED_TX_SET;
if (picture->mode_control_fields.bits.skip_mode_present) {
ctrl->flags |= V4L2_AV1_FRAME_FLAG_SKIP_MODE_ALLOWED;
ctrl->flags |= V4L2_AV1_FRAME_FLAG_SKIP_MODE_PRESENT;
}
}
/* ===== fill_film_grain ===== */
static void av1_fill_film_grain(VADecPictureParameterBufferAV1 *picture,
struct v4l2_ctrl_av1_film_grain *ctrl)
{
VAFilmGrainStructAV1 *fg = &picture->film_grain_info;
unsigned int i;
memset(ctrl, 0, sizeof(*ctrl));
ctrl->cr_mult = fg->cr_mult;
ctrl->grain_seed = fg->grain_seed;
/* VAAPI doesn't expose film_grain_params_ref_idx (the reuse-from-
* previous-frame index). Leave zero — only consulted when
* update_grain=0, which VAAPI also doesn't expose. */
ctrl->film_grain_params_ref_idx = 0;
ctrl->num_y_points = fg->num_y_points;
ctrl->num_cb_points = fg->num_cb_points;
ctrl->num_cr_points = fg->num_cr_points;
ctrl->grain_scaling_minus_8 =
fg->film_grain_info_fields.bits.grain_scaling_minus_8;
ctrl->ar_coeff_lag = fg->film_grain_info_fields.bits.ar_coeff_lag;
ctrl->ar_coeff_shift_minus_6 =
fg->film_grain_info_fields.bits.ar_coeff_shift_minus_6;
ctrl->grain_scale_shift =
fg->film_grain_info_fields.bits.grain_scale_shift;
ctrl->cb_mult = fg->cb_mult;
ctrl->cb_luma_mult = fg->cb_luma_mult;
ctrl->cr_luma_mult = fg->cr_luma_mult;
ctrl->cb_offset = fg->cb_offset;
ctrl->cr_offset = fg->cr_offset;
if (fg->film_grain_info_fields.bits.apply_grain) {
ctrl->flags |= V4L2_AV1_FILM_GRAIN_FLAG_APPLY_GRAIN;
/* kdirect strace diff confirmed: V4L2_AV1_FILM_GRAIN_FLAG_
* UPDATE_GRAIN must be set when apply_grain=1 (kdirect's
* flags byte is 0x0B = APPLY|UPDATE|...). VAAPI's
* VAFilmGrainStructAV1 doesn't expose update_grain
* separately. Default to UPDATE=1 (use submitted params,
* not reuse from non-existent prior film_grain ref). The
* earlier segfault we saw with this flag was unmasked by
* the link-NULL deref (now fixed via linked_decode_surface);
* not caused by UPDATE_GRAIN itself. */
ctrl->flags |= V4L2_AV1_FILM_GRAIN_FLAG_UPDATE_GRAIN;
}
if (fg->film_grain_info_fields.bits.chroma_scaling_from_luma)
ctrl->flags |= V4L2_AV1_FILM_GRAIN_FLAG_CHROMA_SCALING_FROM_LUMA;
if (fg->film_grain_info_fields.bits.overlap_flag)
ctrl->flags |= V4L2_AV1_FILM_GRAIN_FLAG_OVERLAP;
if (fg->film_grain_info_fields.bits.clip_to_restricted_range)
ctrl->flags |= V4L2_AV1_FILM_GRAIN_FLAG_CLIP_TO_RESTRICTED_RANGE;
if (!fg->film_grain_info_fields.bits.apply_grain)
return;
for (i = 0; i < fg->num_y_points && i < 14; i++) {
ctrl->point_y_value[i] = fg->point_y_value[i];
ctrl->point_y_scaling[i] = fg->point_y_scaling[i];
}
for (i = 0; i < fg->num_cb_points && i < 10; i++) {
ctrl->point_cb_value[i] = fg->point_cb_value[i];
ctrl->point_cb_scaling[i] = fg->point_cb_scaling[i];
}
for (i = 0; i < fg->num_cr_points && i < 10; i++) {
ctrl->point_cr_value[i] = fg->point_cr_value[i];
ctrl->point_cr_scaling[i] = fg->point_cr_scaling[i];
}
for (i = 0; i < 24; i++)
ctrl->ar_coeffs_y_plus_128[i] = (uint8_t)(fg->ar_coeffs_y[i] + 128);
for (i = 0; i < 25; i++) {
ctrl->ar_coeffs_cb_plus_128[i] = (uint8_t)(fg->ar_coeffs_cb[i] + 128);
ctrl->ar_coeffs_cr_plus_128[i] = (uint8_t)(fg->ar_coeffs_cr[i] + 128);
}
}
/* ===== orchestrator ===== */
int av1_set_controls(struct request_data *driver_data,
struct object_context *context,
struct object_surface *surface_object)
{
VADecPictureParameterBufferAV1 *picture =
&surface_object->params.av1.picture;
unsigned int num_tiles = surface_object->params.av1.num_tile_group_entries;
struct v4l2_ctrl_av1_sequence sequence;
struct v4l2_ctrl_av1_frame frame;
struct v4l2_ctrl_av1_film_grain film_grain;
struct v4l2_ctrl_av1_tile_group_entry *tile_entries = NULL;
struct v4l2_ext_control controls[4];
unsigned int n = 0;
unsigned int i;
unsigned int alloc_tiles;
int rc;
(void)context;
/*
* AV1 film_grain link: when apply_grain=1, ffmpeg-vaapi allocates a
* separate display surface (current_display_picture) from the decode
* surface (current_frame). vpu981 HW applies grain inline to the
* decode CAPTURE buffer, so the consumable data is in current_frame's
* slot. ffmpeg then calls vaGetImage on current_display_picture which
* has no slot bound. Link the display surface back to the decode
* surface so copy_surface_to_image can borrow destination_data[].
*/
if (picture->current_display_picture != VA_INVALID_SURFACE &&
picture->current_display_picture != picture->current_frame) {
struct object_surface *display_surface =
SURFACE(driver_data, picture->current_display_picture);
if (display_surface != NULL)
display_surface->linked_decode_surface_id =
picture->current_frame;
}
if (num_tiles > AV1_MAX_TILES)
num_tiles = AV1_MAX_TILES;
/* DYNAMIC_ARRAY size = MAX(num_tiles, 1) per Janet v2 Q1
* amendment — kernel UB on size=0. */
alloc_tiles = num_tiles > 0 ? num_tiles : 1;
tile_entries = calloc(alloc_tiles, sizeof(*tile_entries));
if (tile_entries == NULL)
return -1;
for (i = 0; i < num_tiles; i++) {
VASliceParameterBufferAV1 *slice =
&surface_object->params.av1.tile_group_entries[i];
tile_entries[i].tile_offset = slice->slice_data_offset;
tile_entries[i].tile_size = slice->slice_data_size;
tile_entries[i].tile_row = (uint8_t)slice->tile_row;
tile_entries[i].tile_col = (uint8_t)slice->tile_column;
}
av1_fill_sequence(picture, &sequence);
av1_fill_frame(picture, &frame);
/*
* Phase 2.1 + frame-2 divergence fix: wire reference_frame_ts[].
* VAAPI exposes ref_frame_map[8] as VASurfaceIDs; the kernel needs
* v4l2-style timestamps to cross-reference the corresponding
* CAPTURE buffers (set on the OUTPUT buffer at QBUF time per
* picture.c::EndPicture, via surface_object->timestamp). Mirrors
* the vp9.c:614-628 pattern, scaled to AV1's 8 ref slots.
*
* VA_INVALID_SURFACE entries stay at the calloc'd zero timestamp
* (kernel reads zero, doesn't try to dereference).
*/
/*
* Empirical: DPB-slot iteration (i over ref_frame_map[i]) gives
* better correctness than ref-name iteration via ref_frame_idx[].
* Tried the ref-name reindex (Kwiboo convention via FFmpeg s->ref[i])
* and lost frames that previously PASSed (3/10 → 1/10) — so the V4L2
* uAPI semantic here may be DPB-slot-indexed despite the AV1 spec
* lexicon. Phase 3 open question pending kernel-side disambiguation.
*/
for (i = 0; i < BACKEND_AV1_TOTAL_REFS_PER_FRAME; i++) {
VASurfaceID ref_id = picture->ref_frame_map[i];
struct object_surface *ref_surface;
uint64_t ts;
if (ref_id == VA_INVALID_SURFACE)
continue;
ref_surface = SURFACE(driver_data, ref_id);
if (ref_surface == NULL)
continue;
ts = v4l2_timeval_to_ns(&ref_surface->timestamp);
if (ts == 0 &&
ref_surface->linked_decode_surface_id != VA_INVALID_SURFACE) {
struct object_surface *dec =
SURFACE(driver_data,
ref_surface->linked_decode_surface_id);
if (dec != NULL) {
ts = v4l2_timeval_to_ns(&dec->timestamp);
frame.order_hints[i] = dec->av1_order_hint;
}
} else {
frame.order_hints[i] = ref_surface->av1_order_hint;
}
frame.reference_frame_ts[i] = ts;
}
/* Phase 3: record this frame's order_hint on the surface so the
* NEXT frame's ref-loop can populate order_hints[] for slots that
* reference us. */
surface_object->av1_order_hint = picture->order_hint;
/* Also propagate to the linked display surface (if any), since
* future frames' ref_frame_map[] may point at either. */
if (picture->current_display_picture != VA_INVALID_SURFACE &&
picture->current_display_picture != picture->current_frame) {
struct object_surface *disp =
SURFACE(driver_data, picture->current_display_picture);
if (disp != NULL)
disp->av1_order_hint = picture->order_hint;
}
if (driver_data->has_av1_film_grain)
av1_fill_film_grain(picture, &film_grain);
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_AV1_SEQUENCE,
.ptr = &sequence,
.size = sizeof(sequence),
};
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_AV1_FRAME,
.ptr = &frame,
.size = sizeof(frame),
};
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY,
.ptr = tile_entries,
.size = sizeof(*tile_entries) * alloc_tiles,
};
if (driver_data->has_av1_film_grain) {
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_AV1_FILM_GRAIN,
.ptr = &film_grain,
.size = sizeof(film_grain),
};
}
rc = v4l2_set_controls(driver_data->video_fd,
surface_object->request_fd,
controls, n);
free(tile_entries);
if (rc < 0) {
request_log("ampere-av1: VIDIOC_S_EXT_CTRLS failed rc=%d\n", rc);
return -1;
}
return 0;
}
+45
View File
@@ -0,0 +1,45 @@
/*
* Copyright (C) 2026 claude-noether <claude-noether@reauktion.de>
*
* ampere-av1-enablement Phase 2: AV1 codec dispatcher header for libva-
* v4l2-request-fourier. Mirrors vp9.h shape — single set_controls entry
* point that translates surface->params.av1.* VAAPI structures into a
* batch of V4L2_CID_STATELESS_AV1_{SEQUENCE,FRAME,TILE_GROUP_ENTRY,
* FILM_GRAIN} controls + the underlying request_fd / OUTPUT plane setup.
*
* V4L2 target: V4L2_PIX_FMT_AV1_FRAME on the vpu981 hantro instance
* (RK3588's dedicated AV1 decoder).
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
* THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef _AV1_H_
#define _AV1_H_
#include "context.h"
#include "request.h"
#include "surface.h"
int av1_set_controls(struct request_data *driver_data,
struct object_context *context,
struct object_surface *surface);
#endif /* _AV1_H_ */
+2
View File
@@ -44,6 +44,8 @@ unsigned int pixelformat_for_profile(VAProfile profile)
return V4L2_PIX_FMT_VP8_FRAME; return V4L2_PIX_FMT_VP8_FRAME;
case VAProfileVP9Profile0: case VAProfileVP9Profile0:
return V4L2_PIX_FMT_VP9_FRAME; return V4L2_PIX_FMT_VP9_FRAME;
case VAProfileAV1Profile0:
return V4L2_PIX_FMT_AV1_FRAME;
default: default:
return 0; return 0;
} }
+19 -2
View File
@@ -84,6 +84,10 @@ VAStatus RequestCreateConfig(VADriverContextP context, VAProfile profile,
// fresnel-fourier iter4: VP9 Profile 0 enabled on rkvdec. // fresnel-fourier iter4: VP9 Profile 0 enabled on rkvdec.
// Same shape — no profile-specific validation here. // Same shape — no profile-specific validation here.
break; break;
case VAProfileAV1Profile0:
// ampere-av1-enablement: AV1 Profile 0 enabled on vpu981.
// Same shape — no profile-specific validation here.
break;
default: default:
return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
} }
@@ -157,13 +161,14 @@ VAStatus RequestDestroyConfig(VADriverContextP context, VAConfigID config_id)
static bool any_fd_supports_output_format(struct request_data *driver_data, static bool any_fd_supports_output_format(struct request_data *driver_data,
unsigned int fmt) unsigned int fmt)
{ {
int fds[3] = { int fds[4] = {
driver_data->video_fd, driver_data->video_fd,
driver_data->video_fd_rkvdec, driver_data->video_fd_rkvdec,
driver_data->video_fd_hantro, driver_data->video_fd_hantro,
driver_data->video_fd_vpu981,
}; };
int i; int i;
for (i = 0; i < 3; i++) { for (i = 0; i < 4; i++) {
if (fds[i] < 0) continue; if (fds[i] < 0) continue;
if (v4l2_find_format(fds[i], V4L2_BUF_TYPE_VIDEO_OUTPUT, fmt)) if (v4l2_find_format(fds[i], V4L2_BUF_TYPE_VIDEO_OUTPUT, fmt))
return true; return true;
@@ -207,6 +212,17 @@ VAStatus RequestQueryConfigProfiles(VADriverContextP context,
if (found && index < (V4L2_REQUEST_MAX_PROFILES - 1)) if (found && index < (V4L2_REQUEST_MAX_PROFILES - 1))
profiles[index++] = VAProfileVP9Profile0; profiles[index++] = VAProfileVP9Profile0;
/*
* ampere-av1-enablement: AV1 routes to vpu981 (advertised via the
* new video_fd_vpu981 slot). V4L2_REQUEST_MAX_PROFILES=11 is now
* EXACTLY full with this addition. Future profile additions
* require bumping that constant + verifying libva consumers'
* profiles[] sizing.
*/
found = any_fd_supports_output_format(driver_data, V4L2_PIX_FMT_AV1_FRAME);
if (found && index < (V4L2_REQUEST_MAX_PROFILES - 1))
profiles[index++] = VAProfileAV1Profile0;
*profiles_count = index; *profiles_count = index;
return VA_STATUS_SUCCESS; return VA_STATUS_SUCCESS;
@@ -228,6 +244,7 @@ VAStatus RequestQueryConfigEntrypoints(VADriverContextP context,
case VAProfileHEVCMain: case VAProfileHEVCMain:
case VAProfileVP8Version0_3: case VAProfileVP8Version0_3:
case VAProfileVP9Profile0: case VAProfileVP9Profile0:
case VAProfileAV1Profile0:
entrypoints[0] = VAEntrypointVLD; entrypoints[0] = VAEntrypointVLD;
*entrypoints_count = 1; *entrypoints_count = 1;
break; break;
+215 -1
View File
@@ -70,6 +70,7 @@
#include "surface.h" #include "surface.h"
#include <assert.h> #include <assert.h>
#include <errno.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
@@ -79,6 +80,9 @@
#include <linux/videodev2.h> #include <linux/videodev2.h>
#include <linux/v4l2-controls.h> #include <linux/v4l2-controls.h>
#include "hevc-ctrls/v4l2-hevc-ext-controls.h"
#include "h265_parser/gst/codecparsers/gsth265parser.h"
#include "utils.h" #include "utils.h"
#include "v4l2.h" #include "v4l2.h"
@@ -582,6 +586,177 @@ static void h265_fill_scaling_matrix(VAIQMatrixBufferHEVC *iqmatrix,
} }
/* ===== Clause 1: orchestrator — batched 5-control submission ===== */ /* ===== Clause 1: orchestrator — batched 5-control submission ===== */
/*
* iter2 (ampere-kernel-decoders) — parse the HEVC SPS NAL out of the
* decode-time bitstream buffer (when present — typically only on IDR
* frames) via the vendored GStreamer 1.28.2 H.265 parser, map the
* resulting GstH265ShortTermRefPicSet + GstH265ShortTermRefPicSetExt
* arrays into V4L2_CID_STATELESS_HEVC_EXT_SPS_{ST,LT}_RPS struct
* arrays, and cache them on driver_data for reuse by subsequent
* non-IDR frames whose source_data buffer doesn't carry the SPS.
*
* Why: Linux 7.0 VDPU381/383 rkvdec requires the kernel-side RPS
* arrays to be populated; userspace VAAPI doesn't expose this data
* via VAPictureParameterBufferHEVC (only the COUNTS). Mirrors
* GStreamer's gst_v4l2_codec_h265_dec_fill_ext_sps_rps shape
* (gst-plugins-bad/sys/v4l2codecs/gstv4l2codech265dec.c, merged in
* GStreamer 1.28 via MR !10820).
*
* Returns 0 on success (cache is valid after this call, controls
* arrays available in driver_data->hevc_rps_cache_*), negative on
* parse failure with cache left in its previous state.
*
* If source_data does NOT contain an SPS NAL and the cache is NOT
* yet valid (first frame of a stream where IDR happens to lack
* embedded SPS), returns -ENODATA. Caller decides what to do
* (typically: skip the controls submission and let the kernel hit
* its early-return path; if the kernel still OOPSes that's the
* F1 falsifier and we loop back to Phase 0).
*/
static int h265_populate_ext_sps_rps_cache(struct request_data *driver_data,
struct object_surface *surface_object)
{
const guint8 *src = surface_object->source_data;
gsize src_size = surface_object->slices_size;
GstH265Parser *parser;
GstH265NalUnit nalu;
GstH265SPS sps;
GstH265SPSEXT sps_ext;
GstH265ParserResult pr;
int err = -ENODATA;
parser = gst_h265_parser_new();
if (parser == NULL)
return -ENOMEM;
/* Walk source_data for NAL units; first NAL with type==33 (SPS)
* is what we parse. Annex-B start codes (3- or 4-byte) are
* detected by gst_h265_parser_identify_nalu_unchecked. */
gsize offset = 0;
while (offset < src_size) {
pr = gst_h265_parser_identify_nalu(parser, src, offset, src_size,
&nalu);
if (pr != GST_H265_PARSER_OK && pr != GST_H265_PARSER_NO_NAL_END)
break;
if (nalu.type == GST_H265_NAL_SPS) {
/*
* gst_h265_parser_parse_sps_ext fills both the base
* SPS and the extended-RPS SPSEXT struct. The plain
* gst_h265_parser_parse_sps only fills the base —
* its internally-parsed sps_ext is discarded (see
* gsth265parser.c:2050+ where the function calls
* parse_sps_ext with a LOCAL sps_ext variable). We
* need the EXT data for the V4L2 EXT_SPS_*_RPS
* controls, so call the _ext variant directly.
*/
memset(&sps, 0, sizeof(sps));
memset(&sps_ext, 0, sizeof(sps_ext));
pr = gst_h265_parser_parse_sps_ext(parser, &nalu,
&sps, &sps_ext, TRUE);
if (pr != GST_H265_PARSER_OK)
break;
/* Allocate the V4L2 struct arrays sized by the
* parser's reported counts; free any previous
* cache before overwriting. */
free(driver_data->hevc_rps_cache_st);
driver_data->hevc_rps_cache_st = NULL;
free(driver_data->hevc_rps_cache_lt);
driver_data->hevc_rps_cache_lt = NULL;
driver_data->hevc_rps_cache_valid = false;
driver_data->hevc_rps_cache_st_count =
sps.num_short_term_ref_pic_sets;
driver_data->hevc_rps_cache_lt_count =
sps.num_long_term_ref_pics_sps;
if (driver_data->hevc_rps_cache_st_count > 0) {
driver_data->hevc_rps_cache_st = calloc(
driver_data->hevc_rps_cache_st_count,
sizeof(struct v4l2_ctrl_hevc_ext_sps_st_rps));
if (driver_data->hevc_rps_cache_st == NULL) {
err = -ENOMEM;
break;
}
for (unsigned int i = 0;
i < driver_data->hevc_rps_cache_st_count;
i++) {
struct v4l2_ctrl_hevc_ext_sps_st_rps *dst =
&driver_data->hevc_rps_cache_st[i];
const GstH265ShortTermRefPicSet *st =
&sps.short_term_ref_pic_set[i];
const GstH265ShortTermRefPicSetExt *ste =
&sps_ext.short_term_ref_pic_set_ext[i];
if (st->inter_ref_pic_set_prediction_flag)
dst->flags |=
V4L2_HEVC_EXT_SPS_ST_RPS_FLAG_INTER_REF_PIC_SET_PRED;
dst->delta_idx_minus1 = st->delta_idx_minus1;
dst->delta_rps_sign = st->delta_rps_sign;
dst->abs_delta_rps_minus1 = st->abs_delta_rps_minus1;
dst->num_negative_pics = st->NumNegativePics;
dst->num_positive_pics = st->NumPositivePics;
/* GStreamer's ShortTermRefPicSetExt
* carries the per-RPS-entry use_delta /
* used_by_curr_pic / delta_poc_s0/s1
* arrays (added GStreamer 1.28
* alongside the V4L2 controls). */
for (unsigned int j = 0; j < 16; j++) {
if (ste->used_by_curr_pic_flag[j])
dst->used_by_curr_pic |= (1u << j);
if (ste->use_delta_flag[j])
dst->use_delta_flag |= (1u << j);
dst->delta_poc_s0_minus1[j] =
ste->delta_poc_s0_minus1[j];
dst->delta_poc_s1_minus1[j] =
ste->delta_poc_s1_minus1[j];
}
}
}
if (driver_data->hevc_rps_cache_lt_count > 0) {
driver_data->hevc_rps_cache_lt = calloc(
driver_data->hevc_rps_cache_lt_count,
sizeof(struct v4l2_ctrl_hevc_ext_sps_lt_rps));
if (driver_data->hevc_rps_cache_lt == NULL) {
err = -ENOMEM;
break;
}
for (unsigned int i = 0;
i < driver_data->hevc_rps_cache_lt_count;
i++) {
struct v4l2_ctrl_hevc_ext_sps_lt_rps *dst =
&driver_data->hevc_rps_cache_lt[i];
dst->lt_ref_pic_poc_lsb_sps =
sps.lt_ref_pic_poc_lsb_sps[i];
if (sps.used_by_curr_pic_lt_sps_flag[i])
dst->flags |=
V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT;
}
}
driver_data->hevc_rps_cache_valid = true;
err = 0;
break;
}
offset = nalu.offset + nalu.size;
}
gst_h265_parser_free(parser);
/* If the SPS NAL wasn't in this frame's source_data but we have
* a cached valid RPS from a prior frame, that's the non-IDR
* common case — report success so the caller submits the
* cached arrays. */
if (err == -ENODATA && driver_data->hevc_rps_cache_valid)
err = 0;
return err;
}
int h265_set_controls(struct request_data *driver_data, int h265_set_controls(struct request_data *driver_data,
struct object_context *context_object, struct object_context *context_object,
struct object_surface *surface_object) struct object_surface *surface_object)
@@ -599,7 +774,7 @@ int h265_set_controls(struct request_data *driver_data,
struct v4l2_ctrl_hevc_scaling_matrix scaling_matrix; struct v4l2_ctrl_hevc_scaling_matrix scaling_matrix;
struct v4l2_ctrl_hevc_slice_params *slice_params_array = NULL; struct v4l2_ctrl_hevc_slice_params *slice_params_array = NULL;
struct v4l2_ext_control controls[5]; struct v4l2_ext_control controls[7];
unsigned int n = 0; unsigned int n = 0;
unsigned int i; unsigned int i;
unsigned int prefix_bytes; unsigned int prefix_bytes;
@@ -690,6 +865,45 @@ int h265_set_controls(struct request_data *driver_data,
.size = sizeof(decode_params), .size = sizeof(decode_params),
}; };
/*
* iter2 (ampere-kernel-decoders): VDPU381/383 rkvdec on Linux
* 7.0+ requires the EXT_SPS_{ST,LT}_RPS controls populated with
* parser-derived data. RK3399 rkvdec (linux 6.x or 7.x pre-
* VDPU381 bindings) doesn't have these CIDs; probe at init time
* (request.c::probe_hevc_ext_sps_rps_controls) gates this block.
*
* Per feedback_per_driver_kludge_gating, also gate explicitly on
* driver-kind to keep the human-readable intent clear even though
* the probe naturally returns false for RK3399.
*/
if (driver_data->has_hevc_ext_sps_rps_rkvdec) {
int err = h265_populate_ext_sps_rps_cache(driver_data,
surface_object);
if (err == 0 && driver_data->hevc_rps_cache_valid) {
if (driver_data->hevc_rps_cache_st_count > 0) {
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS,
.ptr = driver_data->hevc_rps_cache_st,
.size = sizeof(struct v4l2_ctrl_hevc_ext_sps_st_rps) *
driver_data->hevc_rps_cache_st_count,
};
}
if (driver_data->hevc_rps_cache_lt_count > 0) {
controls[n++] = (struct v4l2_ext_control){
.id = V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS,
.ptr = driver_data->hevc_rps_cache_lt,
.size = sizeof(struct v4l2_ctrl_hevc_ext_sps_lt_rps) *
driver_data->hevc_rps_cache_lt_count,
};
}
}
/* If err is -ENODATA AND cache not valid (first-ever
* frame happens to lack an SPS NAL): we DON'T submit the
* new controls. The kernel's early-return-on-NULL path in
* rkvdec_hevc_prepare_hw_st_rps should fire and prevent
* the OOPS — Phase 7 verifies this matches the prediction. */
}
rc = v4l2_set_controls(driver_data->video_fd, rc = v4l2_set_controls(driver_data->video_fd,
surface_object->request_fd, surface_object->request_fd,
controls, n); controls, n);
+14
View File
@@ -0,0 +1,14 @@
/* Stub for <gst/base/base-prelude.h> — GStreamer base-lib prelude.
* In upstream GStreamer, this sets up the GstBaseExport macro + GObject
* boilerplate. We bypass all of that and provide only what our four
* vendored .c files actually need (gst_compat.h's typedefs).
*
* Crucially we also #define GST_BASE_API to nothing so the function
* declarations in gstbitreader.h / gstbytereader.h drop the
* dllimport / visibility attribute prefix.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_BASE_PRELUDE_STUB
#define LIBVA_V4L2_REQUEST_FOURIER_BASE_PRELUDE_STUB
#include "gst_compat.h"
#define GST_BASE_API
#endif
+307
View File
@@ -0,0 +1,307 @@
/* GStreamer
*
* Copyright (C) 2008 Sebastian Dröge <sebastian.droege@collabora.co.uk>.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
* Boston, MA 02110-1301, USA.
*/
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#define GST_BIT_READER_DISABLE_INLINES
#include "gstbitreader.h"
#include <string.h>
/**
* SECTION:gstbitreader
* @title: GstBitReader
* @short_description: Reads any number of bits from a memory buffer
* @symbols:
* - gst_bit_reader_skip_unchecked
* - gst_bit_reader_skip_to_byte_unchecked
* - gst_bit_reader_get_bits_uint8_unchecked
* - gst_bit_reader_peek_bits_uint8_unchecked
* - gst_bit_reader_get_bits_uint16_unchecked
* - gst_bit_reader_peek_bits_uint16_unchecked
* - gst_bit_reader_get_bits_uint32_unchecked
* - gst_bit_reader_peek_bits_uint32_unchecked
* - gst_bit_reader_get_bits_uint64_unchecked
* - gst_bit_reader_peek_bits_uint64_unchecked
*
* #GstBitReader provides a bit reader that can read any number of bits
* from a memory buffer. It provides functions for reading any number of bits
* into 8, 16, 32 and 64 bit variables.
*/
/**
* gst_bit_reader_new: (skip)
* @data: (array length=size): Data from which the #GstBitReader
* should read
* @size: Size of @data in bytes
*
* Create a new #GstBitReader instance, which will read from @data.
*
* Free-function: gst_bit_reader_free
*
* Returns: (transfer full): a new #GstBitReader instance
*/
GstBitReader *
gst_bit_reader_new (const guint8 * data, guint size)
{
GstBitReader *ret = g_new0 (GstBitReader, 1);
ret->data = data;
ret->size = size;
return ret;
}
/**
* gst_bit_reader_free:
* @reader: (in) (transfer full): a #GstBitReader instance
*
* Frees a #GstBitReader instance, which was previously allocated by
* gst_bit_reader_new().
*/
void
gst_bit_reader_free (GstBitReader * reader)
{
g_return_if_fail (reader != NULL);
g_free (reader);
}
/**
* gst_bit_reader_init:
* @reader: a #GstBitReader instance
* @data: (in) (array length=size): data from which the bit reader should read
* @size: Size of @data in bytes
*
* Initializes a #GstBitReader instance to read from @data. This function
* can be called on already initialized instances.
*/
void
gst_bit_reader_init (GstBitReader * reader, const guint8 * data, guint size)
{
g_return_if_fail (reader != NULL);
reader->data = data;
reader->size = size;
reader->byte = reader->bit = 0;
}
/**
* gst_bit_reader_set_pos:
* @reader: a #GstBitReader instance
* @pos: The new position in bits
*
* Sets the new position of a #GstBitReader instance to @pos in bits.
*
* Returns: %TRUE if the position could be set successfully, %FALSE
* otherwise.
*/
gboolean
gst_bit_reader_set_pos (GstBitReader * reader, guint pos)
{
g_return_val_if_fail (reader != NULL, FALSE);
if (pos > reader->size * 8)
return FALSE;
reader->byte = pos / 8;
reader->bit = pos % 8;
return TRUE;
}
/**
* gst_bit_reader_get_pos:
* @reader: a #GstBitReader instance
*
* Returns the current position of a #GstBitReader instance in bits.
*
* Returns: The current position of @reader in bits.
*/
guint
gst_bit_reader_get_pos (const GstBitReader * reader)
{
return _gst_bit_reader_get_pos_inline (reader);
}
/**
* gst_bit_reader_get_remaining:
* @reader: a #GstBitReader instance
*
* Returns the remaining number of bits of a #GstBitReader instance.
*
* Returns: The remaining number of bits of @reader instance.
*/
guint
gst_bit_reader_get_remaining (const GstBitReader * reader)
{
return _gst_bit_reader_get_remaining_inline (reader);
}
/**
* gst_bit_reader_get_size:
* @reader: a #GstBitReader instance
*
* Returns the total number of bits of a #GstBitReader instance.
*
* Returns: The total number of bits of @reader instance.
*/
guint
gst_bit_reader_get_size (const GstBitReader * reader)
{
return _gst_bit_reader_get_size_inline (reader);
}
/**
* gst_bit_reader_skip:
* @reader: a #GstBitReader instance
* @nbits: the number of bits to skip
*
* Skips @nbits bits of the #GstBitReader instance.
*
* Returns: %TRUE if @nbits bits could be skipped, %FALSE otherwise.
*/
gboolean
gst_bit_reader_skip (GstBitReader * reader, guint nbits)
{
return _gst_bit_reader_skip_inline (reader, nbits);
}
/**
* gst_bit_reader_skip_to_byte:
* @reader: a #GstBitReader instance
*
* Skips until the next byte.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
gboolean
gst_bit_reader_skip_to_byte (GstBitReader * reader)
{
return _gst_bit_reader_skip_to_byte_inline (reader);
}
/**
* gst_bit_reader_get_bits_uint8:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint8 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val and update the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_get_bits_uint16:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint16 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val and update the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_get_bits_uint32:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint32 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val and update the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_get_bits_uint64:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint64 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val and update the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_peek_bits_uint8:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint8 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val but keep the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_peek_bits_uint16:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint16 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val but keep the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_peek_bits_uint32:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint32 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val but keep the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
/**
* gst_bit_reader_peek_bits_uint64:
* @reader: a #GstBitReader instance
* @val: (out): Pointer to a #guint64 to store the result
* @nbits: number of bits to read
*
* Read @nbits bits into @val but keep the current position.
*
* Returns: %TRUE if successful, %FALSE otherwise.
*/
#define GST_BIT_READER_READ_BITS(bits) \
gboolean \
gst_bit_reader_peek_bits_uint##bits (const GstBitReader *reader, guint##bits *val, guint nbits) \
{ \
return _gst_bit_reader_peek_bits_uint##bits##_inline (reader, val, nbits); \
} \
\
gboolean \
gst_bit_reader_get_bits_uint##bits (GstBitReader *reader, guint##bits *val, guint nbits) \
{ \
return _gst_bit_reader_get_bits_uint##bits##_inline (reader, val, nbits); \
}
GST_BIT_READER_READ_BITS (8);
GST_BIT_READER_READ_BITS (16);
GST_BIT_READER_READ_BITS (32);
GST_BIT_READER_READ_BITS (64);
+328
View File
@@ -0,0 +1,328 @@
/* GStreamer
*
* Copyright (C) 2008 Sebastian Dröge <sebastian.droege@collabora.co.uk>.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
* Boston, MA 02110-1301, USA.
*/
#ifndef __GST_BIT_READER_H__
#define __GST_BIT_READER_H__
#include <gst/gst.h>
#include <gst/base/base-prelude.h>
/* FIXME: inline functions */
G_BEGIN_DECLS
#define GST_BIT_READER(reader) ((GstBitReader *) (reader))
/**
* GstBitReader:
* @data: (array length=size): Data from which the bit reader will
* read
* @size: Size of @data in bytes
* @byte: Current byte position
* @bit: Bit position in the current byte
*
* A bit reader instance.
*/
typedef struct {
const guint8 *data;
guint size;
guint byte; /* Byte position */
guint bit; /* Bit position in the current byte */
/* < private > */
gpointer _gst_reserved[GST_PADDING];
} GstBitReader;
GST_BASE_API
GstBitReader * gst_bit_reader_new (const guint8 *data, guint size) G_GNUC_MALLOC;
GST_BASE_API
void gst_bit_reader_free (GstBitReader *reader);
GST_BASE_API
void gst_bit_reader_init (GstBitReader *reader, const guint8 *data, guint size);
GST_BASE_API
gboolean gst_bit_reader_set_pos (GstBitReader *reader, guint pos);
GST_BASE_API
guint gst_bit_reader_get_pos (const GstBitReader *reader);
GST_BASE_API
guint gst_bit_reader_get_remaining (const GstBitReader *reader);
GST_BASE_API
guint gst_bit_reader_get_size (const GstBitReader *reader);
GST_BASE_API
gboolean gst_bit_reader_skip (GstBitReader *reader, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_skip_to_byte (GstBitReader *reader);
GST_BASE_API
gboolean gst_bit_reader_get_bits_uint8 (GstBitReader *reader, guint8 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_get_bits_uint16 (GstBitReader *reader, guint16 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_get_bits_uint32 (GstBitReader *reader, guint32 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_get_bits_uint64 (GstBitReader *reader, guint64 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_peek_bits_uint8 (const GstBitReader *reader, guint8 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_peek_bits_uint16 (const GstBitReader *reader, guint16 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_peek_bits_uint32 (const GstBitReader *reader, guint32 *val, guint nbits);
GST_BASE_API
gboolean gst_bit_reader_peek_bits_uint64 (const GstBitReader *reader, guint64 *val, guint nbits);
/**
* GST_BIT_READER_INIT:
* @data: Data from which the #GstBitReader should read
* @size: Size of @data in bytes
*
* A #GstBitReader must be initialized with this macro, before it can be
* used. This macro can used be to initialize a variable, but it cannot
* be assigned to a variable. In that case you have to use
* gst_bit_reader_init().
*/
#define GST_BIT_READER_INIT(data, size) {data, size, 0, 0}
/* Unchecked variants */
static inline void
gst_bit_reader_skip_unchecked (GstBitReader * reader, guint nbits)
{
reader->bit += nbits;
reader->byte += reader->bit / 8;
reader->bit = reader->bit % 8;
}
static inline void
gst_bit_reader_skip_to_byte_unchecked (GstBitReader * reader)
{
if (reader->bit) {
reader->bit = 0;
reader->byte++;
}
}
#define __GST_BIT_READER_READ_BITS_UNCHECKED(bits) \
static inline guint##bits \
gst_bit_reader_peek_bits_uint##bits##_unchecked (const GstBitReader *reader, guint nbits) \
{ \
guint##bits ret = 0; \
const guint8 *data; \
guint byte, bit; \
\
data = reader->data; \
byte = reader->byte; \
bit = reader->bit; \
\
while (nbits > 0) { \
guint toread = MIN (nbits, 8 - bit); \
\
ret <<= toread; \
ret |= (data[byte] & (0xff >> bit)) >> (8 - toread - bit); \
\
bit += toread; \
if (bit >= 8) { \
byte++; \
bit = 0; \
} \
nbits -= toread; \
} \
\
return ret; \
} \
\
static inline guint##bits \
gst_bit_reader_get_bits_uint##bits##_unchecked (GstBitReader *reader, guint nbits) \
{ \
guint##bits ret; \
\
ret = gst_bit_reader_peek_bits_uint##bits##_unchecked (reader, nbits); \
\
gst_bit_reader_skip_unchecked (reader, nbits); \
\
return ret; \
}
__GST_BIT_READER_READ_BITS_UNCHECKED (8)
__GST_BIT_READER_READ_BITS_UNCHECKED (16)
__GST_BIT_READER_READ_BITS_UNCHECKED (32)
__GST_BIT_READER_READ_BITS_UNCHECKED (64)
#undef __GST_BIT_READER_READ_BITS_UNCHECKED
/* unchecked variants -- do not use */
static inline guint
_gst_bit_reader_get_size_unchecked (const GstBitReader * reader)
{
return reader->size * 8;
}
static inline guint
_gst_bit_reader_get_pos_unchecked (const GstBitReader * reader)
{
return reader->byte * 8 + reader->bit;
}
static inline guint
_gst_bit_reader_get_remaining_unchecked (const GstBitReader * reader)
{
return reader->size * 8 - (reader->byte * 8 + reader->bit);
}
/* inlined variants -- do not use directly */
static inline guint
_gst_bit_reader_get_size_inline (const GstBitReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_bit_reader_get_size_unchecked (reader);
}
static inline guint
_gst_bit_reader_get_pos_inline (const GstBitReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_bit_reader_get_pos_unchecked (reader);
}
static inline guint
_gst_bit_reader_get_remaining_inline (const GstBitReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_bit_reader_get_remaining_unchecked (reader);
}
static inline gboolean
_gst_bit_reader_skip_inline (GstBitReader * reader, guint nbits)
{
g_return_val_if_fail (reader != NULL, FALSE);
if (_gst_bit_reader_get_remaining_unchecked (reader) < nbits)
return FALSE;
gst_bit_reader_skip_unchecked (reader, nbits);
return TRUE;
}
static inline gboolean
_gst_bit_reader_skip_to_byte_inline (GstBitReader * reader)
{
g_return_val_if_fail (reader != NULL, FALSE);
if (reader->byte > reader->size)
return FALSE;
gst_bit_reader_skip_to_byte_unchecked (reader);
return TRUE;
}
#define __GST_BIT_READER_READ_BITS_INLINE(bits) \
static inline gboolean \
_gst_bit_reader_get_bits_uint##bits##_inline (GstBitReader *reader, guint##bits *val, guint nbits) \
{ \
g_return_val_if_fail (reader != NULL, FALSE); \
g_return_val_if_fail (val != NULL, FALSE); \
g_return_val_if_fail (nbits <= bits, FALSE); \
\
if (_gst_bit_reader_get_remaining_unchecked (reader) < nbits) \
return FALSE; \
\
*val = gst_bit_reader_get_bits_uint##bits##_unchecked (reader, nbits); \
return TRUE; \
} \
\
static inline gboolean \
_gst_bit_reader_peek_bits_uint##bits##_inline (const GstBitReader *reader, guint##bits *val, guint nbits) \
{ \
g_return_val_if_fail (reader != NULL, FALSE); \
g_return_val_if_fail (val != NULL, FALSE); \
g_return_val_if_fail (nbits <= bits, FALSE); \
\
if (_gst_bit_reader_get_remaining_unchecked (reader) < nbits) \
return FALSE; \
\
*val = gst_bit_reader_peek_bits_uint##bits##_unchecked (reader, nbits); \
return TRUE; \
}
__GST_BIT_READER_READ_BITS_INLINE (8)
__GST_BIT_READER_READ_BITS_INLINE (16)
__GST_BIT_READER_READ_BITS_INLINE (32)
__GST_BIT_READER_READ_BITS_INLINE (64)
#undef __GST_BIT_READER_READ_BITS_INLINE
#ifndef GST_BIT_READER_DISABLE_INLINES
#define gst_bit_reader_get_size(reader) \
_gst_bit_reader_get_size_inline (reader)
#define gst_bit_reader_get_pos(reader) \
_gst_bit_reader_get_pos_inline (reader)
#define gst_bit_reader_get_remaining(reader) \
_gst_bit_reader_get_remaining_inline (reader)
/* we use defines here so we can add the G_LIKELY() */
#define gst_bit_reader_skip(reader, nbits)\
G_LIKELY (_gst_bit_reader_skip_inline(reader, nbits))
#define gst_bit_reader_skip_to_byte(reader)\
G_LIKELY (_gst_bit_reader_skip_to_byte_inline(reader))
#define gst_bit_reader_get_bits_uint8(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_get_bits_uint8_inline (reader, val, nbits))
#define gst_bit_reader_get_bits_uint16(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_get_bits_uint16_inline (reader, val, nbits))
#define gst_bit_reader_get_bits_uint32(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_get_bits_uint32_inline (reader, val, nbits))
#define gst_bit_reader_get_bits_uint64(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_get_bits_uint64_inline (reader, val, nbits))
#define gst_bit_reader_peek_bits_uint8(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_peek_bits_uint8_inline (reader, val, nbits))
#define gst_bit_reader_peek_bits_uint16(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_peek_bits_uint16_inline (reader, val, nbits))
#define gst_bit_reader_peek_bits_uint32(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_peek_bits_uint32_inline (reader, val, nbits))
#define gst_bit_reader_peek_bits_uint64(reader, val, nbits) \
G_LIKELY (_gst_bit_reader_peek_bits_uint64_inline (reader, val, nbits))
#endif
G_END_DECLS
#endif /* __GST_BIT_READER_H__ */
+67
View File
@@ -0,0 +1,67 @@
/* Stub for <gst/base/gstbitwriter.h>.
*
* The vendored nalutils.c uses GstBitWriter for NAL emulation-prevention
* byte INSERTION during write-side (encoder) operations. The libva
* backend never invokes those paths — we only PARSE NAL units, never
* write them. The functions must still compile + link though, so we
* stub them with abort() runtime guards: if any future code path
* accidentally invokes a writer function, we fail-fast instead of
* silently corrupting.
*
* Header surface mirrors upstream gstbitwriter.h minimally — enough
* for nalutils.c to compile.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_GSTBITWRITER_STUB
#define LIBVA_V4L2_REQUEST_FOURIER_GSTBITWRITER_STUB
#include "gst_compat.h"
typedef struct {
guint8 *data;
guint bit_size;
guint bit_capacity;
gboolean auto_grow;
gboolean owned;
} GstBitWriter;
static inline void
gst_bit_writer_init(GstBitWriter *bw) { (void)bw; abort(); }
static inline void
gst_bit_writer_init_with_size(GstBitWriter *bw, guint size, gboolean fixed) {
(void)bw; (void)size; (void)fixed; abort();
}
static inline void
gst_bit_writer_reset(GstBitWriter *bw) { (void)bw; abort(); }
static inline gboolean
gst_bit_writer_put_bits_uint8(GstBitWriter *bw, guint8 value, guint nbits) {
(void)bw; (void)value; (void)nbits; abort();
}
static inline gboolean
gst_bit_writer_align_bytes(GstBitWriter *bw, guint8 trailing_bit) {
(void)bw; (void)trailing_bit; abort();
}
static inline guint8 *
gst_bit_writer_get_data(GstBitWriter *bw) { (void)bw; abort(); }
static inline guint
gst_bit_writer_get_size(const GstBitWriter *bw) { (void)bw; abort(); }
static inline guint
gst_bit_writer_reset_and_get_size(GstBitWriter *bw) { (void)bw; abort(); }
static inline guint8 *
gst_bit_writer_reset_and_get_data(GstBitWriter *bw) { (void)bw; abort(); }
static inline gboolean
gst_bit_writer_put_bits_uint16(GstBitWriter *bw, guint16 value, guint nbits) {
(void)bw; (void)value; (void)nbits; abort();
}
static inline gboolean
gst_bit_writer_put_bits_uint32(GstBitWriter *bw, guint32 value, guint nbits) {
(void)bw; (void)value; (void)nbits; abort();
}
static inline gboolean
gst_bit_writer_put_bytes(GstBitWriter *bw, const guint8 *data, guint nbytes) {
(void)bw; (void)data; (void)nbytes; abort();
}
#define GST_BIT_WRITER_BIT_SIZE(bw) ((bw)->bit_size)
#define GST_BIT_WRITER_DATA(bw) ((bw)->data)
#endif
File diff suppressed because it is too large Load Diff
+684
View File
@@ -0,0 +1,684 @@
/* GStreamer byte reader
*
* Copyright (C) 2008 Sebastian Dröge <sebastian.droege@collabora.co.uk>.
* Copyright (C) 2009 Tim-Philipp Müller <tim centricular net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
* Boston, MA 02110-1301, USA.
*/
#ifndef __GST_BYTE_READER_H__
#define __GST_BYTE_READER_H__
#include <gst/gst.h>
#include <gst/base/base-prelude.h>
G_BEGIN_DECLS
#define GST_BYTE_READER(reader) ((GstByteReader *) (reader))
/**
* GstByteReader:
* @data: (array length=size): Data from which the bit reader will
* read
* @size: Size of @data in bytes
* @byte: Current byte position
*
* A byte reader instance.
*/
typedef struct {
const guint8 *data;
guint size;
guint byte; /* Byte position */
/* < private > */
gpointer _gst_reserved[GST_PADDING];
} GstByteReader;
GST_BASE_API
GstByteReader * gst_byte_reader_new (const guint8 *data, guint size) G_GNUC_MALLOC;
GST_BASE_API
void gst_byte_reader_free (GstByteReader *reader);
GST_BASE_API
void gst_byte_reader_init (GstByteReader *reader, const guint8 *data, guint size);
GST_BASE_API
gboolean gst_byte_reader_peek_sub_reader (GstByteReader * reader,
GstByteReader * sub_reader,
guint size);
GST_BASE_API
gboolean gst_byte_reader_get_sub_reader (GstByteReader * reader,
GstByteReader * sub_reader,
guint size);
GST_BASE_API
gboolean gst_byte_reader_set_pos (GstByteReader *reader, guint pos);
GST_BASE_API
guint gst_byte_reader_get_pos (const GstByteReader *reader);
GST_BASE_API
guint gst_byte_reader_get_remaining (const GstByteReader *reader);
GST_BASE_API
guint gst_byte_reader_get_size (const GstByteReader *reader);
GST_BASE_API
gboolean gst_byte_reader_skip (GstByteReader *reader, guint nbytes);
GST_BASE_API
gboolean gst_byte_reader_get_uint8 (GstByteReader *reader, guint8 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int8 (GstByteReader *reader, gint8 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint16_le (GstByteReader *reader, guint16 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int16_le (GstByteReader *reader, gint16 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint16_be (GstByteReader *reader, guint16 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int16_be (GstByteReader *reader, gint16 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint24_le (GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int24_le (GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint24_be (GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int24_be (GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint32_le (GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int32_le (GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint32_be (GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int32_be (GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint64_le (GstByteReader *reader, guint64 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int64_le (GstByteReader *reader, gint64 *val);
GST_BASE_API
gboolean gst_byte_reader_get_uint64_be (GstByteReader *reader, guint64 *val);
GST_BASE_API
gboolean gst_byte_reader_get_int64_be (GstByteReader *reader, gint64 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint8 (const GstByteReader *reader, guint8 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int8 (const GstByteReader *reader, gint8 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint16_le (const GstByteReader *reader, guint16 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int16_le (const GstByteReader *reader, gint16 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint16_be (const GstByteReader *reader, guint16 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int16_be (const GstByteReader *reader, gint16 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint24_le (const GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int24_le (const GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint24_be (const GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int24_be (const GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint32_le (const GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int32_le (const GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint32_be (const GstByteReader *reader, guint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int32_be (const GstByteReader *reader, gint32 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint64_le (const GstByteReader *reader, guint64 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int64_le (const GstByteReader *reader, gint64 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_uint64_be (const GstByteReader *reader, guint64 *val);
GST_BASE_API
gboolean gst_byte_reader_peek_int64_be (const GstByteReader *reader, gint64 *val);
GST_BASE_API
gboolean gst_byte_reader_get_float32_le (GstByteReader *reader, gfloat *val);
GST_BASE_API
gboolean gst_byte_reader_get_float32_be (GstByteReader *reader, gfloat *val);
GST_BASE_API
gboolean gst_byte_reader_get_float64_le (GstByteReader *reader, gdouble *val);
GST_BASE_API
gboolean gst_byte_reader_get_float64_be (GstByteReader *reader, gdouble *val);
GST_BASE_API
gboolean gst_byte_reader_peek_float32_le (const GstByteReader *reader, gfloat *val);
GST_BASE_API
gboolean gst_byte_reader_peek_float32_be (const GstByteReader *reader, gfloat *val);
GST_BASE_API
gboolean gst_byte_reader_peek_float64_le (const GstByteReader *reader, gdouble *val);
GST_BASE_API
gboolean gst_byte_reader_peek_float64_be (const GstByteReader *reader, gdouble *val);
GST_BASE_API
gboolean gst_byte_reader_dup_data (GstByteReader * reader, guint size, guint8 ** val);
GST_BASE_API
gboolean gst_byte_reader_get_data (GstByteReader * reader, guint size, const guint8 ** val);
GST_BASE_API
gboolean gst_byte_reader_peek_data (const GstByteReader * reader, guint size, const guint8 ** val);
#define gst_byte_reader_dup_string(reader,str) \
gst_byte_reader_dup_string_utf8(reader,str)
GST_BASE_API
gboolean gst_byte_reader_dup_string_utf8 (GstByteReader * reader, gchar ** str);
GST_BASE_API
gboolean gst_byte_reader_dup_string_utf16 (GstByteReader * reader, guint16 ** str);
GST_BASE_API
gboolean gst_byte_reader_dup_string_utf32 (GstByteReader * reader, guint32 ** str);
#define gst_byte_reader_skip_string(reader) \
gst_byte_reader_skip_string_utf8(reader)
GST_BASE_API
gboolean gst_byte_reader_skip_string_utf8 (GstByteReader * reader);
GST_BASE_API
gboolean gst_byte_reader_skip_string_utf16 (GstByteReader * reader);
GST_BASE_API
gboolean gst_byte_reader_skip_string_utf32 (GstByteReader * reader);
#define gst_byte_reader_get_string(reader,str) \
gst_byte_reader_get_string_utf8(reader,str)
#define gst_byte_reader_peek_string(reader,str) \
gst_byte_reader_peek_string_utf8(reader,str)
GST_BASE_API
gboolean gst_byte_reader_get_string_utf8 (GstByteReader * reader, const gchar ** str);
GST_BASE_API
gboolean gst_byte_reader_peek_string_utf8 (const GstByteReader * reader, const gchar ** str);
GST_BASE_API
guint gst_byte_reader_masked_scan_uint32 (const GstByteReader * reader,
guint32 mask,
guint32 pattern,
guint offset,
guint size);
GST_BASE_API
guint gst_byte_reader_masked_scan_uint32_peek (const GstByteReader * reader,
guint32 mask,
guint32 pattern,
guint offset,
guint size,
guint32 * value);
/**
* GST_BYTE_READER_INIT:
* @data: Data from which the #GstByteReader should read
* @size: Size of @data in bytes
*
* A #GstByteReader must be initialized with this macro, before it can be
* used. This macro can used be to initialize a variable, but it cannot
* be assigned to a variable. In that case you have to use
* gst_byte_reader_init().
*/
#define GST_BYTE_READER_INIT(data, size) {data, size, 0}
/* unchecked variants */
static inline void
gst_byte_reader_skip_unchecked (GstByteReader * reader, guint nbytes)
{
reader->byte += nbytes;
}
#define __GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(bits,type,lower,upper,adj) \
\
static inline type \
gst_byte_reader_peek_##lower##_unchecked (const GstByteReader * reader) \
{ \
type val = (type) GST_READ_##upper (reader->data + reader->byte); \
adj \
return val; \
} \
\
static inline type \
gst_byte_reader_get_##lower##_unchecked (GstByteReader * reader) \
{ \
type val = gst_byte_reader_peek_##lower##_unchecked (reader); \
reader->byte += bits / 8; \
return val; \
}
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(8,guint8,uint8,UINT8,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(8,gint8,int8,UINT8,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(16,guint16,uint16_le,UINT16_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(16,guint16,uint16_be,UINT16_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(16,gint16,int16_le,UINT16_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(16,gint16,int16_be,UINT16_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,guint32,uint32_le,UINT32_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,guint32,uint32_be,UINT32_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,gint32,int32_le,UINT32_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,gint32,int32_be,UINT32_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(24,guint32,uint24_le,UINT24_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(24,guint32,uint24_be,UINT24_BE,/* */)
/* fix up the sign for 24-bit signed ints stored in 32-bit signed ints */
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(24,gint32,int24_le,UINT24_LE,
if (val & 0x00800000) val |= 0xff000000;)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(24,gint32,int24_be,UINT24_BE,
if (val & 0x00800000) val |= 0xff000000;)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,guint64,uint64_le,UINT64_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,guint64,uint64_be,UINT64_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,gint64,int64_le,UINT64_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,gint64,int64_be,UINT64_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,gfloat,float32_le,FLOAT_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(32,gfloat,float32_be,FLOAT_BE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,gdouble,float64_le,DOUBLE_LE,/* */)
__GST_BYTE_READER_GET_PEEK_BITS_UNCHECKED(64,gdouble,float64_be,DOUBLE_BE,/* */)
#undef __GET_PEEK_BITS_UNCHECKED
static inline const guint8 *
gst_byte_reader_peek_data_unchecked (const GstByteReader * reader)
{
return (const guint8 *) (reader->data + reader->byte);
}
static inline const guint8 *
gst_byte_reader_get_data_unchecked (GstByteReader * reader, guint size)
{
const guint8 *data;
data = gst_byte_reader_peek_data_unchecked (reader);
gst_byte_reader_skip_unchecked (reader, size);
return data;
}
static inline guint8 *
gst_byte_reader_dup_data_unchecked (GstByteReader * reader, guint size)
{
gconstpointer data = gst_byte_reader_get_data_unchecked (reader, size);
guint8 *dup_data = (guint8 *) g_malloc (size);
memcpy (dup_data, data, size);
return dup_data;
}
/* Unchecked variants that should not be used */
static inline guint
_gst_byte_reader_get_pos_unchecked (const GstByteReader * reader)
{
return reader->byte;
}
static inline guint
_gst_byte_reader_get_remaining_unchecked (const GstByteReader * reader)
{
return reader->size - reader->byte;
}
static inline guint
_gst_byte_reader_get_size_unchecked (const GstByteReader * reader)
{
return reader->size;
}
/* inlined variants (do not use directly) */
static inline guint
_gst_byte_reader_get_remaining_inline (const GstByteReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_byte_reader_get_remaining_unchecked (reader);
}
static inline guint
_gst_byte_reader_get_size_inline (const GstByteReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_byte_reader_get_size_unchecked (reader);
}
#define __GST_BYTE_READER_GET_PEEK_BITS_INLINE(bits,type,name) \
\
static inline gboolean \
_gst_byte_reader_peek_##name##_inline (const GstByteReader * reader, type * val) \
{ \
g_return_val_if_fail (reader != NULL, FALSE); \
g_return_val_if_fail (val != NULL, FALSE); \
\
if (_gst_byte_reader_get_remaining_unchecked (reader) < (bits / 8)) \
return FALSE; \
\
*val = gst_byte_reader_peek_##name##_unchecked (reader); \
return TRUE; \
} \
\
static inline gboolean \
_gst_byte_reader_get_##name##_inline (GstByteReader * reader, type * val) \
{ \
g_return_val_if_fail (reader != NULL, FALSE); \
g_return_val_if_fail (val != NULL, FALSE); \
\
if (_gst_byte_reader_get_remaining_unchecked (reader) < (bits / 8)) \
return FALSE; \
\
*val = gst_byte_reader_get_##name##_unchecked (reader); \
return TRUE; \
}
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(8,guint8,uint8)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(8,gint8,int8)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(16,guint16,uint16_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(16,guint16,uint16_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(16,gint16,int16_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(16,gint16,int16_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,guint32,uint32_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,guint32,uint32_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,gint32,int32_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,gint32,int32_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(24,guint32,uint24_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(24,guint32,uint24_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(24,gint32,int24_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(24,gint32,int24_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,guint64,uint64_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,guint64,uint64_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,gint64,int64_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,gint64,int64_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,gfloat,float32_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(32,gfloat,float32_be)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,gdouble,float64_le)
__GST_BYTE_READER_GET_PEEK_BITS_INLINE(64,gdouble,float64_be)
#undef __GST_BYTE_READER_GET_PEEK_BITS_INLINE
#ifndef GST_BYTE_READER_DISABLE_INLINES
#define gst_byte_reader_init(reader,data,size) \
_gst_byte_reader_init_inline(reader,data,size)
#define gst_byte_reader_get_remaining(reader) \
_gst_byte_reader_get_remaining_inline(reader)
#define gst_byte_reader_get_size(reader) \
_gst_byte_reader_get_size_inline(reader)
#define gst_byte_reader_get_pos(reader) \
_gst_byte_reader_get_pos_inline(reader)
/* we use defines here so we can add the G_LIKELY() */
#define gst_byte_reader_get_uint8(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint8_inline(reader,val))
#define gst_byte_reader_get_int8(reader,val) \
G_LIKELY(_gst_byte_reader_get_int8_inline(reader,val))
#define gst_byte_reader_get_uint16_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint16_le_inline(reader,val))
#define gst_byte_reader_get_int16_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_int16_le_inline(reader,val))
#define gst_byte_reader_get_uint16_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint16_be_inline(reader,val))
#define gst_byte_reader_get_int16_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_int16_be_inline(reader,val))
#define gst_byte_reader_get_uint24_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint24_le_inline(reader,val))
#define gst_byte_reader_get_int24_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_int24_le_inline(reader,val))
#define gst_byte_reader_get_uint24_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint24_be_inline(reader,val))
#define gst_byte_reader_get_int24_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_int24_be_inline(reader,val))
#define gst_byte_reader_get_uint32_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint32_le_inline(reader,val))
#define gst_byte_reader_get_int32_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_int32_le_inline(reader,val))
#define gst_byte_reader_get_uint32_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint32_be_inline(reader,val))
#define gst_byte_reader_get_int32_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_int32_be_inline(reader,val))
#define gst_byte_reader_get_uint64_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint64_le_inline(reader,val))
#define gst_byte_reader_get_int64_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_int64_le_inline(reader,val))
#define gst_byte_reader_get_uint64_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_uint64_be_inline(reader,val))
#define gst_byte_reader_get_int64_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_int64_be_inline(reader,val))
#define gst_byte_reader_peek_uint8(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint8_inline(reader,val))
#define gst_byte_reader_peek_int8(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int8_inline(reader,val))
#define gst_byte_reader_peek_uint16_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint16_le_inline(reader,val))
#define gst_byte_reader_peek_int16_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int16_le_inline(reader,val))
#define gst_byte_reader_peek_uint16_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint16_be_inline(reader,val))
#define gst_byte_reader_peek_int16_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int16_be_inline(reader,val))
#define gst_byte_reader_peek_uint24_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint24_le_inline(reader,val))
#define gst_byte_reader_peek_int24_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int24_le_inline(reader,val))
#define gst_byte_reader_peek_uint24_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint24_be_inline(reader,val))
#define gst_byte_reader_peek_int24_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int24_be_inline(reader,val))
#define gst_byte_reader_peek_uint32_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint32_le_inline(reader,val))
#define gst_byte_reader_peek_int32_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int32_le_inline(reader,val))
#define gst_byte_reader_peek_uint32_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint32_be_inline(reader,val))
#define gst_byte_reader_peek_int32_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int32_be_inline(reader,val))
#define gst_byte_reader_peek_uint64_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint64_le_inline(reader,val))
#define gst_byte_reader_peek_int64_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int64_le_inline(reader,val))
#define gst_byte_reader_peek_uint64_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_uint64_be_inline(reader,val))
#define gst_byte_reader_peek_int64_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_int64_be_inline(reader,val))
#define gst_byte_reader_get_float32_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_float32_le_inline(reader,val))
#define gst_byte_reader_get_float32_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_float32_be_inline(reader,val))
#define gst_byte_reader_get_float64_le(reader,val) \
G_LIKELY(_gst_byte_reader_get_float64_le_inline(reader,val))
#define gst_byte_reader_get_float64_be(reader,val) \
G_LIKELY(_gst_byte_reader_get_float64_be_inline(reader,val))
#define gst_byte_reader_peek_float32_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_float32_le_inline(reader,val))
#define gst_byte_reader_peek_float32_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_float32_be_inline(reader,val))
#define gst_byte_reader_peek_float64_le(reader,val) \
G_LIKELY(_gst_byte_reader_peek_float64_le_inline(reader,val))
#define gst_byte_reader_peek_float64_be(reader,val) \
G_LIKELY(_gst_byte_reader_peek_float64_be_inline(reader,val))
#endif /* GST_BYTE_READER_DISABLE_INLINES */
static inline void
_gst_byte_reader_init_inline (GstByteReader * reader, const guint8 * data, guint size)
{
g_return_if_fail (reader != NULL);
reader->data = data;
reader->size = size;
reader->byte = 0;
}
static inline gboolean
_gst_byte_reader_peek_sub_reader_inline (GstByteReader * reader,
GstByteReader * sub_reader, guint size)
{
g_return_val_if_fail (reader != NULL, FALSE);
g_return_val_if_fail (sub_reader != NULL, FALSE);
if (_gst_byte_reader_get_remaining_unchecked (reader) < size)
return FALSE;
sub_reader->data = reader->data + reader->byte;
sub_reader->byte = 0;
sub_reader->size = size;
return TRUE;
}
static inline gboolean
_gst_byte_reader_get_sub_reader_inline (GstByteReader * reader,
GstByteReader * sub_reader, guint size)
{
if (!_gst_byte_reader_peek_sub_reader_inline (reader, sub_reader, size))
return FALSE;
gst_byte_reader_skip_unchecked (reader, size);
return TRUE;
}
static inline gboolean
_gst_byte_reader_dup_data_inline (GstByteReader * reader, guint size, guint8 ** val)
{
g_return_val_if_fail (reader != NULL, FALSE);
g_return_val_if_fail (val != NULL, FALSE);
if (G_UNLIKELY (size > reader->size || _gst_byte_reader_get_remaining_unchecked (reader) < size))
return FALSE;
*val = gst_byte_reader_dup_data_unchecked (reader, size);
return TRUE;
}
static inline gboolean
_gst_byte_reader_get_data_inline (GstByteReader * reader, guint size, const guint8 ** val)
{
g_return_val_if_fail (reader != NULL, FALSE);
g_return_val_if_fail (val != NULL, FALSE);
if (G_UNLIKELY (size > reader->size || _gst_byte_reader_get_remaining_unchecked (reader) < size))
return FALSE;
*val = gst_byte_reader_get_data_unchecked (reader, size);
return TRUE;
}
static inline gboolean
_gst_byte_reader_peek_data_inline (const GstByteReader * reader, guint size, const guint8 ** val)
{
g_return_val_if_fail (reader != NULL, FALSE);
g_return_val_if_fail (val != NULL, FALSE);
if (G_UNLIKELY (size > reader->size || _gst_byte_reader_get_remaining_unchecked (reader) < size))
return FALSE;
*val = gst_byte_reader_peek_data_unchecked (reader);
return TRUE;
}
static inline guint
_gst_byte_reader_get_pos_inline (const GstByteReader * reader)
{
g_return_val_if_fail (reader != NULL, 0);
return _gst_byte_reader_get_pos_unchecked (reader);
}
static inline gboolean
_gst_byte_reader_skip_inline (GstByteReader * reader, guint nbytes)
{
g_return_val_if_fail (reader != NULL, FALSE);
if (G_UNLIKELY (_gst_byte_reader_get_remaining_unchecked (reader) < nbytes))
return FALSE;
reader->byte += nbytes;
return TRUE;
}
#ifndef GST_BYTE_READER_DISABLE_INLINES
#define gst_byte_reader_dup_data(reader,size,val) \
G_LIKELY(_gst_byte_reader_dup_data_inline(reader,size,val))
#define gst_byte_reader_get_data(reader,size,val) \
G_LIKELY(_gst_byte_reader_get_data_inline(reader,size,val))
#define gst_byte_reader_peek_data(reader,size,val) \
G_LIKELY(_gst_byte_reader_peek_data_inline(reader,size,val))
#define gst_byte_reader_skip(reader,nbytes) \
G_LIKELY(_gst_byte_reader_skip_inline(reader,nbytes))
#endif /* GST_BYTE_READER_DISABLE_INLINES */
G_END_DECLS
#endif /* __GST_BYTE_READER_H__ */
@@ -0,0 +1,9 @@
/* Stub for <gst/codecparsers/codecparsers-prelude.h>.
* Same shape as base-prelude.h — drop the GObject boilerplate + define
* the GstCodecParsersAPI macro to nothing.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_CODECPARSERS_PRELUDE_STUB
#define LIBVA_V4L2_REQUEST_FOURIER_CODECPARSERS_PRELUDE_STUB
#include "gst_compat.h"
#define GST_CODEC_PARSERS_API
#endif
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+545
View File
@@ -0,0 +1,545 @@
/* Gstreamer
* Copyright (C) <2011> Intel Corporation
* Copyright (C) <2011> Collabora Ltd.
* Copyright (C) <2011> Thibault Saunier <thibault.saunier@collabora.com>
*
* Some bits C-c,C-v'ed and s/4/3 from h264parse and videoparsers/h264parse.c:
* Copyright (C) <2010> Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
* Copyright (C) <2010> Collabora Multimedia
* Copyright (C) <2010> Nokia Corporation
*
* (C) 2005 Michal Benes <michal.benes@itonis.tv>
* (C) 2008 Wim Taymans <wim.taymans@gmail.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
* Boston, MA 02110-1301, USA.
*/
/*
* Common code for NAL parsing from h264 and h265 parsers.
*/
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#include "nalutils.h"
/****** Nal parser ******/
void
nal_reader_init (NalReader * nr, const guint8 * data, guint size)
{
nr->data = data;
nr->size = size;
nr->n_epb = 0;
nr->byte = 0;
nr->bits_in_cache = 0;
/* fill with something other than 0 to detect emulation prevention bytes */
nr->first_byte = 0xff;
nr->epb_cache = 0xff;
nr->cache = 0xff;
}
gboolean
nal_reader_read (NalReader * nr, guint nbits)
{
if (G_UNLIKELY (nr->byte * 8 + (nbits - nr->bits_in_cache) > nr->size * 8)) {
GST_DEBUG ("Can not read %u bits, bits in cache %u, Byte * 8 %u, size in "
"bits %u", nbits, nr->bits_in_cache, nr->byte * 8, nr->size * 8);
return FALSE;
}
while (nr->bits_in_cache < nbits) {
guint8 byte;
next_byte:
if (G_UNLIKELY (nr->byte >= nr->size))
return FALSE;
byte = nr->data[nr->byte++];
nr->epb_cache = (nr->epb_cache << 8) | byte;
/* check if the byte is a emulation_prevention_three_byte */
if ((nr->epb_cache & 0xffffff) == 0x3) {
nr->n_epb++;
goto next_byte;
}
nr->cache = (nr->cache << 8) | nr->first_byte;
nr->first_byte = byte;
nr->bits_in_cache += 8;
}
return TRUE;
}
/* Skips the specified amount of bits. This is only suitable to a
cacheable number of bits */
gboolean
nal_reader_skip (NalReader * nr, guint nbits)
{
g_assert (nbits <= 8 * sizeof (nr->cache));
if (G_UNLIKELY (!nal_reader_read (nr, nbits)))
return FALSE;
nr->bits_in_cache -= nbits;
return TRUE;
}
/* Generic version to skip any number of bits */
gboolean
nal_reader_skip_long (NalReader * nr, guint nbits)
{
/* Leave out enough bits in the cache once we are finished */
const guint skip_size = 4 * sizeof (nr->cache);
guint remaining = nbits;
nbits %= skip_size;
while (remaining > 0) {
if (!nal_reader_skip (nr, nbits))
return FALSE;
remaining -= nbits;
nbits = skip_size;
}
return TRUE;
}
guint
nal_reader_get_pos (const NalReader * nr)
{
return nr->byte * 8 - nr->bits_in_cache;
}
guint
nal_reader_get_remaining (const NalReader * nr)
{
return (nr->size - nr->byte) * 8 + nr->bits_in_cache;
}
guint
nal_reader_get_epb_count (const NalReader * nr)
{
return nr->n_epb;
}
#define NAL_READER_READ_BITS(bits) \
gboolean \
nal_reader_get_bits_uint##bits (NalReader *nr, guint##bits *val, guint nbits) \
{ \
guint shift; \
\
if (!nal_reader_read (nr, nbits)) \
return FALSE; \
\
/* bring the required bits down and truncate */ \
shift = nr->bits_in_cache - nbits; \
*val = nr->first_byte >> shift; \
\
*val |= nr->cache << (8 - shift); \
/* mask out required bits */ \
if (nbits < bits) \
*val &= ((guint##bits)1 << nbits) - 1; \
\
nr->bits_in_cache = shift; \
\
return TRUE; \
} \
NAL_READER_READ_BITS (8);
NAL_READER_READ_BITS (16);
NAL_READER_READ_BITS (32);
#define NAL_READER_PEEK_BITS(bits) \
gboolean \
nal_reader_peek_bits_uint##bits (const NalReader *nr, guint##bits *val, guint nbits) \
{ \
NalReader tmp; \
\
tmp = *nr; \
return nal_reader_get_bits_uint##bits (&tmp, val, nbits); \
}
NAL_READER_PEEK_BITS (8);
gboolean
nal_reader_get_ue (NalReader * nr, guint32 * val)
{
guint i = 0;
guint8 bit;
guint32 value;
if (G_UNLIKELY (!nal_reader_get_bits_uint8 (nr, &bit, 1)))
return FALSE;
while (bit == 0) {
i++;
if (G_UNLIKELY (!nal_reader_get_bits_uint8 (nr, &bit, 1)))
return FALSE;
}
if (G_UNLIKELY (i > 31))
return FALSE;
if (G_UNLIKELY (!nal_reader_get_bits_uint32 (nr, &value, i)))
return FALSE;
*val = (1 << i) - 1 + value;
return TRUE;
}
gboolean
nal_reader_get_se (NalReader * nr, gint32 * val)
{
guint32 value;
if (G_UNLIKELY (!nal_reader_get_ue (nr, &value)))
return FALSE;
if (value % 2)
*val = (value / 2) + 1;
else
*val = -(value / 2);
return TRUE;
}
gboolean
nal_reader_is_byte_aligned (NalReader * nr)
{
if (nr->bits_in_cache != 0)
return FALSE;
return TRUE;
}
gboolean
nal_reader_has_more_data (NalReader * nr)
{
NalReader nr_tmp;
guint remaining, nbits;
guint8 rbsp_stop_one_bit, zero_bits;
remaining = nal_reader_get_remaining (nr);
if (remaining == 0)
return FALSE;
nr_tmp = *nr;
nr = &nr_tmp;
/* The spec defines that more_rbsp_data() searches for the last bit
equal to 1, and that it is the rbsp_stop_one_bit. Subsequent bits
until byte boundary is reached shall be zero.
This means that more_rbsp_data() is FALSE if the next bit is 1
and the remaining bits until byte boundary are zero. One way to
be sure that this bit was the very last one, is that every other
bit after we reached byte boundary are also set to zero.
Otherwise, if the next bit is 0 or if there are non-zero bits
afterwards, then then we have more_rbsp_data() */
if (!nal_reader_get_bits_uint8 (nr, &rbsp_stop_one_bit, 1))
return FALSE;
if (!rbsp_stop_one_bit)
return TRUE;
nbits = --remaining % 8;
while (remaining > 0) {
if (!nal_reader_get_bits_uint8 (nr, &zero_bits, nbits))
return FALSE;
if (zero_bits != 0)
return TRUE;
remaining -= nbits;
nbits = 8;
}
return FALSE;
}
/*********** end of nal parser ***************/
gint
scan_for_start_codes (const guint8 * data, guint size)
{
GstByteReader br;
gst_byte_reader_init (&br, data, size);
/* NALU not empty, so we can at least expect 1 (even 2) bytes following sc */
return gst_byte_reader_masked_scan_uint32 (&br, 0xffffff00, 0x00000100,
0, size);
}
void
nal_writer_init (NalWriter * nw, guint nal_prefix_size, gboolean packetized)
{
g_return_if_fail (nw != NULL);
g_return_if_fail ((packetized && nal_prefix_size > 1 && nal_prefix_size < 5)
|| (!packetized && (nal_prefix_size == 3 || nal_prefix_size == 4)));
gst_bit_writer_init (&nw->bw);
nw->nal_prefix_size = nal_prefix_size;
nw->packetized = packetized;
}
void
nal_writer_reset (NalWriter * nw)
{
g_return_if_fail (nw != NULL);
gst_bit_writer_reset (&nw->bw);
memset (nw, 0, sizeof (NalWriter));
}
gboolean
nal_writer_do_rbsp_trailing_bits (NalWriter * nw)
{
g_return_val_if_fail (nw != NULL, FALSE);
if (!gst_bit_writer_put_bits_uint8 (&nw->bw, 1, 1)) {
GST_WARNING ("Cannot put trailing bits");
return FALSE;
}
if (!gst_bit_writer_align_bytes (&nw->bw, 0)) {
GST_WARNING ("Cannot put align bits");
return FALSE;
}
return TRUE;
}
static gpointer
nal_writer_create_nal_data (NalWriter * nw, guint32 * ret_size)
{
GstBitWriter bw;
gint i;
guint8 *src, *dst;
gsize size;
gpointer data;
/* scan to put emulation_prevention_three_byte */
size = GST_BIT_WRITER_BIT_SIZE (&nw->bw) >> 3;
src = GST_BIT_WRITER_DATA (&nw->bw);
gst_bit_writer_init_with_size (&bw, size + nw->nal_prefix_size, FALSE);
for (i = 0; i < nw->nal_prefix_size - 1; i++)
gst_bit_writer_put_bits_uint8 (&bw, 0, 8);
gst_bit_writer_put_bits_uint8 (&bw, 1, 8);
for (i = 0; i < size; i++) {
guint pos = (GST_BIT_WRITER_BIT_SIZE (&bw) >> 3);
dst = GST_BIT_WRITER_DATA (&bw);
if (pos >= nw->nal_prefix_size + 2 &&
dst[pos - 2] == 0 && dst[pos - 1] == 0 && src[i] <= 0x3) {
gst_bit_writer_put_bits_uint8 (&bw, 0x3, 8);
}
gst_bit_writer_put_bits_uint8 (&bw, src[i], 8);
}
*ret_size = bw.bit_size >> 3;
data = gst_bit_writer_reset_and_get_data (&bw);
if (nw->packetized) {
size = *ret_size - nw->nal_prefix_size;
switch (nw->nal_prefix_size) {
case 1:
GST_WRITE_UINT8 (data, size);
break;
case 2:
GST_WRITE_UINT16_BE (data, size);
break;
case 3:
GST_WRITE_UINT24_BE (data, size);
break;
case 4:
GST_WRITE_UINT32_BE (data, size);
break;
default:
g_assert_not_reached ();
break;
}
}
return data;
}
GstMemory *
nal_writer_reset_and_get_memory (NalWriter * nw)
{
guint32 size = 0;
GstMemory *ret = NULL;
gpointer data;
g_return_val_if_fail (nw != NULL, NULL);
if ((GST_BIT_WRITER_BIT_SIZE (&nw->bw) >> 3) == 0) {
GST_WARNING ("No written byte");
goto done;
}
if ((GST_BIT_WRITER_BIT_SIZE (&nw->bw) & 0x7) != 0) {
GST_WARNING ("Written stream is not byte aligned");
if (!nal_writer_do_rbsp_trailing_bits (nw))
goto done;
}
data = nal_writer_create_nal_data (nw, &size);
if (!data) {
GST_WARNING ("Failed to create nal data");
goto done;
}
ret = gst_memory_new_wrapped (0, data, size, 0, size, data, g_free);
done:
gst_bit_writer_reset (&nw->bw);
return ret;
}
guint8 *
nal_writer_reset_and_get_data (NalWriter * nw, guint32 * ret_size)
{
guint32 size = 0;
guint8 *data = NULL;
g_return_val_if_fail (nw != NULL, NULL);
g_return_val_if_fail (ret_size != NULL, NULL);
*ret_size = 0;
if ((GST_BIT_WRITER_BIT_SIZE (&nw->bw) >> 3) == 0) {
GST_WARNING ("No written byte");
goto done;
}
if ((GST_BIT_WRITER_BIT_SIZE (&nw->bw) & 0x7) != 0) {
GST_WARNING ("Written stream is not byte aligned");
if (!nal_writer_do_rbsp_trailing_bits (nw))
goto done;
}
data = nal_writer_create_nal_data (nw, &size);
if (!data) {
GST_WARNING ("Failed to create nal data");
goto done;
}
*ret_size = size;
done:
gst_bit_writer_reset (&nw->bw);
return data;
}
gboolean
nal_writer_put_bits_uint8 (NalWriter * nw, guint8 value, guint nbits)
{
g_return_val_if_fail (nw != NULL, FALSE);
if (!gst_bit_writer_put_bits_uint8 (&nw->bw, value, nbits))
return FALSE;
return TRUE;
}
gboolean
nal_writer_put_bits_uint16 (NalWriter * nw, guint16 value, guint nbits)
{
g_return_val_if_fail (nw != NULL, FALSE);
if (!gst_bit_writer_put_bits_uint16 (&nw->bw, value, nbits))
return FALSE;
return TRUE;
}
gboolean
nal_writer_put_bits_uint32 (NalWriter * nw, guint32 value, guint nbits)
{
g_return_val_if_fail (nw != NULL, FALSE);
if (!gst_bit_writer_put_bits_uint32 (&nw->bw, value, nbits))
return FALSE;
return TRUE;
}
gboolean
nal_writer_put_bytes (NalWriter * nw, const guint8 * data, guint nbytes)
{
g_return_val_if_fail (nw != NULL, FALSE);
g_return_val_if_fail (data != NULL, FALSE);
g_return_val_if_fail (nbytes != 0, FALSE);
if (!gst_bit_writer_put_bytes (&nw->bw, data, nbytes))
return FALSE;
return TRUE;
}
gboolean
nal_writer_put_ue (NalWriter * nw, guint32 value)
{
guint leading_zeros;
guint rest;
g_return_val_if_fail (nw != NULL, FALSE);
count_exp_golomb_bits (value, &leading_zeros, &rest);
/* write leading zeros */
if (leading_zeros) {
if (!nal_writer_put_bits_uint32 (nw, 0, leading_zeros))
return FALSE;
}
/* write the rest */
if (!nal_writer_put_bits_uint32 (nw, value + 1, rest))
return FALSE;
return TRUE;
}
gboolean
count_exp_golomb_bits (guint32 value, guint * leading_zeros, guint * rest)
{
guint32 x;
guint count = 0;
/* https://en.wikipedia.org/wiki/Exponential-Golomb_coding */
/* count bits of value + 1 */
x = value + 1;
while (x) {
count++;
x >>= 1;
}
if (leading_zeros) {
if (count > 1)
*leading_zeros = count - 1;
else
*leading_zeros = 0;
}
if (rest) {
*rest = count;
}
return TRUE;
}
+269
View File
@@ -0,0 +1,269 @@
/* Gstreamer
* Copyright (C) <2011> Intel Corporation
* Copyright (C) <2011> Collabora Ltd.
* Copyright (C) <2011> Thibault Saunier <thibault.saunier@collabora.com>
*
* Some bits C-c,C-v'ed and s/4/3 from h264parse and videoparsers/h264parse.c:
* Copyright (C) <2010> Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
* Copyright (C) <2010> Collabora Multimedia
* Copyright (C) <2010> Nokia Corporation
*
* (C) 2005 Michal Benes <michal.benes@itonis.tv>
* (C) 2008 Wim Taymans <wim.taymans@gmail.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Library General Public License for more details.
*
* You should have received a copy of the GNU Library General Public
* License along with this library; if not, write to the
* Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
* Boston, MA 02110-1301, USA.
*/
/**
* Common code for NAL parsing from h264 and h265 parsers.
*/
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#include <gst/base/gstbytereader.h>
#include <gst/base/gstbitreader.h>
#include <gst/base/gstbitwriter.h>
typedef struct
{
const guint8 *data;
guint size;
guint n_epb; /* Number of emulation prevention bytes */
guint byte; /* Byte position */
guint bits_in_cache; /* bitpos in the cache of next bit */
guint8 first_byte;
guint32 epb_cache; /* cache 3 bytes to check emulation prevention bytes */
guint64 cache; /* cached bytes */
} NalReader;
typedef struct
{
GstBitWriter bw;
guint nal_prefix_size;
gboolean packetized;
} NalWriter;
G_GNUC_INTERNAL
void nal_reader_init (NalReader * nr, const guint8 * data, guint size);
G_GNUC_INTERNAL
gboolean nal_reader_read (NalReader * nr, guint nbits);
G_GNUC_INTERNAL
gboolean nal_reader_skip (NalReader * nr, guint nbits);
G_GNUC_INTERNAL
gboolean nal_reader_skip_long (NalReader * nr, guint nbits);
G_GNUC_INTERNAL
guint nal_reader_get_pos (const NalReader * nr);
G_GNUC_INTERNAL
guint nal_reader_get_remaining (const NalReader * nr);
G_GNUC_INTERNAL
guint nal_reader_get_epb_count (const NalReader * nr);
G_GNUC_INTERNAL
gboolean nal_reader_is_byte_aligned (NalReader * nr);
G_GNUC_INTERNAL
gboolean nal_reader_has_more_data (NalReader * nr);
#define NAL_READER_READ_BITS_H(bits) \
G_GNUC_INTERNAL \
gboolean nal_reader_get_bits_uint##bits (NalReader *nr, guint##bits *val, guint nbits)
NAL_READER_READ_BITS_H (8);
NAL_READER_READ_BITS_H (16);
NAL_READER_READ_BITS_H (32);
#define NAL_READER_PEEK_BITS_H(bits) \
G_GNUC_INTERNAL \
gboolean nal_reader_peek_bits_uint##bits (const NalReader *nr, guint##bits *val, guint nbits)
NAL_READER_PEEK_BITS_H (8);
G_GNUC_INTERNAL
gboolean nal_reader_get_ue (NalReader * nr, guint32 * val);
G_GNUC_INTERNAL
gboolean nal_reader_get_se (NalReader * nr, gint32 * val);
#define CHECK_ALLOWED_MAX_WITH_DEBUG(dbg, val, max) { \
if (val > max) { \
GST_WARNING ("value for '" dbg "' greater than max. value: %d, max %d", \
val, max); \
goto error; \
} \
}
#define CHECK_ALLOWED_MAX(val, max) \
CHECK_ALLOWED_MAX_WITH_DEBUG (G_STRINGIFY (val), val, max)
#define CHECK_ALLOWED_WITH_DEBUG(dbg, val, min, max) { \
if (val < min || val > max) { \
GST_WARNING ("value for '" dbg "' not in allowed range. value: %d, range %d-%d", \
val, min, max); \
goto error; \
} \
}
#define CHECK_ALLOWED(val, min, max) \
CHECK_ALLOWED_WITH_DEBUG (G_STRINGIFY (val), val, min, max)
#define READ_UINT8(nr, val, nbits) { \
if (!nal_reader_get_bits_uint8 (nr, &val, nbits)) { \
GST_WARNING ("failed to read uint8 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define READ_UINT16(nr, val, nbits) { \
if (!nal_reader_get_bits_uint16 (nr, &val, nbits)) { \
GST_WARNING ("failed to read uint16 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define READ_UINT32(nr, val, nbits) { \
if (!nal_reader_get_bits_uint32 (nr, &val, nbits)) { \
GST_WARNING ("failed to read uint32 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define READ_UINT64(nr, val, nbits) { \
if (!nal_reader_get_bits_uint64 (nr, &val, nbits)) { \
GST_WARNING ("failed to read uint32 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define READ_UE(nr, val) { \
if (!nal_reader_get_ue (nr, &val)) { \
GST_WARNING ("failed to read UE for '" G_STRINGIFY (val) "'"); \
goto error; \
} \
}
#define READ_UE_ALLOWED(nr, val, min, max) { \
guint32 tmp; \
READ_UE (nr, tmp); \
CHECK_ALLOWED_WITH_DEBUG (G_STRINGIFY (val), tmp, min, max); \
val = tmp; \
}
#define READ_UE_MAX(nr, val, max) { \
guint32 tmp; \
READ_UE (nr, tmp); \
CHECK_ALLOWED_MAX_WITH_DEBUG (G_STRINGIFY (val), tmp, max); \
val = tmp; \
}
#define READ_SE(nr, val) { \
if (!nal_reader_get_se (nr, &val)) { \
GST_WARNING ("failed to read SE for '" G_STRINGIFY (val) "'"); \
goto error; \
} \
}
#define READ_SE_ALLOWED(nr, val, min, max) { \
gint32 tmp; \
READ_SE (nr, tmp); \
CHECK_ALLOWED_WITH_DEBUG (G_STRINGIFY (val), tmp, min, max); \
val = tmp; \
}
G_GNUC_INTERNAL
gint scan_for_start_codes (const guint8 * data, guint size);
G_GNUC_INTERNAL
void nal_writer_init (NalWriter * nw, guint nal_prefix_size, gboolean packetized);
G_GNUC_INTERNAL
void nal_writer_reset (NalWriter * nw);
G_GNUC_INTERNAL
gboolean nal_writer_do_rbsp_trailing_bits (NalWriter * nw);
G_GNUC_INTERNAL
GstMemory * nal_writer_reset_and_get_memory (NalWriter * nw);
G_GNUC_INTERNAL
guint8 * nal_writer_reset_and_get_data (NalWriter * nw, guint32 * ret_size);
G_GNUC_INTERNAL
gboolean nal_writer_put_bits_uint8 (NalWriter * nw, guint8 value, guint nbits);
G_GNUC_INTERNAL
gboolean nal_writer_put_bits_uint16 (NalWriter * nw, guint16 value, guint nbits);
G_GNUC_INTERNAL
gboolean nal_writer_put_bits_uint32 (NalWriter * nw, guint32 value, guint nbits);
G_GNUC_INTERNAL
gboolean nal_writer_put_bytes (NalWriter * nw, const guint8 * data, guint nbytes);
G_GNUC_INTERNAL
gboolean nal_writer_put_ue (NalWriter * nw, guint32 value);
G_GNUC_INTERNAL
gboolean count_exp_golomb_bits (guint32 value, guint * leading_zeros, guint * rest);
#define WRITE_UINT8(nw, val, nbits) { \
if (!nal_writer_put_bits_uint8 (nw, val, nbits)) { \
GST_WARNING ("failed to write uint8 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define WRITE_UINT16(nw, val, nbits) { \
if (!nal_writer_put_bits_uint16 (nw, val, nbits)) { \
GST_WARNING ("failed to write uint16 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define WRITE_UINT32(nw, val, nbits) { \
if (!nal_writer_put_bits_uint32 (nw, val, nbits)) { \
GST_WARNING ("failed to write uint32 for '" G_STRINGIFY (val) "', nbits: %d", nbits); \
goto error; \
} \
}
#define WRITE_BYTES(nw, data, nbytes) { \
if (!nal_writer_put_bytes (nw, data, nbytes)) { \
GST_WARNING ("failed to write bytes for '" G_STRINGIFY (val) "', nbits: %d", nbytes); \
goto error; \
} \
}
#define WRITE_UE(nw, val) { \
if (!nal_writer_put_ue (nw, val)) { \
GST_WARNING ("failed to write ue for '" G_STRINGIFY (val) "'"); \
goto error; \
} \
}
static inline guint32 div_ceil (guint32 a, guint32 b)
{
/* http://blog.pkh.me/p/36-figuring-out-round%2C-floor-and-ceil-with-integer-division.html */
g_assert (b > 0);
return a / b + (a % b > 0);
}
+10
View File
@@ -0,0 +1,10 @@
/* Stub for <gst/glib-compat-private.h>.
* In upstream GStreamer this provides backwards-compat shims for older
* GLib versions (g_memdup2 polyfill being the load-bearing one).
* Our gst_compat.h already defines g_memdup2 as a static inline, so
* we just include the shim.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_GLIB_COMPAT_PRIVATE_STUB
#define LIBVA_V4L2_REQUEST_FOURIER_GLIB_COMPAT_PRIVATE_STUB
#include "gst_compat.h"
#endif
+10
View File
@@ -0,0 +1,10 @@
/* Stub for <gst/gst.h> — redirects to the project's gst_compat shim.
* The vendored GStreamer 1.28.2 H.265 parser was originally built against
* full GStreamer; we only need the GLib type aliases + memory helpers +
* macro stubs, all provided by gst_compat.h. Original gst.h would pull
* in GObject + GstObject + the entire framework, which we don't link.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_GST_H_STUB
#define LIBVA_V4L2_REQUEST_FOURIER_GST_H_STUB
#include "gst_compat.h"
#endif
+145
View File
@@ -0,0 +1,145 @@
/*
* gst_compat.c — GArray implementation for the vendored GStreamer parser.
*
* Scope: minimal subset of GArray API exercised by gsth265parser.c
* (g_array_new, g_array_sized_new, g_array_append_vals + the
* g_array_append_val macro, g_array_index macro, g_array_set_size,
* g_array_set_clear_func, g_array_free, g_array_unref).
*
* Non-thread-safe (matches GArray's documented semantics — GArray is
* not thread-safe in upstream GLib either, callers must serialize).
*
* License: MIT (matches backend's COPYING.MIT).
*/
#include "gst_compat.h"
/* ===== internal helpers ===== */
static gboolean
garray_grow(GArray *array, guint new_capacity)
{
if (new_capacity <= array->capacity)
return TRUE;
/* round up to next power of two for amortized O(1) growth */
guint cap = array->capacity > 0 ? array->capacity : 4;
while (cap < new_capacity)
cap *= 2;
char *new_data = realloc(array->data, (size_t)cap * array->element_size);
if (new_data == NULL)
return FALSE;
if (array->clear) {
memset(new_data + (size_t)array->capacity * array->element_size, 0,
(size_t)(cap - array->capacity) * array->element_size);
}
array->data = new_data;
array->capacity = cap;
return TRUE;
}
/* ===== public API ===== */
GArray *
g_array_sized_new(gboolean zero_terminated, gboolean clear,
guint element_size, guint reserved_size)
{
/* zero_terminated is GLib-specific (appends a zero-element sentinel
* for trailing-NULL semantics). The vendored parser does not use it;
* we ignore the flag. */
(void)zero_terminated;
GArray *a = calloc(1, sizeof(GArray));
if (a == NULL)
return NULL;
a->element_size = element_size;
a->clear = clear;
if (reserved_size > 0) {
if (!garray_grow(a, reserved_size)) {
free(a);
return NULL;
}
}
return a;
}
GArray *
g_array_new(gboolean zero_terminated, gboolean clear, guint element_size)
{
return g_array_sized_new(zero_terminated, clear, element_size, 0);
}
GArray *
g_array_set_size(GArray *array, guint length)
{
if (length > array->capacity) {
if (!garray_grow(array, length))
return array;
}
if (array->clear_func != NULL && length < array->len) {
for (guint i = length; i < array->len; i++)
array->clear_func(array->data + (size_t)i * array->element_size);
}
if (array->clear && length > array->len) {
memset(array->data + (size_t)array->len * array->element_size, 0,
(size_t)(length - array->len) * array->element_size);
}
array->len = length;
return array;
}
GArray *
g_array_append_vals(GArray *array, gconstpointer data, guint len)
{
if (len == 0)
return array;
if (!garray_grow(array, array->len + len))
return array;
memcpy(array->data + (size_t)array->len * array->element_size,
data, (size_t)len * array->element_size);
array->len += len;
return array;
}
void
g_array_set_clear_func(GArray *array, void (*clear_func)(gpointer))
{
array->clear_func = clear_func;
}
gchar *
g_array_free(GArray *array, gboolean free_segment)
{
if (array == NULL)
return NULL;
if (array->clear_func != NULL) {
for (guint i = 0; i < array->len; i++)
array->clear_func(array->data + (size_t)i * array->element_size);
}
gchar *data = NULL;
if (free_segment) {
free(array->data);
} else {
data = array->data;
}
free(array);
return data;
}
GArray *
g_array_unref(GArray *array)
{
/* simplified to free; the backend never sub-references shared GArrays */
g_array_free(array, TRUE);
return NULL;
}
+463
View File
@@ -0,0 +1,463 @@
/*
* gst_compat.h — minimal GLib/GStreamer compatibility shim for vendored
* GStreamer 1.28.2 H.265 parser + bitreader + bytereader + nalutils.
*
* Strategy: provide #defines / typedefs for the GLib API surface those
* 4 vendored files use, so they can compile against libc + libv4l2 only
* (no glib2 / gst-base linkage). Vendored .c files are NOT modified
* directly; instead this header is force-included via the Makefile's
* `-include` flag on the vendored translation units.
*
* Coverage scoped to what gsth265parser.c + nalutils.c + gstbitreader.c
* + gstbytereader.c actually call. Surveyed in
* ampere-kernel-decoders phase4 step 2 prep — see
* ~/src/ampere-kernel-decoders/phase4_plan_iter2.md and the survey
* commit message for the empirical inventory.
*
* License: this shim is original work, MIT (matching the backend's
* COPYING.MIT). The vendored .c files keep their LGPL v2.1+ headers
* verbatim.
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_GST_COMPAT_H
#define LIBVA_V4L2_REQUEST_FOURIER_GST_COMPAT_H
#include <assert.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* ===== GLib type aliases ===== */
typedef bool gboolean;
typedef char gchar;
typedef unsigned char guchar;
typedef int gint;
typedef int8_t gint8;
typedef int16_t gint16;
typedef int32_t gint32;
typedef int64_t gint64;
typedef unsigned int guint;
typedef uint8_t guint8;
typedef uint16_t guint16;
typedef uint32_t guint32;
typedef uint64_t guint64;
typedef size_t gsize;
typedef ptrdiff_t gssize;
typedef void * gpointer;
typedef const void * gconstpointer;
typedef double gdouble;
typedef float gfloat;
/* GLib's gint64 / guint64 formatting is platform-conditional; for our
* aarch64 ALARM target we don't need the full G_*_FORMAT machinery, but
* gstbytereader uses G_GSIZE_FORMAT in a debug-only printf. */
#define G_GSIZE_FORMAT "zu"
#ifndef TRUE
# define TRUE true
#endif
#ifndef FALSE
# define FALSE false
#endif
/* ===== memory ===== */
#define g_malloc(n) malloc((size_t)(n))
#define g_malloc0(n) calloc(1, (size_t)(n))
#define g_realloc(p, n) realloc((p), (size_t)(n))
/* g_free needs to be addressable (passed as a function-pointer arg by
* nalutils.c::gst_memory_new_wrapped — even though that call site is
* dead code we don't invoke, it must compile). Plain `free` is
* compatible: signature is `void (void *)` either way. */
#define g_free free
#define g_new(type, n) ((type *)malloc(sizeof(type) * (size_t)(n)))
#define g_new0(type, n) ((type *)calloc((size_t)(n), sizeof(type)))
#define g_slice_new(type) ((type *)malloc(sizeof(type)))
#define g_slice_new0(type) ((type *)calloc(1, sizeof(type)))
#define g_slice_free(type, p) free(p)
#define g_slice_free1(size, p) free(p)
#define g_clear_pointer(pp, freefn) \
do { freefn(*(pp)); *(pp) = NULL; } while (0)
/* g_memdup2 — GLib's 64-bit-safe memdup, used by gstbytereader. */
static inline gpointer
g_memdup2(gconstpointer mem, gsize byte_size)
{
if (mem == NULL || byte_size == 0)
return NULL;
void *copy = malloc(byte_size);
if (copy != NULL)
memcpy(copy, mem, byte_size);
return copy;
}
/* g_strcmp0 — NULL-safe strcmp. Used by gsth265parser in profile-name lookup. */
static inline int
g_strcmp0(const char *a, const char *b)
{
if (a == b) return 0;
if (a == NULL) return -1;
if (b == NULL) return 1;
return strcmp(a, b);
}
/* ===== asserts / return-guards =====
*
* Per ampere-kernel-decoders iter2 Phase 2 §"new failure modes" #5:
* g_assert must NOT abort the process. It becomes a no-op here;
* malformed bitstream is caught by the explicit parse-result returns
* the parser already implements.
*
* g_return_if_fail / g_return_val_if_fail propagate as the original
* GLib semantics (early return with optional value). */
#define g_assert(cond) ((void)0)
#define g_assert_not_reached() __builtin_unreachable()
#define g_return_if_fail(cond) do { if (!(cond)) return; } while (0)
#define g_return_val_if_fail(cond, v) do { if (!(cond)) return (v); } while (0)
/* ===== GStreamer logging — no-ops =====
*
* The parser is heavy on debug logging. We compile all of it out;
* the backend's own logging (request_log/error_log) wraps the parser
* calls and reports parse-failure return codes from there. */
#define GST_DISABLE_GST_DEBUG 1
#define GST_DEBUG_CATEGORY_STATIC(name)
#define GST_DEBUG_CATEGORY_INIT(...) ((void)0)
#define GST_DEBUG_CATEGORY_GET(...) ((void)0)
#define GST_DEBUG(...) ((void)0)
#define GST_INFO(...) ((void)0)
#define GST_WARNING(...) ((void)0)
#define GST_ERROR(...) ((void)0)
#define GST_LOG(...) ((void)0)
#define GST_FIXME(...) ((void)0)
#define GST_MEMDUMP(...) ((void)0)
#define GST_CAT_DEFAULT (NULL)
/* ===== compiler / language helpers ===== */
#define G_LIKELY(x) __builtin_expect(!!(x), 1)
#define G_UNLIKELY(x) __builtin_expect(!!(x), 0)
#define G_GNUC_UNUSED __attribute__((unused))
#define G_GNUC_INTERNAL
#define G_GNUC_MALLOC __attribute__((malloc))
#define G_GNUC_NORETURN __attribute__((noreturn))
#define G_GNUC_DEPRECATED
#define G_GNUC_DEPRECATED_FOR(x)
#define G_GNUC_PURE __attribute__((pure))
#define G_GNUC_CONST __attribute__((const))
#define G_GNUC_PRINTF(a, b) __attribute__((format(printf, a, b)))
#define G_BEGIN_DECLS
#define G_END_DECLS
#define G_N_ELEMENTS(arr) (sizeof(arr) / sizeof((arr)[0]))
#define G_STMT_START do
#define G_STMT_END while (0)
#define G_STRINGIFY(x) G_STRINGIFY_(x)
#define G_STRINGIFY_(x) #x
/* GStreamer ABI-padding slot count; upstream uses 4 reserved gpointers
* at the end of public structs for future ABI extension. We replicate
* the size so struct layout matches what gst_byte_reader_init / friends
* write into. */
#define GST_PADDING 4
#define GST_PADDING_LARGE 20
/* Public-symbol visibility — backend's shared module uses
* -fvisibility=hidden, so we don't need to mark anything public from
* within the vendored parser. The original GST_*_API macros expand to
* extern + dllimport on Windows; on Linux ELF builds where
* fvisibility=hidden is active, they would mark public symbols. The
* vendored functions are never called from outside h265_parser/, so
* leaving these empty hides them automatically. */
#define GST_API
#define GST_API_EXPORT extern
#define GST_API_IMPORT extern
/* ===== Opaque GStreamer pipeline types =====
*
* GstBuffer + GstMemory are referenced by encoder-side dead-code
* functions in gsth265parser.c (gst_h265_parser_insert_sei_hevc).
* We never call those; declaring them as opaque structs lets the
* function pointers / declarations compile, and the linker keeps the
* dead-code .text section even though it's unreachable.
*
* If you ever need to actually USE GstBuffer in this tree, replace
* these opaque decls with the project's own buffer abstraction; do not
* try to vendor in libgst itself. */
typedef struct _GstBuffer GstBuffer;
typedef struct _GstMemory GstMemory;
typedef struct _GstMapInfo GstMapInfo; /* opaque — dead-code in gsth265parser SEI insert */
/* GLib min/max constants — dead-code unsigned-overflow guards in
* gsth265parser.c. */
#define G_MAXUINT8 ((guint8)0xFF)
#define G_MAXUINT16 ((guint16)0xFFFF)
#define G_MAXUINT32 ((guint32)0xFFFFFFFFU)
#define G_MAXUINT64 ((guint64)0xFFFFFFFFFFFFFFFFULL)
#define G_MAXINT8 ((gint8)0x7F)
#define G_MAXINT16 ((gint16)0x7FFF)
#define G_MAXINT32 ((gint32)0x7FFFFFFF)
#define G_MAXINT64 ((gint64)0x7FFFFFFFFFFFFFFFLL)
#define G_MININT8 ((gint8)(-0x80))
#define G_MININT16 ((gint16)(-0x8000))
#define G_MININT32 ((gint32)(-0x80000000))
#define G_MAXSIZE ((gsize)-1)
/* GLib function-pointer typedefs used by g_list_* APIs (which our
* gst_compat declares as abort-stubs). They show up in code paths
* we never invoke but must compile. */
typedef void (*GDestroyNotify)(gpointer data);
typedef int (*GCompareFunc)(gconstpointer a, gconstpointer b);
typedef int (*GCompareDataFunc)(gconstpointer a, gconstpointer b, gpointer user_data);
/* GstMapFlags — passed to gst_memory_map / gst_buffer_map. Dead-code. */
#define GST_MAP_READ (1 << 0)
#define GST_MAP_WRITE (1 << 1)
#define GST_MAP_READWRITE (GST_MAP_READ | GST_MAP_WRITE)
/* Dead-code stubs for buffer / memory mapping (only referenced by
* gst_h265_parser_insert_sei_hevc which we never call). The compile
* needs declarations + addressable functions; abort on call. */
static inline gboolean
gst_memory_map(GstMemory *mem G_GNUC_UNUSED, GstMapInfo *info G_GNUC_UNUSED,
int flags G_GNUC_UNUSED) { abort(); }
static inline void
gst_memory_unmap(GstMemory *mem G_GNUC_UNUSED, GstMapInfo *info G_GNUC_UNUSED) { abort(); }
static inline gboolean
gst_buffer_map(GstBuffer *buf G_GNUC_UNUSED, GstMapInfo *info G_GNUC_UNUSED,
int flags G_GNUC_UNUSED) { abort(); }
static inline void
gst_buffer_unmap(GstBuffer *buf G_GNUC_UNUSED, GstMapInfo *info G_GNUC_UNUSED) { abort(); }
static inline GstBuffer *
gst_buffer_new(void) { abort(); }
static inline gboolean
gst_buffer_copy_into(GstBuffer *dst G_GNUC_UNUSED, GstBuffer *src G_GNUC_UNUSED,
int flags G_GNUC_UNUSED, gsize offset G_GNUC_UNUSED,
gssize size G_GNUC_UNUSED) { abort(); }
static inline void
gst_buffer_append_memory(GstBuffer *buf G_GNUC_UNUSED, GstMemory *mem G_GNUC_UNUSED) { abort(); }
static inline GstMemory *
gst_memory_ref(GstMemory *mem G_GNUC_UNUSED) { abort(); }
static inline void
gst_memory_unref(GstMemory *mem G_GNUC_UNUSED) { abort(); }
static inline GstMemory *
gst_memory_copy(GstMemory *mem G_GNUC_UNUSED, gssize offset G_GNUC_UNUSED, gssize size G_GNUC_UNUSED) { abort(); }
static inline void
gst_clear_buffer(GstBuffer **buf) { *buf = NULL; }
#define GST_IS_BUFFER(b) (false)
/* GstBufferCopyFlags — used only by gst_buffer_copy_into in dead code. */
#define GST_BUFFER_COPY_METADATA (1 << 0)
#define GST_BUFFER_COPY_MEMORY (1 << 1)
#define GST_BUFFER_COPY_DEEP (1 << 2)
/* gst_util_ceil_log2(n) — ceil(log2(n)) for non-zero unsigned n.
* Used by gsth265parser.c::gst_h265_slice_parse_ref_pic_list_modification.
* That function is in the slice-header parser which the libva backend
* does NOT invoke (we only call parse_sps) — but the linker still
* needs a definition. Provide a real impl: cheaper to compute than to
* justify a dead-code stub at every call site. */
static inline guint
gst_util_ceil_log2(guint32 n)
{
if (n <= 1) return 0;
/* __builtin_clz returns leading zeros for a 32-bit value;
* 32 - clz(n-1) = bits needed = ceil(log2(n)). */
return 32 - (guint)__builtin_clz(n - 1);
}
/* GstMapInfo's real definition is in <gst/gstmemory.h>; we need at
* least enough to make `info->data` / `info->size` compile. */
struct _GstMapInfo {
GstMemory *memory;
int flags;
guint8 *data;
gsize size;
gsize maxsize;
gpointer user_data[4];
gpointer _gst_reserved[GST_PADDING];
};
/* gst_memory_new_wrapped — dead-code stub (nalutils.c calls it from
* the SEI-insertion path the libva backend never invokes). */
static inline GstMemory *
gst_memory_new_wrapped(int flags, gpointer data, gsize maxsize,
gsize offset, gsize size, gpointer user_data,
void (*notify)(gpointer))
{
(void)flags; (void)data; (void)maxsize; (void)offset; (void)size;
(void)user_data; (void)notify;
abort();
}
/* ===== byte-order read / write macros =====
*
* GStreamer provides these as static-inline functions in
* <gst/gstutils.h>. We re-implement for aarch64 little-endian; the
* parser is byte-stream input, so endian-conversion is mechanical.
* The float / double variants are present in upstream but the parser
* never invokes them — provide stubs so the address-taking sites in
* gstbytereader.h's function table compile. */
#define GST_READ_UINT8(data) \
(*((const guint8 *)(data)))
#define GST_READ_UINT16_LE(data) ( \
((guint16)((const guint8 *)(data))[0]) | \
((guint16)((const guint8 *)(data))[1] << 8))
#define GST_READ_UINT16_BE(data) ( \
((guint16)((const guint8 *)(data))[0] << 8) | \
((guint16)((const guint8 *)(data))[1]))
#define GST_READ_UINT24_LE(data) ( \
((guint32)((const guint8 *)(data))[0]) | \
((guint32)((const guint8 *)(data))[1] << 8) | \
((guint32)((const guint8 *)(data))[2] << 16))
#define GST_READ_UINT24_BE(data) ( \
((guint32)((const guint8 *)(data))[0] << 16) | \
((guint32)((const guint8 *)(data))[1] << 8) | \
((guint32)((const guint8 *)(data))[2]))
#define GST_READ_UINT32_LE(data) ( \
((guint32)((const guint8 *)(data))[0]) | \
((guint32)((const guint8 *)(data))[1] << 8) | \
((guint32)((const guint8 *)(data))[2] << 16) | \
((guint32)((const guint8 *)(data))[3] << 24))
#define GST_READ_UINT32_BE(data) ( \
((guint32)((const guint8 *)(data))[0] << 24) | \
((guint32)((const guint8 *)(data))[1] << 16) | \
((guint32)((const guint8 *)(data))[2] << 8) | \
((guint32)((const guint8 *)(data))[3]))
#define GST_READ_UINT64_LE(data) ( \
((guint64)((const guint8 *)(data))[0]) | \
((guint64)((const guint8 *)(data))[1] << 8) | \
((guint64)((const guint8 *)(data))[2] << 16) | \
((guint64)((const guint8 *)(data))[3] << 24) | \
((guint64)((const guint8 *)(data))[4] << 32) | \
((guint64)((const guint8 *)(data))[5] << 40) | \
((guint64)((const guint8 *)(data))[6] << 48) | \
((guint64)((const guint8 *)(data))[7] << 56))
#define GST_READ_UINT64_BE(data) ( \
((guint64)((const guint8 *)(data))[0] << 56) | \
((guint64)((const guint8 *)(data))[1] << 48) | \
((guint64)((const guint8 *)(data))[2] << 40) | \
((guint64)((const guint8 *)(data))[3] << 32) | \
((guint64)((const guint8 *)(data))[4] << 24) | \
((guint64)((const guint8 *)(data))[5] << 16) | \
((guint64)((const guint8 *)(data))[6] << 8) | \
((guint64)((const guint8 *)(data))[7]))
/* Float / double readers — dead-code, abort if called. The function
* table in gstbytereader.h takes the address of the underlying inline
* which we don't need to be functional, only addressable. */
static inline gfloat
GST_READ_FLOAT_LE(const guint8 *data) { (void)data; abort(); }
static inline gfloat
GST_READ_FLOAT_BE(const guint8 *data) { (void)data; abort(); }
static inline gdouble
GST_READ_DOUBLE_LE(const guint8 *data) { (void)data; abort(); }
static inline gdouble
GST_READ_DOUBLE_BE(const guint8 *data) { (void)data; abort(); }
/* Write side — nalutils.c writes-out SEI bytes (dead path for us but
* must compile). */
#define GST_WRITE_UINT8(data, val) do { \
((guint8 *)(data))[0] = (guint8)(val); \
} while (0)
#define GST_WRITE_UINT16_BE(data, val) do { \
((guint8 *)(data))[0] = (guint8)((val) >> 8); \
((guint8 *)(data))[1] = (guint8)((val)); \
} while (0)
#define GST_WRITE_UINT24_BE(data, val) do { \
((guint8 *)(data))[0] = (guint8)((val) >> 16); \
((guint8 *)(data))[1] = (guint8)((val) >> 8); \
((guint8 *)(data))[2] = (guint8)((val)); \
} while (0)
#define GST_WRITE_UINT32_BE(data, val) do { \
((guint8 *)(data))[0] = (guint8)((val) >> 24); \
((guint8 *)(data))[1] = (guint8)((val) >> 16); \
((guint8 *)(data))[2] = (guint8)((val) >> 8); \
((guint8 *)(data))[3] = (guint8)((val)); \
} while (0)
#ifndef MIN
# define MIN(a, b) ((a) < (b) ? (a) : (b))
#endif
#ifndef MAX
# define MAX(a, b) ((a) > (b) ? (a) : (b))
#endif
/* ===== GArray ===== */
typedef struct {
char *data; /* exposed via g_array_index / GArray->data */
guint len; /* element count */
guint capacity; /* allocated element slots */
guint element_size;
gboolean clear; /* zero-fill on grow */
void (*clear_func)(gpointer);
} GArray;
GArray *g_array_new(gboolean zero_terminated, gboolean clear, guint element_size);
GArray *g_array_sized_new(gboolean zero_terminated, gboolean clear,
guint element_size, guint reserved_size);
GArray *g_array_set_size(GArray *array, guint length);
GArray *g_array_append_vals(GArray *array, gconstpointer data, guint len);
void g_array_set_clear_func(GArray *array, void (*clear_func)(gpointer));
gchar *g_array_free(GArray *array, gboolean free_segment);
GArray *g_array_unref(GArray *array);
#define g_array_append_val(a, v) g_array_append_vals((a), &(v), 1)
#define g_array_index(a, t, i) (((t *)(void *)(a)->data)[i])
/* ===== GList — stubs that abort if reached =====
*
* Surveyed call sites: gsth265parser.c uses g_list_prepend / g_list_sort /
* g_list_free_full in code paths the libva backend does not invoke for
* basic SPS parsing (likely SEI message accumulation). Stub to abort so
* any future call surfaces immediately rather than silently corrupting. */
/* GList — full struct (not opaque) so callers can do `list->data`.
* The functions still abort because we never construct a GList. */
typedef struct _GList GList;
struct _GList {
gpointer data;
GList *next;
GList *prev;
};
static inline GList *g_list_prepend(GList *list G_GNUC_UNUSED, gpointer data G_GNUC_UNUSED) { abort(); }
static inline GList *g_list_sort(GList *list G_GNUC_UNUSED, int (*cmp)(gconstpointer, gconstpointer) G_GNUC_UNUSED) { abort(); }
static inline void g_list_free_full(GList *list G_GNUC_UNUSED, void (*free_func)(gpointer) G_GNUC_UNUSED) { abort(); }
/* ===== g_once_init_enter / g_once_init_leave =====
*
* GLib's lazy-init guards. The parser uses these for one-shot static
* initialization (e.g. profile-name table). Our backend is single-
* threaded at the parser-init site (driver_init), so we can simplify
* to a plain run-once gate. */
#define g_once_init_enter(loc) (*(loc) == 0)
#define g_once_init_leave(loc, val) (*(loc) = (val))
/* ===== conversions ===== */
#define GINT_TO_POINTER(i) ((gpointer)(uintptr_t)(gint)(i))
#define GPOINTER_TO_INT(p) ((gint)(uintptr_t)(p))
#endif /* LIBVA_V4L2_REQUEST_FOURIER_GST_COMPAT_H */
+90
View File
@@ -0,0 +1,90 @@
/*
* v4l2-hevc-ext-controls.h — verbatim mirror of Linux 7.0+ V4L2 stateless
* HEVC extended-SPS RPS control definitions, shipped as an internal
* header so this libva backend can be built against pre-7.0
* linux-api-headers packages (currently ampere ships 6.19-1).
*
* Upstream source: linux kernel, include/uapi/linux/v4l2-controls.h
* As-of: Linux 7.0-rc3 (Detlev Casanova / Collabora "VDPU381/VDPU383"
* series, see lkml.org/lkml/2026/1/9/1334). The two CIDs + two structs
* + two flag macros below are byte-for-byte the kernel UAPI definitions.
*
* Once linux-api-headers >= 7.0 is the floor across the fleet, this
* shim becomes redundant — `<linux/v4l2-controls.h>` will provide the
* same symbols. The include order in h265.c is: this header BEFORE
* <linux/v4l2-controls.h>, so when the system catches up, the macro
* guards below silently no-op and we use the system definitions.
*
* License: MIT (matches backend's COPYING.MIT). Per LGPL § 3.b., the
* kernel UAPI struct definitions themselves are excepted from the
* kernel's overall GPL and may be copied verbatim into userspace
* binaries without inheriting GPL.
*
* Rationale + iter2 plan: see
* ~/src/ampere-kernel-decoders/phase4_plan_iter2.md (§Step 3)
* ~/src/ampere-kernel-decoders/phase0_findings_iter2.md
*/
#ifndef LIBVA_V4L2_REQUEST_FOURIER_V4L2_HEVC_EXT_CONTROLS_H
#define LIBVA_V4L2_REQUEST_FOURIER_V4L2_HEVC_EXT_CONTROLS_H
#include <linux/types.h>
#include <linux/v4l2-controls.h>
#ifndef V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS
# define V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS \
(V4L2_CID_CODEC_STATELESS_BASE + 408)
#endif
#ifndef V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS
# define V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS \
(V4L2_CID_CODEC_STATELESS_BASE + 409)
#endif
#ifndef V4L2_HEVC_EXT_SPS_ST_RPS_FLAG_INTER_REF_PIC_SET_PRED
# define V4L2_HEVC_EXT_SPS_ST_RPS_FLAG_INTER_REF_PIC_SET_PRED 0x1
#endif
#ifndef V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT
# define V4L2_HEVC_EXT_SPS_LT_RPS_FLAG_USED_LT 0x1
#endif
/*
* struct v4l2_ctrl_hevc_ext_sps_st_rps — HEVC short-term RPS parameters.
*
* Dynamic-size 1-dimension array. Number of elements is
* v4l2_ctrl_hevc_sps::num_short_term_ref_pic_sets
* Can contain up to 65 elements (the H.265 spec § 7.4.3.2.1 maximum).
*/
#ifndef V4L2_HEVC_EXT_SPS_ST_RPS_DEFINED
# define V4L2_HEVC_EXT_SPS_ST_RPS_DEFINED 1
struct v4l2_ctrl_hevc_ext_sps_st_rps {
__u8 delta_idx_minus1;
__u8 delta_rps_sign;
__u8 num_negative_pics;
__u8 num_positive_pics;
__u32 used_by_curr_pic;
__u32 use_delta_flag;
__u16 abs_delta_rps_minus1;
__u16 delta_poc_s0_minus1[16];
__u16 delta_poc_s1_minus1[16];
__u16 flags;
};
#endif
/*
* struct v4l2_ctrl_hevc_ext_sps_lt_rps — HEVC long-term RPS parameters.
*
* Dynamic-size 1-dimension array. Number of elements is
* v4l2_ctrl_hevc_sps::num_long_term_ref_pics_sps
* Can contain up to 33 elements (the H.265 spec § 7.4.3.2.1 maximum).
*/
#ifndef V4L2_HEVC_EXT_SPS_LT_RPS_DEFINED
# define V4L2_HEVC_EXT_SPS_LT_RPS_DEFINED 1
struct v4l2_ctrl_hevc_ext_sps_lt_rps {
__u16 lt_ref_pic_poc_lsb_sps;
__u16 flags;
};
#endif
#endif /* LIBVA_V4L2_REQUEST_FOURIER_V4L2_HEVC_EXT_CONTROLS_H */
+43
View File
@@ -216,7 +216,50 @@ static VAStatus copy_surface_to_image (struct request_data *driver_data,
} }
} }
/*
* AV1 film_grain: when this surface is the display surface of a
* decode (current_display_picture != current_frame with apply_grain=1),
* its slot is NULL because BeginPicture only fired on the decode
* surface. Follow the back-link set in av1_set_controls and borrow
* the decode surface's destination_data + sizes for the copy.
*/
if (surface_object->current_slot == NULL &&
surface_object->linked_decode_surface_id != VA_INVALID_SURFACE) {
struct object_surface *decode_surface =
SURFACE(driver_data,
surface_object->linked_decode_surface_id);
if (decode_surface != NULL &&
decode_surface->current_slot != NULL) {
/* Mirror the fields we read below. The surface heap
* pointer is stable for the surface's lifetime; we
* only need destination_data + destination_sizes +
* destination_planes_count from it. */
surface_object->destination_planes_count =
decode_surface->destination_planes_count;
for (i = 0; i < decode_surface->destination_planes_count; i++) {
surface_object->destination_data[i] =
decode_surface->destination_data[i];
surface_object->destination_sizes[i] =
decode_surface->destination_sizes[i];
}
}
}
for (i = 0; i < surface_object->destination_planes_count; i++) { for (i = 0; i < surface_object->destination_planes_count; i++) {
/* AV1 Phase 3 diag: surface NULL-deref hunt. */
if (buffer_object->data == NULL ||
surface_object->destination_data[i] == NULL) {
request_log("copy_surface_to_image NULL i=%u "
"buf_data=%p dest_data=%p dest_size=%u "
"planes=%u slot=%p linked=0x%x\n",
i, (void *)buffer_object->data,
(void *)surface_object->destination_data[i],
surface_object->destination_sizes[i],
surface_object->destination_planes_count,
(void *)surface_object->current_slot,
surface_object->linked_decode_surface_id);
return VA_STATUS_ERROR_OPERATION_FAILED;
}
#ifdef __arm__ #ifdef __arm__
if (!video_format_is_linear(driver_data->video_format)) if (!video_format_is_linear(driver_data->video_format))
tiled_to_planar(surface_object->destination_data[i], tiled_to_planar(surface_object->destination_data[i],
+35 -3
View File
@@ -50,7 +50,17 @@ sources = [
'h265.c', 'h265.c',
'vp8.c', 'vp8.c',
'vp9.c', 'vp9.c',
'codec.c' 'av1.c',
'codec.c',
# Vendored GStreamer 1.28.2 H.265 parser + utilities (LGPL v2.1+,
# see src/h265_parser/gst_compat.h for sourcing notes + per-iter2
# adaptation strategy).
'h265_parser/gst_compat.c',
'h265_parser/gst/base/gstbitreader.c',
'h265_parser/gst/base/gstbytereader.c',
'h265_parser/gst/codecparsers/nalutils.c',
'h265_parser/gst/codecparsers/gsth265parser.c'
] ]
headers = [ headers = [
@@ -76,11 +86,33 @@ headers = [
'h265.h', 'h265.h',
'vp8.h', 'vp8.h',
'vp9.h', 'vp9.h',
'codec.h' 'av1.h',
'codec.h',
# Internal mirror of Linux 7.0 V4L2 HEVC EXT_SPS_*_RPS UAPI defs
# (allows building against pre-7.0 linux-api-headers; redundant
# once the host headers are 7.0+).
'hevc-ctrls/v4l2-hevc-ext-controls.h',
# Vendored GStreamer + project shim headers (see sources above).
'h265_parser/gst_compat.h',
'h265_parser/gst/gst.h',
'h265_parser/gst/glib-compat-private.h',
'h265_parser/gst/base/base-prelude.h',
'h265_parser/gst/base/gstbitreader.h',
'h265_parser/gst/base/gstbytereader.h',
'h265_parser/gst/base/gstbitwriter.h',
'h265_parser/gst/codecparsers/codecparsers-prelude.h',
'h265_parser/gst/codecparsers/gsth265parser.h',
'h265_parser/gst/codecparsers/nalutils.h'
] ]
includes = [ includes = [
include_directories('../include') include_directories('../include'),
# Vendored GStreamer parser tree — the parser's #include <gst/base/...>
# style references resolve here via stub headers that redirect to
# gst_compat.h.
include_directories('h265_parser')
] ]
cflags = [ cflags = [
+53
View File
@@ -36,6 +36,7 @@
#include "mpeg2.h" #include "mpeg2.h"
#include "vp8.h" #include "vp8.h"
#include "vp9.h" #include "vp9.h"
#include "av1.h"
#include <assert.h> #include <assert.h>
#include <stdio.h> #include <stdio.h>
@@ -155,6 +156,15 @@ static VAStatus codec_store_buffer(struct request_data *driver_data,
sizeof(surface_object->params.vp9.picture)); sizeof(surface_object->params.vp9.picture));
break; break;
case VAProfileAV1Profile0:
memcpy(&surface_object->params.av1.picture,
buffer_object->data,
sizeof(surface_object->params.av1.picture));
/* Reset per-frame tile group entry array on each new
* picture parameter buffer (start of a new frame). */
surface_object->params.av1.num_tile_group_entries = 0;
break;
default: default:
break; break;
} }
@@ -200,6 +210,17 @@ static VAStatus codec_store_buffer(struct request_data *driver_data,
sizeof(surface_object->params.vp9.slice)); sizeof(surface_object->params.vp9.slice));
break; break;
case VAProfileAV1Profile0: {
unsigned int n = surface_object->params.av1.num_tile_group_entries;
if (n < AV1_MAX_TILES) {
memcpy(&surface_object->params.av1.tile_group_entries[n],
buffer_object->data,
sizeof(VASliceParameterBufferAV1));
surface_object->params.av1.num_tile_group_entries = n + 1;
}
break;
}
default: default:
break; break;
} }
@@ -309,6 +330,11 @@ static VAStatus codec_set_controls(struct request_data *driver_data,
if (rc < 0) if (rc < 0)
return VA_STATUS_ERROR_OPERATION_FAILED; return VA_STATUS_ERROR_OPERATION_FAILED;
break; break;
case VAProfileAV1Profile0:
rc = av1_set_controls(driver_data, context, surface_object);
if (rc < 0)
return VA_STATUS_ERROR_OPERATION_FAILED;
break;
default: default:
return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
@@ -335,6 +361,12 @@ VAStatus RequestBeginPicture(VADriverContextP context, VAContextID context_id,
if (surface_object == NULL) if (surface_object == NULL)
return VA_STATUS_ERROR_INVALID_SURFACE; return VA_STATUS_ERROR_INVALID_SURFACE;
/* AV1 Phase 3 diag */
request_log("BeginPicture id=0x%x prev_slot=%p status=%d\n",
surface_object->base.id,
(void *)surface_object->current_slot,
surface_object->status);
if (surface_object->status == VASurfaceRendering) if (surface_object->status == VASurfaceRendering)
RequestSyncSurface(context, surface_id); RequestSyncSurface(context, surface_id);
@@ -346,9 +378,30 @@ VAStatus RequestBeginPicture(VADriverContextP context, VAContextID context_id,
* first. The new slot is bound and its V4L2 index + mmap pointers * first. The new slot is bound and its V4L2 index + mmap pointers
* are mirrored into surface_object->destination_* so the existing * are mirrored into surface_object->destination_* so the existing
* QBUF/DQBUF/EXPBUF code paths see no behavioral change. * QBUF/DQBUF/EXPBUF code paths see no behavioral change.
*
* AV1 Phase 3 finding: LIBVA_SKIP_REBIND=1 experiment (do NOT
* unbind on rebind) did not improve PASS count for the av1_larger
* film_grain stress vector — proving the iter2 Fix 3 release is
* NOT the source of the inter-frame divergence. The issue is
* deeper in ffmpeg-vaapi's AV1 hwaccel: per byte-equal OUTPUT
* comparison with the patched-ffmpeg-v4l2request reference run
* (LD_LIBRARY_PATH override on a debug libavcodec.so), 7/7 first
* EndPicture submissions are byte-identical, libva has 2 EXTRA.
*/ */
if (surface_object->current_slot != NULL) if (surface_object->current_slot != NULL)
surface_unbind_slot(driver_data, surface_object); surface_unbind_slot(driver_data, surface_object);
/*
* AV1 Phase 5 review Amendment 4: clear any stale
* linked_decode_surface_id from a prior film_grain display→decode
* link. If ffmpeg-vaapi recycles a former display surface as a
* decode target, BeginPicture binds a fresh slot — but without
* this reset, copy_surface_to_image's link-follow would still
* borrow from the now-stale linked surface and serve wrong data.
* Cleared unconditionally (cheap) so the next AV1 grain frame
* re-establishes the link if needed.
*/
surface_object->linked_decode_surface_id = VA_INVALID_SURFACE;
{ {
struct cap_pool_slot *cap_slot = struct cap_pool_slot *cap_slot =
cap_pool_acquire(&driver_data->capture_pool, surface_id); cap_pool_acquire(&driver_data->capture_pool, surface_id);
+211 -1
View File
@@ -57,6 +57,8 @@
#include <linux/media.h> #include <linux/media.h>
#include <linux/videodev2.h> #include <linux/videodev2.h>
#include "hevc-ctrls/v4l2-hevc-ext-controls.h"
/* /*
* fresnel-fourier iter4 Phase 6 commit Z + iter7 Phase 6 (B1a): device-path * fresnel-fourier iter4 Phase 6 commit Z + iter7 Phase 6 (B1a): device-path
* auto-detect via media controller topology with decoder-entity discrimination. * auto-detect via media controller topology with decoder-entity discrimination.
@@ -286,6 +288,74 @@ out:
* - non-NULL → match only that exact driver name * - non-NULL → match only that exact driver name
* - NULL → match any name in known_decoder_drivers[] * - NULL → match any name in known_decoder_drivers[]
*/ */
/*
* iter2 (ampere-kernel-decoders campaign) — runtime probe for the
* V4L2 stateless HEVC EXT_SPS_{ST,LT}_RPS controls added in
* Linux 7.0 (Casanova VDPU381/VDPU383 series). Returns true iff BOTH
* controls are registered on the given fd. Stored per-fd on
* driver_data so the multi-device-probe model (iter38) doesn't
* silently misbehave when codec routing switches devices.
*
* The two CIDs together are the gate — neither alone is meaningful
* without the other (st-RPS + lt-RPS arrays both need to be set to
* match the SPS num_short_term_ref_pic_sets / num_long_term_ref_pics_sps
* counts). Old kernels (RK3399 rkvdec on linux 6.x) register neither;
* RK3588 rkvdec (VDPU381/383 path) registers both.
*
* Reference: phase4_plan_iter2.md §Step 3 in
* ~/src/ampere-kernel-decoders/.
*/
static bool probe_hevc_ext_sps_rps_controls(int video_fd)
{
struct v4l2_queryctrl q;
if (video_fd < 0)
return false;
memset(&q, 0, sizeof(q));
q.id = V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS;
if (ioctl(video_fd, VIDIOC_QUERYCTRL, &q) < 0)
return false;
memset(&q, 0, sizeof(q));
q.id = V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS;
if (ioctl(video_fd, VIDIOC_QUERYCTRL, &q) < 0)
return false;
return true;
}
/*
* Inspect a /dev/videoN's OUTPUT formats for `want_pixfmt`. Returns true
* iff at least one OUTPUT/OUTPUT_MPLANE format matches.
*
* Used to discriminate between multiple devices sharing a driver name —
* RK3588 has 3 hantro-vpu instances and only one of them is vpu981 (the
* dedicated AV1 decoder advertising V4L2_PIX_FMT_AV1_FRAME).
*/
static bool video_node_supports_output_fmt(int video_fd, uint32_t want_pixfmt)
{
struct v4l2_fmtdesc desc;
const enum v4l2_buf_type types[] = {
V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
V4L2_BUF_TYPE_VIDEO_OUTPUT,
};
unsigned int t, i;
for (t = 0; t < sizeof(types) / sizeof(types[0]); t++) {
for (i = 0; i < 64; i++) {
memset(&desc, 0, sizeof desc);
desc.index = i;
desc.type = types[t];
if (ioctl(video_fd, VIDIOC_ENUM_FMT, &desc) < 0)
break;
if (desc.pixelformat == want_pixfmt)
return true;
}
}
return false;
}
static int find_decoder_device_by_driver(const char *want_driver, static int find_decoder_device_by_driver(const char *want_driver,
char *video_out, size_t video_out_sz, char *video_out, size_t video_out_sz,
char *media_out, size_t media_out_sz) char *media_out, size_t media_out_sz)
@@ -333,6 +403,65 @@ static int find_decoder_device_by_driver(const char *want_driver,
return -1; return -1;
} }
/*
* ampere-av1-enablement Phase 2 — like find_decoder_device_by_driver but
* additionally verifies the resolved /dev/videoN advertises `want_pixfmt`
* as an OUTPUT format. Required for RK3588 where 3 hantro-vpu instances
* share the driver name but only one is vpu981 (AV1 decoder).
*
* Walks all /dev/media* with matching driver name; takes the first hit
* whose OUTPUT formats include `want_pixfmt`. Non-matching candidates
* (encoder-only nodes, legacy hantro for MPEG2/VP8) are skipped.
*/
static int find_decoder_device_by_driver_with_fmt(const char *want_driver,
uint32_t want_pixfmt,
char *video_out,
size_t video_out_sz,
char *media_out,
size_t media_out_sz)
{
struct media_device_info info;
char path[32];
char vpath[32];
int fd, vfd, i;
for (i = 0; i < 16; i++) {
snprintf(path, sizeof path, "/dev/media%d", i);
fd = open(path, O_RDWR | O_NONBLOCK);
if (fd < 0)
continue;
memset(&info, 0, sizeof info);
if (ioctl(fd, MEDIA_IOC_DEVICE_INFO, &info) != 0) {
close(fd);
continue;
}
if (strcmp(info.driver, want_driver) != 0) {
close(fd);
continue;
}
if (find_decoder_video_node_via_topology(fd, vpath,
sizeof vpath) != 0) {
close(fd);
continue;
}
close(fd);
/* Capability check: does this /dev/videoN advertise the
* codec-specific OUTPUT format? */
vfd = open(vpath, O_RDWR | O_NONBLOCK);
if (vfd < 0)
continue;
if (video_node_supports_output_fmt(vfd, want_pixfmt)) {
close(vfd);
snprintf(video_out, video_out_sz, "%s", vpath);
snprintf(media_out, media_out_sz, "%s", path);
return 0;
}
close(vfd);
}
return -1;
}
static int find_codec_device(char *video_out, size_t video_out_sz, static int find_codec_device(char *video_out, size_t video_out_sz,
char *media_out, size_t media_out_sz) char *media_out, size_t media_out_sz)
{ {
@@ -369,6 +498,8 @@ char request_device_kind_for_profile(VAProfile profile)
case VAProfileMPEG2Main: case VAProfileMPEG2Main:
case VAProfileVP8Version0_3: case VAProfileVP8Version0_3:
return 'h'; return 'h';
case VAProfileAV1Profile0:
return 'a'; /* ampere-av1-enablement: vpu981 dedicated AV1 */
default: default:
return '?'; return '?';
} }
@@ -398,6 +529,9 @@ int request_switch_device_for_profile(struct request_data *driver_data,
} else if (kind == 'h') { } else if (kind == 'h') {
target_video = driver_data->video_fd_hantro; target_video = driver_data->video_fd_hantro;
target_media = driver_data->media_fd_hantro; target_media = driver_data->media_fd_hantro;
} else if (kind == 'a') {
target_video = driver_data->video_fd_vpu981;
target_media = driver_data->media_fd_vpu981;
} else { } else {
return -1; return -1;
} }
@@ -585,6 +719,8 @@ VAStatus VA_DRIVER_INIT_FUNC(VADriverContextP context)
driver_data->media_fd_rkvdec = -1; driver_data->media_fd_rkvdec = -1;
driver_data->video_fd_hantro = -1; driver_data->video_fd_hantro = -1;
driver_data->media_fd_hantro = -1; driver_data->media_fd_hantro = -1;
driver_data->video_fd_vpu981 = -1;
driver_data->media_fd_vpu981 = -1;
/* /*
* iter38: probe BOTH rkvdec and hantro-vpu so a single libva session * iter38: probe BOTH rkvdec and hantro-vpu so a single libva session
@@ -642,6 +778,74 @@ VAStatus VA_DRIVER_INIT_FUNC(VADriverContextP context)
} }
} }
(void)primary_driver; (void)primary_driver;
/*
* ampere-av1-enablement Phase 2 — additionally probe for
* vpu981 (RK3588's dedicated AV1 decoder). Driver name
* "hantro-vpu" alone is ambiguous on RK3588 (3 instances:
* legacy MPEG2/VP8, encoder, vpu981 AV1). Discriminate by
* V4L2_PIX_FMT_AV1_FRAME capability. If the primary or alt
* hantro happens to BE vpu981 (unlikely but possible on
* non-RK3588 boards), this probe finds it again and we just
* dedupe via the fd value.
*/
{
static char av1_video[32], av1_media[32];
if (find_decoder_device_by_driver_with_fmt(
"hantro-vpu", V4L2_PIX_FMT_AV1_FRAME,
av1_video, sizeof av1_video,
av1_media, sizeof av1_media) == 0) {
int av1_v = open(av1_video, O_RDWR | O_NONBLOCK);
int av1_m = (av1_v >= 0)
? open(av1_media, O_RDWR | O_NONBLOCK)
: -1;
if (av1_v >= 0 && av1_m >= 0) {
driver_data->video_fd_vpu981 = av1_v;
driver_data->media_fd_vpu981 = av1_m;
request_log(
"ampere-av1: vpu981 AV1 decoder "
"at %s + %s\n",
av1_video, av1_media);
} else {
if (av1_v >= 0) close(av1_v);
if (av1_m >= 0) close(av1_m);
}
}
}
}
/*
* iter2 (ampere-kernel-decoders): probe the new HEVC EXT_SPS_RPS
* controls on each rkvdec/hantro fd. Result is consumed by
* h265_set_controls per-codec gate. Per-fd storage matches the
* iter38 multi-device-probe pattern (Phase 5 review item).
*/
driver_data->has_hevc_ext_sps_rps_rkvdec =
probe_hevc_ext_sps_rps_controls(driver_data->video_fd_rkvdec);
driver_data->has_hevc_ext_sps_rps_hantro =
probe_hevc_ext_sps_rps_controls(driver_data->video_fd_hantro);
if (driver_data->has_hevc_ext_sps_rps_rkvdec) {
request_log("iter2: kernel registers HEVC EXT_SPS_{ST,LT}_RPS "
"controls on rkvdec fd (will route through "
"vendored GStreamer parser)\n");
}
/*
* ampere-av1 Phase 2.1: probe V4L2_CID_STATELESS_AV1_FILM_GRAIN
* on the vpu981 fd. Per Janet v3 amendment, this runs at backend
* init (not lazily) so any race window with concurrent device
* switching can't observe an inconsistent flag.
*/
driver_data->has_av1_film_grain = false;
if (driver_data->video_fd_vpu981 >= 0) {
struct v4l2_query_ext_ctrl qec;
if (v4l2_query_ext_ctrl(driver_data->video_fd_vpu981,
V4L2_CID_STATELESS_AV1_FILM_GRAIN,
&qec) == 0) {
driver_data->has_av1_film_grain = true;
request_log("ampere-av1: vpu981 advertises FILM_GRAIN "
"control (will include in per-frame batch)\n");
}
} }
status = VA_STATUS_SUCCESS; status = VA_STATUS_SUCCESS;
@@ -690,9 +894,15 @@ VAStatus RequestTerminate(VADriverContextP context)
close(driver_data->video_fd_hantro); close(driver_data->video_fd_hantro);
if (driver_data->media_fd_hantro >= 0) if (driver_data->media_fd_hantro >= 0)
close(driver_data->media_fd_hantro); close(driver_data->media_fd_hantro);
if (driver_data->video_fd_vpu981 >= 0)
close(driver_data->video_fd_vpu981);
if (driver_data->media_fd_vpu981 >= 0)
close(driver_data->media_fd_vpu981);
/* Fall back to direct close if neither alt fd captured the active /* Fall back to direct close if neither alt fd captured the active
* pair (env-override path). */ * pair (env-override path). */
if (driver_data->video_fd_rkvdec < 0 && driver_data->video_fd_hantro < 0) { if (driver_data->video_fd_rkvdec < 0 &&
driver_data->video_fd_hantro < 0 &&
driver_data->video_fd_vpu981 < 0) {
if (driver_data->video_fd >= 0) if (driver_data->video_fd >= 0)
close(driver_data->video_fd); close(driver_data->video_fd);
if (driver_data->media_fd >= 0) if (driver_data->media_fd >= 0)
+71
View File
@@ -38,6 +38,8 @@
#include <linux/videodev2.h> #include <linux/videodev2.h>
#include "hevc-ctrls/v4l2-hevc-ext-controls.h"
#define V4L2_REQUEST_STR_VENDOR "v4l2-request" #define V4L2_REQUEST_STR_VENDOR "v4l2-request"
#define V4L2_REQUEST_MAX_PROFILES 11 #define V4L2_REQUEST_MAX_PROFILES 11
@@ -77,6 +79,75 @@ struct request_data {
int video_fd_hantro; int video_fd_hantro;
int media_fd_hantro; int media_fd_hantro;
/*
* ampere-av1-enablement Phase 2 — vpu981 is a THIRD physical
* hantro-vpu instance on RK3588 (separate from the legacy MPEG2/VP8
* hantro at /dev/video2). It's the dedicated AV1 decoder at
* /dev/video4 with card name "rockchip,rk3588-av1-vpu-dec".
*
* Driver-name alone ("hantro-vpu") is ambiguous on RK3588 — three
* instances share the name. The probe discriminates by capability:
* which OUTPUT format does the device advertise? Only vpu981
* exposes V4L2_PIX_FMT_AV1_FRAME.
*/
int video_fd_vpu981;
int media_fd_vpu981;
/*
* iter2 (ampere-kernel-decoders campaign) — per-fd probe result
* for the V4L2_CID_STATELESS_HEVC_EXT_SPS_{ST,LT}_RPS controls
* introduced in Linux 7.0 (Casanova VDPU381/VDPU383 series).
* RK3399 rkvdec doesn't have them and the probe returns false;
* RK3588 rkvdec (VDPU381/383) registers them and the probe is
* true. h265_set_controls consults only the rkvdec entry because
* HEVC routes through rkvdec only — hantro's entry stays false
* naturally (it doesn't have rkvdec-specific controls).
*
* The pair-of-flags layout mirrors video_fd_rkvdec /
* video_fd_hantro above (iter38 multi-device-probe pattern,
* memory feedback_multi_device_probe_design). Phase 5 review
* surfaced this as a correctness item: a single scalar on
* driver_data would silently misbehave across device-switch
* boundaries; per-fd storage is the safe shape.
*/
bool has_hevc_ext_sps_rps_rkvdec;
bool has_hevc_ext_sps_rps_hantro;
/*
* ampere-av1 Phase 2.1: probe result for the optional
* V4L2_CID_STATELESS_AV1_FILM_GRAIN control on the vpu981 fd.
* Probed at VA_DRIVER_INIT (per Janet v3 amendment — init-time
* not lazy). Consumed by av1_set_controls to conditionally include
* the 4th control in the per-frame batch.
*
* True iff vpu981 advertises the control via VIDIOC_QUERY_EXT_CTRL.
* False for non-RK3588 hosts (no vpu981 fd) or older kernels.
*/
bool has_av1_film_grain;
/*
* iter2 — cached SPS-derived RPS arrays. SPS NALs only appear in
* source_data on IDR frames; non-IDR frames' h265_set_controls
* reuse the cached arrays so we don't submit zero-filled RPS to
* the kernel (which would re-trigger the OOPS the iter2 fix is
* designed to prevent). Single-slot cache (sps_id 0 only) —
* adequate for the BBB / typical-stream case; multi-SPS streams
* would need expanding to a [16] cache keyed by sps_id.
*
* The cache stores the post-mapped V4L2 control struct arrays
* (not the intermediate GstH265SPS) so request.h doesn't need
* to know about the vendored GStreamer parser types — only the
* V4L2 UAPI structs from hevc-ctrls/v4l2-hevc-ext-controls.h
* included above.
*
* Owned by h265.c; freed at RequestTerminate.
*/
struct v4l2_ctrl_hevc_ext_sps_st_rps *hevc_rps_cache_st;
unsigned int hevc_rps_cache_st_count;
struct v4l2_ctrl_hevc_ext_sps_lt_rps *hevc_rps_cache_lt;
unsigned int hevc_rps_cache_lt_count;
bool hevc_rps_cache_valid;
struct video_format *video_format; struct video_format *video_format;
/* /*
+9
View File
@@ -111,6 +111,13 @@ void surface_unbind_slot(struct request_data *driver_data,
{ {
if (surface_object->current_slot == NULL) if (surface_object->current_slot == NULL)
return; return;
/* AV1 Phase 3 diag: log every unbind with surface id + slot idx
* + status — confirms whether BeginPicture rebind is racing the
* consumer's vaGetImage on the previous frame. */
request_log("surface_unbind_slot id=0x%x status=%d slot_idx=%u\n",
surface_object->base.id,
surface_object->status,
surface_object->current_slot->v4l2_index);
cap_pool_release(&driver_data->capture_pool, surface_object->current_slot); cap_pool_release(&driver_data->capture_pool, surface_object->current_slot);
surface_object->current_slot = NULL; surface_object->current_slot = NULL;
} }
@@ -192,6 +199,8 @@ VAStatus RequestCreateSurfaces2(VADriverContextP context, unsigned int format,
return VA_STATUS_ERROR_ALLOCATION_FAILED; return VA_STATUS_ERROR_ALLOCATION_FAILED;
surface_object->current_slot = NULL; /* iter2 Fix 3 */ surface_object->current_slot = NULL; /* iter2 Fix 3 */
surface_object->linked_decode_surface_id = VA_INVALID_SURFACE;
surface_object->av1_order_hint = 0;
surface_object->destination_index = 0; /* set on bind */ surface_object->destination_index = 0; /* set on bind */
surface_object->destination_planes_count = 0; /* set at CreateContext */ surface_object->destination_planes_count = 0; /* set at CreateContext */
surface_object->destination_buffers_count = 0; /* set at CreateContext */ surface_object->destination_buffers_count = 0; /* set at CreateContext */
+40
View File
@@ -89,6 +89,33 @@ struct object_surface {
struct timeval timestamp; struct timeval timestamp;
/*
* AV1 Phase 3: for streams with apply_grain=1, VAAPI's
* VADecPictureParameterBufferAV1 carries current_display_picture
* (display-time surface) separate from current_frame (decode
* target). vpu981 HW applies grain inline to the decode CAPTURE
* buffer, so the decoded data lives in current_frame's slot but
* ffmpeg calls vaGetImage on current_display_picture which has no
* slot bound. linked_decode_surface_id, set in av1_set_controls
* on the display surface, points to the decode surface so
* copy_surface_to_image can borrow its destination_data[].
*
* VA_INVALID_SURFACE = no link (the common case: 8-bit codecs,
* AV1 with apply_grain=0, AV1 frames where cur_frame ==
* cur_display).
*/
VASurfaceID linked_decode_surface_id;
/*
* AV1 Phase 3: AV1 order_hint of the frame currently decoded into
* this surface. VAAPI's VADecPictureParameterBufferAV1.order_hint
* is per-frame; kernel's v4l2_ctrl_av1_frame.order_hints[8] is
* per-reference. We track each decoded frame's order_hint here so
* the next frame's av1_set_controls can populate order_hints[i]
* from ref_frame_map[i] SURFACE av1_order_hint.
*/
uint8_t av1_order_hint;
union { union {
struct { struct {
VAPictureParameterBufferMPEG2 picture; VAPictureParameterBufferMPEG2 picture;
@@ -122,6 +149,19 @@ struct object_surface {
VADecPictureParameterBufferVP9 picture; VADecPictureParameterBufferVP9 picture;
VASliceParameterBufferVP9 slice; VASliceParameterBufferVP9 slice;
} vp9; } vp9;
/*
* ampere-av1-enablement: AV1 needs picture-header +
* variable number of slice/tile params (one per tile).
* tile_group_entries[] holds parsed VASliceParameterBufferAV1
* entries up to MAX_TILES; av1.c builds the matching
* v4l2_ctrl_av1_tile_group_entry[] at set_controls time.
*/
struct {
#define AV1_MAX_TILES 128
VADecPictureParameterBufferAV1 picture;
VASliceParameterBufferAV1 tile_group_entries[AV1_MAX_TILES];
unsigned int num_tile_group_entries;
} av1;
} params; } params;
int request_fd; int request_fd;
+23 -1
View File
@@ -433,6 +433,7 @@ static int v4l2_ioctl_controls(int video_fd, int request_fd, unsigned long ioc,
unsigned int num_controls) unsigned int num_controls)
{ {
struct v4l2_ext_controls controls; struct v4l2_ext_controls controls;
int rc;
memset(&controls, 0, sizeof(controls)); memset(&controls, 0, sizeof(controls));
@@ -444,7 +445,28 @@ static int v4l2_ioctl_controls(int video_fd, int request_fd, unsigned long ioc,
controls.request_fd = request_fd; controls.request_fd = request_fd;
} }
return ioctl(video_fd, ioc, &controls); rc = ioctl(video_fd, ioc, &controls);
if (rc < 0) {
/* ampere-av1 Phase 2.1 diag: surface error_idx so the caller's
* error path knows which CID failed validation. error_idx >=
* count means the failure was pre-validation (e.g., bad
* request_fd). errno carries the syscall-level reason. */
const char *failed_cid_label = "<pre-validation>";
unsigned int failed_size = 0;
if (controls.error_idx < num_controls) {
failed_size = control_array[controls.error_idx].size;
(void)failed_cid_label; /* keep symbol if logger truncates */
}
request_log("v4l2_ioctl_controls: rc=%d errno=%d (%s) "
"ioc=0x%lx error_idx=%u count=%u "
"failed_cid=0x%x failed_size=%u\n",
rc, errno, strerror(errno), ioc,
controls.error_idx, num_controls,
controls.error_idx < num_controls
? control_array[controls.error_idx].id : 0,
failed_size);
}
return rc;
} }
int v4l2_get_controls(int video_fd, int request_fd, int v4l2_get_controls(int video_fd, int request_fd,