Commit Graph

18 Commits

Author SHA1 Message Date
claude-noether 9c30eccd52 ampere-av1 Phase 2.1: implement av1_set_controls body (~500 LoC)
Replaces stub av1_set_controls with full VAAPI → V4L2 stateless AV1
control translation. Four V4L2 controls batched per-frame:
  V4L2_CID_STATELESS_AV1_SEQUENCE       (sequence-level flags)
  V4L2_CID_STATELESS_AV1_FRAME          (heavy — quant, lf, cdef, lr, gm,
                                          tile_info, refs, frame flags)
  V4L2_CID_STATELESS_AV1_TILE_GROUP_ENTRY[] (DYNAMIC_ARRAY, size=MAX(N,1))
  V4L2_CID_STATELESS_AV1_FILM_GRAIN     (gated on driver_data->has_av1_film_grain)

Reference: Kwiboo/FFmpeg v4l2-request-n8.1:libavcodec/v4l2_request_av1.c
(636 LoC); same V4L2 output schema, sourced from VAAPI's
VADecPictureParameterBufferAV1 instead of FFmpeg's AV1RawSequenceHeader.

VAAPI gap notes (fields the spec needs but VAAPI doesn't expose):
  - sequence max_frame_{width,height}_minus_1 — use current frame size
  - enable_warped_motion / enable_ref_frame_mvs / enable_superres /
    enable_restoration sequence-level — conservative set-true (per-frame
    flags gate actual behavior)
  - order_hints[], reference_frame_ts[] — zero (kernel cross-refs by
    OUTPUT timestamp / surface id)
  - tile_start_col_sb[] / tile_start_row_sb[] — reconstruct via
    prefix-sum on VAAPI's width/height_in_sbs_minus_1[]
  - tile_size_bytes — set to 4 for multi-tile frames (max value), 0
    for single-tile (matches Kwiboo's conditional)
  - render_width/height — fall back to coded dimensions
  - current_frame_id / refresh_frame_flags / skip_mode_frame_idx /
    buffer_removal_time / frame_refs_short_signaling — zero
  - film_grain_params_ref_idx / update_grain — zero (only consulted in
    reuse paths; apply_grain=1 + populated arrays drive decode directly)

F1/F2/F3 risk mitigations per phase1_plan_v2:
  F1: mi_col/row_starts sentinel = 2 * ((frame_width + 7) >> 3) at
      index [tile_cols]/[tile_rows] — mirrors Kwiboo lines 238/244
  F2: superres_denom direct from VAAPI's superres_scale_denominator
      (VAAPI's encoding is the final value; no AV1_SUPERRES_DENOM_MIN
      math). Fallback to AV1_SUPERRES_NUM=8 if zero.
  F3: loop_restoration_size[] gated on USES_LR flag derived from
      y_t != 0 || cb_t != 0 || cr_t != 0 — mirrors Kwiboo lines 281-287

Plus:
  - request.h: has_av1_film_grain bool on driver_data
  - request.c: probe VIDIOC_QUERY_EXT_CTRL for FILM_GRAIN on vpu981 fd
    at VA_DRIVER_INIT (Janet v3 amendment A: init-time, not lazy)

Compile-tested on boltzmann (aarch64 native, gcc 15.2.1): clean .so,
0 errors, pre-existing GStreamer #warnings only.

Phase 3 verification on ampere is next: 208x208 smoke + film_grain
stress vector (av1-1-b8-23-film_grain-50.ivf) byte-compare libva vs
kdirect (Phase 0 proved kdirect bit-perfect).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 10:18:46 +00:00
claude-noether bed75c0cef ampere-av1 Phase 2 step 1: third-device fd scaffolding for vpu981
RK3588 has THREE hantro-vpu instances (legacy MPEG2/VP8 at /dev/video2,
encoder at /dev/video3, vpu981 AV1 at /dev/video4). The existing
2-device probe is "RK3399-shaped knowledge" — silently picks the first
hantro-vpu and never finds vpu981.

This commit adds:
- video_fd_vpu981 + media_fd_vpu981 slots to request_data
- video_node_supports_output_fmt(): capability probe via VIDIOC_ENUM_FMT
  on OUTPUT/OUTPUT_MPLANE queues
- find_decoder_device_by_driver_with_fmt(): walks /dev/media* matching
  driver name AND capability filter (V4L2_PIX_FMT_AV1_FRAME for vpu981)
- 'a' kind in request_device_kind_for_profile (VAProfileAV1Profile0)
- 'a' branch in request_switch_device_for_profile
- vpu981 probe at backend init, alongside existing rkvdec + hantro
- vpu981 fd cleanup in RequestTerminate
- VAProfileAV1Profile0 → V4L2_PIX_FMT_AV1_FRAME in codec.c

Verified on ampere:

  $ LIBVA_DRIVER_NAME=v4l2_request ffmpeg ... 2>&1 | grep iter38
  v4l2-request: auto-selected codec device: /dev/video1 + /dev/media0
  v4l2-request: iter38: also opened hantro-vpu decoder at /dev/video2 + /dev/media1
  v4l2-request: ampere-av1: vpu981 AV1 decoder at /dev/video4 + /dev/media3

Three devices opened. HEVC still works (iter2 EXT_SPS_RPS probe still
triggers on rkvdec, sibling-campaign bit-perfect behaviour preserved).

Next steps: config.c advertise VAProfileAV1Profile0, surface.h add
av1 substruct, picture.c dispatch, av1.{c,h} for the codec dispatch
(~700 LoC mirroring Kwiboo v4l2_request_av1.c).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:53:37 +02:00
claude-noether 4f6ba6c0e3 iter2 step3: HEVC EXT_SPS_*_RPS UAPI header + runtime probe
src/hevc-ctrls/v4l2-hevc-ext-controls.h (NEW, MIT, ~95 LOC):
  Verbatim mirror of Linux 7.0 V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS
  and _LT_RPS control IDs + struct definitions + flag macros. Each
  symbol is ifndef-guarded so when ampere's linux-api-headers
  eventually bumps to 7.0+, the kernel header takes precedence and
  this shim silently no-ops. Citation block links the upstream
  Casanova v8 series.

  Per LGPL section 3.b, kernel UAPI struct definitions are excepted
  from GPL inheritance, so copying them into MIT userspace is fine.

src/request.h: added has_hevc_ext_sps_rps_rkvdec + _hantro bool
  fields on struct request_data — pair-of-flags layout mirrors
  video_fd_rkvdec / video_fd_hantro (iter38 multi-device-probe
  pattern, per feedback_multi_device_probe_design). Phase 5 review
  identified single-scalar storage as a silent-misbehavior risk
  across device-switch boundaries.

src/request.c:
  - new probe_hevc_ext_sps_rps_controls(fd) helper: queries the two
    new CIDs via VIDIOC_QUERYCTRL; returns true iff both register.
    RK3399 rkvdec (linux 6.x or 7.x without VDPU381/383 bindings)
    returns false; RK3588 rkvdec (VDPU381/383) returns true.
  - probe each driver_data->video_fd_rkvdec / _hantro after the
    iter38 multi-device-probe block at VA_DRIVER_INIT time
  - log-line if rkvdec supports it - diagnostic for Phase 7

src/meson.build: added the new UAPI header to the headers list.

Build verified: ninja -C build clean, .so produced. The new probe
runs at driver init and stores the result, but nothing CONSUMES the
result yet — that's Step 4 (h265_set_controls wiring).

Per ampere-kernel-decoders campaign iter2 Phase 4 step 3 (amended
by Phase 5 review item 'per-fd storage').

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 11:08:10 +02:00
claude-noether c56a77bd4c iter38: multi-device probe — single libva session serves all 5 codecs
Probe BOTH rkvdec and hantro-vpu at VA_DRIVER_INIT and keep their
{video,media}_fd pairs in driver_data. RequestQueryConfigProfiles
enumerates the union of supported profiles from all open fds.
RequestCreateConfig retargets driver_data->{video,media}_fd to the
device that serves the requested profile; if a switch is needed
(active fd is wrong), tears down output_pool, capture_pool, video_format
cache, and fmt_valid so the next RequestCreateContext rebuilds them
on the new device.

Profile→device map (RK3399-shaped):
  H264 / HEVC / VP9  → rkvdec
  MPEG-2 / VP8       → hantro-vpu

Honours LIBVA_V4L2_REQUEST_VIDEO_PATH / MEDIA_PATH explicit overrides
(skips alt-probe when those are set).

Closes the 'libva multi-device probe' open item from iter36/iter37
campaign-close.
2026-05-14 18:52:12 +00:00
claude-noether 6df2159dd3 fresnel-fourier iter7 Phase 7 fix-forward: data links connect pads not entities directly
Empirical Phase 7 verification revealed the algorithm bug: data links
in MEDIA_IOC_G_TOPOLOGY connect PAD IDs, not entity IDs directly.
My iter7 Phase 6 commit compared link source_id/sink_id against
the proc entity_id, never matched → io_entity_ids stayed empty →
interface lookup never fired → returns -1 → falls back to legacy
hardcoded path.

Topology dump on fresnel /dev/media0 (rkvdec) confirmed:
- Entity 3 (rkvdec-proc) has function=0x4008 (DECODER) ✓
- Data link src=16777218 sink=16777220 — these are PAD ids
  (0x01000002, 0x01000004), NOT entity 3.
- Interface link src=50331660 (interface) sink=1 (entity) — for
  interface links source/sink ARE entity IDs.

Fix: resolve pads → entities via the topo.pads[] array.
1. Collect pads belonging to proc entity (via pads[].entity_id).
2. For each data link touching those pads, the OTHER pad's
   entity_id is an IO neighbor.
3. Find interface link to those IO entities (unchanged from prev).

Also allocate topo.pads[] in the 2-call ioctl pattern.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-13 11:00:20 +00:00
claude-noether c106d95869 fresnel-fourier iter7 Phase 6: auto-detect with decoder-entity discrimination (B1a)
Refactor request.c::find_video_node_via_topology to
find_decoder_video_node_via_topology — walks media-topology entities
looking for MEDIA_ENT_F_PROC_VIDEO_DECODER function, then follows the
kernel's link graph (data link from proc to IO entity, interface link
from IO entity to V4L_VIDEO interface) to the correct /dev/videoN.

Two-pass find_codec_device: pass 1 accepts only "rkvdec" (multi-codec
decoder, 3 of 5 codecs); pass 2 accepts any known_decoder_drivers
entry. Pre-iter7 the walk picked whichever media device matched the
hantro-vpu driver name first — which on RK3399 could be the encoder
half of the same media device, surfacing as an empty profile list.

Phase 5 amendments incorporated:
- CRIT-1: use MEDIA_LNK_FL_INTERFACE_LINK (1U<<28) to discriminate
  interface vs data links.
- CRIT-2: check both source_id and sink_id of each link.
- IMP-3: 2-call MEDIA_IOC_G_TOPOLOGY pattern (allocate all 3 arrays
  before second call); pre-iter7 had a spurious memset + third call.

iter4-B1b (multi-decoder routing — open BOTH rkvdec AND hantro from
one backend instance) still deferred. Post-iter7 MPEG-2/VP8 (hantro)
still need LIBVA_V4L2_REQUEST_VIDEO_PATH override.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-13 09:38:54 +00:00
claude-noether 7f8fa93213 fresnel-fourier iter4 Phase 6 commit Z: device-path auto-detect via media controller topology
Pre-iter4 backend hardcoded /dev/video0 + /dev/media0 as defaults
when no env override was set. Linux 7.0 udev/probe order changed,
rockchip-rga (RGB color converter, no codec) now claims
/dev/video0 — legacy default returns empty profile list.

Discovery is driven by the media controller graph (the canonical
v4l2-request approach). NOT a /dev/video* walk by enumeration
order — that mispairs video and media nodes when one driver
registers multiple media devices, and depends on probe-order
luck.

Algorithm:
  1. Walk /dev/media0..15. MEDIA_IOC_DEVICE_INFO names the driver.
     Match against {rkvdec, hantro-vpu, cedrus, sun4i_csi}.
  2. MEDIA_IOC_G_TOPOLOGY enumerates the entity/interface graph.
     The MEDIA_INTF_T_V4L_VIDEO interface carries major:minor of
     the V4L2 video node owned by THIS media controller — paired
     by the kernel, not by /dev/* enumeration order.
  3. Resolve major:minor to /dev/videoN via /sys/dev/char/<M>:<N>
     (the kernel's char-device sysfs symlink whose basename is
     the device node name).

LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 escape hatch reverts to legacy
/dev/video0 + /dev/media0 hardcoded behavior for callers that
depended on it.

Phase 5 C4 amendment: walk-and-pick-first selects rkvdec on RK3399
(rkvdec's media controller enumerates before hantro's). H.264 /
HEVC / VP9 (rkvdec codecs) work without env override after this
commit. MPEG-2 / VP8 (hantro) still require explicit
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 override; full
multi-decoder dispatch is iter4-B1 backlog item.

Verified empirically on fresnel (linux-fresnel-fourier 7.0-1):
- vainfo (no env) -> "auto-selected codec device: /dev/video1 +
  /dev/media0", enumerates H264*5 + HEVCMain (rkvdec) — paired
  via topology graph, not /dev/video* enumeration.
- vainfo NO_AUTODETECT=1 -> empty list (legacy /dev/video0 = rga).
- vainfo with explicit /dev/video3 + /dev/media1 -> MPEG2*2 + VP8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 06:05:41 +00:00
test0r 565f5c0de4 context: introduce request_pool, decouple OUTPUT buffers from surfaces
Commit 3 of the upstreamable plan (upstreamable_design.md §1, §5).
Replaces the prior per-surface OUTPUT-buffer ownership model with a
small driver-wide pool sized by codec pipeline depth (4 H.264 frames
in flight), allocated unconditionally regardless of caller's
num_render_targets.

Prior art (kernel UAPI dev-stateless-decoder.rst, ffmpeg
v4l2_request.c, Chromium V4L2StatelessVideoDecoder, GStreamer
v4l2slh264dec) all decouple OUTPUT and CAPTURE pool sizing. fourier's
"output_count == surfaces_count" model was a category error: OUTPUT
buffers are request-time bitstream slots, CAPTURE buffers are
picture-time DPB slots; their lifecycles and sizing are independent.

Changes:
  * NEW src/request_pool.{c,h} (~200 LoC):
      - request_pool_init(): CREATE_BUFS + per-slot QUERYBUF + mmap.
      - request_pool_destroy(): munmap all, idempotent.
      - request_pool_acquire(): round-robin claim; returns V4L2 buffer
        index of an unused slot or -1.
      - request_pool_release(): mark slot free for reuse.
      - request_pool_slot(): accessor for ptr/size given a buffer index.

  * src/request.h: add struct request_pool output_pool to request_data.

  * src/context.c::RequestCreateContext: replace the per-surface
    OUTPUT loop with a single request_pool_init() call (count=4,
    independent of surfaces_count). Drop the now-unused locals
    (length, offset, source_data, output_buffers_count, index,
    index_base, i, surface_object). DELETES patch 0002's
    "output_buffers_count = ... ? ... : 4" hack inline — the pool's
    own count parameter supersedes it.

  * src/picture.c::RequestBeginPicture: borrow a pool slot at frame
    start, write its mmap pointer/size/index into the surface's
    transient source_* fields. The fields stay (still useful as
    a borrow handle that the existing codec_store_buffer memcpys
    target), but no longer represent surface-permanent ownership.
    Reset slices_size/slices_count here too (was implicit on first
    Render).

  * src/surface.c::RequestSyncSurface: after VIDIOC_DQBUF returns
    the OUTPUT buffer, release the pool slot and clear the surface's
    borrow handle. Fixes the segv on second-frame submission.

  * src/surface.c::RequestDestroySurfaces: remove the munmap of
    source_data — pool owns the mmap.

  * src/request.c::RequestTerminate: call request_pool_destroy()
    before close(video_fd) so munmaps still target a valid fd.

  * src/meson.build: add request_pool.c and request_pool.h to the
    sources/headers lists.

This commit removes 0002's OUTPUT-pool hack inline (the
"floor to 4" line is gone). The DECODE_MODE/START_CODE block in 0002
remains until commit 4 lands.

Build-verified clean on aarch64.

Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
2026-05-04 09:45:05 +00:00
Paul Kocialkowski 518d7a0c59 Update and harmonize heading author lists
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2019-03-07 11:37:12 +01:00
Paul Kocialkowski 7ff2543e64 Add support for the single-planar V4L2 API
Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
2018-09-07 16:43:13 +02:00
Paul Kocialkowski e96e5f392a request: Check required v4l2 capabilities after opening the video node
Since our use of the v4l2 API has some assumptions on the available
userspace APIs, check the capabilities reported by the driver to make
sure they are supported.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-08-06 16:44:37 +02:00
Paul Kocialkowski c764527c17 Add support for QuerySurfaceAttributes
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-07-18 15:07:42 +02:00
Paul Kocialkowski 7587ef6901 surface: Add ExportSurfaceHandle support for dma-buf export
This is the latest version of dma-buf export, that does support
specifying DRM modifiers.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-07-18 15:02:37 +02:00
Paul Kocialkowski 019a0ccb42 buffer: Add Acquire/ReleaseBufferHandle support for dma-buf export
This is the first version of dma-buf export, that does not support
specifying a DRM modifier.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-07-18 14:51:11 +02:00
Paul Kocialkowski 7a72782612 request: Reorder BufferInfo
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-07-18 14:46:47 +02:00
Paul Kocialkowski 829abae895 surface: Add basic support for CreateSurfaces2
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
2018-07-18 14:40:36 +02:00
Maxime Ripard 59c4a6e034 libva: Change the environment variables name
Change the environment variables for the media and video path to match the
libva name.

Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2018-07-17 17:02:23 +02:00
Maxime Ripard 22e0c6c3e1 tree: Rename the libva name for real
Change the autotools files, and with them the name of the libva.

Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2018-07-17 17:02:23 +02:00