Commit Graph

359 Commits

Author SHA1 Message Date
claude-noether 6bc12fe7e4 iter39 Option B: drop Hi10P + Main10 from RequestQueryConfigProfiles
Per Phase 7 close + user-directed Option B trigger (web research /
rockchip-mpp showed Hi10P is effectively impossible on the current
stack). Cross-test on ampere RK3588 confirmed the SAME failure mode
as fresnel RK3399 — both produce all-zero output via libva; kdirect
fails with EINVAL on both. The blocker is in ffmpeg-v4l2-request
userspace plumbing for the new uAPI controls Karlman's kernel patches
introduced, NOT in our backend or the kernel.

Sources confirming kernel + HW capable but userspace pending:
  - lwn.net/Articles/950434: "to fully runtime test... you may need
    upstream DRM commits, FFmpeg patches"
  - patchwork.kernel.org Karlman v6 → v10 series on linux-media
  - Rockchip RK3399 + RK3588 datasheets list 10-bit H.264 support

Stop enumerating Hi10P + Main10 so VAAPI consumers don't try the
broken path. The backend infrastructure (codec.c profile cases,
context.c NV15 CAPTURE + synthetic SPS bit_depth=2 + video_format
invalidation, image.c P010 reporting + NV15→P010 unpack, surface.c
RT_FORMAT_YUV420_10 guard + NV15 PRIME fourcc, nv15.c + nv15.h
unpack primitive, request.h is_10bit flag) is RETAINED — just
re-add the two profiles[index++] lines and bump the H264 guard
back to (-6) when upstream ffmpeg-vaapi V4L2 hwaccel learns 10-bit.

Memory: feedback_rk3399_h264_hi10p_advertised_not_functional.md
captures the empirical evidence for future iterations.

vainfo after this commit: 10 profiles (was 12), matches the iter38
baseline. iter38 5/5 PASS preserved (no other codec touched).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:43:44 +00:00
claude-noether 63fed87bc5 iter39 fresnel fix: advertise P010 unconditionally in QueryImageFormats
ffmpeg-vaapi's hwcontext_vaapi calls vaQueryImageFormats during
hwframes context setup, BEFORE vaCreateContext fires. Our previous
gate on driver_data->is_10bit meant P010 wasn't in the catalog at
that early query — ffmpeg's hwdownload then rejected pix_fmt=p010le
with "Invalid output format p010le for hwframe download" and decode
failed before our backend's CreateContext saw the 10-bit profile.

Fix: advertise P010 unconditionally in QueryImageFormats. Safe because
consumers ask for P010 only when their decode pipeline needs 10-bit,
and our P010 unpack path in copy_surface_to_image is gated on
image->format.fourcc == VA_FOURCC_P010 (independent of is_10bit).

Verified on fresnel: with this fix, Hi10P decode advances past the
hwdownload filter setup. (Run pending bundle to fresnel.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:34:52 +00:00
claude-noether a13215de45 iter39 fresnel fix: skip pre-S_FMT NV15 CAPTURE format probe
RK3399 rkvdec advertises NV15 in VIDIOC_ENUM_FMT(CAPTURE) only AFTER
S_FMT(OUTPUT) + S_EXT_CTRLS(SPS) resolve image_fmt to 420_10BIT.
Pre-flight v4l2_find_format(NV15) always returns 0 → video_format
stays NULL → CreateContext returns OPERATION_FAILED → ffmpeg-vaapi
hwaccel init fails with "Failed to create decode context: 1".

Verified on fresnel (kernel 7.0-14 / linux-fresnel-fourier):
  v4l2-ctl -d /dev/video1 --list-formats → only NV12 enumerated

Fix: for 10-bit profiles, skip the find_format probe and directly
map to our NV15 video_format entry. The later S_FMT(CAPTURE) in
the same RequestCreateContext path commits the actual NV15 mode
once the synthetic-SPS injection sets bit_depth_luma_minus8=2.

Discovered during Phase 7 sub-profile verification — Criterion 1
(vainfo enumeration) PASSed but Criteria 2/3 (Hi10P/Main10 decode)
failed with the hwaccel init error. iter38 5/5 baseline still PASSES
(no regression — non-10-bit path unchanged).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:34:14 +00:00
claude-noether f0ef69d279 iter2 step4: wire h265_set_controls to populate EXT_SPS_*_RPS controls
Per Phase 4 plan + Phase 5 review amendments (SPS parse-and-cache,
per-fd gating).

src/h265.c additions:
  - #include <errno.h>, the v4l2-hevc-ext-controls.h, and the
    vendored gst/codecparsers/gsth265parser.h
  - new static helper h265_populate_ext_sps_rps_cache(): walks
    surface_object->source_data for an SPS NAL (nal_unit_type == 33)
    using gst_h265_parser_identify_nalu; if found, calls
    gst_h265_parser_parse_sps_ext (NOT gst_h265_parser_parse_sps —
    the latter discards the per-RPS-entry EXT data we need); maps
    GstH265ShortTermRefPicSet (base) + GstH265ShortTermRefPicSetExt
    (carrying use_delta_flag[16], used_by_curr_pic_flag[16],
    delta_poc_s0_minus1[16], delta_poc_s1_minus1[16]) into the V4L2
    struct arrays; stores on driver_data->hevc_rps_cache_*
  - non-IDR-frame handling: cache holds across frames, so frames
    whose source_data lacks an SPS NAL reuse the previously-parsed
    cached arrays (Phase 5 review item #3)
  - controls[] grows from [5] to [7]; the 2 new entries are appended
    after the standard 5 (SPS/PPS/SLICE_PARAMS/SCALING_MATRIX/
    DECODE_PARAMS), gated by driver_data->has_hevc_ext_sps_rps_rkvdec
    (per-fd probe result from Step 3) + the cache being valid
  - field-by-field mapping mirrors GStreamer's
    gst_v4l2_codec_h265_dec_fill_ext_sps_rps verbatim (the upstream
    reference identified in Phase 0 prior-art survey)

src/request.h additions:
  - struct request_data carries hevc_rps_cache_st (array pointer),
    _st_count, hevc_rps_cache_lt, _lt_count, hevc_rps_cache_valid.
    Single-slot cache (sps_id 0 only; multi-SPS streams would need
    expanding). Stores POST-MAPPED V4L2 structs so request.h doesn't
    need to know GstH265SPS / GstH265SPSEXT types.

Critical interpretation correction (Phase 5 review followup):
GstH265SPS has short_term_ref_pic_set[65] (base) but NOT
short_term_ref_pic_set_ext[]. The EXT array lives on a SEPARATE
GstH265SPSEXT struct accessed via gst_h265_parser_parse_sps_ext.
The 'plain' gst_h265_parser_parse_sps internally calls _ext with a
LOCAL discarded SPSEXT (see gsth265parser.c:2050). Our call must
use the _ext variant directly to keep the EXT data. Caught during
Step 4 first-build error.

Build verified: ninja -C build clean. .so is 759 KB (up from 485 KB
original, 682 KB after Step 2 vendor — the +80 KB is the new helper
+ extension).

iter2 Phase 6 Step 5 (install + reboot + smoke-test) is the F1
falsifier moment: if HEVC stops OOPSing, mechanism confirmed; if it
still OOPSes, loopback Phase 0 with re-opened kernel-agent#11.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:49:12 +00:00
claude-noether 393d02f413 iter2 step3: HEVC EXT_SPS_*_RPS UAPI header + runtime probe
src/hevc-ctrls/v4l2-hevc-ext-controls.h (NEW, MIT, ~95 LOC):
  Verbatim mirror of Linux 7.0 V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS
  and _LT_RPS control IDs + struct definitions + flag macros. Each
  symbol is ifndef-guarded so when ampere's linux-api-headers
  eventually bumps to 7.0+, the kernel header takes precedence and
  this shim silently no-ops. Citation block links the upstream
  Casanova v8 series.

  Per LGPL section 3.b, kernel UAPI struct definitions are excepted
  from GPL inheritance, so copying them into MIT userspace is fine.

src/request.h: added has_hevc_ext_sps_rps_rkvdec + _hantro bool
  fields on struct request_data — pair-of-flags layout mirrors
  video_fd_rkvdec / video_fd_hantro (iter38 multi-device-probe
  pattern, per feedback_multi_device_probe_design). Phase 5 review
  identified single-scalar storage as a silent-misbehavior risk
  across device-switch boundaries.

src/request.c:
  - new probe_hevc_ext_sps_rps_controls(fd) helper: queries the two
    new CIDs via VIDIOC_QUERYCTRL; returns true iff both register.
    RK3399 rkvdec (linux 6.x or 7.x without VDPU381/383 bindings)
    returns false; RK3588 rkvdec (VDPU381/383) returns true.
  - probe each driver_data->video_fd_rkvdec / _hantro after the
    iter38 multi-device-probe block at VA_DRIVER_INIT time
  - log-line if rkvdec supports it - diagnostic for Phase 7

src/meson.build: added the new UAPI header to the headers list.

Build verified: ninja -C build clean, .so produced. The new probe
runs at driver init and stores the result, but nothing CONSUMES the
result yet — that's Step 4 (h265_set_controls wiring).

Per ampere-kernel-decoders campaign iter2 Phase 4 step 3 (amended
by Phase 5 review item 'per-fd storage').

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:49:09 +00:00
claude-noether 9f7437e8ee iter2 step2: GLib/GStreamer compat shim, build succeeds
Vendored gsth265parser + nalutils + gstbitreader + gstbytereader (the
Step 1 commit) compile cleanly against libc + libv4l2 only after
adding 1 compat translation unit + 5 stub headers, no edits to the
vendored .c/.h files themselves.

src/h265_parser/gst_compat.{h,c} — new files (MIT, original work):
  - GLib type aliases (gboolean, gchar, gint*, guint*, gsize, gpointer)
  - Memory helpers (g_malloc/g_free as #define free, g_memdup2 inline)
  - Asserts as no-op + parser-return-code-propagation
  - All GST_DEBUG/INFO/WARNING/ERROR/LOG/FIXME as no-ops (the parser
    is heavy on debug logging; we compile it all out)
  - GArray implementation (~100 LOC, just enough for gsth265parser.c's
    24 call sites)
  - GList full struct with .data/.next/.prev so callers compile;
    list-manipulation functions abort() — dead code paths only
  - Byte-order read/write macros (GST_READ_UINT8/16/24/32/64_LE/BE,
    GST_WRITE_UINT8/16/24/32_BE) — aarch64 LE inlines
  - g_once_init_enter/leave as simple gate
  - G_MAXUINT*, G_MAXINT*, G_MINxxx, G_GNUC_* attribute macros, etc.
  - Opaque GstBuffer/GstMemory/GstMapInfo + abort-stub functions for
    the encoder-side SEI-insertion paths the libva backend never invokes
  - gst_util_ceil_log2 real impl (used by slice-header parser; dead
    for our SPS-only call path but cheaper to implement than stub)

src/h265_parser/gst/{gst.h,base/base-prelude.h,base/gstbitwriter.h,
codecparsers/codecparsers-prelude.h,glib-compat-private.h} — 5 new
stub headers (MIT). All include gst_compat.h. gstbitwriter.h adds
abort-stub functions for the bit-writer API (used by nalutils.c's NAL
emulation-prevention encoder path — dead code for the parse-only
libva backend).

src/meson.build — added the 5 new .c source files and 10 new .h
headers; added include_directories('h265_parser') to the include path
so the vendored files' '#include <gst/base/...>' style references
resolve to the stub headers + actual vendored files in the local
tree.

Build verified: ninja -C build produces v4l2_request_drv_video.so
(682 KB, up from 485 KB pre-vendor — the +200 KB is the vendored
parser code). nm shows gst_h265_parse_sps, gst_h265_parse_sps_ext,
gst_h265_parser_identify_nalu, and the other functions we need for
Step 4 are present in the binary.

Two #warning messages from gsth265parser.h about API stability are
upstream-intentional and harmless ('The H.265 parsing library is
unstable API and may change in future').

This commit completes Step 2 of ampere-kernel-decoders iter2 Phase 6.
Backend remains functionally identical to pre-iter2 — the new code
compiles + links but is not yet called from h265_set_controls (that's
Step 4). Existing 5 codecs continue to work as before.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:49:06 +00:00
claude-noether c9b7fcff50 iter2 step1: vendor GStreamer 1.28.2 H.265 parser unchanged
Source: gitlab.freedesktop.org/gstreamer/gstreamer @ commit 43421c2a5b8a
(refs/tags/1.28.2). All 8 vendored files copied verbatim into
src/h265_parser/:

  gst-plugins-bad/gst-libs/gst/codecparsers/gsth265parser.c (168 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/gsth265parser.h ( 92 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/nalutils.c       (13 KB)
  gst-plugins-bad/gst-libs/gst/codecparsers/nalutils.h       (  8 KB)
  gstreamer/libs/gst/base/gstbitreader.c                     (  8 KB)
  gstreamer/libs/gst/base/gstbitreader.h                     ( 10 KB)
  gstreamer/libs/gst/base/gstbytereader.c                    ( 39 KB)
  gstreamer/libs/gst/base/gstbytereader.h                    ( 25 KB)

Total ~11 KLOC, LGPL v2.1+ per original headers (Intel + Sreerenj
Balachandran + others). LGPL headers preserved verbatim. Backend's
existing COPYING.LGPL covers redistribution.

** Build is INTENTIONALLY BROKEN at this commit. ** GLib dependencies
(GArray, g_malloc, gboolean, GST_DEBUG, etc.) are not yet satisfied;
src/Makefile.am is not yet updated to include these files. Step 2
performs the GLib-to-libc mechanical adaptation; Step 3 wires the
header + Makefile.

This vendor-unchanged commit is the upstream-tracking baseline. When
GStreamer ships a parser bug fix, the future-sync workflow is:
  git diff src/h265_parser/ HEAD..(this commit)
to surface our adaptations, then rebase those over the upstream fix.

Per ampere-kernel-decoders campaign iter2 Phase 4 §Step 1
(/home/mfritsche/src/ampere-kernel-decoders/phase4_plan_iter2.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:48:52 +00:00
claude-noether a8a91d92d6 Revert "ampere iter2: HEVC EXT_SPS_ST_RPS / _LT_RPS dynamic-array submission (VDPU381)"
This reverts commit f61f736380.
2026-05-17 09:48:29 +00:00
claude-noether f61f736380 ampere iter2: HEVC EXT_SPS_ST_RPS / _LT_RPS dynamic-array submission (VDPU381)
Fixes the rkvdec_hevc_prepare_hw_st_rps out-of-bounds kernel OOPS that
blocked HEVC decode on ampere (RK3588) per
marfrit/libva-v4l2-request-fourier#3 and ampere-fourier iter1 close.

Mechanism (Phase 5 amendment to issue body):
The new EXT_SPS controls are registered as V4L2_CTRL_FLAG_DYNAMIC_ARRAY
in vdpu38x_hevc_ctrl_descs (rkvdec.c:279/284) with cfg.dims = { 65 }.
The v4l2-ctrl framework init-allocates 1 zeroed element (ctrls-core.c:2116).
When num_short_term_ref_pic_sets > 1, rkvdec_hevc_prepare_hw_st_rps
(rkvdec-hevc-common.c:393-405) iterates idx 0..N-1 and overruns the
1-element kernel allocation. Submitting an N-element dynamic-array
control via S_EXT_CTRLS extends the framework allocation.

Userspace fix:
  - VIDIOC_QUERY_EXT_CTRL probe at first HEVC CreateContext sets
    driver_data->has_ext_sps_rps (true on VDPU381/383, false on legacy
    RK3399 — control unregistered there, so fresnel iter38 5/5 + iter39
    sub-profile paths are byte-identical to pre-iter2).
  - When set, h265_set_controls appends EXT_SPS_ST_RPS + _LT_RPS as
    calloc'd zero arrays, sized by VAAPI's count fields and capped at
    H.265 §7.4.3.2 spec maxima (ST 64, LT 32). Min 1 (kernel rejects 0).
  - Free post-S_EXT_CTRLS.

Decode correctness scope:
VAAPI does NOT expose per-set st_ref_pic_set syntax elements
(delta_idx_minus1, delta_rps_sign, etc.) — confirmed in va_dec_hevc.h.
All-zero entries give empty inter-pred RPS per set, which is correct
for IDR-only streams and incorrect for streams with inter-pred RPS
dependence. iter2 acceptance: stop the OOPS. Decode-correctness for
inter-RPS content is a known follow-up requiring either bitstream-snoop
or SPS-passthrough via a new VAAPI extension.

Files:
  - include/hevc-ctrls.h: #ifndef-guarded fallback definitions for
    V4L2_CID_STATELESS_HEVC_EXT_SPS_{ST,LT}_RPS + structs (ampere host
    is on linux-api-headers 6.19-1; the new CIDs land in 7.0).
  - src/request.h: driver_data->has_ext_sps_rps (persists for driver
    lifetime; gated solely by HEVC code path so cross-codec leakage
    impossible).
  - src/context.c: probe at HEVC CreateContext via v4l2_query_ext_ctrl.
  - src/h265.c: controls[5] → controls[7]; #include <hevc-ctrls.h>
    (replaces <linux/v4l2-controls.h>) for forward UAPI compatibility.

Compile-tested on boltzmann (aarch64 native, gcc 15.2.1): clean .so,
0 new warnings. Fresnel cross-device safety: legacy RK3399 rkvdec_ctrl
table omits the CIDs; probe returns false; new code path never executes.

iter39 sub-profile work (commits 662f887 + 8746690) is preserved
in-tree; iter2 is a forward-compatible additive change.

Refs:
  marfrit/libva-v4l2-request-fourier#3
  ampere-fourier/iter1_close.md HEVC blocker
  ampere-fourier/iter2_phase0_findings.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:34:58 +00:00
claude-noether 8746690739 iter39: add NV15 → P010 unpack self-test (tests/test_nv15_unpack.c)
Pure-C unit test for nv15_unpack_plane_to_p010, independent of any V4L2
hardware. Verifies bit layout against the spec at
Documentation/userspace-api/media/v4l/pixfmt-nv15.rst by packing known
10-bit pixel values, running the unpack, and asserting P010 output
matches pixel<<6.

Coverage:
  - zero, all-max
  - 8 known position/spread vectors
  - widths {1, 2, 3, 7, 8} including remainder paths
  - multi-row with stride padding
  - chroma-shape (half-height)

Build + run:
  cc -Wall -Werror -O2 -o test_nv15_unpack \
     tests/test_nv15_unpack.c src/nv15.c
  ./test_nv15_unpack

Confirmed PASS on noether (x86_64 native). Catches the highest-risk
class of regression in iter39 — silent bit-shift errors in the unpack —
without requiring fresnel hardware.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:22:14 +00:00
claude-noether 662f8874ba iter39 α-31: H264 Hi10P + HEVC Main10 sub-profile support (10-bit, rkvdec NV15)
Adds VAProfileH264High10 and VAProfileHEVCMain10 to the libva-v4l2-request
backend. RK3399 rkvdec emits decoded frames as V4L2_PIX_FMT_NV15 (4 × 10-bit
values packed in 5 bytes per element); VAAPI consumers receive standard
VA_FOURCC_P010 via a new userspace unpack in copy_surface_to_image.

VP9 Profile 2 explicitly NOT added — RK3399 rkvdec kernel ctrl table
caps at V4L2_MPEG_VIDEO_VP9_PROFILE_0 (rkvdec.c::rkvdec_vp9_ctrl_descs).

Touchpoints (per Phase 5 sonnet-architect review amendments):
  - include/drm_fourcc.h: define DRM_FORMAT_NV15 (vendored libdrm lacks it)
  - src/nv15.{c,h}: NV15 → P010 plane unpack (LSB-first, per
    Documentation/userspace-api/media/v4l/pixfmt-nv15.rst)
  - src/video.c: NV15 entry in formats[] (else NULL-deref on video_format_find)
  - src/codec.c: pixelformat_for_profile cases for Hi10P + Main10
  - src/config.c: enumeration, validation, entrypoints, RT_FORMAT_YUV420_10
    advertisement for 10-bit profiles
  - src/context.c: per-profile CAPTURE pix_fmt (NV12/NV15), 10-bit synthetic
    SPS (bit_depth_luma_minus8=2), video_format invalidation on bit-depth
    transition (sibling to iter38 device-switch invalidation), is_10bit flag
  - src/surface.c: RT_FORMAT_YUV420_10 admission, NV15 fourcc on PRIME export
  - src/image.c: P010 reporting in DeriveImage + QueryImageFormats,
    P010-aware sizing in CreateImage, NV15 → P010 unpack call in
    copy_surface_to_image (gated on is_10bit + image.format.fourcc == P010)
  - src/picture.c: 4 switch blocks route Hi10P/Main10 to existing H264/HEVC
    per-codec paths
  - src/request.h: MAX_PROFILES bump 11 → 13, driver_data->is_10bit flag

Scope: COPY path (vaGetImage / vaDeriveImage) only. Standard ffmpeg-vaapi
hwdownload, mpv vaapi-copy, and any consumer using vaGetImage works
end-to-end. PRIME-path consumers that only know NV12/P010 must use the
COPY path; PRIME consumers aware of NV15 (panfrost-Mesa et al.) get the
correct fourcc on RequestExportSurfaceHandle. PRIME-side P010 emission is
follow-up scope (would need DRM_FORMAT_P010 + per-plane unpack into a
GPU-accessible buffer).

Compile-tested on boltzmann (aarch64 native, gcc 15.2.1, libva 1.23.0,
libdrm 2.4.133): clean build, .so produced, 0 new warnings.

Phase 0/2 evidence: linux-mmind-v7.0 drivers/media/platform/rockchip/rkvdec.
rkvdec_h264_decoded_fmts[] and rkvdec_hevc_decoded_fmts[] both list NV15;
ctrl tables cap at HEVC MAIN_10 and H264 HIGH_422_INTRA (Hi10P < cap, not
in menu_skip_mask). image_fmt resolution (rkvdec-h264-common.c:196,
rkvdec-hevc-common.c:467) dispatches on bit_depth_luma_minus8 only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:15:16 +00:00
claude-noether 7ac934e0c5 iter38b: bounds check uses MAX_PROFILES (11), not MAX_CONFIG_ATTRIBUTES (10)
Latent bug surfaced by iter38 multi-device probe. profiles[] array
in RequestQueryConfigProfiles is sized by V4L2_REQUEST_MAX_PROFILES
(set as context->max_profiles=11 in VA_DRIVER_INIT), but the bounds
checks used V4L2_REQUEST_MAX_CONFIG_ATTRIBUTES (10). Pre-iter38 only
a single device's profiles were enumerated, total ≤9, so the off-by-
one never bit. With iter38's rkvdec+hantro union (10 profiles total
across MPEG2/H264/HEVC/VP8/VP9), the last enumerator (VP9) hit
index=9 with the check 'index < 10-1 = 9' → skipped.
2026-05-14 18:55:27 +00:00
claude-noether c56a77bd4c iter38: multi-device probe — single libva session serves all 5 codecs
Probe BOTH rkvdec and hantro-vpu at VA_DRIVER_INIT and keep their
{video,media}_fd pairs in driver_data. RequestQueryConfigProfiles
enumerates the union of supported profiles from all open fds.
RequestCreateConfig retargets driver_data->{video,media}_fd to the
device that serves the requested profile; if a switch is needed
(active fd is wrong), tears down output_pool, capture_pool, video_format
cache, and fmt_valid so the next RequestCreateContext rebuilds them
on the new device.

Profile→device map (RK3399-shaped):
  H264 / HEVC / VP9  → rkvdec
  MPEG-2 / VP8       → hantro-vpu

Honours LIBVA_V4L2_REQUEST_VIDEO_PATH / MEDIA_PATH explicit overrides
(skips alt-probe when those are set).

Closes the 'libva multi-device probe' open item from iter36/iter37
campaign-close.
2026-05-14 18:52:12 +00:00
claude-noether 25d3e5f06f iter37: revert α-26 — decode_params.short_term_ref_pic_set_size back to 0
α-26 (iter26) wrote VAAPI's picture->st_rps_bits to the V4L2 decode_params
field of the same name based on field-name match. Per V4L2 spec, this field
is the bit-count of st_ref_pic_set() *in the SPS* — VAAPI doesn't expose
that. The slice-header bit-count (which IS what VAAPI's st_rps_bits provides)
belongs in slice_params->short_term_ref_pic_set_size (handled correctly in
α-29).

rkvdec doesn't read decode_params->short_term_ref_pic_set_size, so the
misroute was harmless but stale. This revert restores spec-correct semantics
(0 when SPS bit-count is unknown).

Cosmetic cleanup; no functional change.
2026-05-14 18:38:26 +00:00
claude-noether 7db15a5685 iter36: remove env-gated DIAG probes (iter29/30/33/35)
Cleans up the campaign's exploratory env-gated dumps now that all
bugs are fixed:
- iter29 LIBVA_HEVC_DUMP_SLICE_TAIL (h265.c) — refuted 40-byte inflation theory
- iter30 LIBVA_TS_SCALE (picture.c) — refuted timestamp magnitude theory
- iter33 LIBVA_VP8_DUMP_FRAME (vp8.c) — led to α-30 fix
- iter35 LIBVA_MPEG2_DUMP_FRAME (mpeg2.c) — confirmed MPEG-2 ctrls correct

Total: -131 lines / +7 lines (α-7 comment refresh).

Preexisting framework env knobs retained:
- LIBVA_V4L2_DUMP_OUTPUT (picture.c α-16)
- LIBVA_V4L2_DUMP_CAPTURE (surface.c)
- LIBVA_V4L2_ZERO_CAPTURE (picture.c)
- LIBVA_V4L2_REQUEST_VIDEO_PATH / MEDIA_PATH / NO_AUTODETECT (request.c)

The 3 load-bearing fixes remain unchanged:
α-25 (rkvdec image_fmt pre-seed, src/context.c)
α-29 (slice_params.short_term_ref_pic_set_size, src/h265.c)
α-30 (VP8 OUTPUT header prepend, src/picture.c)
2026-05-14 18:12:55 +00:00
claude-noether 48fd0288c3 iter35 DIAG: env-gated dump of v4l2_ctrl_mpeg2_* contents 2026-05-14 17:55:09 +00:00
claude-noether 7e0848d7d2 iter33 α-30: prepend VP8 uncompressed frame header to OUTPUT buffer
ROOT CAUSE FIX for VP8 libva decode garbage output.

ffmpeg-vaapi's vaapi_vp8.c:191-192 STRIPS the VP8 uncompressed
header (3 bytes for interframe, 10 bytes for keyframe) before
submitting the slice data via VAAPI. ffmpeg-v4l2request (kdirect)
KEEPS the header in its OUTPUT buffer.

Hantro's rockchip_vpu2_vp8_dec_run (rockchip_vpu2_hw_vp8_dec.c:349)
hard-codes 'first_part_offset = V4L2_VP8_FRAME_IS_KEY_FRAME(hdr) ? 10 : 3'
as the byte offset into OUTPUT where the first compressed partition
starts. It uses this offset for:
  - mb_offset_bits = first_part_offset * 8 + first_part_header_bits + 8
  - dct_part_offset = first_part_offset + first_part_size

Without the header, every offset is wrong, the entropy decoder
spins on the wrong bytes, and every frame decodes to garbage.

Fix: in codec_store_buffer for VAProfileVP8Version0_3, prepend
header_size bytes (10 keyframe / 3 interframe) of zeros to OUTPUT
before the slice data memcpy. Hantro skips these bytes for actual
parsing (uses ctrl-struct values instead), so zero-fill is fine.

Empirical: iter33 kernel printk in vpu2_vp8_dec_run dumped the
v4l2_ctrl_vp8_frame struct for libva vs kdirect and confirmed
byte-identical control fields. Only the OUTPUT buffer bytes
differed, traced to ffmpeg-vaapi's header stripping.
2026-05-14 16:35:41 +00:00
claude-noether bf3e3d8587 iter33: extend VP8 DIAG to dump VAAPI probability struct directly 2026-05-14 16:15:00 +00:00
claude-noether 4b3c21b105 iter33 DIAG: env-gated dump of v4l2_ctrl_vp8_frame contents
LIBVA_VP8_DUMP_FRAME=1 prints the v4l2_ctrl_vp8_frame struct fields
to stderr before VIDIOC_S_EXT_CTRLS. Goal: diff libva-side struct
against expected kdirect-side values for VP8 frame-2+ divergence
(libva produces non-trivial but wrong output; kdirect VP8 byte-equal
to SW). Env-gated, no behavior change otherwise.
2026-05-14 16:13:11 +00:00
claude-noether 23eb1bd5ae iter31 α-29: slice_params.short_term_ref_pic_set_size = picture->st_rps_bits
ROOT CAUSE FIX for HEVC frame 2+ divergence (Bug 5 remainder).

rkvdec's assemble_sw_rps (rkvdec-hevc.c:386-389) uses
sl_params->short_term_ref_pic_set_size to compute the bit offset where
long-term RPS data starts in the slice header. When zero, it falls back
to fls(num_short_term_ref_pic_sets - 1) — wrong when num=1 (BBB's case).

α-26 misdirected: set decode_params->short_term_ref_pic_set_size = st_rps_bits
but rkvdec doesn't use that field. The correct consumer is slice_params per
V4L2 spec and rkvdec source.

VAAPI's picture->st_rps_bits is documented as: 'number of bits that structure
short_term_ref_pic_set(num_short_term_ref_pic_sets) takes in slice segment
header when short_term_ref_pic_set_sps_flag equals 0' — exactly what
sl_params->short_term_ref_pic_set_size means.

Frames 1 (IDR) unaffected (V4L2 rkvdec gates on !IDR_PIC flag).
Frames 2+: bit offset for long-term RPS now correct, slice header parsing
no longer falls off the edge of the entropy bitstream.
2026-05-14 15:28:44 +00:00
claude-noether 68dbbdd4b7 iter30 DIAG: LIBVA_TS_SCALE env-gated timestamp multiplier
Default behavior unchanged: counter*1000ns same as before.
With LIBVA_TS_SCALE=N, multiplies the ns timestamp by N. Lets us
sweep timestamp magnitude to test whether small-ts collides with
stale CAPTURE entries in vb2_find_buffer for HEVC frame 2+ bug.

Also keeps iter29 slice-tail probe from previous commit.
2026-05-14 15:16:40 +00:00
claude-noether 0eca3ffc6b iter29 DIAG: dump trailing 80 bytes of HEVC slice_data per slice
Env-gated via LIBVA_HEVC_DUMP_SLICE_TAIL=1. Goal: characterise the
40-byte inflation in libva's slice_data buffer vs ffmpeg-v4l2request
(see iter27/28 close — HEVC frame 2+ divergence at byte 1382401).

Dumps per slice: nal_unit_type, slice_data_size, slice_data_byte_offset,
and the last 80 bytes of source_data for that slice. Lets us see if the
trailing 40 bytes are (a) real entropy, (b) trailing zeros, (c) a
next-NAL start code prefix, or (d) random memory.
2026-05-14 15:00:54 +00:00
claude-noether 6646b1635e Revert iter28b DIAG: trim=40 universal-trim breaks IDR frame 1
iter28b tested LIBVA_HEVC_TRIM_TRAILING=40 on HEVC. Result: hash
differed at byte 899745 (inside frame 1, NOT just frame 2 boundary at
byte 1382401). Trimming 40 bytes off the IDR slice (96890→96850)
corrupted frame 1. The 40-byte inflation is not uniform per slice;
requires dynamic detection (e.g., scan for rbsp_stop_one_bit) or
per-slice-type logic.
2026-05-14 14:42:24 +00:00
claude-noether c5557882aa iter28b DIAG: env-gated trim of HEVC slice_data trailing N bytes 2026-05-14 14:41:34 +00:00
claude-noether cd286d9bf0 iter28 α-28: bit_size = (slice_data_size - slice_data_byte_offset) * 8 for HEVC
VAAPI's slice_data_size includes NAL+slice header bytes that precede the
slice payload. rkvdec_hevc expects bit_size to cover the slice payload
(starting at data_byte_offset). Setting bit_size = slice_data_size * 8
made rkvdec read past slice payload → wrong entropy state → frame 2+
garbage despite correct ctx->image_fmt (iter25) and decode_params
(iter26).

Empirical match: with formula (slice_data_size - slice_data_byte_offset)
* 8, libva produces bit_size=44096 for BBB frame 2 matching kdirect's
44096 exactly per iter27 dmesg printk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:24:40 +00:00
claude-noether 754be1de7e iter27 diag: env-gated VAAPI slice fields dump 2026-05-14 10:23:43 +00:00
claude-noether c9bfa21425 iter27: remove request_log diag (VAAPI reports 0; rkvdec doesn't use field) 2026-05-14 10:23:23 +00:00
claude-noether 719d813f4a iter27 α-27: populate slice_params.num_entry_point_offsets from VAAPI
BBB HEVC uses WPP (entropy_coding_sync_enabled_flag=1); slice header
contains entry_point_offset_minus1 syntax elements. libva was setting
num_entry_point_offsets=0 with the comment 'iter2 doesn't do tiles',
but WPP uses the same mechanism — rkvdec miscounted the slice header
skip distance and read slice data starting at wrong byte for P/B
frames → frame 2+ decoded with garbage reference data.

iter27 kernel printk diff:
  libva frame 2 sl[8..11]  = 00 00 00 00 (=0)
  kdirect frame 2 sl[8..11] = 16 00 00 00 (=22)

VAAPI exposes VASliceParameterBufferHEVC.num_entry_point_offsets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:19:14 +00:00
claude-noether 66ef848b34 iter26 α-26: populate decode_params.short_term_ref_pic_set_size from VAAPI
VAPictureParameterBufferHEVC exposes st_rps_bits — the number of bits
the inline short_term_ref_pic_set syntax element takes in the slice
header. rkvdec's DPB resolution for P/B frames uses this to skip the
RPS data correctly; with size=0 it skips wrong bytes and reads wrong
references → frame 2+ visual divergence.

iter25 evidence: libva HEVC frame 1 byte-identical to kdirect, but
frame 2 diverges at the decode_params bytes 4-5 (libva 0x00 0x00,
kdirect 0x0a 0x00 = 10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:06:09 +00:00
claude-noether d062fec65d iter25 α-25 fix: add FRAME_MBS_ONLY to H264 dummy SPS
rkvdec_h264_validate_sps doubles height when FRAME_MBS_ONLY is unset
(field-to-frame). Dummy with 1080-height was failing validation as
2176 > 1080, returning -EINVAL silently (void-cast). Even though libva
ignores the result of v4l2_set_controls, the side effect was leaving
ctx->image_fmt at ANY → first per-frame H264_SPS still hit -EBUSY in
try_or_set_cluster → setup loop broke (Bug 4 unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:04:16 +00:00
claude-noether db0b7f9892 iter25 α-25: inject synthetic SPS before cap_pool_init to seed image_fmt
Root cause for Bug 5 (HEVC libva = all-zero CAPTURE) and Bug 4 (H.264
libva = keyframe partial), localized via iter17→iter24 kernel-printk
chain:

  rkvdec_s_ctrl() for HEVC_SPS / H264_SPS calls get_image_fmt() and,
  if the resolved image_fmt differs from cached ctx->image_fmt (default
  RKVDEC_IMG_FMT_ANY at open), tries to reset the CAPTURE format.
  Format reset returns -EBUSY when vb2_is_busy(CAPTURE_queue) — any
  CAPTURE buffer allocated blocks the change.

  libva (iter5b-β) pre-allocates 24 CAPTURE buffers at CreateContext
  via cap_pool_init, BEFORE any per-frame S_EXT_CTRLS. First per-frame
  HEVC_SPS therefore fails with -EBUSY in try_or_set_cluster, breaks
  v4l2_ctrl_request_setup's outer loop, leaves all 5 staged HEVC
  compound controls at zero in ctx->ctrl_hdl. rkvdec_hevc_run reads
  zero (iter20 dmesg: sps[0..16]=00..00), hardware sees w=0 h=0,
  CAPTURE comes out all-zero (Bug 5).

Fix: BEFORE cap_pool_init, inject one S_EXT_CTRLS (no request, no
which) with a synthetic SPS containing the profile's known chroma +
bit_depth. CAPTURE queue is still empty at this point → vb2_is_busy
returns false → rkvdec_s_ctrl succeeds, ctx->image_fmt is updated to
the profile's image_fmt. From then on, per-frame SPS submissions with
matching chroma + bit_depth see image_fmt_changed=false → skip reset
→ commit succeeds.

VP9 / MPEG-2 / VP8 paths are not affected: VP9's rkvdec coded_fmt_desc
has no get_image_fmt op; MPEG-2 + VP8 route to hantro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:00:08 +00:00
claude-noether e109306fd4 Revert "iter21 α-24 (diag): G_EXT_CTRLS readback after S_EXT_CTRLS staging"
This reverts commit a9c897fa8b.
2026-05-14 09:18:00 +00:00
claude-noether a9c897fa8b iter21 α-24 (diag): G_EXT_CTRLS readback after S_EXT_CTRLS staging
Env-gated by LIBVA_V4L2_REQ_GETBACK. After v4l2_set_controls() against
the request_fd in h265_set_controls(), issue G_EXT_CTRLS with the same
request_fd targeting SPS and log first 16 bytes returned.

iter20 (kernel printk) found rkvdec sees all-zero ctx->ctrl_hdl SPS for
libva HEVC vs correct bytes for kdirect. The remaining branch is whether
req->p_new was ever staged with libva's payload, or whether
v4l2_ctrl_request_setup failed to apply it.

α-24 distinguishes the two:
  zero readback  -> staging failed in v4l2_s_ext_ctrls
  non-zero       -> apply failed in v4l2_ctrl_request_setup
  EACCES         -> kernel disallows req readback; need deeper printk

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:17:26 +00:00
claude-noether 415688dab0 Revert "iter19 α-23 TEST: skip media_request_reinit() in RequestSyncSurface"
This reverts commit aa82bffa35.
2026-05-14 09:03:37 +00:00
claude-noether aa82bffa35 iter19 α-23 TEST: skip media_request_reinit() in RequestSyncSurface
Tests mechanism 2 (REINIT clears controls between S_EXT_CTRLS and QUEUE).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:03:08 +00:00
claude-noether fc78ed4204 Revert "iter18 α-22 (diag): log S_EXT_CTRLS error_idx + request_fd"
This reverts commit 0dbe1732f6.
2026-05-14 09:00:51 +00:00
claude-noether afe632fe68 Revert "iter18 α-21 (TEST): heap-persist HEVC controls past IOC_QUEUE"
This reverts commit e63bfd4dde.
2026-05-14 09:00:23 +00:00
claude-noether 65722e74bd Revert "iter18 α-22 TEST: skip DECODE_PARAMS to isolate validation failure"
This reverts commit 5a6eb4351d.
2026-05-14 09:00:23 +00:00
claude-noether 5a6eb4351d iter18 α-22 TEST: skip DECODE_PARAMS to isolate validation failure
If removing DECODE_PARAMS from libva's S_EXT_CTRLS batch lets the other
4 controls stage, rkvdec_hevc_run printk will show w=1280 h=720 etc.
That confirms DECODE_PARAMS specifically is failing kernel validation
and rolling back the whole batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:59:33 +00:00
claude-noether 0dbe1732f6 iter18 α-22 (diag): log S_EXT_CTRLS error_idx + request_fd
Tests mechanism 5 (silent partial failure). If error_idx != count after
S_EXT_CTRLS, one of the per-request controls was rejected by the kernel
even though the ioctl returned 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:58:12 +00:00
claude-noether e63bfd4dde iter18 α-21 (TEST): heap-persist HEVC controls past IOC_QUEUE
Static storage for sps/pps/decode_params/scaling_matrix + no-free for
slice_params_array. Tests the kernel-defers-compound-copy hypothesis
from iter17 P7 finding.

If hashes change -> mechanism 3 confirmed; will refactor to per-surface
heap allocation.
If hashes unchanged -> mechanism 3 disproved; iter19 explores
mechanisms 1/2/5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:57:18 +00:00
claude-noether 111f8bac8f iter17 α-20 revert: pool size 11 inert; back to 24
Test discriminator: lowering MIN_CAP_POOL from 24 to 11 (matching
kdirect) did not change any of the 5-codec hashes. Pool depth is
not the cause of Bug 4/5/6. Revert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:39:50 +00:00
claude-noether 7ae85c54fc iter17 α-20 (test): MIN_CAP_POOL 24 -> 11 to match kdirect
Quick discriminator: if pool depth affects rkvdec's per-codec state
machine, reducing libva's pool to kdirect's ~11 might change Bug 4/5/6
hashes. Reverts to 24 if test shows no change or regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:39:14 +00:00
claude-noether 3760a70006 iter15 α-19: explicit VIDIOC_S_FMT on CAPTURE side for rkvdec correctness
Phase 3 ioctl-sequence diff: kdirect (ffmpeg-v4l2request) S_FMTs CAPTURE
with NV12 + dimensions after S_FMT OUTPUT, BEFORE CREATE_BUFS. libva's
old code only G_FMTs CAPTURE (per iter5b-β's hantro-targeted comment
that explicit S_FMT puts hantro into an inconsistent state).

For rkvdec on RK3399 the absence of explicit S_FMT CAPTURE doesn't
commit the chosen NV12 format properly. rkvdec HEVC + H.264 silently
produce zero / garbage CAPTURE output — Bug 4 + Bug 5 root cause.

Now: S_FMT OUTPUT → S_FMT CAPTURE → G_FMT CAPTURE. Failure of S_FMT
CAPTURE is non-fatal: fall back to G_FMT (preserves the iter5b-β
hantro path).

Future iter to gate this on driver_kind explicitly per
feedback_per_driver_kludge_gating.md. For now, always-on is safe
because kdirect proves S_FMT CAPTURE works on both rkvdec AND hantro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:33:18 +00:00
claude-noether 522fb6daa5 iter14 α-16: env-gated OUTPUT bitstream byte dump pre-QBUF
LIBVA_V4L2_DUMP_OUTPUT=<dir> writes source_data[0..slices_size] to
<dir>/output_p<profile>_s<surface>_t<ts>.bin immediately before
v4l2_queue_buffer OUTPUT. Discriminates whether libva writes the
correct H.264/HEVC bitstream bytes (same as kdirect/input file).

Off by default. Wrapped in static-cache env check.

iter11+12+13 confirmed Bug 4/5 are not in S_EXT_CTRLS payload, not
in kernel substrate (RFC v2), not in CPU cache visibility (α-17 sync
ioctl works but inert). The remaining libva-side surface is the
actual bitstream bytes the kernel reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:19:29 +00:00
claude-noether ca4dd88007 iter13 α-17: explicit DMA_BUF_IOCTL_SYNC around copy_surface_to_image
V4L2 CAPTURE buffers are V4L2_MEMORY_MMAP and mapped cached. Kernel
DMA writes don't propagate to CPU cache observer; reading
destination_data[] without DMA_BUF_IOCTL_SYNC(START|READ) returns
stale data on RK3399 — observed as Bug 4 (H.264 partial-fill) and
Bug 5 (HEVC all-zero) when libva goes through cached-mmap readback
while kdirect ffmpeg-v4l2request + DRM_PRIME-mmap reads cleanly via
implicit sync.

Per Tomasz Figa's 2024 linaro-mm-sig discussion + feedback_rfc_v2_
vb2_dma_resv_scope.md: userspace responsibility for cache sync on
cached-mmap'd V4L2 buffers. RFC v2 fence work doesn't engage this
path; this ioctl pair does.

Just-in-time EXPBUF + SYNC + close per copy. Per-call cost is one
ioctl pair + one fd lifecycle per plane. Could cache the EXPBUF fd
on cap_pool slot but doing it transient keeps lifecycle simple.
Closing the EXPBUF fd is a no-op on V4L2 buffer memory.

If EXPBUF or SYNC fails, fall through to existing memcpy path —
preserves pre-iter13 behavior on the error branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:06:10 +00:00
claude-noether 8e2c04f84b iter11 Phase 6 α-13 + α-14: HEVC SPS hygiene + IRAP/IDR flags fix Bug 5
Two fixes in one commit:

α-13 (h265_fill_sps): sps_max_num_reorder_pics now derived from
sps_max_dec_pic_buffering_minus1 (safe upper bound per H.265 §A.4.2)
instead of hardcoded 0. Phase 5b empirically showed rkvdec ignores
this field on RK3399, so this is wire-correctness hygiene only — matches
kdirect's payload pattern without behavior change.

α-14 (h265_set_controls): derive IRAP_PIC / IDR_PIC flags from the
first slice's nal_unit_type (parsed by h265_fill_slice_params into
slice_params_array[0].nal_unit_type). Without these flags rkvdec
doesn't recognise the keyframe boundary, treats IDR as inter without
references, and produces all-zero CAPTURE output — observed as Bug 5
on libva HEVC (06b2c5a0...). kdirect sets these from the bitstream
parse and decodes correctly (9340b832...).

Mapping:
  nal_unit_type 16..23 -> IRAP_PIC
  nal_unit_type 19 (IDR_W_RADL) or 20 (IDR_N_LP) -> IDR_PIC

HEVC-only (no risk to other codecs). h265_set_controls already
profile-gated via picture.c::codec_set_controls VAProfileHEVCMain
dispatch. Per feedback_unconditional_codec_state.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:01:22 +00:00
claude-noether e0be4e6992 iter9 Phase 6 α-7: monotonic per-context timestamp counter
Replace gettimeofday in RequestEndPicture with object_context-scoped
counter producing small us values (1, 2, 3, ...) so OUTPUT QBUF
timestamp and DPB.reference_ts match ffmpeg-v4l2request's pattern.

Phase 5 IMP-1: counter scoped to object_context (not driver_data) to
avoid multi-context collisions.

Empirical confirmation only — reviewer's CRIT-1 predicts this is
inert (VP9/MPEG-2 use same path and PASS). If α-7 produces the same
broken hash, the libva wire-byte search space is exhausted and iter10
must pivot to slice-data inspection or kernel investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:55:33 +00:00
claude-noether 02266841c6 iter8 Phase 6c α-2: pass H.264 POC values through unchanged for rkvdec
Bug 4 root cause per Phase 7 γ + Phase 4c strace re-decode:
libva strips FFmpeg's bit-16 POC sentinel; kdirect (ffmpeg-v4l2request)
does NOT strip. rkvdec writes top/bottom_field_order_cnt directly to
MMIO via writel_relaxed; with libva sending 0 instead of kdirect's
65536, hardware POC comparisons mismatch and motion compensation
silently corrupts (16x32 patch + nothing else).

The original h264_strip_ffmpeg_poc_sentinel was hantro-specific
(hantro_h264.c prepare_table fed unmasked tbl->poc[]). Hantro+H.264
is not exercised on RK3399; deferring per-driver gating to iter9 if
it surfaces.

Preserve VA_PICTURE_H264_INVALID → return 0 (correct zero-init for
empty DPB slots per Phase 5c amendment).

4 call sites unchanged (h264.c:309, 312, 462, 465 — for ref and current
frame TopFieldOrderCnt / BottomFieldOrderCnt). Both reference and
current-frame POCs now pass through unchanged so hardware compares
agree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:57:51 +00:00
claude-noether 6f4e5833f0 iter8 Phase 7 fix-fwd: picture.c needs <stdlib.h> for getenv
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:22:40 +00:00