Commit Graph

333 Commits

Author SHA1 Message Date
claude-noether c9bfa21425 iter27: remove request_log diag (VAAPI reports 0; rkvdec doesn't use field) 2026-05-14 10:23:23 +00:00
claude-noether 719d813f4a iter27 α-27: populate slice_params.num_entry_point_offsets from VAAPI
BBB HEVC uses WPP (entropy_coding_sync_enabled_flag=1); slice header
contains entry_point_offset_minus1 syntax elements. libva was setting
num_entry_point_offsets=0 with the comment 'iter2 doesn't do tiles',
but WPP uses the same mechanism — rkvdec miscounted the slice header
skip distance and read slice data starting at wrong byte for P/B
frames → frame 2+ decoded with garbage reference data.

iter27 kernel printk diff:
  libva frame 2 sl[8..11]  = 00 00 00 00 (=0)
  kdirect frame 2 sl[8..11] = 16 00 00 00 (=22)

VAAPI exposes VASliceParameterBufferHEVC.num_entry_point_offsets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:19:14 +00:00
claude-noether 66ef848b34 iter26 α-26: populate decode_params.short_term_ref_pic_set_size from VAAPI
VAPictureParameterBufferHEVC exposes st_rps_bits — the number of bits
the inline short_term_ref_pic_set syntax element takes in the slice
header. rkvdec's DPB resolution for P/B frames uses this to skip the
RPS data correctly; with size=0 it skips wrong bytes and reads wrong
references → frame 2+ visual divergence.

iter25 evidence: libva HEVC frame 1 byte-identical to kdirect, but
frame 2 diverges at the decode_params bytes 4-5 (libva 0x00 0x00,
kdirect 0x0a 0x00 = 10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:06:09 +00:00
claude-noether d062fec65d iter25 α-25 fix: add FRAME_MBS_ONLY to H264 dummy SPS
rkvdec_h264_validate_sps doubles height when FRAME_MBS_ONLY is unset
(field-to-frame). Dummy with 1080-height was failing validation as
2176 > 1080, returning -EINVAL silently (void-cast). Even though libva
ignores the result of v4l2_set_controls, the side effect was leaving
ctx->image_fmt at ANY → first per-frame H264_SPS still hit -EBUSY in
try_or_set_cluster → setup loop broke (Bug 4 unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:04:16 +00:00
claude-noether db0b7f9892 iter25 α-25: inject synthetic SPS before cap_pool_init to seed image_fmt
Root cause for Bug 5 (HEVC libva = all-zero CAPTURE) and Bug 4 (H.264
libva = keyframe partial), localized via iter17→iter24 kernel-printk
chain:

  rkvdec_s_ctrl() for HEVC_SPS / H264_SPS calls get_image_fmt() and,
  if the resolved image_fmt differs from cached ctx->image_fmt (default
  RKVDEC_IMG_FMT_ANY at open), tries to reset the CAPTURE format.
  Format reset returns -EBUSY when vb2_is_busy(CAPTURE_queue) — any
  CAPTURE buffer allocated blocks the change.

  libva (iter5b-β) pre-allocates 24 CAPTURE buffers at CreateContext
  via cap_pool_init, BEFORE any per-frame S_EXT_CTRLS. First per-frame
  HEVC_SPS therefore fails with -EBUSY in try_or_set_cluster, breaks
  v4l2_ctrl_request_setup's outer loop, leaves all 5 staged HEVC
  compound controls at zero in ctx->ctrl_hdl. rkvdec_hevc_run reads
  zero (iter20 dmesg: sps[0..16]=00..00), hardware sees w=0 h=0,
  CAPTURE comes out all-zero (Bug 5).

Fix: BEFORE cap_pool_init, inject one S_EXT_CTRLS (no request, no
which) with a synthetic SPS containing the profile's known chroma +
bit_depth. CAPTURE queue is still empty at this point → vb2_is_busy
returns false → rkvdec_s_ctrl succeeds, ctx->image_fmt is updated to
the profile's image_fmt. From then on, per-frame SPS submissions with
matching chroma + bit_depth see image_fmt_changed=false → skip reset
→ commit succeeds.

VP9 / MPEG-2 / VP8 paths are not affected: VP9's rkvdec coded_fmt_desc
has no get_image_fmt op; MPEG-2 + VP8 route to hantro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:00:08 +00:00
claude-noether e109306fd4 Revert "iter21 α-24 (diag): G_EXT_CTRLS readback after S_EXT_CTRLS staging"
This reverts commit a9c897fa8b.
2026-05-14 09:18:00 +00:00
claude-noether a9c897fa8b iter21 α-24 (diag): G_EXT_CTRLS readback after S_EXT_CTRLS staging
Env-gated by LIBVA_V4L2_REQ_GETBACK. After v4l2_set_controls() against
the request_fd in h265_set_controls(), issue G_EXT_CTRLS with the same
request_fd targeting SPS and log first 16 bytes returned.

iter20 (kernel printk) found rkvdec sees all-zero ctx->ctrl_hdl SPS for
libva HEVC vs correct bytes for kdirect. The remaining branch is whether
req->p_new was ever staged with libva's payload, or whether
v4l2_ctrl_request_setup failed to apply it.

α-24 distinguishes the two:
  zero readback  -> staging failed in v4l2_s_ext_ctrls
  non-zero       -> apply failed in v4l2_ctrl_request_setup
  EACCES         -> kernel disallows req readback; need deeper printk

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:17:26 +00:00
claude-noether 415688dab0 Revert "iter19 α-23 TEST: skip media_request_reinit() in RequestSyncSurface"
This reverts commit aa82bffa35.
2026-05-14 09:03:37 +00:00
claude-noether aa82bffa35 iter19 α-23 TEST: skip media_request_reinit() in RequestSyncSurface
Tests mechanism 2 (REINIT clears controls between S_EXT_CTRLS and QUEUE).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:03:08 +00:00
claude-noether fc78ed4204 Revert "iter18 α-22 (diag): log S_EXT_CTRLS error_idx + request_fd"
This reverts commit 0dbe1732f6.
2026-05-14 09:00:51 +00:00
claude-noether afe632fe68 Revert "iter18 α-21 (TEST): heap-persist HEVC controls past IOC_QUEUE"
This reverts commit e63bfd4dde.
2026-05-14 09:00:23 +00:00
claude-noether 65722e74bd Revert "iter18 α-22 TEST: skip DECODE_PARAMS to isolate validation failure"
This reverts commit 5a6eb4351d.
2026-05-14 09:00:23 +00:00
claude-noether 5a6eb4351d iter18 α-22 TEST: skip DECODE_PARAMS to isolate validation failure
If removing DECODE_PARAMS from libva's S_EXT_CTRLS batch lets the other
4 controls stage, rkvdec_hevc_run printk will show w=1280 h=720 etc.
That confirms DECODE_PARAMS specifically is failing kernel validation
and rolling back the whole batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:59:33 +00:00
claude-noether 0dbe1732f6 iter18 α-22 (diag): log S_EXT_CTRLS error_idx + request_fd
Tests mechanism 5 (silent partial failure). If error_idx != count after
S_EXT_CTRLS, one of the per-request controls was rejected by the kernel
even though the ioctl returned 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:58:12 +00:00
claude-noether e63bfd4dde iter18 α-21 (TEST): heap-persist HEVC controls past IOC_QUEUE
Static storage for sps/pps/decode_params/scaling_matrix + no-free for
slice_params_array. Tests the kernel-defers-compound-copy hypothesis
from iter17 P7 finding.

If hashes change -> mechanism 3 confirmed; will refactor to per-surface
heap allocation.
If hashes unchanged -> mechanism 3 disproved; iter19 explores
mechanisms 1/2/5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:57:18 +00:00
claude-noether 111f8bac8f iter17 α-20 revert: pool size 11 inert; back to 24
Test discriminator: lowering MIN_CAP_POOL from 24 to 11 (matching
kdirect) did not change any of the 5-codec hashes. Pool depth is
not the cause of Bug 4/5/6. Revert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:39:50 +00:00
claude-noether 7ae85c54fc iter17 α-20 (test): MIN_CAP_POOL 24 -> 11 to match kdirect
Quick discriminator: if pool depth affects rkvdec's per-codec state
machine, reducing libva's pool to kdirect's ~11 might change Bug 4/5/6
hashes. Reverts to 24 if test shows no change or regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:39:14 +00:00
claude-noether 3760a70006 iter15 α-19: explicit VIDIOC_S_FMT on CAPTURE side for rkvdec correctness
Phase 3 ioctl-sequence diff: kdirect (ffmpeg-v4l2request) S_FMTs CAPTURE
with NV12 + dimensions after S_FMT OUTPUT, BEFORE CREATE_BUFS. libva's
old code only G_FMTs CAPTURE (per iter5b-β's hantro-targeted comment
that explicit S_FMT puts hantro into an inconsistent state).

For rkvdec on RK3399 the absence of explicit S_FMT CAPTURE doesn't
commit the chosen NV12 format properly. rkvdec HEVC + H.264 silently
produce zero / garbage CAPTURE output — Bug 4 + Bug 5 root cause.

Now: S_FMT OUTPUT → S_FMT CAPTURE → G_FMT CAPTURE. Failure of S_FMT
CAPTURE is non-fatal: fall back to G_FMT (preserves the iter5b-β
hantro path).

Future iter to gate this on driver_kind explicitly per
feedback_per_driver_kludge_gating.md. For now, always-on is safe
because kdirect proves S_FMT CAPTURE works on both rkvdec AND hantro.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:33:18 +00:00
claude-noether 522fb6daa5 iter14 α-16: env-gated OUTPUT bitstream byte dump pre-QBUF
LIBVA_V4L2_DUMP_OUTPUT=<dir> writes source_data[0..slices_size] to
<dir>/output_p<profile>_s<surface>_t<ts>.bin immediately before
v4l2_queue_buffer OUTPUT. Discriminates whether libva writes the
correct H.264/HEVC bitstream bytes (same as kdirect/input file).

Off by default. Wrapped in static-cache env check.

iter11+12+13 confirmed Bug 4/5 are not in S_EXT_CTRLS payload, not
in kernel substrate (RFC v2), not in CPU cache visibility (α-17 sync
ioctl works but inert). The remaining libva-side surface is the
actual bitstream bytes the kernel reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:19:29 +00:00
claude-noether ca4dd88007 iter13 α-17: explicit DMA_BUF_IOCTL_SYNC around copy_surface_to_image
V4L2 CAPTURE buffers are V4L2_MEMORY_MMAP and mapped cached. Kernel
DMA writes don't propagate to CPU cache observer; reading
destination_data[] without DMA_BUF_IOCTL_SYNC(START|READ) returns
stale data on RK3399 — observed as Bug 4 (H.264 partial-fill) and
Bug 5 (HEVC all-zero) when libva goes through cached-mmap readback
while kdirect ffmpeg-v4l2request + DRM_PRIME-mmap reads cleanly via
implicit sync.

Per Tomasz Figa's 2024 linaro-mm-sig discussion + feedback_rfc_v2_
vb2_dma_resv_scope.md: userspace responsibility for cache sync on
cached-mmap'd V4L2 buffers. RFC v2 fence work doesn't engage this
path; this ioctl pair does.

Just-in-time EXPBUF + SYNC + close per copy. Per-call cost is one
ioctl pair + one fd lifecycle per plane. Could cache the EXPBUF fd
on cap_pool slot but doing it transient keeps lifecycle simple.
Closing the EXPBUF fd is a no-op on V4L2 buffer memory.

If EXPBUF or SYNC fails, fall through to existing memcpy path —
preserves pre-iter13 behavior on the error branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:06:10 +00:00
claude-noether 8e2c04f84b iter11 Phase 6 α-13 + α-14: HEVC SPS hygiene + IRAP/IDR flags fix Bug 5
Two fixes in one commit:

α-13 (h265_fill_sps): sps_max_num_reorder_pics now derived from
sps_max_dec_pic_buffering_minus1 (safe upper bound per H.265 §A.4.2)
instead of hardcoded 0. Phase 5b empirically showed rkvdec ignores
this field on RK3399, so this is wire-correctness hygiene only — matches
kdirect's payload pattern without behavior change.

α-14 (h265_set_controls): derive IRAP_PIC / IDR_PIC flags from the
first slice's nal_unit_type (parsed by h265_fill_slice_params into
slice_params_array[0].nal_unit_type). Without these flags rkvdec
doesn't recognise the keyframe boundary, treats IDR as inter without
references, and produces all-zero CAPTURE output — observed as Bug 5
on libva HEVC (06b2c5a0...). kdirect sets these from the bitstream
parse and decodes correctly (9340b832...).

Mapping:
  nal_unit_type 16..23 -> IRAP_PIC
  nal_unit_type 19 (IDR_W_RADL) or 20 (IDR_N_LP) -> IDR_PIC

HEVC-only (no risk to other codecs). h265_set_controls already
profile-gated via picture.c::codec_set_controls VAProfileHEVCMain
dispatch. Per feedback_unconditional_codec_state.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:01:22 +00:00
claude-noether e0be4e6992 iter9 Phase 6 α-7: monotonic per-context timestamp counter
Replace gettimeofday in RequestEndPicture with object_context-scoped
counter producing small us values (1, 2, 3, ...) so OUTPUT QBUF
timestamp and DPB.reference_ts match ffmpeg-v4l2request's pattern.

Phase 5 IMP-1: counter scoped to object_context (not driver_data) to
avoid multi-context collisions.

Empirical confirmation only — reviewer's CRIT-1 predicts this is
inert (VP9/MPEG-2 use same path and PASS). If α-7 produces the same
broken hash, the libva wire-byte search space is exhausted and iter10
must pivot to slice-data inspection or kernel investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:55:33 +00:00
claude-noether 02266841c6 iter8 Phase 6c α-2: pass H.264 POC values through unchanged for rkvdec
Bug 4 root cause per Phase 7 γ + Phase 4c strace re-decode:
libva strips FFmpeg's bit-16 POC sentinel; kdirect (ffmpeg-v4l2request)
does NOT strip. rkvdec writes top/bottom_field_order_cnt directly to
MMIO via writel_relaxed; with libva sending 0 instead of kdirect's
65536, hardware POC comparisons mismatch and motion compensation
silently corrupts (16x32 patch + nothing else).

The original h264_strip_ffmpeg_poc_sentinel was hantro-specific
(hantro_h264.c prepare_table fed unmasked tbl->poc[]). Hantro+H.264
is not exercised on RK3399; deferring per-driver gating to iter9 if
it surfaces.

Preserve VA_PICTURE_H264_INVALID → return 0 (correct zero-init for
empty DPB slots per Phase 5c amendment).

4 call sites unchanged (h264.c:309, 312, 462, 465 — for ref and current
frame TopFieldOrderCnt / BottomFieldOrderCnt). Both reference and
current-frame POCs now pass through unchanged so hardware compares
agree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:57:51 +00:00
claude-noether 6f4e5833f0 iter8 Phase 7 fix-fwd: picture.c needs <stdlib.h> for getenv
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:22:40 +00:00
claude-noether 66ecbef5c6 iter8 Phase 7 IMP-1 experiment: LIBVA_V4L2_ZERO_CAPTURE pre-zero gate
Env-gated CAPTURE pre-zero in BeginPicture after cap_pool_acquire. With
LIBVA_V4L2_ZERO_CAPTURE=1, the slot mmap region is memset 0 before the
kernel decode runs. Discriminates "kernel writes partial then aborts"
from "kernel writes nothing, buffer carries stale residue from prior
allocation."

Per Phase 5 IMP-1: the 16x32 patch in libva H.264 frame 1 may be either
real partial kernel write OR stale residue. This gate makes the next
sweep run deterministically zero the buffer; if the patch still appears
after, the kernel really writes it; if not, it was stale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:21:54 +00:00
claude-noether 7eae6eab46 iter8 Phase 6: γ env-gated CAPTURE buffer diagnostic dump
After RequestSyncSurface DQBUFs CAPTURE and marks slot DECODED, optionally
dump first/last 32 bytes of each destination_data plane plus a non-zero
count over a per-plane scan window (one MB row for plane 0, 1024 bytes
for chroma). Gated behind LIBVA_V4L2_DUMP_CAPTURE=1; default off, no
regression on existing flows.

Diagnostic for Bug 4 (H.264 partial-fill): distinguishes "kernel didn't
write" from "libva mis-reads" from "stale-residue" by inspecting the
post-DQBUF buffer state directly.

Phase 5 amendments applied:
- Amendment 1 (CRIT-1): snprintf-buffered hex line, one request_log call.
- Amendment 2 (CRIT-2): dump nested inside current_slot != NULL guard.
- Amendment 4 (IMP-3): placed between cap_pool_mark_decoded and
  status=VASurfaceDisplaying on happy path only.
- Amendment 5 (MIN-2): scan window = max(1024 chroma, bpl*16 luma).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:10:24 +00:00
claude-noether 6df2159dd3 fresnel-fourier iter7 Phase 7 fix-forward: data links connect pads not entities directly
Empirical Phase 7 verification revealed the algorithm bug: data links
in MEDIA_IOC_G_TOPOLOGY connect PAD IDs, not entity IDs directly.
My iter7 Phase 6 commit compared link source_id/sink_id against
the proc entity_id, never matched → io_entity_ids stayed empty →
interface lookup never fired → returns -1 → falls back to legacy
hardcoded path.

Topology dump on fresnel /dev/media0 (rkvdec) confirmed:
- Entity 3 (rkvdec-proc) has function=0x4008 (DECODER) ✓
- Data link src=16777218 sink=16777220 — these are PAD ids
  (0x01000002, 0x01000004), NOT entity 3.
- Interface link src=50331660 (interface) sink=1 (entity) — for
  interface links source/sink ARE entity IDs.

Fix: resolve pads → entities via the topo.pads[] array.
1. Collect pads belonging to proc entity (via pads[].entity_id).
2. For each data link touching those pads, the OTHER pad's
   entity_id is an IO neighbor.
3. Find interface link to those IO entities (unchanged from prev).

Also allocate topo.pads[] in the 2-call ioctl pattern.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-13 11:00:20 +00:00
claude-noether c106d95869 fresnel-fourier iter7 Phase 6: auto-detect with decoder-entity discrimination (B1a)
Refactor request.c::find_video_node_via_topology to
find_decoder_video_node_via_topology — walks media-topology entities
looking for MEDIA_ENT_F_PROC_VIDEO_DECODER function, then follows the
kernel's link graph (data link from proc to IO entity, interface link
from IO entity to V4L_VIDEO interface) to the correct /dev/videoN.

Two-pass find_codec_device: pass 1 accepts only "rkvdec" (multi-codec
decoder, 3 of 5 codecs); pass 2 accepts any known_decoder_drivers
entry. Pre-iter7 the walk picked whichever media device matched the
hantro-vpu driver name first — which on RK3399 could be the encoder
half of the same media device, surfacing as an empty profile list.

Phase 5 amendments incorporated:
- CRIT-1: use MEDIA_LNK_FL_INTERFACE_LINK (1U<<28) to discriminate
  interface vs data links.
- CRIT-2: check both source_id and sink_id of each link.
- IMP-3: 2-call MEDIA_IOC_G_TOPOLOGY pattern (allocate all 3 arrays
  before second call); pre-iter7 had a spurious memset + third call.

iter4-B1b (multi-decoder routing — open BOTH rkvdec AND hantro from
one backend instance) still deferred. Post-iter7 MPEG-2/VP8 (hantro)
still need LIBVA_V4L2_REQUEST_VIDEO_PATH override.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-13 09:38:54 +00:00
claude-noether 70196f8065 fresnel-fourier iter5b-β Phase 7 fix-forward commit D: destination_* for vaapi-copy late-surface flow
Phase 7 empirical: all 5 libva codecs returned all-zero because
CreateContext's surfaces_ids[] walk was a no-op for ffmpeg-vaapi-copy
which passes surfaces_count=0 to vaCreateContext (per the iter6
comment at context.c:262). Surfaces existed in driver_data's
surface_heap but weren't in the param array → destination_* stayed
at the zero initialization from CreateSurfaces2 β → BeginPicture's
surface_bind_slot saw destination_planes_count=0 → no data
assignment → copy_surface_to_image read all-zero.

Fix: cache the format-uniform CAPTURE geometry in driver_data
(fmt_valid, fmt_planes_count, fmt_buffers_count, fmt_format_height,
fmt_sizes[], fmt_bytesperlines[]). Populate at CreateContext after
v4l2_get_format(CAPTURE). Walk surface_heap (not just surfaces_ids[])
to fill every existing surface. Add lazy-fill in CreateSurfaces2 for
surfaces created AFTER CreateContext. Invalidate cache in
DestroyContext.

New helper: surface_fill_format_uniform(driver_data, surface_object).
Idempotent on destination_planes_count != 0.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 18:52:33 +00:00
claude-noether 7055b14f5e fresnel-fourier iter5b-β Phase 6 commit C: β refactor — OUTPUT lifecycle to CreateContext + CRIT-1 + CRIT-2
Strip OUTPUT-side V4L2 device-format lifecycle out of
RequestCreateSurfaces2 entirely. Move S_FMT(OUTPUT), CAPTURE-format
probe, cap_pool_init, per-surface destination_* fill into
RequestCreateContext where config_id (and therefore the bound
VAProfile) is known via config_object->pixelformat (wired by
commit B). The α' multi-CreateSurfaces2-mid-stream failure mode
disappears because β has no in-CreateSurfaces2 teardown branch;
each context cycle does its own setup, DestroyContext handles
teardown.

Phase 5 v2 review amendments:
- CRIT-1: removed video_format==NULL early-return at context.c:64-66
  (would have rejected every first β CreateContext).
- CRIT-2: added request_pool_destroy() to DestroyContext before
  REQBUFS(0). Pre-β only surface.c's resolution-change branch
  called request_pool_destroy; β strips that, so DestroyContext
  becomes the sole per-session teardown site.
- IMP-1: probe CAPTURE format first to derive output_type from
  video_format->v4l2_mplane (eliminates the hardcoded mplane=true
  hack from the Phase 4 v2 plan).
- IMP-2: surface_reset_format_cache() deleted (function + declaration
  in surface.h + call in DestroyContext + last_output_{width,height}
  fields in request.h). All dead under β.

CreateSurfaces2 now ~50 LOC (was ~250). Pure surface ID allocation
+ per-surface lifecycle bookkeeping; no V4L2 device state touched.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 14:41:35 +00:00
claude-noether cc077a0c06 fresnel-fourier iter5b-β Phase 6 commit B: config.c — wire object_config->pixelformat
Populate the previously-dead pixelformat field at config.h:46 from
pixelformat_for_profile(profile). The switch at lines 54-88 already
rejects unsupported profiles, so by the time we reach the assignment
at line 98, pixelformat_for_profile returns non-zero.

Commit C reads this field at CreateContext to set the V4L2 OUTPUT
format correctly per profile (the β architectural fix for Bug 2 —
HEVC/VP9/VP8 currently dispatch through the pre-iter5b H264_SLICE
hardcode at surface.c:173 because surface.c has no config_id to look
up the profile).

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 14:11:32 +00:00
claude-noether 1c548b136a fresnel-fourier iter5b-β Phase 6 commit A: NEW src/codec.{h,c} — pixelformat_for_profile helper
Re-introduce after the iter5b-α' revert. Helper maps VAProfile to V4L2
OUTPUT-side FOURCC, used at CreateConfig in commit B to populate the
previously-dead object_config->pixelformat field. β reads from there
at CreateContext (commit C).

Single source of truth for the profile→pixelformat mapping; mirrors
the per-profile probes in config.c::RequestQueryConfigProfiles
(lines 138-188).

Register codec.c in meson.build sources, codec.h in headers.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 14:10:46 +00:00
claude-noether 6bc29ec582 Revert "fresnel-fourier iter5b Phase 6 commit A: NEW src/codec.{h,c} — pixelformat_for_profile helper"
This reverts commit ce304ef5af.
2026-05-12 12:32:57 +00:00
claude-noether 9a7f888f1b Revert "fresnel-fourier iter5b Phase 6 commit B: state-tracking — request.h field + config.c wire-up"
This reverts commit f8256e6c2d.
2026-05-12 12:32:57 +00:00
claude-noether 709ab34624 Revert "fresnel-fourier iter5b Phase 6 commit C: surface.c — profile-derived OUTPUT pixel format"
This reverts commit 4b2288fa9a.
2026-05-12 12:32:56 +00:00
claude-noether 4b2288fa9a fresnel-fourier iter5b Phase 6 commit C: surface.c — profile-derived OUTPUT pixel format
Replace the hardcoded `V4L2_PIX_FMT_H264_SLICE` at surface.c:173 with
a profile-derived lookup via find_sole_active_pixelformat(). The
helper walks the config_heap; with one active config (universal across
mpv, ffmpeg, Firefox, Chromium) it returns the cached pixelformat
populated at CreateConfig in commit B. Falls back to the pre-iter5b
H264_SLICE for the pathological "zero or multiple configs" case
(probe surfaces before CreateConfig; multi-config-then-surfaces).

Extend the existing resolution-change gate to also fire on
pixelformat (codec) change. The teardown branch handles both cases
identically — REQBUFS(0) on both queues before re-S_FMT.

The kernel behavior pre-iter5b on RK3399:
- hantro: hantro_find_format(H264_SLICE) returns NULL on the RK3399
  decoder block (no H.264 support); hantro_try_fmt silently
  substitutes the first format in rk3399_vpu_dec_fmts =
  MPEG2_SLICE → codec_mode = MPEG2_DECODER. VP8 bitstream
  dispatched to MPEG2 ops → all-zero CAPTURE. MPEG-2 worked by
  accident (bitstream matched the substituted codec_mode).
- rkvdec: format/control mismatch; decoder silently drops the
  request → all-zero CAPTURE.

Same bug class as iter4 commit `692eaa0` (h264_start_code
unconditional set). Both fixes thread the active VAProfile into
codec-specific kernel state.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 09:23:31 +00:00
claude-noether f8256e6c2d fresnel-fourier iter5b Phase 6 commit B: state-tracking — request.h field + config.c wire-up
request.h: add last_output_pixelformat to struct request_data, alongside
the existing last_output_{width,height} V4L2 device state cache. Gates
re-S_FMT on codec change in addition to resolution change.

config.c::RequestCreateConfig: wire up object_config->pixelformat
(previously dead field at config.h:46) by calling pixelformat_for_profile
on the active profile. The pixelformat field becomes the source of truth
that surface.c reads in commit C.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 09:08:33 +00:00
claude-noether ce304ef5af fresnel-fourier iter5b Phase 6 commit A: NEW src/codec.{h,c} — pixelformat_for_profile helper
Add a small helper that maps a VAProfile to its V4L2 OUTPUT-side
pixel format FOURCC. Single source of truth, mirrors the per-profile
probes in config.c::RequestQueryConfigProfiles (lines 138-188).

Used by commits B + C in this series:
- commit B: populate object_config->pixelformat at CreateConfig
- commit C: surface.c reads the populated field to set OUTPUT format
  per-profile instead of hardcoded H264_SLICE

Register in meson.build sources + headers.

Signed-off-by: claude-noether <claude-noether@reauktion.de>
2026-05-12 09:04:02 +00:00
claude-noether 692eaa0053 fresnel-fourier iter4 Phase 7 fix-forward: gate ANNEX-B start-code prepend on H.264/HEVC profiles
Root cause for VP9 criterion-4 failure traced via runtime
instrumentation: context.c:194 unconditionally set
context_object->h264_start_code = true for every CreateContext,
regardless of codec profile. picture.c:70 then prepends 0x00 0x00 0x01
(ANNEX-B start code) to ALL slice data including VP9 frames.

VP9 has no start codes — its uncompressed_header begins with the raw
frame_marker byte (0x10 in the high 2 bits). The 3-byte prefix
shifted the rkvdec driver's bitstream-read by 24 bits, producing a
silent decode failure (frame_marker mismatch -> driver fails to
locate a valid frame -> CAPTURE slot stays at cap_pool init pattern,
the dim 0x4c green visible in Phase 7 hwdownload PNGs).

iter4 fix: switch on config_object->profile in RequestCreateContext.
Set h264_start_code = true only for VAProfileH264* and VAProfileHEVCMain.
False for MPEG2/VP8/VP9.

iter1 (MPEG-2) and iter3 (VP8) had this same bug latent — they passed
because their criterion-4 verification used different paths (iter1
direct readback was small enough to mask, iter3 used transitive proof
not pixel comparison). The Phase 7 byte-level pixel comparison is what
exposed it.

Empirical proof of the fix on fresnel:
- pre-fix submission FRAME control bytes 0-23: lf.flags=0x01 (only
  DELTA_ENABLED), base_q_idx=0x41 — bit-misaligned because parser was
  reading the prefix bytes.
- post-fix submission FRAME control bytes 0-23 byte-match Phase 3
  kernel-direct anchor: lf.flags=0x03 (ENABLED|UPDATE), base_q_idx=0x2e
  (46). Transitive-proof leg 1 (backend-payload == kernel-direct-payload)
  satisfied for the keyframe.
- s(6) bit-width fix in vp9.c (4 mag + 1 sign -> 6 mag + 1 sign per
  VP9 spec) was a real bug too, latent because Bug 1 (this commit's fix)
  prevented its code path from running. Both fixes ship together.

Pixels still produce 0x4c constant pattern post-fix — that is Bug 2
(substrate-wide cap_pool readback regression on
linux-fresnel-fourier 7.0-1) per phase7_iter4_verification.md.
Bug 2 is out of iter4 scope per Option-A choice; transitive proof
remains the criterion-4 verification path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 09:50:25 +00:00
claude-noether beaa914680 fresnel-fourier iter4 Phase 6 commit C: picture.c VP9 dispatch + 2 buffer-type cases
5 sites:
1. include block: add #include "vp9.h".
2. codec_set_controls: add VP9 case calling vp9_set_controls().
3. codec_store_buffer VAPictureParameterBufferType: VP9 inner case
   memcpy'ing into surface_object->params.vp9.picture.
4. codec_store_buffer VASliceParameterBufferType: VP9 inner case
   memcpy'ing into surface_object->params.vp9.slice.
5. (No reset in RequestBeginPicture — VP9 has no iqmatrix_set/
   probability_set-style flag, Picture/Slice are unconditionally
   populated by VAAPI consumer per frame.)

Per Phase 2 B12: NO buffer.c changes — VP9 uses Picture+Slice+Data
which are already in the iter3 allow-list. Per memory
feedback_runtime_enumerates_allowlists.md plan for Commit D
fix-forward if a runtime miss surfaces; predicted clean.

Verified end-to-end on fresnel:
- vainfo enumerates VAProfileVP9Profile0 alongside H.264 + HEVC.
- LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi VP9 decode
  exits 0 (criterion 3 PASS): 5 frames decoded at 0.307x speed,
  cap_pool_init OK, no kernel ioctl errors.
- mpv vp9-vaapi engagement still SW-fallback (iter4-B2 backlog —
  mpv-DRM device-create path doesn't honor LIBVA_DRIVER_NAME the
  way ffmpeg-vaapi does; investigation deferred).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 06:48:49 +00:00
claude-noether 406d08e122 fresnel-fourier iter4 Phase 6 commit B: NEW src/vp9.c + src/vp9.h + meson.build + context.h (vp9_lf) + surface.h (params.vp9)
VP9 codec dispatcher implementing 12 contract clauses against
V4L2_CID_STATELESS_VP9_FRAME (0xa40a2c) +
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (0xa40a2d). 2 batched
controls per frame; rkvdec on RK3399 mandatorily requires both
per drivers/staging/media/rkvdec/rkvdec-vp9.c::rkvdec_vp9_run_preamble:752.

Implementation:
- ~80 LOC VPX range coder (vp9_rac_*) — minimal port of FFmpeg
  vpx_rac.[ch] + vp89_rac.h. Stateless static helpers.
- inv_map_table[255] + read_prob_delta — verbatim copy from
  v4l2_request_vp9.c:44-97.
- vp9_parse_uncompressed_header_lf_quant — partial parse for the
  fields VAAPI doesn't expose: lf_delta_enabled / lf_delta_update /
  lf_ref_delta[4] / lf_mode_delta[2] / base_q_idx /
  delta_q_y_dc / delta_q_uv_dc / delta_q_uv_ac. ~120 LOC.
- vp9_fill_compressed_hdr — port of FFmpeg fill_compressed_hdr
  with Phase 5 C3 out_reference_mode parameter. ~140 LOC.
- vp9_set_controls — orchestrates Clauses 1+2+4+5+7+10+11+12.
  ~120 LOC.

Phase 5 amendments incorporated in code:
- C1: frame.interpolation_filter = direct from VAAPI's
  mcomp_filter_type (NO XOR; vaapi_vp9.c:62 already applied it
  before storing into VAAPI's mcomp_filter_type).
- C2: persistent vp9_lf state added to object_context (in
  context.h). Initialized to VP9 spec defaults
  {1,0,-1,-1,0,0} on keyframe / intra_only / error_resilient.
  Updated only when parser sees lf_delta.update=1. Always
  copied to kernel control.
- C3: vp9_fill_compressed_hdr takes uint8_t *out_reference_mode;
  threaded through call site. allowcompinter derived from VAAPI
  sign-bias bits.

Phase 5 S4: uv_mode memcpy from FFmpeg's fill_compressed_hdr
omitted — rkvdec reads uv_mode from kernel's persistent
probability_tables, NOT from prob_updates ctrl.

Clause 3 compile-time _Static_assert on struct sizes (168/2040)
matches Phase 3 empirical baseline; UAPI shifts will fail loudly.

surface.h: extends params union with vp9 { picture, slice }.
context.h: adds vp9_lf { ref_deltas[4], mode_deltas[2], initialized }.
meson.build: adds vp9.c + vp9.h.

Build: clean on fresnel (linux-fresnel-fourier 7.0-1, libva 1.23).
Runtime: not yet wired in picture.c — next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 06:46:11 +00:00
claude-noether 16b397305d fresnel-fourier iter4 Phase 6 commit A: VP9 enumeration + dispatch in config.c
3 sites:
1. RequestQueryConfigProfiles: probe V4L2_PIX_FMT_VP9_FRAME against
   single + MPLANE OUTPUT formats; advertise VAProfileVP9Profile0.
2. RequestCreateConfig: VAProfileVP9Profile0 case (no profile-specific
   validation; defer to vaCreateContext / control submission time).
3. RequestQueryConfigEntrypoints: add VAProfileVP9Profile0 to the
   VAEntrypointVLD fall-through.

Verified on fresnel: vainfo (auto-detect rkvdec) now shows
VAProfileVP9Profile0 alongside H.264x5 + HEVCMain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 06:34:14 +00:00
claude-noether 7f8fa93213 fresnel-fourier iter4 Phase 6 commit Z: device-path auto-detect via media controller topology
Pre-iter4 backend hardcoded /dev/video0 + /dev/media0 as defaults
when no env override was set. Linux 7.0 udev/probe order changed,
rockchip-rga (RGB color converter, no codec) now claims
/dev/video0 — legacy default returns empty profile list.

Discovery is driven by the media controller graph (the canonical
v4l2-request approach). NOT a /dev/video* walk by enumeration
order — that mispairs video and media nodes when one driver
registers multiple media devices, and depends on probe-order
luck.

Algorithm:
  1. Walk /dev/media0..15. MEDIA_IOC_DEVICE_INFO names the driver.
     Match against {rkvdec, hantro-vpu, cedrus, sun4i_csi}.
  2. MEDIA_IOC_G_TOPOLOGY enumerates the entity/interface graph.
     The MEDIA_INTF_T_V4L_VIDEO interface carries major:minor of
     the V4L2 video node owned by THIS media controller — paired
     by the kernel, not by /dev/* enumeration order.
  3. Resolve major:minor to /dev/videoN via /sys/dev/char/<M>:<N>
     (the kernel's char-device sysfs symlink whose basename is
     the device node name).

LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 escape hatch reverts to legacy
/dev/video0 + /dev/media0 hardcoded behavior for callers that
depended on it.

Phase 5 C4 amendment: walk-and-pick-first selects rkvdec on RK3399
(rkvdec's media controller enumerates before hantro's). H.264 /
HEVC / VP9 (rkvdec codecs) work without env override after this
commit. MPEG-2 / VP8 (hantro) still require explicit
LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 override; full
multi-decoder dispatch is iter4-B1 backlog item.

Verified empirically on fresnel (linux-fresnel-fourier 7.0-1):
- vainfo (no env) -> "auto-selected codec device: /dev/video1 +
  /dev/media0", enumerates H264*5 + HEVCMain (rkvdec) — paired
  via topology graph, not /dev/video* enumeration.
- vainfo NO_AUTODETECT=1 -> empty list (legacy /dev/video0 = rga).
- vainfo with explicit /dev/video3 + /dev/media1 -> MPEG2*2 + VP8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 06:05:41 +00:00
claude-noether e1aca9cc6b fresnel-fourier iter3 Phase 6 commit D: buffer.c whitelist for
VAProbabilityBufferType

Phase 2 source-read assumed buffer.c was type-agnostic ("the buffer
registry is type-agnostic" per phase2_iter3_situation.md non-bugs
list). FALSE. RequestCreateBuffer at buffer.c:59-70 has an explicit
allow-list switch:

  case VAPictureParameterBufferType:
  case VAIQMatrixBufferType:
  case VASliceParameterBufferType:
  case VASliceDataBufferType:
  case VAImageBufferType:
      break;
  default:
      return VA_STATUS_ERROR_UNSUPPORTED_BUFFERTYPE;

Without VAProbabilityBufferType in the allow-list, the consumer gets
VA_STATUS_ERROR_UNSUPPORTED_BUFFERTYPE on vaCreateBuffer for the
probability buffer, BEFORE codec_store_buffer is ever reached.
ffmpeg-vaapi log:

  [vp8] Failed to create parameter buffer (type 13): 15
        (the requested VABufferType is not supported).

Same iter1 Commit D pattern: Phase 2 grep didn't find this, runtime
enumerated authoritatively. Per memory feedback_header_deletion_
check.md ("let the compiler enumerate them") — but extended here:
runtime enumerates allow-list violations the same way the compiler
enumerates include-site violations.

Fix: add `case VAProbabilityBufferType:` to the buffer.c allow-list.
+1 line, mechanical.

Refs:
  ../fresnel-fourier/phase2_iter3_situation.md (incorrect non-bug
                                                 claim about buffer.c)
  ../fresnel-fourier/phase4_iter3_plan.md (Commit D placeholder for
                                            fix-forward — used)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 23:03:59 +00:00
claude-noether 7f84bbb50f fresnel-fourier iter3 Phase 6 commit C: picture.c VP8 dispatch + 4
buffer-type cases + new VAProbabilityBufferType outer case + per-
frame reset + surface.h params.vp8 union extension

Five sites in picture.c + one site in surface.h wire up the VP8
codec dispatcher introduced by commit B:

  1. Include #include "vp8.h" in the codec headers block.

  2. codec_set_controls: NEW case VAProfileVP8Version0_3 calling
     vp8_set_controls(driver_data, context, surface_object).
     Same shape as MPEG-2 + HEVC dispatch.

  3. codec_store_buffer VAPictureParameterBufferType: NEW VP8 case
     memcpy'ing into surface_object->params.vp8.picture
     (sizeof VAPictureParameterBufferVP8).

  4. codec_store_buffer VASliceParameterBufferType: NEW VP8 case
     memcpy'ing into surface_object->params.vp8.slice (single,
     no slices[] array — VP8 is frame-mode, no multi-slice).

  5. codec_store_buffer VAIQMatrixBufferType: NEW VP8 case
     memcpy'ing into surface_object->params.vp8.iqmatrix +
     setting iqmatrix_set true.

  6. codec_store_buffer NEW outer case VAProbabilityBufferType
     (Phase 5 C3: NOT VAProbabilityDataBufferType — that's the
     STRUCT name; the buffer-type enum constant is
     VAProbabilityBufferType = 13 per va.h:2058). Inner switch
     dispatches by profile, with VP8 case memcpy'ing into
     surface_object->params.vp8.probability + setting
     probability_set true.

  7. RequestBeginPicture: NEW per-frame reset for the two VP8
     flags — params.vp8.iqmatrix_set = false +
     params.vp8.probability_set = false. Mirrors the existing
     iter1 (h264.matrix_set) + iter2 (h265.num_slices) per-frame
     resets.

surface.h extension:

  8. params union: NEW vp8 struct after h265 — holds the 4 VAAPI
     buffer-type structs (VAPictureParameterBufferVP8,
     VASliceParameterBufferVP8, VAIQMatrixBufferVP8 + iqmatrix_set,
     VAProbabilityDataBufferVP8 + probability_set).

The NEW vp8 union member adds ~5300 bytes (sizeof
VAProbabilityDataBufferVP8 dominated by dct_coeff_probs[4][8][3]
[11] = 1056 + bookkeeping). The h265 member with slices[64] array
remains the largest (~17 KB), so the union size doesn't grow.

After this commit: backend builds clean, links cleanly. mpv-vaapi
VP8 decode should engage end-to-end on hantro env binding. Phase
1 criteria 1 + 2 + 3 expected satisfied; criterion 4 (HW=SW byte-
identical) and criterion 5 (3-codec regression) verified at Phase
6 smoke + Phase 7.

Refs:
  ../fresnel-fourier/phase4_iter3_plan.md (Commit C site list)
  ../fresnel-fourier/phase2_iter3_situation.md (B6, B7, B8, B9
                                                 bug enumeration)
  ../fresnel-fourier/phase5_iter3_review.md (C3 VAProbabilityBuffer
                                              Type rename
                                              empirically verified)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:52:24 +00:00
claude-noether 017e27f389 fresnel-fourier iter3 Phase 6 commit B: NEW src/vp8.c + src/vp8.h
+ meson.build VP8 entries

Net-new VP8 codec dispatcher implemented against
V4L2_CID_STATELESS_VP8_FRAME (kernel UAPI <linux/v4l2-controls.h>:
1900-1958). Single batched control per frame, no init-time device-
wide menus (VP8 has no DECODE_MODE/START_CODE).

Per-frame submission: ONE VIDIOC_S_EXT_CTRLS, count=1, with full
v4l2_ctrl_vp8_frame struct (1232 bytes — corrected vs Phase 2
implicit ~400 estimate; entropy.coeff_probs[4][8][3][11] alone is
1056 bytes).

vp8_set_controls() implements 10 contract clauses per
phase4_iter3_plan.md:

  Clause 1: single-control batched submission (count=1)
  Clause 2: stack alloc + memset zero (covers all padding)
  Clause 3: width/height/version/per-frame scalars; off-by-one
            num_dct_parts = num_of_partitions - 1
  Clause 4: DPB timestamp resolution (3 refs: last/golden/alt;
            NULL surface → 0-sentinel via memset; mirrors iter1
            mpeg2.c::pic.forward_ref_ts)
  Clause 5: loop filter (6 fields + 3 flag bits; ADJ_ENABLE/
            DELTA_UPDATE/FILTER_TYPE_SIMPLE)
  Clause 6: quant base + delta derivation from VAAPI's per-segment
            absolute index matrix (subtraction recovers signed
            deltas; correct for typical content per Phase 5 S1)
  Clause 7: segment fields (segment_probs direct copy; flags
            assembled with DELTA_VALUE_MODE set unconditionally
            per FFmpeg pattern)
  Clause 8: entropy table — 3 VAAPI sources merged (Picture: y_mode +
            uv_mode + mv_probs; ProbabilityData: coeff_probs[4][8][3]
            [11] direct memcpy; IQMatrix: quant)
  Clause 9: coder state + first-partition fields + flags assembly
  Clause 10: v4l2_set_controls submission

Phase 5 review amendments incorporated:

  C1 first_part_header_bits = slice->macroblock_offset
     NOT 0 — kernel hantro_g1_vp8_dec.c:260 + rockchip_vpu2_hw_vp8_
     dec.c:372 read this field unconditionally to compute the MB-
     data DMA offset. Verified via source identity: vaapi_vp8.c:204
     and v4l2_request_vp8.c:83 use byte-identical formulas
     (8 * (input - data) - bit_count - 8); VAAPI exposes via
     slice->macroblock_offset, V4L2 names it first_part_header_bits.

  C2 first_part_size = slice->partition_size[0] +
                       ((macroblock_offset + 7) / 8)
     VAAPI's partition_size[0] is the REMAINING bytes after parsing
     (vaapi_vp8.c:209; va_dec_vp8.h:193-196). Kernel needs the
     TOTAL control partition size; recover by adding back ceil
     (macroblock_offset/8) bytes.
     Phase 3 keyframe verbatim cross-check: 21923 + 819 = 22742 ✓

  C4 (int8_t) cast (NOT (s8); s8 is kernel-internal typedef from
     <linux/types.h> not exposed to userspace; userspace UAPI
     exposes __s8 with double-underscore; portable userspace cast
     is int8_t from <stdint.h>).

  S3 assert(probability_set) — kernel hantro_vp8.c::hantro_vp8_
     prob_update reads coeff_probs unconditionally; NO default-
     table fallback. Practical risk low (FFmpeg vaapi_vp8.c always
     sends VAProbabilityBufferType per frame), but assert surfaces
     immediately if a future consumer doesn't.

Flags assembly: 6 mainline-documented bits only (KEY_FRAME, SHOW_
FRAME, MB_NO_SKIP_COEFF, SIGN_BIAS_GOLDEN, SIGN_BIAS_ALT). EXP +
bit 0x40 NOT replicated despite ffmpeg-v4l2-request-git setting
them on inter frames — kernel hantro_vp8.c only inspects KEY_FRAME
bit. SHOW_FRAME forced unconditional per Phase 3 Q4 (BBB has no
alt-ref invisible frames; documented fidelity gap).

VAAPI inverts: key_frame=0 means it IS a keyframe per VP8 spec.
Backend writes V4L2_VP8_FRAME_FLAG_KEY_FRAME iff
!picture->pic_fields.bits.key_frame.

After this commit alone: vp8.o compiles standalone; meson.build
links it into the shared library. picture.c can't dispatch yet
(commit C wires that).

Refs:
  ../fresnel-fourier/phase4_iter3_plan.md (10 contract clauses,
                                            Phase 5 amendments
                                            section)
  ../fresnel-fourier/phase5_iter3_review.md (C1, C2, C3, C4, S3
                                              all incorporated)
  ../fresnel-fourier/phase3_iter3_baseline.md (verbatim payload
                                                anchors)
  references/ffmpeg-kwiboo/libavcodec/v4l2_request_vp8.c (V4L2 ref)
  references/ffmpeg-kwiboo/libavcodec/vaapi_vp8.c (VAAPI source ref)
  references/linux-mainline/drivers/media/platform/verisilicon/
    hantro_g1_vp8_dec.c (RK3399 kernel driver — first_part_header_
    bits + first_part_size usage)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:51:12 +00:00
claude-noether 27d82e3cf4 fresnel-fourier iter3 Phase 6 commit A: VP8 enumeration + dispatch in config.c
Three sites enabling VP8 profile recognition through the libva config
path:

  1. RequestQueryConfigProfiles: NEW enumeration block probing
     V4L2_PIX_FMT_VP8_FRAME against single + MPLANE OUTPUT formats.
     Mirrors iter2 HEVC enumeration block. Surfaces VAProfileVP8
     Version0_3 in vainfo on hantro env binding.

  2. RequestCreateConfig: NEW case VAProfileVP8Version0_3 with
     break — same shape as iter1 MPEG-2 + iter2 HEVCMain (no
     profile-specific config validation in the libva backend;
     validation deferred to vaCreateContext / control submission).

  3. RequestQueryConfigEntrypoints: VAProfileVP8Version0_3 added to
     the existing fall-through case list — surfaces VAEntrypointVLD.

After this commit alone, vainfo lists VP8Version0_3 (Phase 1
criterion 1) but vaCreateContext / runtime decode would fail at
later stages because no codec dispatcher exists yet (added in
commit B + C).

Refs:
  ../fresnel-fourier/phase4_iter3_plan.md (Commit A site list)
  ../fresnel-fourier/phase2_iter3_situation.md (B1, B2, B3
                                                  bug enumeration)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:49:28 +00:00
claude-noether 8d71e20bf7 fresnel-fourier iter2 Phase 6 commit B: rewrite h265.c against new V4L2 stateless HEVC API
Rewrites src/h265.c (407 lines → 588 lines) and the picture.c HEVC
dispatch + per-slice accumulation against the modern split V4L2_CID_
STATELESS_HEVC_{SPS,PPS,SLICE_PARAMS,SCALING_MATRIX,DECODE_PARAMS,
DECODE_MODE,START_CODE} stateless controls. Replaces the staging-era
V4L2_CID_MPEG_VIDEO_HEVC_{SPS,PPS,SLICE_PARAMS} CIDs that were
removed from the kernel UAPI.

Per-frame submission: ONE batched VIDIOC_S_EXT_CTRLS, count=5,
ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS:

  0xa40a90 SPS            (40  bytes)
  0xa40a91 PPS            (64  bytes)
  0xa40a92 SLICE_PARAMS   (variable; dynamic-array; one entry per slice)
  0xa40a93 SCALING_MATRIX (1296 bytes; memset-zero when no scaling list)
  0xa40a94 DECODE_PARAMS  (328 bytes; per-frame DPB info)

Plus device-wide menus set once at context.c init (separate batched
S_EXT_CTRLS call so a kernel without HEVC controls — e.g. hantro on
RK3568/RK3399 — silently fails its batch without invalidating H.264):

  0xa40a95 DECODE_MODE  (FRAME_BASED on rkvdec)
  0xa40a96 START_CODE   (ANNEX_B on rkvdec)

Reference: FFmpeg libavcodec/v4l2_request_hevc.c:505-565
           (v4l2_request_hevc_queue_decode batched submission shape).

Phase 5 review amendments incorporated:

  C1 (data_byte_offset NOT data_bit_offset):
    Old h265.c at lines 184-209 ran an 8-bit search to compute
    bit-granularity offset. New API renames the field to
    data_byte_offset (u32 byte offset). Bit-search dropped; replaced
    with plain byte offset = source_offset + slice->slice_data_byte_offset.

  C2 (dpb_entry.flags only LONG_TERM_REFERENCE; pic_order_cnt_val
      singular; poc_st_curr_*[] arrays hold DPB INDICES not POC):
    h265_fill_decode_params replaces old slice-params DPB iteration
    with explicit DPB classification + index-array population.
    For each VAAPI ReferenceFrames[i]:
      - Classify into ST_CURR_BEFORE / ST_CURR_AFTER / LT_CURR via
        VA_PICTURE_HEVC_RPS_* flags.
      - Set dpb[j].timestamp, .pic_order_cnt_val (singular), .field_pic.
      - Set dpb[j].flags = LONG_TERM_REFERENCE iff RPS_LT_CURR.
      - Append j (DPB index, u8) to poc_st_curr_before[k] /
        poc_st_curr_after[k] / poc_lt_curr[k] based on classification.

  C3 (union-aliasing reasoning corrected):
    BeginPicture's params.h265.num_slices = 0 reset is benign for
    non-HEVC profiles because byte ~17764 of the params union is past
    any field non-HEVC profiles read, NOT because RenderPicture's
    per-buffer copies overwrite that location. Wording amended in
    phase4_iter2_plan.md per phase5_iter2_review.md.

  S1 (PPS flags 19 + 20 — DEBLOCKING_FILTER_CONTROL_PRESENT and
      UNIFORM_SPACING):
    Empirically VAAPI does NOT expose either flag in the
    VAPictureParameterBufferHEVC pic_fields.bits or
    slice_parsing_fields.bits. Both bits left zero. BBB-720p10s_hevc
    fixture uses neither tiles nor explicit deblocking-control
    parameters, so the omission is correct for the iter2 binding cell.

  S2 (3 PPS scalars added):
    pic_parameter_set_id (default 0; VAAPI doesn't expose),
    num_ref_idx_l0_default_active_minus1, num_ref_idx_l1_default_
    active_minus1 (both populated from VAAPI picture struct).

  Q2 (slice_segment_addr populated):
    Was missing in old h265.c. Now sourced from
    VAAPI's slice->slice_segment_address.

  S3 (SCALING_MATRIX content choice):
    Implementer choice taken: when iqmatrix_set==false (BBB has no
    scaling list per SPS flags = SAO|STRONG_INTRA_SMOOTHING),
    h265_fill_scaling_matrix sends memset-zero. Matches FFmpeg's
    sl=NULL pattern at v4l2_request_hevc.c:384-403 (preserves
    byte-equality vs cross-validator anchor).

  S4 (FFmpeg function name fix): cosmetic; no code impact.

Plus one Phase 6 inline correction: phase 5 review S1 suggested
VAAPI exposes uniform_spacing_flag in pic_fields.bits; empirical
test-compile shows it doesn't. Comment added in h265_fill_pps
documenting the omission.

Picture.c changes (3 edits):

  1. codec_set_controls HEVCMain dispatch (lines 204-206 → call
     h265_set_controls; replaces explicit Fourier-local: HEVC stripped
     reject).
  2. codec_store_buffer HEVC VASliceParameterBufferType case: append
     VAAPI slice param to params.h265.slices[N] array, increment
     num_slices. Single-slice mirror at .slice retained for
     h265_fill_pps (which reads dependent_slice_segment_flag from
     LongSliceFlags).
  3. RequestBeginPicture: add params.h265.num_slices = 0 reset
     alongside existing h264.matrix_set = false reset.

Surface.h: extend params.h265 struct with slices[HEVC_MAX_SLICES_PER_
FRAME=64] array + num_slices counter. ~17 KB extra per surface union;
24 surfaces in iter7 cap_pool = ~400 KB total surface_heap growth.
object_heap allocator picks up new size automatically via
sizeof(struct object_surface).

Context.c: separate 2-control batched call sets HEVC DECODE_MODE +
START_CODE device-wide. Same best-effort (void)v4l2_set_controls
pattern as the existing H.264 device-init block; if kernel doesn't
advertise HEVC controls (hantro on RK3568/RK3399), the batch silently
fails without invalidating the H.264 batch.

Meson.build: uncomment 'h265.c' (line 50) and 'h265.h' (line 73)
in sources + headers lists.

H265.h: added HEVC_MAX_SLICES_PER_FRAME=64 #define before struct
forward declarations.

Phase 6 smoke test on fresnel (post Commit A + Commit B):

  Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec env binding
              (/dev/video1 + /dev/media0). PASS.

  Criterion 3: ffmpeg -hwaccel vaapi HEVC decode of bbb_720p10s_hevc.mp4
              -frames:v 5 -f null -, exit 0. cap_pool_init: 24 slots
              ready. PASS.

  Criterion 4: mpv --hwdec=vaapi --vo=image at +02s seek, HEVC fixture:
    HW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
    SW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
    HW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
    SW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
    HW=SW byte-identical for both frames; frame1 != frame2 (real motion).
    PASS.

  Criterion 5: regression hashes hold for both prior cells:
    H.264 +30s HW frame 1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9 (T4 ref MATCH)
    H.264 +30s HW frame 2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8 (T4 ref MATCH)
    MPEG-2 +02s HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 (iter1 ref MATCH)
    MPEG-2 +02s HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de (iter1 ref MATCH)
    PASS.

All five criteria green on first build attempt — Phase 5 review
caught the 3 Critical UAPI errors (data_bit_offset → data_byte_offset
rename; dpb.rps field gone + pic_order_cnt_val rename + index-array
semantics) that would have been Phase 6 compile failures or silent
Phase 7 byte-compare divergences. Without that review pass, this
commit would have been the start of a 2+ loopback debugging cycle.

Refs:
  ../fresnel-fourier/phase4_iter2_plan.md (10 contract clauses,
                                            File 4 patch shape)
  ../fresnel-fourier/phase5_iter2_review.md (C1, C2, C3, S1, S2,
                                              S3, S4, Q2 amendments
                                              all incorporated)
  ../fresnel-fourier/phase0_evidence/2026-05-08/iter2_phase3/
    ffmpeg_v4l2req.stdout (cross-validator anchor — Phase 7
    bonus byte-compare verification target)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:58:34 +02:00
claude-noether cca539d5f9 fresnel-fourier iter2 Phase 6 commit A: config.c break for HEVCMain case
RequestCreateConfig dispatches H.264 + MPEG-2 cases via break.
HEVCMain previously fell through to default returning
VA_STATUS_ERROR_UNSUPPORTED_PROFILE (= 12). Same fall-through
pattern iter1 fixed for MPEG-2; iter2 closes the loop for HEVC.

Add break for VAProfileHEVCMain. Same shape as iter1 Commit A
pattern — no profile-specific config validation in
RequestCreateConfig (validation happens at vaCreateContext /
control submission time).

This is the substrate fix only. After this commit:
  - vaCreateConfig(VAProfileHEVCMain) returns SUCCESS
  - mpv-vaapi HEVC ATTEMPTS to set up the hwaccel path
  - codec_set_controls at picture.c:204-206 still has the
    explicit case VAProfileHEVCMain: return UNSUPPORTED_PROFILE
    reject in place
  - decode fails downstream with -5 (Input/output error)

Bug 2 (picture.c reject removal) + Bug 3-7 (h265.c rewrite +
meson re-enable + slice_params accumulation + device-init
extension) land together in commit B, where h265_set_controls
exists to dispatch to.

Verified empirically Phase 3 Baseline D (scratch test on
throwaway branch): with this break alone, vaCreateConfig
SUCCESS for HEVCMain, V4L2 setup proceeds, decode fails at
the picture.c reject — confirms Phase 2 prediction. T4 H.264
+ iter1 MPEG-2 reference hashes hold (no collateral
regression).

Refs:
  ../fresnel-fourier/phase0_findings_iter2.md (Phase 1 lock)
  ../fresnel-fourier/phase2_iter2_situation.md Bug 1
  ../fresnel-fourier/phase3_iter2_baseline.md Baseline D
  ../fresnel-fourier/phase4_iter2_plan.md Clause 8, File 1
  ../fresnel-fourier/phase5_iter2_review.md (no Critical findings
                                              touch this commit)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:00:30 +02:00
claude-noether 229d6d11be fresnel-fourier iter1 Phase 6 commit D: drop missed mpeg2-ctrls.h include from context.c
Fix-forward for commit C (3aab187): Phase 2 source-read missed a
third occurrence of #include <mpeg2-ctrls.h> in src/context.c:42.
The Phase 2 grep audit reported only two callsites
(src/config.c:37, src/mpeg2.c:38), both removed in commit B.
After commit C deleted include/mpeg2-ctrls.h from disk, the build
broke on context.c with:

  ../src/context.c:42:10: fatal error: mpeg2-ctrls.h:
  No such file or directory
     42 | #include <mpeg2-ctrls.h>
        |          ^~~~~~~~~~~~~~~

The include in context.c was vestigial — context.c references no
V4L2_CID_MPEG_VIDEO_MPEG2_* symbols and never needed the header
even before iter1's rewrite. The Phase 2 grep was simply incomplete.

This commit drops the orphan include line. Build now passes; install
clean; Phase 1 criterion 4 (DMA-BUF GL HW=SW byte-identical pixel
hashes) still PASS:

  HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
  SW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
  HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
  SW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de

Per feedback_dev_process.md Phase 6 discipline:
"If a plan revision is needed mid-implementation, surface it
explicitly and re-enter Phase 4."

This is a 1-line scope expansion of commit B's "drop mpeg2-ctrls.h
include from all callsites" intent. Surfacing explicitly here
rather than silently amending B (which is already pushed). No
re-lock of plan needed; the spirit of File 1+2 in
phase4_iter1_plan.md was "drop the include from every file that
has it." The audit method (Phase 2 grep) was the gap.

Lesson for Phase 8 memory update: a more authoritative completeness
check than naive grep before deleting a header — recursive build
attempt to drive out hidden includes, or grep with no path filter
would have caught it.

Refs:
  ../fresnel-fourier/phase4_iter1_plan.md (File 3 + audit)
  ../fresnel-fourier/phase2_iter1_situation.md Bug 3 (incomplete
                                                     audit)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 10:24:50 +02:00