Commit Graph

110 Commits

Author SHA1 Message Date
marfrit 407c7c56e1 iter39 Phase 4-6 LANDED on backend — Phase 7 awaiting fresnel power-on
Adds the iter39 sub-profile (H264 Hi10P + HEVC Main10) FR landing
materials and resumption sequence to the campaign repo.

- phase4_iter39_subprofile_plan.md: full Phase 4 plan with Phase 5
  sonnet-architect review amendments folded in. Documents the
  Option A/B/C/D scope tree, the locked Option C choice (full NV15→P010
  userspace unpack), the LOC breakdown (~180), and the test plan.
- phase7_iter39_test_rig.sh: end-to-end test script for fresnel. Encodes
  Hi10P + Main10 fixtures, runs libva vs kdirect bit-exact comparison
  (both via `-vf hwdownload,format=p010le` to normalize the NV15 stride
  difference between paths), SSIM_Y check vs SW reference, and verifies
  the iter38 5/5 baseline still holds.
- PRE_COMPACT_HANDOFF.md: TL;DR table row for iter39 (committed
  pending validation), Phase 7 resumption sequence, internals-summary
  for future-session resumption.

Backend tip: `662f887` (iter39 α-31) + `8746690` (unpack self-test) on
gitea master. Self-test passes on noether x86_64; compile-test clean on
boltzmann aarch64 native; self-review of commit vs Phase 5 amendments
APPROVED. Phase 7 actual decode test blocked on fresnel power-on.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:22:34 +00:00
marfrit e66c5c0583 Update handoff doc for final iter38 close — 5/5 PASS in single libva session 2026-05-14 19:23:09 +00:00
marfrit ba4b6fd240 iter38 close: multi-device probe — 5/5 codecs in one libva session
Closes the last architectural open item from iter37 campaign-close.

VA_DRIVER_INIT now opens BOTH rkvdec and hantro fds when no env
override is set. RequestQueryConfigProfiles enumerates the union
(via any_fd_supports_output_format helper). RequestCreateConfig
calls request_switch_device_for_profile to retarget the active fd
to the device that serves the profile; tears down output_pool +
capture_pool + video_format cache on switch so the next
RequestCreateContext rebuilds them on the new device.

Bonus iter38b: fixed latent bounds-check bug in
RequestQueryConfigProfiles — profile array bounds checks should use
MAX_PROFILES (11), not MAX_CONFIG_ATTRIBUTES (10). Pre-iter38 single-
device probes never returned more than 9 profiles so the off-by-one
never bit. iter38's 10-profile union surfaced it.

Result: 5/5 codecs PASS bit-exact vs kdirect with NO env override.

  $ vainfo (no env)
  v4l2-request: auto-selected: /dev/video3 + /dev/media1
  v4l2-request: iter38: also opened hantro at /dev/video2 + /dev/media0
  → all 10 profiles enumerated (MPEG2+H264+HEVC+VP8+VP9)

Backend tip 7ac934e.
2026-05-14 18:58:17 +00:00
marfrit 7e3eadf983 iter36 close: env-gated DIAG removed, 5/5 PASS retained
Backend cleanup: -131 lines (iter29/30/33/35 env-gated dump probes).
Load-bearing fixes (α-25, α-29, α-30) untouched. Framework env knobs
(LIBVA_V4L2_DUMP_OUTPUT, etc.) retained.

5-codec regression post-cleanup: all 5 PASS with identical hashes to
pre-cleanup. Backend tip 7db15a5.
2026-05-14 18:13:39 +00:00
marfrit 7c06c519e7 iter35 close: MPEG-2 verified libva-correct; HW IDCT precision intrinsic
Investigation requested via /user 'mpeg2 next' after iter33 closed
MPEG-2 at libva==kdirect with libva!=SW gap. iter35 verifies:

1. libva==kdirect holds across:
   - BBB 720p MPEG-2 240F (sha a75015b10fe205ea)
   - Synthetic testsrc 720p MPEG-2 48F (sha 5557dae9fa99d01b)
   Rules out fixture coincidence.

2. v4l2 ctrl structs (SEQUENCE + PICTURE + QUANTISATION) semantically
   correct per V4L2 spec via iter35 env-gated DIAG dump. Intra Q
   matrix in zigzag scan order; flags = PROGRESSIVE+FRAME_PRED_DCT.

3. HW vs SW gap quantified:
   - mean byte-diff = 0.015, max = 3
   - SSIM = 0.999923 (visually identical)
   Within IEEE 1180 / ISO 13818-2 Annex A MPEG-2 IDCT precision
   tolerance. Not fixable at V4L2/libva layer.

ffmpeg's own SW IDCT options (-idct auto/int/simple/...) produce 4
distinct hashes for the same bitstream — libavcodec internally has
the same precision divergence between impls. Hantro HW IDCT is yet
another impl, not matching any SW.

Backend tip 48fd028 (added env-gated LIBVA_MPEG2_DUMP_FRAME).
Memory: feedback_mpeg2_hw_sw_idct_precision.md added.
2026-05-14 17:58:31 +00:00
marfrit 7ef4ac234b Final handoff doc: campaign closed at 5/5 PASS on 7.0-14 clean kernel 2026-05-14 16:52:18 +00:00
marfrit 70ddbd6c4b iter34 close: kernel 7.0-14 CLEAN ship — 5/5 codecs PASS
iter33 VP8 diagnostic printks reverted from rockchip_vpu2_hw_vp8_dec.c
and hantro_g1_vp8_dec.c. 6 base patches retained (DTS + dma_resv +
hantro/rga fences).

Post-reboot 5-codec regression on 7.0-14:
- h264   libva=dd4f5f2d552c07bc L==K L==SW
- hevc   libva=108f925bb6cbb6c9 L==K L==SW
- vp9    libva=cf35908ae0f9ab60 L==K L==SW
- vp8    libva=d3231e5b6c0ee10b L==K L==SW
- mpeg2  libva=95c5905890c937d4 L==K L!=SW (HW IDCT ≤1 LSB precision)

dmesg | grep iter — empty. Substrate clean.

Note: /dev/video* and /dev/media* device numbers SHIFTED between
kernel boots (rkvdec /dev/video1→/dev/video3, hantro /dev/video3→
/dev/video2 between 7.0-13 and 7.0-14). Updated memory entry to
flag this.
2026-05-14 16:50:59 +00:00
marfrit cd2d077cb6 iter33: MPEG-2 closed (libva==kdirect bit-exact) — 5/5 codecs PASS
After VP8 fix landed, ran 5-codec libva-vs-kdirect anchor sweep.
All 5 codecs produce byte-identical libva and kdirect output:
  h264   sha=dd4f5f2d552c07bc
  hevc   sha=108f925bb6cbb6c9
  vp9    sha=cf35908ae0f9ab60
  vp8    sha=d3231e5b6c0ee10b
  mpeg2  sha=95c5905890c937d4

MPEG-2 HW output (libva and kdirect agree) differs from libavcodec
SW MPEG-2 by ~1 LSB per ~67 pixels — hantro IDCT precision artifact,
not a libva bug. Effectively 5/5 PASS for libva correctness contract.
2026-05-14 16:40:05 +00:00
marfrit 51eee192b8 iter33 α-30 close: VP8 FIXED — 4/5 codecs PASS
ffmpeg-vaapi strips the VP8 uncompressed frame header before
submitting VASliceData. Hantro hard-codes first_part_offset = 10
or 3 based on keyframe flag. libva must prepend matching placeholder
bytes. Backend commit 7e0848d.

3-codec rkvdec anchors unchanged (H264 + HEVC + VP9 all PASS).
VP8 newly PASS through hantro (env-override LIBVA_V4L2_REQUEST_VIDEO_PATH).
MPEG-2 surfaces as next codec — same hantro device, different bug.

Memory: feedback_vaapi_strips_vp8_uncompressed_header.md added.
2026-05-14 16:38:11 +00:00
marfrit acacf3d7eb iter32 close: kernel substrate cleanup landed → 7.0-11 SHIPPING
All iter17/20-31 diagnostic pr_info printks removed from
v4l2-ctrls-request.c + rkvdec-hevc.c. 6 base patches retained.

3-codec anchor regression on 7.0-11 post-reboot:
- H.264 10F: PASS (sha dd4f5f2d552c)
- HEVC 10F:  PASS (sha 108f925bb6cb)
- VP9 10F:   PASS (sha cf35908ae0f9)

Hashes identical to 7.0-10 — confirms no kernel-side regression
from removing printks. dmesg clean of iter* entries.

Memory entry reference_fresnel_kernel_substrate.md updated:
substrate now at 7.0-11; Bug 4/Bug 5 marked RESOLVED as libva-side
fixes (NOT kernel-side as originally hypothesised).
2026-05-14 15:41:05 +00:00
marfrit 85cc1781e1 Update campaign session doc: full-day arc closes at 3/3 PASS 2026-05-14 15:34:22 +00:00
marfrit fde8a25779 Update handoff doc: HEVC Bug 5 fully fixed (3/3 PASS) 2026-05-14 15:32:27 +00:00
marfrit c1f9738368 iter31 α-29 close: HEVC Bug 5 remainder FIXED — 3/3 PASS
Fix: slice_params.short_term_ref_pic_set_size = picture->st_rps_bits
(was 0, mis-routed via α-26 into decode_params with same field name).

Final 5-codec state:
- H.264 10F: PASS (byte-equal SW)
- HEVC 10F: PASS (byte-equal SW)  ← THIS ITER
- VP9 10F: PASS (byte-equal SW)
- MPEG-2 / VP8: untestable through libva single-device probe
  (pre-existing limitation, orthogonal to Bug 4/5)

Backend fork tip: 23eb1bd. Kernel: 7.0-10 (diagnostic printks still in,
production cleanup outstanding).
2026-05-14 15:30:48 +00:00
marfrit 422ecafca9 Add pre-compact handoff doc for session resumption 2026-05-14 14:55:15 +00:00
marfrit c15fc6c0f6 iter28b DIAG documented: universal trim=40 breaks IDR (reverted)
Confirmed the 40-byte inflation is non-uniform — IDR slice has correct
size from VAAPI; only P/B slices are inflated. Real fix requires dynamic
rbsp_stop_bit detection or per-slice-type logic.
2026-05-14 14:45:35 +00:00
marfrit 8b17bf797a Final session summary: H264 + VP9 + HEVC frame 1 byte-equal to SW
Bug 4 (H264 keyframe-partial): FIXED.
Bug 5 (HEVC libva all-zero): partial fix, frame 1 byte-equal.
Root cause: rkvdec_s_ctrl -EBUSY when first SPS triggers image_fmt
reset on busy CAPTURE queue (libva pre-allocates buffers at
CreateContext, kernel blocks the reset).
Fix: 90-LOC synthetic SPS injection in libva CreateContext before
cap_pool_init pre-seeds ctx->image_fmt.

Remaining: HEVC frame 2+ (ffmpeg-vaapi slice_data 40-byte inflation),
MPEG-2/VP8 (libva multi-device probe). Both deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 12:10:08 +00:00
marfrit 02c4192902 iter27/28: probe HEVC frame 2+ divergence; α-27/α-28 no-op; ffmpeg-vaapi slice_data inflation localized
α-27: num_entry_point_offsets — VAAPI returns 0, rkvdec doesn't use it
α-28: bit_size = (slice_data_size - data_byte_offset) * 8 — matches kdirect's
      printk value, but rkvdec doesn't use bit_size either. Output unchanged.

Remaining HEVC frame 2+ root cause: libva's slice_data buffer (from VAAPI)
is 40 bytes larger per slice than what ffmpeg-v4l2request appends from
libavcodec for the same frame. The trailing bytes inflate OUTPUT buffer
content → rkvdec reads past slice payload into garbage → frame 2+ wrong.

Campaign status: H264  (Bug 4 fixed), HEVC frame 1  (Bug 5 partial),
VP9 , HEVC frame 2+ ⚠️ (deferred to ffmpeg-vaapi-level fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:28:34 +00:00
marfrit bf67900cd8 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24:    pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
           to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
           before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
           ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
           still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
           HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
           picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
           dp now match kdirect. HEVC frame 2+ still diverges
           (separate bug, likely DPB entry mapping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:10:56 +00:00
marfrit a443ad73d3 iter19 Phase 8 close: mechanism 2 (REINIT) disproved; ctrl_hdl mismatch is sole remaining hypothesis
α-23 test (skip media_request_reinit): no change. HEVC still 06b2c5a0...
all-zero. Kernel printk still shows w=0 h=0 for libva.

Cumulative disproved mechanisms (iter17-iter19):
  2. REINIT clears between S_EXT_CTRLS and QUEUE: DISPROVED (α-23)
  3. Stale stack-local pointer: DISPROVED (α-21)
  5. Silent partial failure via error_idx: DISPROVED (α-22)
  1. request_fd mismatch: unlikely per strace evidence

Remaining:
  4. ctrl_hdl mismatch — libva submits to one v4l2_ctrl_handler,
     rkvdec reads from another.

iter20 candidate: kernel printk dumping &ctx->ctrl_hdl, per-ID
ctrl pointer, and *p_cur.p first bytes during rkvdec_hevc_run_preamble.
Comparing libva vs kdirect will pinpoint where the mismatch sits.

State at close: backend c1d4bb53... (iter15 stable). Fork tip 415688d.
5-codec anchors held. Diagnostic kernel 7.0-3 still running on fresnel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:04:11 +00:00
marfrit a449cec92e iter18 Phase 8 close: mechanisms 3 + 5 disproved; iter17 finding stands
α-21 (heap-persist HEVC controls past IOC_QUEUE): hash unchanged.
  -> Kernel does copy at S_EXT_CTRLS time, not deferred. Mechanism 3 dead.

α-22 (log error_idx after S_EXT_CTRLS): error_idx = count - 1 in BOTH
  the working device-init batch AND the broken per-frame batch. Not
  a failure indicator in this kernel version. Mechanism 5 dead.

Backend reverted to iter15 stable state c1d4bb53... All 5-codec
anchors preserved.

Remaining mechanisms (untested):
  1. request_fd mismatch (unlikely; strace shows consistent fd)
  2. REINIT clears controls between S_EXT_CTRLS and QUEUE (LEADING)
  4. ctrl_hdl mismatch (libva submits to one, rkvdec reads from another)

iter17's empirical finding still stands as the campaign's strongest
narrowing: rkvdec sees zero SPS for libva, correct for kdirect. The
mechanism is between S_EXT_CTRLS submission and ctx->ctrl_hdl->p_cur
read, specific to libva's invocation pattern.

iter19 candidate (α-23): test mechanism 2 by disabling
media_request_reinit() in libva's RequestSyncSurface. If hashes
change, REINIT timing is the bug. Alternative (mechanism 4): kernel
printk that dumps &ctx->ctrl_hdl + per-request handler pointer,
comparing libva vs kdirect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:02:19 +00:00
marfrit cbead4ec64 iter17 Phase 7: KERNEL PRINTK FINDS THE BUG — controls lost between S_EXT_CTRLS and rkvdec read
DEFINITIVE FINDING via pr_info in rkvdec_hevc_run on RK3399:

libva HEVC:    w=0 h=0 reorder=0 chroma=0 nal_unit_type=0 decode_flags=0x0
kdirect HEVC:  w=1280 h=720 reorder=2 chroma=1 nal_unit_type=20 decode_flags=0x3

The kernel sees ALL-ZERO control structs for libva HEVC, but CORRECT values
for kdirect. Same kernel, same code path, same /dev/video1, same
rkvdec_hevc_run_preamble fetching v4l2_ctrl_find(ctx->ctrl_hdl,
HEVC_SPS)->p_cur.p.

This overturns iter11-iter15's "wire-byte search exhausted" conclusion.
The S_EXT_CTRLS payloads ARE byte-correct at the strace observer level,
but the kernel sees zeros. The bug is in the
S_EXT_CTRLS -> request -> ctx->ctrl_hdl path, specifically for libva.

Five mechanisms hypothesized:
  1. request_fd mismatch
  2. REINIT clears controls before QUEUE
  3. Compound-control copy deferred until QUEUE -> stack-locals stale
  4. ctrl_hdl mismatch (libva submits to one, rkvdec reads another)
  5. error_idx silently fails

Key difference observed:
  libva stores SPS/PPS/decode_params as STACK LOCALS in h265_set_controls
  kdirect stores them in heap-allocated hwaccel_picture_private

Mechanism 3 (kernel defers compound-ctrl copy_from_user) is the leading
hypothesis. iter18 α-21: heap-allocate libva's HEVC control structs;
if Bug 5 fixes, apply same pattern to H.264 (Bug 4) and VP8 (Bug 6).

This is the strongest narrowing since iter5b-β.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:55:58 +00:00
marfrit 57051b665c iter17 Phase 0: kernel-side rkvdec_hevc_run diagnostic printk
Per iter16 close (Bug 4/5/6 confirmed kernel-side, libva byte-correct),
add a single pr_info at rkvdec_hevc_run entry dumping key state values
from run->sps / pps / slices_params[0] / decode_params. Build 7.0-3,
deploy, reboot, run libva-HEVC + kdirect-HEVC, diff dmesg output.

Outcome interpretations:
  identical -> bug is in rkvdec assemble_hw_*/config_registers/HW path
  different -> libva somehow leaks different struct contents via non-
                ioctl path despite identical V4L2 ioctls

Build running on boltzmann via kernel-agent workflow; pkgrel 7.0-2 -> 7.0-3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:44:57 +00:00
marfrit caf480ef71 iter16 Phase 8 close: VP8 OUTPUT byte-verified — Bug 4/5/6 same cause class
Applied iter14's α-16 OUTPUT byte verification to VP8. Result:
  libva VP8 frame 1 OUTPUT dump: 300614 bytes
  input IVF frame 1: 300624 bytes
  diff with +10-byte offset (VP8 uncompressed header stripped by VAAPI
  consumer client-side): 0 bytes differ.

Libva's VP8 OUTPUT bytes are byte-identical to the input frame minus
the 10-byte uncompressed header. Same correctness as iter14's HEVC
verification.

Cumulative finding: ALL THREE remaining campaign bugs (Bug 4 H.264
partial-fill, Bug 5 HEVC all-zero, Bug 6 VP8 partial) have:
- libva controls byte-equal to kdirect on rkvdec-read fields
- libva OUTPUT bitstream bytes byte-identical to input
- libva ioctl sequence structurally close to kdirect after iter15 α-19

But:
- VP9 + MPEG-2 work via the same libva backend on the same kernel.
- libva HEVC/H.264 hash to wrong output; kdirect HEVC/H.264 hash to
  correct output. Same kernel.

Therefore Bug 4 + 5 + 6 are kernel-side rkvdec/hantro per-codec bugs
specific to libva's ioctl pattern. Per
feedback_libva_byte_correct_kernel_bug.md (saved iter14), libva-side
changes are confirmed inert for these bugs.

iter17 productive direction: kernel-side investigation via
kernel-agent workflow. Read rkvdec source, instrument via ftrace/
eBPF kprobe, compare kernel state evolution between libva-trigger
and kdirect-trigger for same bitstream.

No code changes in iter16. Substrate unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:37:26 +00:00
marfrit 42c0515900 iter15 Phase 8 close: α-19 S_FMT CAPTURE wires up, 14 hypotheses eliminated
Phase 3 ioctl-sequence diff identified missing S_FMT CAPTURE in libva
init (only G_FMT was being called, per iter5b-β's hantro-targeted
comment). α-19 added explicit S_FMT CAPTURE with NV12 + dims after
S_FMT OUTPUT, before CREATE_BUFS. strace confirms libva now emits
identical S_FMT CAPTURE call to kdirect:
  S_FMT CAPTURE NV12 1280x720 -> sizeimage=1843200, bytesperline=1280

5-codec sweep on α-19 backend: byte-identical anchors. HEVC still
06b2c5a0... all-zero, H.264 still 71ac099b... partial. Wire correct,
behavior unchanged.

Cumulative iter8-iter15: 14 hypotheses eliminated for Bug 4 + 5. Libva
backend ioctl + payload sequence is now structurally equivalent to
kdirect's at every byte/field level rkvdec reads. Remaining diffs are
in allocation pattern (REQBUFS vs incremental CREATE_BUFS) and pool
sizes (libva 24+16, kdirect ~13+4) — high-risk to change without
clearer kernel evidence; VP9/MPEG-2 work with libva's pattern.

Bug 4 + 5 confirmed kernel-side rkvdec failures specific to HEVC +
H.264 paths on RK3399 that libva's pattern triggers and kdirect's
doesn't. Per-codec kernel-level investigation is the only productive
direction; route via kernel-agent.

α-19 ships as wire-correctness hygiene (zero regression). Backend
SHA c1d4bb53...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:35:37 +00:00
marfrit 18f24cd26d iter14 Phase 8 close: α-16 finds libva HEVC OUTPUT bytes BYTE-IDENTICAL to input
α-16 OUTPUT byte dump: libva HEVC frame 1 = 96893 bytes = 1 ANNEX-B
start code + 96890 byte IDR NAL with header 0x28 (nal_unit_type 20 =
IDR_N_LP, correct). Byte-compared against input file's raw HEVC
ANNEX-B stream (after VPS+SPS+PPS): 0 bytes differ over 96890 byte
overlap. The 1-byte tail diff is an inter-NAL boundary marker, not
slice payload.

Libva submits BYTE-IDENTICAL slice bytes as what the input contains
and what kdirect submits. Combined with iter11's wire-byte audit
showing every libva-vs-kdirect control diff is in a field rkvdec
ignores, AND iter12's RFC v2 substrate upgrade producing zero
codec-correctness change, AND iter13's DMA_BUF_IOCTL_SYNC ioctl
working but inert:

Cumulative iter8-iter14: 13 hypotheses eliminated. Libva backend
is empirically byte-correct on its side. Bug 4 + Bug 5 are
KERNEL-SIDE failures specific to how rkvdec processes the libva
ioctl sequence vs the kdirect sequence — NOT a libva backend bug.

iter15+ candidates:
  - Full ioctl-sequence trace diff (libva vs kdirect, find first
    divergence in syscall order/args).
  - kernel-side rkvdec ftrace/eBPF kprobe instrumentation; route
    via kernel-agent.
  - Campaign close-out: VP9+MPEG-2 PASS direct, HEVC+H.264+VP8 narrowed
    to kernel-side with byte-clean libva submission.

Backend SHA fa2098b6... 8 cumulative iter11-iter14 commits all ship
clean (wire-correctness, env-gated diagnostics, zero regression).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:29:10 +00:00
marfrit 2eaf737145 iter13 Phase 8 close: α-17 DMA_BUF_IOCTL_SYNC ioctls fire but hashes unchanged
α-17 implemented and deployed. strace confirms VIDIOC_EXPBUF +
DMA_BUF_IOCTL_SYNC(START|READ) before memcpy + END after, all returning 0.
The libva backend now follows the V4L2+dma-buf cache-sync contract
correctly. But 5-codec sweep hashes are byte-identical to anchors:
no Bug 4/5 movement.

Cache-sync hypothesis empirically falsified. Bug 4 + 5 are NOT a CPU
cache-coherency issue on the libva cached-mmap path.

Three consecutive PARTIAL closes (iter11 wire-byte, iter12 RFC v2,
iter13 cache-sync) confirms libva-backend-side hypothesis space for
Bug 4+5 is exhausted. The live source is kernel-side write-
completeness for HEVC and H.264 on RK3399 rkvdec — distinct from
cache visibility (γ dump iter8 already confirmed destination_data[]
post-DQBUF matches YUV output).

Backend SHA on fresnel: 9ba47002...

iter14 candidates:
  α-16: OUTPUT byte dump (cheapest remaining)
  kernel-side rkvdec audit (deepest; route via kernel-agent)
  pivot to Bug 6 VP8 or campaign close-out documentation

α-17 itself is real wire-correctness progress even as a non-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:08:11 +00:00
marfrit 33f74b07c8 iter12 Phase 8 close: kernel 7.0-2 with RFC v2 deployed; Bug 4/5 unchanged
Boltzmann built linux-fresnel-fourier 7.0-2 in ~50 min (8-core native,
no distcc). Package sha 843fd4462a09b3d9... Deployed to fresnel:
sudo pacman -U clean. extlinux hook updated entry. sddm autologin as
mfritsche persisted. Reboot succeeded; fresnel up on new kernel
within 30s.

5-codec sweep post-reboot: all 5 hashes BYTE-IDENTICAL to pre-iter12
anchors. RFC v2's dma_resv fence machinery does NOT engage libva's
cached-mmap pixel readback path. Consistent with what
reference_dmabuf_resv_blocker.md memo always said: vaDeriveImage /
cached-mmap is the broken path; RFC v2 helps DRM_PRIME / compositor
paths.

Substrate state moved forward (kernel 7.0-1 -> 7.0-2 with RFC v2).
Memory entries updated:
  reference_fresnel_kernel_substrate.md (pkg version + patch list)
  feedback_rfc_v2_vb2_dma_resv_scope.md (NEW — scope clarification)

iter13 candidates ranked:
  α-17: DMA_BUF_IOCTL_SYNC(START|END) in libva backend around image
        read sites (~30 LOC).
  α-18: switch libva image export to DRM_PRIME (larger refactor).
  α-16: OUTPUT byte dump (deferred again).

α-17 is the natural follow-on — Figa's 2024 "userspace responsibility
for explicit sync" line directly addresses the libva-cached-mmap path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 07:47:51 +00:00
marfrit de889898b8 iter12 Phase 4 + 6: integrate vb2_dma_resv RFC v2 into linux-fresnel-fourier 7.0-2
User signaled RFC v2 is prepared at boltzmann:~/v2-patch-work/v2-out/.
Three patches:
  0001 media: videobuf2: add opt-in dma_resv producer fence helper
  0002 media: hantro: attach dma_resv release fence at device_run
  0003 media: rockchip-rga: attach dma_resv release fence at ...

v2 key change vs v1: attach moves from buf_queue to m2m device_run
(Dufresne's finite-time-contract concern). Build the kernel package
on boltzmann (~/src/kernel-agent-bootstrap/.../linux-fresnel-fourier/),
deploy to fresnel, reboot, retest.

sddm auto-login as mfritsche staged in /etc/sddm.conf.d/20-autologin.conf
on fresnel before reboot per user authorization.

Phase 0's α-16 OUTPUT-byte dump candidate parked; kernel substrate
upgrade takes precedence given RFC v2 is the long-stalled
reference_dmabuf_resv_blocker.md unblock.

Iter12 outcomes:
  PASS  = Bug 4/5 hashes shift toward kdirect after reboot.
  PARTIAL = kernel upgraded cleanly, no regression, hashes unchanged.

Either outcome is valuable — substrate moves forward regardless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:56:15 +00:00
marfrit f40b025868 iter12 Phase 0: lock OUTPUT bitstream byte dump as next candidate
After iter11 close, both Bug 4 (H.264 partial fill) and Bug 5 (HEVC
all-zero) share the same architectural pattern: libva control payloads
can be made byte-equivalent to kdirect for fields rkvdec consumes,
yet libva produces wrong output while kdirect succeeds.

Remaining unexamined surface = OUTPUT bitstream bytes (source_data
that the kernel reads). iter12 candidate α-16: extend γ infra to
dump source_data pre-QBUF, compare with kdirect.

If bytes match → both bugs are outside libva (kernel/HW state).
If bytes differ → narrow to bitstream-write divergence site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:08:27 +00:00
marfrit 7807326aff iter11 Phase 8 close: HEVC wire-byte search exhausted; same wall as iter9
α-13 + α-14 changed two HEVC wire-byte fields to match kdirect
(sps_max_num_reorder_pics, decode_params IRAP|IDR flags). Output
unchanged (06b2c5a0... still all-zero). 5-codec regression sweep:
zero regression.

Cumulative iter11 eliminations: 4 fields (sps_max_num_reorder_pics,
sps_max_latency_increase_plus1, IRAP/IDR flags, num_entry_point_offsets)
all confirmed kernel-ignored on RK3399 per rkvdec-hevc.c grep.

Wire-byte landscape after iter11: every observable libva-vs-kdirect
HEVC control-payload diff is in a field rkvdec ignores. Bug 5 root
cause is NOT in S_EXT_CTRLS payload.

Same wall as Bug 4 / iter9: wire-byte search exhausted. Real cause is
in OUTPUT bitstream bytes the kernel reads. iter12 candidate: extend
γ infrastructure to dump source_data pre-QBUF, compare with kdirect
byte-by-byte. Bug 4 and Bug 5 likely both close via this same
instrumentation given the parallel structure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:07:35 +00:00
marfrit 7a1bd8ec0a iter11 Phase 5: α-13 inert; pivot to α-14 num_entry_point_offsets
Reviewer empirically read rkvdec-hevc.c on boltzmann kernel-agent
tree. sps_max_num_reorder_pics is NOT read by rkvdec. α-13 would
match kdirect's wire bytes but produce no behavioural change.

CRIT-2: num_entry_point_offsets (libva hardcoded 0 at h265.c:356,
kdirect 22 from slice header parse) + PPS UNIFORM_SPACING flag are
the live candidates. BBB HEVC uses WPP (ENTROPY_CODING_SYNC flag
set in PPS) not tiles; 22 entry points = 23 CTB rows for 720p with
32-pixel CTBs.

Decision: land α-13 as wire-correctness hygiene (matches kdirect,
no regression risk), then α-14 for the actual fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 02:06:37 +00:00
marfrit a18ba53d6b iter11 Phase 3 + 4: HEVC SPS wire-byte diff narrows Bug 5 to α-13
Phase 3 deep strace: only meaningful SPS diff is bytes 10-11.
  libva   bytes 10-11 = 00 00 (sps_max_num_reorder_pics=0, latency=0)
  kdirect bytes 10-11 = 02 04 (reorder=2, latency=4)

Hardcoded at h265.c:110-111 with comment "/* not exposed */". VAAPI's
VAPictureParameterBufferHEVC doesn't forward these; kdirect parses
SPS NAL directly. sps_max_num_reorder_pics = 0 tells rkvdec "no
reordering" -> B-frame decode blocked -> all-zero output (Bug 5 fits).

Secondary diffs (Phase 4b candidates if α-13 doesn't close):
  - SLICE_PARAMS num_entry_point_offsets = 0 (hardcoded at h265.c:356
    with "iter2 doesn't do tiles" comment); kdirect submits 22.
  - PPS UNIFORM_SPACING flag bit 20 (don't-care for non-tiled).

Phase 4 α-13: ~2 LOC fix. Set sps_max_num_reorder_pics =
sps_max_dec_pic_buffering_minus1 (safe upper bound per H.265 §A.4.2).
Leave sps_max_latency_increase_plus1 = 0 (spec "unconstrained").

Phase 5b review required before Phase 6b implementation per
"reviews never skippable".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:58:03 +00:00
marfrit 5f94d7a9ae iter10 close + iter11 Phase 0: pivot to HEVC wire-byte diff for Bug 5
iter10 closed negative at Phase 0 (Bommarito unreachable on RK3399).
Saved kernel build + reboot cycle by source-tree reachability check.

iter11 opens with Bug 5 (HEVC libva all-zero) as research target.
Replay iter8/iter9 methodology: deep strace HEVC libva vs kdirect,
decode V4L2_CID_STATELESS_HEVC_* control payload bytes, find the
diff that causes rkvdec to produce all-zero output for libva while
kdirect's submission produces correct decode.

In scope: src/h265.c (libva HEVC), Phase 3 strace + byte-decode.
Out of scope: ext_sps_st/lt_rps (VDPU381/383-only, not RK3399),
kernel patches until empirical evidence of a kernel-side gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:37:45 +00:00
marfrit 917e9b2691 iter10 Phase 0: Bommarito patch unreachable on RK3399 — close Phase 0 negative
Empirical reachability check on linux-fresnel-fourier 7.0-1 source tree
at boltzmann:~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/.../linux-7.0/.

rkvdec_hevc_assemble_hw_rps() is defined in rkvdec-hevc-common.c:411 and
called ONLY from rkvdec-vdpu381-hevc.c:609 (RK3576) and
rkvdec-vdpu383-hevc.c:620 (RK3588). RK3399's variant_ops bind to
rockchip,rk3399-vdec and route HEVC through the older standalone
rkvdec-hevc.c, which does NOT call rkvdec_hevc_assemble_hw_rps.

Bommarito's May 13 patch is real and load-bearing on RK3588/3576,
but inert on RK3399 / fresnel. Not iter10 vehicle for Bug 5.

Saved a kernel build/reboot cycle by Phase-0 reachability check.

Memory rule candidate: before applying any upstream patch to fresnel's
kernel, verify the patched path is reachable from rockchip,rk3399-vdec.
mainline rkvdec has diverging per-variant code (VDPU381/383 vs RK3399
legacy).

iter10 candidate pivots:
- α-10: audit rkvdec-hevc.c (RK3399 legacy) for analogous OOB gaps;
  same KUnit/KASAN methodology Bommarito used. Route via kernel-agent
  per user directive.
- α-12: stay on Bug 4 H.264 (PPS deep diff / OUTPUT bitstream).

User directive registered: "consult kernel-agent for kernel work."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:34:20 +00:00
marfrit 5e2a228cfd iter9 Phase 8 close: α-7 inert as predicted; wire-byte search exhausted
α-7 (monotonic timestamp counter) changed wire bytes but H.264 output
unchanged (71ac099b...). Confirms Phase 5 CRIT-1 prediction: VP9/MPEG-2
PASS via libva with the same v4l2_timeval_to_ns(&ref->timestamp)
pattern; therefore timestamp magnitude was never load-bearing.

5-codec regression sweep: all 4 non-H.264 anchors hold. Zero regression.

Cumulative state after iter8+iter9:
- 6 hypotheses eliminated (libva-readback, slot-binding, stale-residue,
  constraint_set_flags, POC sentinel, reference_ts magnitude)
- libva-vs-kdirect H.264 wire-byte diff is now empirically zero
- α-2 + α-7 shipped as wire-payload hygiene cleanups (zero behavior
  change but cleaner semantics)

iter10 candidate ranking:
1. α-8 OUTPUT bitstream byte dump (compare in-memory slice bytes)
2. α-9 untraced control diff (device-wide controls beyond DECODE_MODE
   + START_CODE)
3. Kernel-side investigation (rkvdec source dive for 16x32 partial-
   decode signature)
4. Pivot to Bug 5 (HEVC) or Bug 6 (VP8)

Two more iterations of diminishing returns suggest either deeper
empirical work (OUTPUT-byte dump or kernel investigation) or pivot
to a different bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:57:26 +00:00
marfrit 3b0880a97f iter9 Phase 5: CRIT-1 — α-7 contradicted by VP9/MPEG-2 PASS evidence
VP9 (vp9.c:624) and MPEG-2 (mpeg2.c:150,156) use v4l2_timeval_to_ns
identically to H.264. Both PASS via libva with the same gettimeofday-
based giant ns values. If timestamp magnitude were the bug, VP9/MPEG-2
should also fail. They don't.

Reviewer flagged α-7 as low-probability fix and pointed to iter10
kernel-side investigation (M-A vb2_find_buffer_by_timestamp overflow)
if α-7 confirmed inert.

IMP-1: timestamp_counter should live in object_context not driver_data
to avoid multi-context collisions.

Decision: implement α-7 anyway as empirical confirmation (5 min) since
test cost is trivial. If α-7 fails as predicted, iter9 closes PARTIAL
with wire-byte search exhausted; iter10 candidates pivot to slice-data
encoding or kernel investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:33:54 +00:00
marfrit 4832ffc401 iter9 Phase 4: α-7 implementation contract — monotonic per-context counter
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:40 +00:00
marfrit fa771b0625 iter9 Phase 0: lock α-7 timestamp scheme — only remaining wire diff
Phase 0 deep-strace yielded a critical narrowing:
- Post-DPB DECODE_PARAMS bytes (512-559): IDENTICAL libva vs kdirect
- PPS: IDENTICAL
- SPS: identical except inert constraint_set_flags
- DPB[0] beyond reference_ts: IDENTICAL after α-2

The ONLY remaining wire-byte diff between libva (broken) and kdirect
(working) is reference_ts magnitude. libva uses gettimeofday giving
~1.78e18 ns; kdirect uses an internal counter giving ~10000 ns.

α-7 hypothesis: V4L2 stateless decoder (rkvdec) reference-resolution
fails for very large reference_ts values. Possible mechanisms:
M-A: vb2_find_buffer_by_timestamp truncates/overflows on giant values.
M-B: V4L2 framework transforms OUTPUT QBUF ts before storing on CAPTURE
     but DPB.reference_ts left untransformed → mismatch.
M-C: gettimeofday + v4l2_timeval_to_ns produce slightly different ns
     values than the kernel computes from the timeval QBUF.

Fix: ~10 LOC. Add timestamp_counter to driver_data; replace
gettimeofday in EndPicture with monotonic counter.

If α-7 works → iter9 PASS, Bug 4 closed.
If α-7 doesn't → iter9 PARTIAL, wire-byte search space effectively
exhausted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:01 +00:00
marfrit 3ed1e454fb iter8 Phase 7c + 8: close iter8 PARTIAL — Bug 4 narrowed via 5 eliminations
α-2 (POC strip removal) changed wire bytes (POC now matches kdirect's
sentinel-encoded 0x10000) but H.264 output unchanged. POC not load-bearing.

5-codec regression sweep on α-2 backend: all 4 non-H.264 anchors hold.
Zero regression.

Iter8 close: 5/6 PASS, criterion-1 PARTIAL. Bug 4 narrowed but not fixed.

Eliminations achieved:
  1. libva-readback bug (γ dump)
  2. Slot-binding wrong (γ dump shows correct slot per surface)
  3. Stale residue (IMP-1 memset confirmed deterministic kernel write)
  4. constraint_set_flags (Phase 5b CRIT-1: rkvdec source review)
  5. POC sentinel strip (α-2 wire change, no output change)

Remaining candidates for iter9: PPS diff (α-3), DECODE_PARAMS post-DPB
fields (α-6), DPB entry order (α-4), slice data encoding (α-5).

Fork tip 0226684 carries γ + IMP-1 diagnostic + α-2 hygiene. All
env-gated off by default; α-2 is a wire-payload cleanup with zero
behavior effect.

Lessons distilled:
- Reviews are never skippable — Phase 5b CRIT-1 saved a build cycle.
- Wire-byte equivalence ≠ behavior equivalence.
- Per-driver kludges in shared codec code need explicit gating.
- Bug carryover labels can mislead (Bug 4 != "inter race-loss").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:01:36 +00:00
marfrit 16034152a8 iter8 Phase 4c: α-2 plan — remove POC sentinel strip for rkvdec
Phase 3 strace re-decoded with correct struct layout:
- libva sends dpb[0] tfoc=0, bfoc=0 (sentinel stripped)
- kdirect sends dpb[0] tfoc=65536, bfoc=65536 (FFmpeg sentinel preserved)
- flags match between both (0x03 VALID|ACTIVE)

rkvdec config_registers() writes top/bottom_field_order_cnt directly to
MMIO. The strip was added in h264.c:219 for hantro's prepare_table; for
rkvdec, kdirect's path (no strip) decodes correctly while libva's
(strip) produces 16x32 partial decode.

Option A: remove the strip entirely (~5 LOC).
Option B: per-driver gating (~20 LOC).

Hantro+H.264 not exercised on RK3399 — Option A is safe. Phase 5c
review then Phase 6c implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:48:52 +00:00
marfrit 64b370d863 iter8 Phase 5b review: CRIT-1 kills α-1 (rkvdec ignores constraint_set_flags)
Sonnet-architect Phase 5b read rkvdec-h264.c end-to-end and confirmed:
constraint_set_flags is NEVER accessed by the driver. assemble_hw_pps()
reads only chroma_format_idc, bit_depth_*, log2_max_frame_num_minus4,
max_num_ref_frames, pic_order_cnt_type, log2_max_pic_order_cnt_lsb_minus4,
and dimension fields. rkvdec_h264_validate_sps() doesn't validate it.

CONSTRAINT_SET3_FLAG and PROFILE_IDC in the hardware PPS packet are
hardcoded constants (1 and 0xFF respectively), not propagated from the
incoming SPS.

α-1 will not unblock Bug 4. Plan-killer.

CRIT-2: ConstrainedBaseline 0x42 mapping is wrong (bit 6 reserved);
correct value 0x12 (bit 1 | bit 4) per H.264 §A.2.1.1.

IMP-1 redirects: DPB entry flags + POC fields are the next candidate.
rkvdec config_registers() reads dpb[i].flags ACTIVE/FIELD bits and
dpb[i].fields TOP/BOT bits. lookup_ref_buf_idx() substitutes destination
buffer as reference when ACTIVE missing — silent corruption matching
observed symptom.

IMP-2/3: full PPS byte comparison + close-criteria framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:33:40 +00:00
marfrit 678c072d75 iter8 Phase 4b: α-1 plan — per-profile SPS constraint_set_flags
Single-byte fix candidate. Add h264_constraint_set_flags(VAProfile)
helper to h264.c, mirror pattern of h264_profile_to_idc + level_idc
derivation. VAAPI doesn't forward this field; libva backend must derive
per profile.

Mapping per H.264 typical-stream conventions:
  Main → 0x02 (constraint_set1_flag, matches BBB + kdirect)
  ConstrainedBaseline → 0x42
  High / MultiviewHigh / StereoHigh → 0x00

LOC ~15 in h264.c only. Per-VAProfile-gated; no risk to VP9/VP8/HEVC/
MPEG-2. Phase 5b architect review required before Phase 6b implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:25:23 +00:00
marfrit 84c939692f iter8 Phase 7 (γ + IMP-1): root cause confirmed kernel-side
γ dump confirms libva reads buffer correctly; the 16x32 patch and
stride-4 UV markers appear at YUV output exactly as in the dump.

IMP-1 memset-before-QBUF test: pre-zeroing buffer does NOT change output
(identical hash). The 512 bytes ARE deterministic kernel writes, not
stale residue.

Bug root cause: rkvdec accepts libva's H.264 decode request without
error flags but writes only 16x32 of luma-neutral data + stride-4 UV
scratch. Kernel decoded a tiny bit then stopped.

Phase 3 SPS diff: libva SPS.constraint_set_flags=0x00 vs kdirect's
0x02 — likely the kernel hint that triggers rkvdec's full decode path
for Main profile. Phase 4b α-1 fix: derive constraint_set_flags per
VAProfile in h264_set_controls. ~10 LOC. Phase 5b review required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 12:23:55 +00:00
marfrit d4c04b4a3b iter8 Phase 5: sonnet-architect review — 2 CRIT + 4 IMP + 3 MIN
CRIT-1: request_log prepends prefix on every call; per-byte loop in γ
sketch would emit 32 prefix-only lines. Fix: snprintf buffered emit.

CRIT-2: γ dump block missing null guard on destination_data[]; the
plan's env-var check is outside the current_slot != NULL guard. Fix:
nest the dump inside the existing slot-null guard.

IMP-1: "stale residue from prior decode" not eliminated as alternative
explanation for the 16x32 patch. Add memset-zero-before-QBUF experiment
to Phase 7 to discriminate.

IMP-2: γ-first defensible but on IMP-1 grounds, not the
three-signature argument (which is weaker than stated).

IMP-3/4 placement clarifications. MIN-1/2/3 cosmetic.

5 mechanical amendments locked for Phase 6. γ-first strategy stands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:55:51 +00:00
marfrit 3a6307638d iter8 Phase 4: γ-then-α plan — diagnostic dump first, fix after
Phase 3 redefined Bug 4 to partial-fill (not inter race). Three distinct
per-codec signatures (VP9 correct, HEVC zero, H.264 partial-leak) can't
be explained by a single hypothesis. Phase 4 commits to γ first: a
~30 LOC env-gated diagnostic dump in RequestSyncSurface that fires
after CAPTURE DQBUF, prints first/last 32 bytes of each destination_data
plane and a non-zero-count of the first 1024 bytes.

γ definitively distinguishes "kernel didn't write" from "libva mis-reads"
from "slot binding wrong". Phase 4b targeted fix follows γ's outcome.

Out of scope: per-codec H.264 control-fill changes (gated on γ's
findings), VP9/VP8/HEVC/MPEG-2 paths, kernel patches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:51:47 +00:00
marfrit 4320d7860f iter8 Phase 3: empirical Bug 4 redefinition — partial-fill, not inter race
Phase 3 strace + byte-level analysis on fresnel rkvdec. Findings:

1. Bug 4 is NOT inter-race-loss. The IDR keyframe itself fails through
   libva (only 512 bytes of real Y data at top-left 16x32 region).
2. The 16x32 leak is structured real image content (smooth gradients,
   neutral luma ~0x80) — kernel decoded one tile / one MB pair, then
   stopped.
3. VP9 via libva WORKS through the same readback path (100% non-zero,
   real image data). So the bug isn't generic DMA-BUF cache coherency.
4. HEVC fails via libva (all-zero, distinct from H.264 partial-fill).
5. OUTPUT sizeimage = 1MB (SOURCE_SIZE_MAX) is sufficient — BBB IDR is
   only 6321 bytes. Not the bug.
6. Control payload diffs: SPS.constraint_set_flags = 0 vs kdirect's 2
   (probably cosmetic); DECODE_PARAMS.dpb[0].bottom_field_order_cnt = 0
   vs kdirect's 1 (load-bearing for POC).

Refined hypothesis: a specific H.264 control field libva sends causes
rkvdec to abort after partial decode. Phase 4 candidates: α fix POC
fields, β bump OUTPUT sizeimage, γ instrumentation dump, δ relative
timestamps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:48:41 +00:00
marfrit abd97e3eb6 iter8 Phase 2: H.264 backend source-read + refined hypothesis surface
Maps the per-frame decode pipeline (BeginPicture → RenderPicture →
EndPicture → SyncSurface) and walks frame-1 IDR + frame-2 P state
transitions through h264_set_controls and the DPB.

Eliminates 6 of 13 hypotheses from Phase 0 by source-read alone (H-A
DPB stale, H-B POC sentinel for small POCs, H-C SLICE flags in FRAME_BASED,
H-D request_fd lifecycle, H-F pred-weight, H-G scaling matrix re-upload).
Adds 4 new hypotheses (H-J reference_ts derivation, H-K CAPTURE buf count,
H-L slice_data alignment vs h264_start_code, H-M frame_num cross-check).
Live hypotheses for Phase 3: H-E (CAPTURE rotation/reference-resolution),
H-H (start_code prefix), H-L (slice_data alignment), H-K (cap_pool size).

Phase 3 plan: strace-diff libva-vaapi-H.264 vs kdirect-H.264 on the same
fixture; byte-level frame-1/2/3 examination; dmesg check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:19:49 +00:00
marfrit e47a7ba309 iter8 Phase 0: lock Bug 4 — H.264 inter-frame race-loss
User pick at iter8 open. Carried unchanged through 5 iters (iter4..iter7);
keyframe partially decodes (frame-1 first 16 bytes = real chroma) while
inter frames return all-zero. Pass criterion: libva_h264 == kdirec_h264
== sw_h264 byte-identical for bbb_1080p30_h264.mp4 3-frame, including
inter frames.

In scope: src/h264.c, src/h264_slice_header.c, src/picture.c H.264 paths,
per-frame request_fd lifecycle. Out of scope: VP9/VP8/HEVC/MPEG-2, kernel
patches, performance, all other backlog items.

Substrate at iter8 open: fork tip 6df2159 (iter7), backend SHA 520507f6..,
kernel linux-fresnel-fourier 7.0-1, auto-detect picks rkvdec on every boot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:15:45 +00:00
marfrit b0ebe67673 iter7 PASS close: auto-detect picks rkvdec reliably; iter4-B1a closed
Phase 7 verification 5/5 PASS:
- C1 auto-detect picks decoder (verified: auto-selected /dev/video1 +
  /dev/media0 on rkvdec, NOT encoder)
- C2 prefer rkvdec (pass-1 short-circuit confirmed)
- C3 zero regression: all 5 codec hashes (H.264 71ac099b..., HEVC
  06b2c5a0..., VP9 4f1565e8..., MPEG-2 19eefbf4..., VP8 bcc57ed5...)
  identical to iter5b-β/iter6 anchors
- C4 multi-boot stability: SOFT PASS (architectural — algorithm is
  deterministic given kernel topology; physical reboot not session-
  blocking)
- C5 vainfo lists 7 rkvdec profiles (H.264 variants + HEVC + VP9)

Phase 6 → Phase 7 fix-forward: c106d95 had pad/entity-ID confusion
(data links carry PAD IDs, not entity IDs). Empirical topology dump
on fresnel /dev/media0 revealed it; fix-forward 6df2159 allocates
topo.pads[] and resolves data-link endpoints via pads[].entity_id.

Phase 5 reviewer caught 2 CRIT + 4 IMP + 3 MIN — all incorporated.
Phase 5 missed the pad/entity ID encoding distinction; future
media-topology code reviews should ask for empirical dumps.

Net iter7 contribution: quality-of-life. Auto-detect now reliable
across boot orderings for rkvdec codecs (H.264/HEVC/VP9). MPEG-2/VP8
still need LIBVA_V4L2_REQUEST_VIDEO_PATH env override (iter4-B1b
backlog — multi-decoder routing deferred to future iter).

Fork tip 6df2159. Backend SHA 520507f6...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:10:23 +00:00
marfrit 5bf6acb964 iter7 Phase 6: 1 commit landed on fork — auto-detect refactor pending fresnel build
Fork tip c106d95 (was 70196f8). 165 LOC added / 57 removed in
src/request.c. All 9 Phase 5 amendments (2 CRIT + 4 IMP + 3 MIN)
incorporated.

Fresnel offline at push time. Build + install + Phase 7 verify
deferred until host returns. Phase 7 sweep ready to execute:
vainfo + ffmpeg-vaapi + reboot stability + iter5b/iter6 regression
check.

Code review verified algorithm correctness against Phase 5 reviewer
pseudocode + boltzmann's linux-rockchip source confirms
MEDIA_ENT_F_PROC_VIDEO_DECODER is set on rkvdec.c:1382 +
hantro_drv.c proc entities. Compile-time syntax untested
(no va-api dev headers on noether).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 09:41:12 +00:00