Files
fresnel-fourier/phase8_iteration14_close.md
marfrit 18f24cd26d iter14 Phase 8 close: α-16 finds libva HEVC OUTPUT bytes BYTE-IDENTICAL to input
α-16 OUTPUT byte dump: libva HEVC frame 1 = 96893 bytes = 1 ANNEX-B
start code + 96890 byte IDR NAL with header 0x28 (nal_unit_type 20 =
IDR_N_LP, correct). Byte-compared against input file's raw HEVC
ANNEX-B stream (after VPS+SPS+PPS): 0 bytes differ over 96890 byte
overlap. The 1-byte tail diff is an inter-NAL boundary marker, not
slice payload.

Libva submits BYTE-IDENTICAL slice bytes as what the input contains
and what kdirect submits. Combined with iter11's wire-byte audit
showing every libva-vs-kdirect control diff is in a field rkvdec
ignores, AND iter12's RFC v2 substrate upgrade producing zero
codec-correctness change, AND iter13's DMA_BUF_IOCTL_SYNC ioctl
working but inert:

Cumulative iter8-iter14: 13 hypotheses eliminated. Libva backend
is empirically byte-correct on its side. Bug 4 + Bug 5 are
KERNEL-SIDE failures specific to how rkvdec processes the libva
ioctl sequence vs the kdirect sequence — NOT a libva backend bug.

iter15+ candidates:
  - Full ioctl-sequence trace diff (libva vs kdirect, find first
    divergence in syscall order/args).
  - kernel-side rkvdec ftrace/eBPF kprobe instrumentation; route
    via kernel-agent.
  - Campaign close-out: VP9+MPEG-2 PASS direct, HEVC+H.264+VP8 narrowed
    to kernel-side with byte-clean libva submission.

Backend SHA fa2098b6... 8 cumulative iter11-iter14 commits all ship
clean (wire-correctness, env-gated diagnostics, zero regression).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 08:29:10 +00:00

6.1 KiB
Raw Permalink Blame History

Iteration 14 — Phase 8 (close)

Closes 2026-05-14. iter14 = α-16 OUTPUT bitstream byte dump. Definitive empirical narrowing of Bug 4 + Bug 5 to kernel-side. PARTIAL on the campaign's success criteria but represents the largest single jump in understanding since iter5b.

Outcome

Metric Value
Fork tip end 522fb6d (α-16 OUTPUT dump)
LOC delta +43 in src/picture.c
Backend SHA on fresnel fa2098b69fd484ea2e4e9b6208d9e1a996358ae64401b47b5ac8bdb166e3c972
Phase 1 criteria 5/6 PASS — Bug 4/5 hashes unchanged but cause definitively localized

The key result

For HEVC frame 1 (IDR keyframe), 96893-byte OUTPUT dump from libva:

size: 96893
start codes (00 00 01) at: [0]    # exactly ONE start code, at position 0
total: 1
  pos 0: NAL header 0x28, nal_unit_type=20    # IDR_N_LP, correct

Comparison against input file's raw HEVC ANNEX-B IDR NAL:

libva slice (after start code) size: 96890 bytes
input file's slice NAL+data size (up to next start code): 96891 bytes
byte-by-byte diff over min(96890,96891)=96890: 0 bytes differ

0 bytes differ. libva's OUTPUT buffer contains exactly the IDR NAL the kernel should decode. kdirect (ffmpeg-v4l2request) submits the same bytes from the same parser. Both backends submit identical bitstream.

What this empirically rules out

Cumulative iter8iter14 eliminations for Bug 4 + Bug 5:

Iter Hypothesis Status
iter8 P7 γ dump: libva mis-reads Eliminated
iter8 P7 Slot binding wrong Eliminated
iter8 IMP-1 Stale residue (memset test) Eliminated
iter8 Phase 5b SPS constraint_set_flags Eliminated (rkvdec ignores)
iter9 α-2 POC sentinel Eliminated
iter9 α-7 Reference_ts magnitude Eliminated
iter11 α-13 sps_max_num_reorder_pics Eliminated (rkvdec ignores)
iter11 α-14 DECODE_PARAMS IRAP/IDR flags Eliminated (rkvdec ignores)
iter11 num_entry_point_offsets Eliminated (rkvdec ignores)
iter11 Slice qp_delta Eliminated (rkvdec ignores)
iter12 RFC v2 vb2_dma_resv fences Eliminated (orthogonal path)
iter13 α-17 DMA_BUF_IOCTL_SYNC CPU cache Eliminated (ioctls work, output unchanged)
iter14 α-16 OUTPUT bitstream bytes wrong Eliminated (byte-identical to input)

13 hypotheses eliminated. Libva backend produces byte-correct ioctls + controls + bitstream. Bug 4 + Bug 5 are kernel-side, not libva-side.

Where the bug actually is

Given:

  • libva submits byte-identical bitstream as kdirect.
  • libva submits kernel-correct controls (rkvdec reads the same SPS / PPS / DPB / slice fields from both).
  • libva uses the same V4L2 ioctl sequence shape (REQBUFS, S_FMT, EXPBUF, QBUF, MEDIA_REQUEST_IOC_QUEUE, DQBUF).
  • Same kernel (linux-fresnel-fourier 7.0-2).
  • Same hardware (rkvdec on RK3399).

But:

  • libva HEVC → all-zero CAPTURE.
  • kdirect HEVC → correct CAPTURE.

The cause must be:

  • Some subtle ioctl-sequence difference (timing of STREAMON, QBUF ordering, request_fd reuse pattern) that triggers different rkvdec state.
  • Some allocator difference (libva's CAPTURE buffer goes through one vb2 allocator, kdirect's through another, even though both end up V4L2_MEMORY_MMAP).
  • Some kernel-side state-machine bug specific to how libva sequences calls.

These are NOT visible at the wire-byte / payload level. They are visible at the syscall-sequence level. The natural next investigation is a full ioctl trace comparison (not just S_EXT_CTRLS payload):

  • libva strace: every ioctl from open → REQBUFS → S_FMT → EXPBUF → STREAMON → QBUF → MEDIA_REQUEST_IOC_QUEUE → DQBUF → close.
  • kdirect strace: same.
  • Find the FIRST diverging ioctl or its FIRST diverging argument.

Lessons

  1. OUTPUT byte verification is the gold-standard ruling-out check. Two iters (12, 13) thrashed on kernel-substrate / cache hypotheses before this one byte-compared the actual slice data. Doing α-16 EARLIER (iter5 / iter6) would have saved many cycles.
  2. The campaign has been chasing wire-byte fields the kernel ignores. Same anti-pattern as iter8 α-1. The reviewer's "grep rkvdec source for field reference" methodology saves iterations.
  3. VP9 works through the same libva backend — so this isn't a categorical libva failure. It's a kernel codec-specific failure (HEVC + H.264 paths) that libva's particular ioctl sequence triggers and kdirect's doesn't.

Substrate state at iter14 close

  • Fork tip 522fb6d on noether + fresnel + gitea.
  • Backend SHA fa2098b6… on fresnel.
  • Kernel 7.0-2 (RFC v2 included).
  • Cumulative libva improvements that ship clean (zero regression, wire correctness): γ dump (iter8), IMP-1 memset gate (iter8), α-2 POC strip removed (iter9), α-7 timestamp counter (iter9), α-13 SPS hygiene (iter11), α-14 IRAP/IDR flags (iter11), α-17 DMA_BUF_IOCTL_SYNC (iter13), α-16 OUTPUT dump (iter14). 8 commits, all env-gated or wire-correctness.

iter15+ candidates

Given iter14's localization:

  • Full ioctl-sequence trace diff — strace libva vs kdirect, complete syscall sequence, find the first divergence. Likely 1-2 hours.
  • kernel-side rkvdec hot-path trace — instrument rkvdec-hevc.c via ftrace or eBPF kprobe; compare what kernel state evolves between libva-trigger and kdirect-trigger for the same input. Route via kernel-agent.
  • Investigate libva STREAMON timing — libva's STREAMON happens at CreateContext (iter5b-β); kdirect's STREAMON timing may differ.
  • Campaign close-out documentation — VP9 + MPEG-2 PASS direct via libva; HEVC + H.264 + VP8 remain kernel-side bugs, narrowing complete to wire-byte AND OUTPUT-byte byte-identity. Campaign deliverable: a libva backend that's byte-correct on its side; kernel-side gap is upstream.

Memory rule candidate (defer)

Strong empirical evidence that libva backend ioctl/control/OUTPUT byte production for HEVC + H.264 on RK3399 is byte-correct relative to the working reference (ffmpeg-v4l2request). Bug 4 + Bug 5 are KERNEL-SIDE failures in how rkvdec processes specific ioctl sequences. Future iters that target libva-side fixes for these bugs are unlikely to succeed without kernel cooperation.