Validates marfrit-packages patch 0016 (PR #106) end-to-end against
the daedalus_decode_h264 CLI. Callback fires once per macroblock
in coded order; this PR checks the count + uniqueness invariants
WITHOUT yet driving daedalus-decoder differently — that's PR-A3.
Infrastructure landed
---------------------
CMake gains DAEDALUS_FFMPEG_PREFIX option pointing at a private
FFmpeg install carrying patch 0016. When set, the CLI links
against it (static .a's from $prefix/lib) and the inspection
codepath is compiled in (DAEDALUS_HAVE_H264_MB_INSPECT_CB). When
unset, the CLI falls back to the pkg-config-discovered system
FFmpeg and behaves as PR-A1b did (identity-passthrough only, no
callback).
The H264Context struct stays opaque (forward-decl only — its
real definition lives in libavcodec's internal h264dec.h which
isn't installed). Real per-MB state extraction (sl->mb coeffs,
mb_type, intra modes, deblock params) will land in PR-A3
alongside an internal-header include path.
The callback's only job in this PR: assert (mb_x, mb_y) lies in
the coded grid, mark "seen" in a per-frame bitmap, count
invocations. At end-of-frame: assert seen-count == mb_w*mb_h,
0 duplicates, 0 out-of-bounds.
Per-frame mb-grid init goes BEFORE first avcodec_send_packet
(callbacks fire from inside send_packet, before the first
receive_frame ever returns — lazy init from AVFrame would miss
all of frame 0). Dims come from codecpar->width/height rounded
up to 16-mod (H.264 codes 1080 display as 1088 coded).
Raster-order check considered but dropped: libavcodec uses
MB-level threading in some configs so callbacks fire out of
raster order. The contract is "each MB exactly once", not "in
raster order"; the bitmap check captures that.
Result on hertz (Pi 5, patched FFmpeg at /tmp/ffmpeg-inspect-prefix)
-------------------------------------------------------------------
320x240 I-only, 3 frames:
mb-grid 20x15
callback invocations: 900 (= 3 * 300)
missing/duplicates/oob: 0/0/0
identity-passthrough Y diff 0/230400, UV diff 0/115200
PASS
1920x1088 I-only, 3 frames:
mb-grid 120x68
callback invocations: 24480 (= 3 * 8160)
missing/duplicates/oob: 0/0/0
identity-passthrough Y diff 0/6266880, UV diff 0/3133440
PASS
Followups
---------
- PR-A3: include libavcodec/h264dec.h via -I to access H264Context
internals; extract sl->mb coefficients in the callback, compute
P = pre-deblock pixels - IDCT(C) using a transcribed C reference;
feed daedalus_decoder with REAL (P, C, edges) instead of identity.
Use avctx->skip_loop_filter = AVDISCARD_ALL to make libavcodec
output pre-deblock so the subtraction is exact.
- PR-A4 onwards: extend to P/B frames + chroma DC + intra prediction
coverage.
Option A's standalone end-to-end gate against real H.264 streams.
First iteration: identity-passthrough validation — daedalus-decoder
produces output byte-exact to libavcodec's AVFrame when fed the
reconstructed pixels as `predicted`, zero coeffs, no deblock edges.
Validates: daedalus-decoder data path (append_mb + flush_frame +
NV12 output + coded-vs-display dim handling) at real-stream frame
sizes (320x240 and 1920x1088) with real H.264-decoded predicted-
sample distributions — not the random patterns the existing
test_idct_bitexact + test_deblock_smoke synthesize.
Identity-passthrough math:
- mb_input.predicted = AVFrame pixels at MB raster position
- mb_input.coeffs = 384 int16's, all zero
- mb_input.edges = NULL, n_edges = 0
flush_frame:
scratch_y/_uv pre-fill from predicted (= AVFrame pixels)
IDCT dispatches with all-zero coeffs add 0 (no-op compute)
No deblock dispatches (no edges)
copy-out → caller's NV12 planes
Result MUST equal AVFrame pixels byte-for-byte.
Build
-----
New cmake option DAEDALUS_BUILD_TOOLS (default OFF). When enabled,
pkg-checks libavcodec / libavformat / libavutil and builds the
daedalus_decode_h264 binary against the system FFmpeg.
Stock libavcodec is sufficient for THIS PR (identity passthrough
reads from AVFrame after avcodec_receive_frame; no per-MB internal
state extraction needed). Follow-up PRs (A2+) will use the per-MB
inspection callback added in marfrit-packages patch 0016 (PR #106)
to feed REAL per-MB state (pre-residual predicted samples, residual
coeffs, deblock edges) for actual non-trivial daedalus-decoder
validation.
Usage
-----
daedalus_decode_h264 [--substrate cpu|qpu|auto]
[--max-frames N]
<input.h264> <output_dadec.yuv> <output_ref.yuv>
Exit codes:
0 = byte-exact match across all frames
1 = argument / setup error
2 = decode error from libavcodec
3 = daedalus-decoder error (ctx, append, flush)
4 = bit-exact comparison failed
Result on hertz (Pi 5 V3D 7.1)
------------------------------
I-only test clip via ffmpeg testsrc2 + libx264 -bf 0 -g 1:
320x240, 5 frames:
substrate=auto: Y diff 0/76800 UV diff 0/38400 PASS
substrate=cpu: Y diff 0/76800 UV diff 0/38400 PASS
substrate=qpu: Y diff 0/76800 UV diff 0/38400 PASS
1920x1088 (coded; 1080 display), 3 frames:
substrate=auto: Y diff 0/2088960 UV diff 0/1044480 PASS
Followups
---------
- PR-A2: wire the per-MB inspection callback (marfrit-packages
0016) so per-MB state — coeffs (sl->mb), predicted-before-
residual (from prediction kernels), bS/alpha/beta — flows into
mb_input instead of zeros, and IDCT / deblock dispatches do
real GPU work. At that point we're decoding real H.264 streams
through daedalus-decoder for real.
- PR-A3: extend to P/B frames once MC dispatch lands.