Option A's standalone end-to-end gate against real H.264 streams.
First iteration: identity-passthrough validation — daedalus-decoder
produces output byte-exact to libavcodec's AVFrame when fed the
reconstructed pixels as `predicted`, zero coeffs, no deblock edges.
Validates: daedalus-decoder data path (append_mb + flush_frame +
NV12 output + coded-vs-display dim handling) at real-stream frame
sizes (320x240 and 1920x1088) with real H.264-decoded predicted-
sample distributions — not the random patterns the existing
test_idct_bitexact + test_deblock_smoke synthesize.
Identity-passthrough math:
- mb_input.predicted = AVFrame pixels at MB raster position
- mb_input.coeffs = 384 int16's, all zero
- mb_input.edges = NULL, n_edges = 0
flush_frame:
scratch_y/_uv pre-fill from predicted (= AVFrame pixels)
IDCT dispatches with all-zero coeffs add 0 (no-op compute)
No deblock dispatches (no edges)
copy-out → caller's NV12 planes
Result MUST equal AVFrame pixels byte-for-byte.
Build
-----
New cmake option DAEDALUS_BUILD_TOOLS (default OFF). When enabled,
pkg-checks libavcodec / libavformat / libavutil and builds the
daedalus_decode_h264 binary against the system FFmpeg.
Stock libavcodec is sufficient for THIS PR (identity passthrough
reads from AVFrame after avcodec_receive_frame; no per-MB internal
state extraction needed). Follow-up PRs (A2+) will use the per-MB
inspection callback added in marfrit-packages patch 0016 (PR #106)
to feed REAL per-MB state (pre-residual predicted samples, residual
coeffs, deblock edges) for actual non-trivial daedalus-decoder
validation.
Usage
-----
daedalus_decode_h264 [--substrate cpu|qpu|auto]
[--max-frames N]
<input.h264> <output_dadec.yuv> <output_ref.yuv>
Exit codes:
0 = byte-exact match across all frames
1 = argument / setup error
2 = decode error from libavcodec
3 = daedalus-decoder error (ctx, append, flush)
4 = bit-exact comparison failed
Result on hertz (Pi 5 V3D 7.1)
------------------------------
I-only test clip via ffmpeg testsrc2 + libx264 -bf 0 -g 1:
320x240, 5 frames:
substrate=auto: Y diff 0/76800 UV diff 0/38400 PASS
substrate=cpu: Y diff 0/76800 UV diff 0/38400 PASS
substrate=qpu: Y diff 0/76800 UV diff 0/38400 PASS
1920x1088 (coded; 1080 display), 3 frames:
substrate=auto: Y diff 0/2088960 UV diff 0/1044480 PASS
Followups
---------
- PR-A2: wire the per-MB inspection callback (marfrit-packages
0016) so per-MB state — coeffs (sl->mb), predicted-before-
residual (from prediction kernels), bS/alpha/beta — flows into
mb_input instead of zeros, and IDCT / deblock dispatches do
real GPU work. At that point we're decoding real H.264 streams
through daedalus-decoder for real.
- PR-A3: extend to P/B frames once MC dispatch lands.