PR-A6: enable libavcodec deblock + drive daedalus deblock on real streams #16
Reference in New Issue
Block a user
Delete Branch "noether/tools-h264-deblock-validation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
PARTIAL PASS — full I-frame pipeline (IDCT + deblock) now runs on real H.264 streams via daedalus-decoder's frame-major dispatch. Residual divergence vs libavcodec reference: 0.09%–0.86% Y / 0.35%–2.0% UV depending on substrate + resolution. Kernel-level / dispatch-order off-by-one issues remain — same family as task #179.
Architecture (verified vs
dejavumemory before coding)Different shape from the banned per-kernel substitution arc. Re-checked the
dejavumemory andframe_major_uma_verdictmemory before any tool call (per user's explicit instruction).What changed in the CLI
avctx->skip_loop_filter = AVDISCARD_ALL— libavcodec's deblock now runs; AVFrame is post-deblock.qp_y,mb_type_intra,transform_8x8. Slice-level:slice_alpha_c0_offset,slice_beta_offset,slice_deblocking_filter.alpha_table[156],beta_table[156],tc0_table[156][4]from FFmpeg'sh264_loopfilter.c.chroma_qp_table[52].daedalus_decoder_mb_input.edgesfrom spec rules. 16 edges/MB (4 V-luma + 4 H-luma + 2 V-Cb + 2 V-Cr + 2 H-Cb + 2 H-Cr). bS=4 at MB boundary, bS=3 internal, bS=0 at frame boundary. 8×8 DCT MBs skip cols/rows 4 and 12 internal edges (only the 8×8-block boundary fires).flush_frameruns IDCT-add for real-coeffs MBs + identity passthrough for skipped MBs, then dispatches the 4 deblock kernels (luma V/H + chroma V/H, plus their bS=4 intra variants) across the frame.Subtle bug hunted:
sl->deblocking_filterconvention inversionFFmpeg's
h264_slice.cline 1901:sl->deblocking_filter ^= 1. Inverts the spec'sdisable_deblocking_filter_idcsemantics. Internal:0= DISABLED (spec=1)1= ENABLED (spec=0)2= enabled-but-not-across-slice-boundaries (unchanged)First implementation treated
== 1as "disabled" per spec semantics → silently skipped all edge emission → diff count identical to no-edges baseline. Inverting todeblock_off = (sl->deblocking_filter == 0)dropped diffs 5346 → 438 Y per frame (92% reduction).Results on hertz (Pi 5 V3D 7.1)
testsrc2 I-only via
libx264 -bf 0 -g 1:Residual divergence — root cause analysis
ff_h264_*_loop_filter_*_neon(same kernel libavcodec uses). Same kernel + samealpha/beta/tc0/bS→ output SHOULD be identical. But still 0.52% Y diff.These are kernel-level / dispatch-order issues, not CLI bugs. Task #179 extended in scope (now includes luma + cross-MB edge ordering on real-stream layouts); root cause investigation belongs in daedalus-fourier.
Honest framing
This is a partial-pass delivery. The infrastructure works (real coefficients + real edges flowing through daedalus-decoder's frame-major dispatch on real H.264 streams) and output is within ~1% of reference. Full byte-exact closure depends on the daedalus-fourier deblock kernel / dispatch-order investigation.
Per the user's "correctness before speed" principle, this is clearly called out as partial — not pretended to be byte-exact.
Followups