h264: chroma DC 2x2 Hadamard pre-pass primitive #23
Reference in New Issue
Block a user
Delete Branch "noether/h264-chroma-dc-hadamard"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds the H.264 §8.5.11.1 chroma DC Hadamard transform. In 4:2:0 chroma, the four DC coefficients go through a 2x2 Hadamard before quant-scaling and being added back to each chroma 4x4 AC block's [0,0] coefficient.
Pure transform; QP-dependent scaling per §8.5.11.2 is caller-side composition (it varies by slice/PPS context).
All 7 tests PASS first try including the algebraic invariant test (H·H = 4·I, applied twice = 4*input). The invariant test is a strong gate: any sign error in the butterfly would break it immediately.
With this PR the H.264 8-bit 4:2:0 pixel-math primitive matrix is complete in fourier:
Remaining work all sits at the libavcodec intercept layer (CABAC/CAVLC entropy, SPS/PPS parsing, MB header decode) — fourier provides all pixel-math primitives the intercept needs.
Adds the H.264 §8.5.11.1 chroma DC Hadamard transform. In 4:2:0 chroma, the four DC coefficients (one from each chroma 4x4 AC block within an MB) go through a 2x2 Hadamard before quant-scaling and before being added back to each block's [0,0] coefficient prior to the 4x4 AC IDCT. This PR ships the pure Hadamard transform: f[0,0] = c[0,0] + c[0,1] + c[1,0] + c[1,1] f[0,1] = c[0,0] - c[0,1] + c[1,0] - c[1,1] f[1,0] = c[0,0] + c[0,1] - c[1,0] - c[1,1] f[1,1] = c[0,0] - c[0,1] - c[1,0] + c[1,1] implemented as the 2-stage row+col butterfly (1:1 with the NEON SIMD shape upstream). Operates in-place on int16[4]. What this does NOT do (deferred to caller-side composition): - QP-dependent scaling per §8.5.11.2. The scale depends on QP_C (with chroma_qp_offset adjustment), so the formula has branches (>=6 vs <6) and looks up LevelScale4x4 table values. The libavcodec intercept patch composes Hadamard + scale + shift itself since the scale shape varies by codec-level context (slice header chroma_qp_offset, PPS chroma_qp_offset, second_chroma_qp_offset for the chroma_qp_index_offset). - Inverse transform (decode-time used for the FORWARD direction is the same Hadamard up to scaling, but conceptually the spec distinguishes them in §8.5.11; we expose only the matrix). Test design (tests/test_chroma_dc_hadamard.c): 7 cases, all spec-derived hand-computations: - all-uniform 5 → [20, 0, 0, 0] - col gradient [0,10,0,10] → [20, -20, 0, 0] - row gradient [0,0,10,10] → [20, 0, -20, 0] - anti-diagonal [10,0,0,10] → [20, 0, 0, 20] - asymmetric [1,2,3,4] → [10, -2, -4, 0] - sign-alternating [-5,5,-5,5] → [0, -20, 0, 0] - double-Hadamard invariant: H·H = 4·I, so applying twice gives [4*c[0], 4*c[1], 4*c[2], 4*c[3]] for any input. The double-Hadamard test is the strongest correctness gate: any single sign error in the butterfly would break the H·H = 4·I algebraic property, surfacing immediately. All 7 PASS first try. Verified on hertz: $ ./build/test_chroma_dc_hadamard all-uniform 5 PASS col gradient [0,10,0,10] PASS row gradient [0,0,10,10] PASS anti-diagonal [10,0,0,10] PASS asymmetric [1,2,3,4] PASS sign-alternating [-5,5,-5,5] PASS double-Hadamard = 4*orig PASS ALL chroma DC Hadamard tests PASS With this primitive the H.264 8-bit 4:2:0 pixel-math primitive matrix is complete in fourier: - IDCT 4x4 (luma + chroma) ✓ - IDCT 8x8 (luma, High profile) ✓ - Chroma DC Hadamard 2x2 ✓ (this PR) - Deblock (8 variants) ✓ - Intra prediction (26 modes) ✓ - MC qpel (30 dispatches) ✓ What remains for the libavcodec intercept patch: CABAC/CAVLC entropy decode, SPS/PPS parsing, slice header parsing, MB type / QP / CBP / intra mode prediction. All of that lives at the intercept layer (it's spec-derived from the bitstream syntax, not pixel-math); the intercept patch will call into these fourier primitives once the metadata is decoded.