phase1: substrate selector API + cross-substrate bit-exact ctest #9

Merged
marfrit merged 1 commits from noether/phase1-substrate-select into main 2026-05-24 21:14:06 +00:00
Owner

Surfaces daedalus-fourier's substrate-override at the decoder boundary: new daedalus_decoder_set_substrate(dec, AUTO/CPU/QPU) setter with the same mid-frame restriction as set_output_format. Default unchanged (AUTO).

New ctest idct_bitexact_cpu re-runs QVGA forcing the NEON path; produces byte-identical output to the AUTO/QPU run, confirming the V3D shaders and daedalus-fourier's CPU reference agree at the spec level. Also unlocks the bit-exact suite on hosts without V3D7 (CI runners, x86).

Surfaces daedalus-fourier's substrate-override at the decoder boundary: new `daedalus_decoder_set_substrate(dec, AUTO/CPU/QPU)` setter with the same mid-frame restriction as set_output_format. Default unchanged (AUTO). New ctest `idct_bitexact_cpu` re-runs QVGA forcing the NEON path; produces byte-identical output to the AUTO/QPU run, confirming the V3D shaders and daedalus-fourier's CPU reference agree at the spec level. Also unlocks the bit-exact suite on hosts without V3D7 (CI runners, x86).
marfrit added 1 commit 2026-05-24 21:08:13 +00:00
Surfaces daedalus-fourier's substrate-override capability at the
decoder boundary.  Lets tests run on CPU-only hosts (CI runners,
x86 dev boxes) AND cross-checks V3D shader output against NEON
reference on hosts that have both.

API additions (pre-0.1 ABI, additive):

  - daedalus_decoder_substrate enum { AUTO, CPU, QPU }
    (mirrors daedalus_substrate; isolated for ABI reasons).
  - daedalus_decoder_set_substrate(dec, sub) setter, same
    mid-frame-change restrictions as set_output_format.
  - Default remains AUTO — the only sensible choice for production.

Internal:

  - flush_frame now calls daedalus_dispatch_h264_idct{4,8} with an
    explicit substrate instead of daedalus_recipe_dispatch_*.  Mapped
    via a small map_substrate() helper.  No perf delta on AUTO (recipe
    layer was just doing the same dispatch under the hood).

Test changes:

  - test_smoke: new EXPECTs for set_substrate (valid + bogus).
  - test_idct_bitexact: new argv[4] takes "auto" (default), "cpu", or
    "qpu" to force the substrate.
  - CMakeLists.txt: new ctest entry `idct_bitexact_cpu` re-runs the
    QVGA case forcing the CPU path.  Catches silent drift between
    the V3D shader and the NEON reference; both must produce
    identical output for the same coefficient input (and they do —
    see ctest log below).

Verified on hertz (Pi 5 / V3D 7.1 / daedalus-fourier 0.1.0):

  $ ctest --test-dir build --output-on-failure
      Start 1: smoke
  1/4 Test #1: smoke ............................   Passed    0.10 sec
      Start 2: idct_bitexact
  2/4 Test #2: idct_bitexact ....................   Passed    0.03 sec
      Start 3: idct_bitexact_cpu
  3/4 Test #3: idct_bitexact_cpu ................   Passed    0.03 sec
      Start 4: idct_bitexact_1080p
  4/4 Test #4: idct_bitexact_1080p ..............   Passed    0.06 sec

  100% tests passed, 0 tests failed out of 4

CPU substrate produces byte-identical Y + Cb + Cr planes against the
same C reference that the AUTO/QPU path matches — confirming the V3D
shaders and the daedalus-fourier NEON path agree at the spec level.

Why we plumbed the lower-level dispatch instead of leaving recipe in
place: recipe is just a thin wrapper that calls dispatch with
AUTO.  Once we needed substrate control, the wrapper became a
liability (would have required adding a parallel recipe API for each
substrate); going direct is simpler and the AUTO path is unchanged.

Coverage note: idct_bitexact_cpu runs at QVGA (300 MBs); not also at
1080p because the CPU path's wall time scales linearly with block
count and a 1080p CPU run is ~0.5s on hertz — fine standalone but
slows ctest enough that it would tempt opt-in gating.  The bit-exact
content is the same regardless of frame size; the 1080p variant only
exists to gate index-arithmetic bugs that surface above small int
boundaries.
marfrit merged commit 43aa43017c into main 2026-05-24 21:14:06 +00:00
marfrit deleted branch noether/phase1-substrate-select 2026-05-24 21:14:06 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/daedalus-decoder#9