Files
daedalus-fourier/external/dav1d-snapshot/PROVENANCE.md
T
marfrit 2cd2258a7b Cycle 5 setup (Phase 1+2): vendor dav1d 1.4.3 CDEF sources
First AV1 kernel cycle and first dav1d-vendored sources. Phase 1+2
docs lay out the structural complexity (CDEF needs pre-padded 12x12
working buffer + external edge context + direction lookup +
constraint function — meaningfully more complex than cycles 1-4).

Phase 3+ deferred to next session — CDEF is the first cycle that
doesn't fit cleanly into a single autonomous run.

Vendored from dav1d 1.4.3 (BSD-2-Clause, cleaner license than
FFmpeg's LGPL-2.1+):

  src/arm/64/cdef.S            520 lines — NEON impl
  src/arm/64/util.S            278 lines — NEON helpers
  src/arm/asm.S                335 lines — GAS preamble
  src/cdef_tmpl.c              331 lines — C reference (templated)
  include/common/intops.h       84 lines — utility helpers
  src/tables_cdef_subset.c      hand-extracted — dav1d_cdef_directions
                                only (avoids dragging full 1013-line
                                tables.c + transitive includes)

Discovery from Phase 2 analysis:
- Filter type and shape: dav1d_cdef_filter8_pri_sec_8bpc_neon takes
  (dst, dst_stride, tmp, pri_strength, sec_strength, dir, damping, h).
  The 'tmp' arg is the pre-padded 12x12 buffer constructed externally
  by the dav1d C-side padding() function.
- Tap weights are inline-computed (not table): pri_tap = 4 or 3
  (based on pri_strength bit), sec_tap = 2 or 1. Only
  dav1d_cdef_directions[12][2] is an external table.
- Constraint function: constrain(diff, threshold, shift) =
  apply_sign(min(abs(diff), max(0, threshold - (abs(diff) >> shift))),
             diff)

Predicted R5 band: 0.15-0.30 (ORANGE). CDEF is compute-heavier than
LPF (per-pixel min/max conditional logic), so likely worse R than
cycle 2/4 but better than cycle 3 MC. M4 gate likely required.

What Phase 3+ needs (next session):
1. config.h shim for dav1d's asm preamble (defines TBD on first build)
2. Standalone C reference for cdef_filter_block_8x8_c
   (cdef_tmpl.c references several dav1d private headers; cleaner to
   transcribe to a self-contained tests/cdef_ref.c)
3. tests/bench_neon_cdef.c — M1+M3 bench
4. Phase 4 plan, Phase 5 review (mandatory), Phase 6 shader, Phase 7 measure

PROVENANCE.md documents pin + per-file role + re-vendoring procedure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 13:12:25 +00:00

4.2 KiB

dav1d source snapshot

Verbatim subset of dav1d source pinned for use as reference implementations of AV1 CDEF (cycle 5 of daedalus-fourier) and potentially future AV1 kernels. dav1d is the canonical AV1 decoder library (BSD-2-Clause, maintained by VideoLAN).

See ../../docs/k5_cdef_phase1_2.md for the cycle 5 scope and rationale.

Upstream pin

Files in this snapshot

All files are byte-for-byte copies of the upstream source at the tagged commit, except tables_cdef_subset.c which is a hand-extracted single-table copy from src/tables.c (see §"Why each file" below).

Path Lines SHA-256
src/arm/64/cdef.S 520 88d048cbed93f168... (TODO full hash)
src/arm/64/util.S 278 582acd8e2b74a1e8...
src/arm/asm.S 335 6a22def2799876c4...
src/cdef_tmpl.c 331 26a7a5f9fda65c58...
include/common/intops.h 84 c1e7d52b421d6417...
src/tables_cdef_subset.c hand-extracted

Full SHA-256s (regenerated by phase 3 setup):

( cd external/dav1d-snapshot && sha256sum \
    src/arm/64/cdef.S src/arm/64/util.S src/arm/asm.S \
    src/cdef_tmpl.c include/common/intops.h )

License

BSD-2-Clause. Copyright (c) 2018 VideoLAN and dav1d authors; (c) 2019 Martin Storsjö (NEON aarch64). Original copyright headers preserved in each vendored file.

Notably cleaner license than the FFmpeg LGPL-2.1+ snapshot — dav1d's BSD allows distribution of binaries without LGPL's "share linking ability" requirements. For daedalus-fourier benches that link only this snapshot, the binary inherits BSD-2-Clause. Benches that combine both snapshots (none currently) inherit LGPL-2.1+ via FFmpeg's stronger terms.

Why each file

  • src/arm/64/cdef.S — the NEON aarch64 implementation. Provides dav1d_cdef_filter8_pri_sec_8bpc_neon and pri-only / sec-only variants. The Phase 3 NEON baseline (M3₅) measures this symbol.

  • src/arm/64/util.S — helper macros (load_px_8, handle_pixel_8, etc.) referenced by cdef.S.

  • src/arm/asm.S — top-level GAS preamble (function/endfunc, movrel, register macros). dav1d's own version is similar to FFmpeg's but with different defines (PRIVATE_PREFIX dav1d_ etc.); Phase 6 setup will identify the config.h shim needed for standalone assembly.

  • src/cdef_tmpl.c — the C reference (templated; the cdef_filter_block_c core function is in here, expanded to cdef_filter_block_8x8_c via cdef_fn(8, 8)).

  • include/common/intops.h — utility helpers (apply_sign, imin, imax, iclip, umin) used by cdef_tmpl.c.

  • src/tables_cdef_subset.c — hand-extracted dav1d_cdef_directions table from src/tables.c (lines 400-414). Provides the only table symbol both cdef.S and cdef_tmpl.c reference externally. Pulling in the full src/tables.c (1013 lines) would chain-include the entire dav1d decoder, which is overkill for our purposes. See tables_cdef_subset.c header comment for line-range reference back to upstream.

Re-vendoring procedure

Same as FFmpeg snapshot — see ../ffmpeg-snapshot/PROVENANCE.md.

TAG=1.x.y
BASE=https://raw.githubusercontent.com/videolan/dav1d/$TAG
cd external/dav1d-snapshot
for f in src/arm/64/cdef.S src/arm/64/util.S src/arm/asm.S \
         src/cdef_tmpl.c include/common/intops.h; do
  curl -sSf -o "$f" "$BASE/$f"
done
# tables_cdef_subset.c needs manual re-extraction from
# upstream src/tables.c — search for "dav1d_cdef_directions ="

Pending work (Phase 3+, next session)

  • config.h shim for assembling cdef.S standalone (dav1d's defines differ from FFmpeg's; will identify exact list on first build)
  • Standalone C reference for cdef_filter_block_8x8_c (this snapshot's cdef_tmpl.c references several private headers — easier to transcribe to a self-contained tests/cdef_ref.c)
  • tests/bench_neon_cdef.c to capture M3₅ baseline