ffmpeg-v4l2-request-fourier: substitute H.264 luma-v deblock → daedalus-fourier #86

Merged
marfrit merged 1 commits from claude-noether/marfrit-packages:noether/ffmpeg-fourier-deblock-luma-v-daedalus into main 2026-05-22 10:29:10 +00:00
Owner

Cycle 8 of the libavcodec.so substitution arc (reauktion/daedalus-v4l2#11 step 2). H264DSPContext.v_loop_filter_luma — non-intra bS<4 vertical luma deblock, called per macroblock-row edge from the slice deblock loop in libavcodec/h264_loopfilter.c — now dispatches through daedalus_recipe_dispatch_h264_deblock_luma_v instead of ff_h264_v_loop_filter_luma_neon.

What

  • Add 0005-h264-deblock-luma-v-daedalus-fourier.patch (in both arch/ and debian/ ffmpeg-v4l2-request-fourier/). Extends libavcodec/aarch64/h264_idct_daedalus.c with ff_h264_v_loop_filter_luma_daedalus (constructs a daedalus_h264_deblock_meta from FFmpeg's (alpha, beta, tc0[4]) and calls daedalus_recipe_dispatch_h264_deblock_luma_v with n_edges=1). Patches libavcodec/aarch64/h264dsp_init_aarch64.c to wire c->v_loop_filter_luma to the new shim.
  • arch/PKGBUILD + debian/build-deb.sh: append patch + bump pkgrel/PKGREL to 8.
  • No new build-deps, no Depends change, no daedalus-fourier rev — the d87239d pin already exposes daedalus_recipe_dispatch_h264_deblock_luma_v.

Why

Cycle 8 is marked "CPU primary; QPU opportunistic" in the daedalus-fourier API docstring. Per the hybrid substrate philosophy ("if there's a coprocessor, use it") we eventually want the QPU opportunism active here. But the libavcodec.so context is process-global and shared with cycles 6/7 via pthread_once, and it uses daedalus_ctx_create_no_qpu deliberately to avoid implicit Vulkan init in arbitrary host processes (Firefox content, mpv-fourier, ffmpeg-fourier CLI, ...). Switching to daedalus_ctx_create here without a feature flag would be a footgun.

So cycle 8 lands as plumbing-only NEON-by-recipe substitution for now; opportunistic QPU enablement is a separate follow-up that adds a DAEDALUS_FOURIER_ENABLE_QPU env var or equivalent.

Scope NOT covered

  • Intra (bS=4) loop filter c->v_loop_filter_luma_intra — daedalus's daedalus_h264_deblock_meta only covers the non-intra path.
  • Horizontal-edge variant c->h_loop_filter_luma — separate kernel (not yet in daedalus-fourier API).
  • Chroma loop filters — separate kernels.
  • Bulk batching — single-edge dispatch wastes the kernel's n_edges>1 amortization. Same caveat as cycles 6/7; follow-up.
  • QPU opportunism — see Why above.

SONAME

Unchanged. libavcodec.so.62 / libavformat.so.62 / libavutil.so.60.

Refs

Cycle 8 of the libavcodec.so substitution arc (reauktion/daedalus-v4l2#11 step 2). H264DSPContext.v_loop_filter_luma — non-intra bS<4 vertical luma deblock, called per macroblock-row edge from the slice deblock loop in libavcodec/h264_loopfilter.c — now dispatches through `daedalus_recipe_dispatch_h264_deblock_luma_v` instead of `ff_h264_v_loop_filter_luma_neon`. ## What - Add `0005-h264-deblock-luma-v-daedalus-fourier.patch` (in both arch/ and debian/ ffmpeg-v4l2-request-fourier/). Extends `libavcodec/aarch64/h264_idct_daedalus.c` with `ff_h264_v_loop_filter_luma_daedalus` (constructs a `daedalus_h264_deblock_meta` from FFmpeg's `(alpha, beta, tc0[4])` and calls `daedalus_recipe_dispatch_h264_deblock_luma_v` with `n_edges=1`). Patches `libavcodec/aarch64/h264dsp_init_aarch64.c` to wire `c->v_loop_filter_luma` to the new shim. - `arch/PKGBUILD` + `debian/build-deb.sh`: append patch + bump pkgrel/PKGREL to **8**. - No new build-deps, no Depends change, no daedalus-fourier rev — the `d87239d` pin already exposes `daedalus_recipe_dispatch_h264_deblock_luma_v`. ## Why Cycle 8 is marked "CPU primary; QPU opportunistic" in the daedalus-fourier API docstring. Per the hybrid substrate philosophy ("if there's a coprocessor, use it") we eventually want the QPU opportunism active here. But the libavcodec.so context is process-global and shared with cycles 6/7 via `pthread_once`, and it uses `daedalus_ctx_create_no_qpu` deliberately to avoid implicit Vulkan init in arbitrary host processes (Firefox content, mpv-fourier, ffmpeg-fourier CLI, ...). Switching to `daedalus_ctx_create` here without a feature flag would be a footgun. So cycle 8 lands as plumbing-only NEON-by-recipe substitution for now; opportunistic QPU enablement is a separate follow-up that adds a `DAEDALUS_FOURIER_ENABLE_QPU` env var or equivalent. ## Scope NOT covered - Intra (bS=4) loop filter `c->v_loop_filter_luma_intra` — daedalus's `daedalus_h264_deblock_meta` only covers the non-intra path. - Horizontal-edge variant `c->h_loop_filter_luma` — separate kernel (not yet in daedalus-fourier API). - Chroma loop filters — separate kernels. - Bulk batching — single-edge dispatch wastes the kernel's `n_edges>1` amortization. Same caveat as cycles 6/7; follow-up. - QPU opportunism — see Why above. ## SONAME Unchanged. libavcodec.so.62 / libavformat.so.62 / libavutil.so.60. ## Refs - reauktion/daedalus-v4l2 issue #11: https://git.reauktion.de/reauktion/daedalus-v4l2/issues/11 - marfrit-packages PR #76 (cycle 6 IDCT 4×4) - marfrit-packages PR #85 (cycle 7 IDCT 8×8)
marfrit added 1 commit 2026-05-22 10:17:38 +00:00
Cycle 8 of the libavcodec.so substitution arc (reauktion/daedalus-v4l2#11
step 2).  H264DSPContext.v_loop_filter_luma — non-intra bS<4 vertical
luma deblock, called per macroblock-row edge from the slice deblock
loop in libavcodec/h264_loopfilter.c — now dispatches through
daedalus_recipe_dispatch_h264_deblock_luma_v instead of
ff_h264_v_loop_filter_luma_neon.

## What

- Add 0005-h264-deblock-luma-v-daedalus-fourier.patch (in both arch/
  and debian/ ffmpeg-v4l2-request-fourier/).  Extends
  libavcodec/aarch64/h264_idct_daedalus.c with
  ff_h264_v_loop_filter_luma_daedalus (constructs a
  daedalus_h264_deblock_meta from FFmpeg's (alpha, beta, tc0[4]) and
  calls daedalus_recipe_dispatch_h264_deblock_luma_v with n_edges=1).
  Patches libavcodec/aarch64/h264dsp_init_aarch64.c to wire
  c->v_loop_filter_luma to the new shim.
- arch/PKGBUILD + debian/build-deb.sh: append patch + bump pkgrel/PKGREL
  to 8.
- No new build-deps, no Depends change, no daedalus-fourier rev — the
  d87239d pin already exposes daedalus_recipe_dispatch_h264_deblock_luma_v.

## Why

Cycle 8 is marked "CPU primary; QPU opportunistic" in the daedalus-
fourier API docstring.  Per the hybrid substrate philosophy
("if there's a coprocessor, use it") we eventually want the QPU
opportunism active here.  But the libavcodec.so context is
process-global and shared with cycles 6/7 via pthread_once, and it
uses daedalus_ctx_create_no_qpu deliberately to avoid implicit
Vulkan init in arbitrary host processes (Firefox content, mpv-fourier,
ffmpeg-fourier CLI, ...).  Switching to daedalus_ctx_create here
without a feature flag would be a footgun.

So cycle 8 lands as plumbing-only NEON-by-recipe substitution for
now; opportunistic QPU enablement is a separate follow-up that adds
a DAEDALUS_FOURIER_ENABLE_QPU env var or equivalent.

## Scope NOT covered

- Intra (bS=4) loop filter c->v_loop_filter_luma_intra — daedalus's
  daedalus_h264_deblock_meta only covers the non-intra path.
- Horizontal-edge variant c->h_loop_filter_luma — separate kernel
  (not yet in daedalus-fourier API).
- Chroma loop filters — separate kernels.
- Bulk batching — single-edge dispatch wastes the kernel's n_edges>1
  amortization.  Same caveat as cycles 6/7; follow-up.
- QPU opportunism — see "Why" above.

## SONAME

Unchanged.  libavcodec.so.62 / libavformat.so.62 / libavutil.so.60.

## Refs

- reauktion/daedalus-v4l2 issue #11: reauktion/daedalus-v4l2#11
- marfrit-packages PR #76 (cycle 6 IDCT 4×4)
- marfrit-packages PR #85 (cycle 7 IDCT 8×8)
- marfrit/daedalus-fourier cycle 8 close (deblock luma-v NEON green)
marfrit merged commit d11a52405d into main 2026-05-22 10:29:10 +00:00
marfrit deleted branch noether/ffmpeg-fourier-deblock-luma-v-daedalus 2026-05-22 10:29:10 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/marfrit-packages#86