From 29e0852d119fcda27e701b972141596a41f839a3 Mon Sep 17 00:00:00 2001 From: claude-noether Date: Fri, 22 May 2026 12:17:14 +0200 Subject: [PATCH] =?UTF-8?q?ffmpeg-v4l2-request-fourier:=20substitute=20H.2?= =?UTF-8?q?64=20luma-v=20deblock=20=E2=86=92=20daedalus-fourier?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cycle 8 of the libavcodec.so substitution arc (reauktion/daedalus-v4l2#11 step 2). H264DSPContext.v_loop_filter_luma — non-intra bS<4 vertical luma deblock, called per macroblock-row edge from the slice deblock loop in libavcodec/h264_loopfilter.c — now dispatches through daedalus_recipe_dispatch_h264_deblock_luma_v instead of ff_h264_v_loop_filter_luma_neon. ## What - Add 0005-h264-deblock-luma-v-daedalus-fourier.patch (in both arch/ and debian/ ffmpeg-v4l2-request-fourier/). Extends libavcodec/aarch64/h264_idct_daedalus.c with ff_h264_v_loop_filter_luma_daedalus (constructs a daedalus_h264_deblock_meta from FFmpeg's (alpha, beta, tc0[4]) and calls daedalus_recipe_dispatch_h264_deblock_luma_v with n_edges=1). Patches libavcodec/aarch64/h264dsp_init_aarch64.c to wire c->v_loop_filter_luma to the new shim. - arch/PKGBUILD + debian/build-deb.sh: append patch + bump pkgrel/PKGREL to 8. - No new build-deps, no Depends change, no daedalus-fourier rev — the d87239d pin already exposes daedalus_recipe_dispatch_h264_deblock_luma_v. ## Why Cycle 8 is marked "CPU primary; QPU opportunistic" in the daedalus- fourier API docstring. Per the hybrid substrate philosophy ("if there's a coprocessor, use it") we eventually want the QPU opportunism active here. But the libavcodec.so context is process-global and shared with cycles 6/7 via pthread_once, and it uses daedalus_ctx_create_no_qpu deliberately to avoid implicit Vulkan init in arbitrary host processes (Firefox content, mpv-fourier, ffmpeg-fourier CLI, ...). Switching to daedalus_ctx_create here without a feature flag would be a footgun. So cycle 8 lands as plumbing-only NEON-by-recipe substitution for now; opportunistic QPU enablement is a separate follow-up that adds a DAEDALUS_FOURIER_ENABLE_QPU env var or equivalent. ## Scope NOT covered - Intra (bS=4) loop filter c->v_loop_filter_luma_intra — daedalus's daedalus_h264_deblock_meta only covers the non-intra path. - Horizontal-edge variant c->h_loop_filter_luma — separate kernel (not yet in daedalus-fourier API). - Chroma loop filters — separate kernels. - Bulk batching — single-edge dispatch wastes the kernel's n_edges>1 amortization. Same caveat as cycles 6/7; follow-up. - QPU opportunism — see "Why" above. ## SONAME Unchanged. libavcodec.so.62 / libavformat.so.62 / libavutil.so.60. ## Refs - reauktion/daedalus-v4l2 issue #11: https://git.reauktion.de/reauktion/daedalus-v4l2/issues/11 - marfrit-packages PR #76 (cycle 6 IDCT 4×4) - marfrit-packages PR #85 (cycle 7 IDCT 8×8) - marfrit/daedalus-fourier cycle 8 close (deblock luma-v NEON green) --- ...h264-deblock-luma-v-daedalus-fourier.patch | 121 ++++++++++++++++++ arch/ffmpeg-v4l2-request-fourier/PKGBUILD | 8 +- ...h264-deblock-luma-v-daedalus-fourier.patch | 121 ++++++++++++++++++ .../ffmpeg-v4l2-request-fourier/build-deb.sh | 13 +- .../debian/changelog | 25 ++++ 5 files changed, 280 insertions(+), 8 deletions(-) create mode 100644 arch/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch create mode 100644 debian/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch diff --git a/arch/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch b/arch/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch new file mode 100644 index 000000000..5146c3710 --- /dev/null +++ b/arch/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch @@ -0,0 +1,121 @@ +From 68731c41d7ea68be0e912b128cb4e71fb56e8263 Mon Sep 17 00:00:00 2001 +From: Markus Fritsche +Date: Fri, 22 May 2026 12:15:16 +0200 +Subject: [PATCH] avcodec/aarch64/h264dsp: route H.264 luma-v deblock through + daedalus-fourier +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +H264DSPContext.v_loop_filter_luma (non-intra bS<4 vertical luma +deblock, called per macroblock-row edge from the slice deblock +loop) now dispatches through +daedalus_recipe_dispatch_h264_deblock_luma_v instead of +ff_h264_v_loop_filter_luma_neon. + +The recipe layer picks the substrate; for cycle 8 the daedalus +docstring marks the kernel "CPU primary; QPU opportunistic", but +the libavcodec.so context here is built with +daedalus_ctx_create_no_qpu — process-global pthread_once init, +shared with cycles 6/7. QPU opportunism stays gated off until a +follow-up adds an explicit feature flag (no implicit Vulkan init +in arbitrary host processes). In the meantime cycle 8 is a +plumbing-only substitution, NEON-to-NEON via the daedalus recipe. + +Intra (bS=4) loop filter — c->v_loop_filter_luma_intra — stays on +the in-tree NEON .S code; daedalus's daedalus_h264_deblock_meta +only covers the non-intra path per its docstring. + +FFmpeg `int alpha/beta/int8_t tc0[4]` → daedalus_h264_deblock_meta +(int32_t alpha/beta + inline int8_t tc0[4]). pix already points +to row 0 of the bottom block per FFmpeg's deblock convention, +satisfying daedalus's `dst_off >= 4 * dst_stride` constraint. + +Refs reauktion/daedalus-v4l2#11 — substitution arc step 2 cycle 8. +--- + libavcodec/aarch64/h264_idct_daedalus.c | 36 +++++++++++++++++++---- + libavcodec/aarch64/h264dsp_init_aarch64.c | 4 ++- + 2 files changed, 33 insertions(+), 7 deletions(-) + +diff --git a/libavcodec/aarch64/h264_idct_daedalus.c b/libavcodec/aarch64/h264_idct_daedalus.c +index cbb98af..92365fa 100644 +--- a/libavcodec/aarch64/h264_idct_daedalus.c ++++ b/libavcodec/aarch64/h264_idct_daedalus.c +@@ -1,11 +1,14 @@ + /* +- * H.264 4x4 / 8x8 IDCT + add — daedalus-fourier substitution shims. ++ * H.264 4x4 / 8x8 IDCT + luma-v deblock — daedalus-fourier substitution shims. + * +- * Routes H264DSPContext.idct_add → daedalus_recipe_dispatch_h264_idct4 +- * H264DSPContext.idct8_add → daedalus_recipe_dispatch_h264_idct8 +- * instead of the in-tree ff_h264_idct{,8}_add_neon assembly. The +- * recipe layer picks the substrate (CPU NEON by default for cycles +- * 6 + 7; future cycles may dispatch to V3D opportunistically). ++ * Routes H264DSPContext.idct_add → daedalus_recipe_dispatch_h264_idct4 ++ * H264DSPContext.idct8_add → daedalus_recipe_dispatch_h264_idct8 ++ * H264DSPContext.v_loop_filter_luma → daedalus_recipe_dispatch_h264_deblock_luma_v ++ * instead of the in-tree ff_h264_*_neon assembly. The recipe layer ++ * picks the substrate (CPU NEON for cycles 6 + 7 by default; cycle 8 ++ * is CPU primary with QPU opportunistic — the ctx below is no-QPU, ++ * so cycle 8 stays on the CPU NEON path until a separate change ++ * gates QPU init on a daedalus-fourier feature flag). + * + * FFmpeg's 4x4 and 8x8 block memory layouts match daedalus's + * column-major convention: block[r + N*c] = coefficient at +@@ -40,6 +43,8 @@ static void daedalus_ctx_init_once(void) + + void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride); + void ff_h264_idct8_add_daedalus(uint8_t *dst, int16_t *block, int stride); ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0); + + void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride) + { +@@ -60,3 +65,22 @@ void ff_h264_idct8_add_daedalus(uint8_t *dst, int16_t *block, int stride) + daedalus_recipe_dispatch_h264_idct8(g_dctx, dst, (size_t)stride, + block, 1, &meta); + } ++ ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0) ++{ ++ daedalus_h264_deblock_meta meta = { ++ .dst_off = 0, ++ .alpha = alpha, ++ .beta = beta, ++ }; ++ meta.tc0[0] = tc0[0]; ++ meta.tc0[1] = tc0[1]; ++ meta.tc0[2] = tc0[2]; ++ meta.tc0[3] = tc0[3]; ++ ++ pthread_once(&g_dctx_once, daedalus_ctx_init_once); ++ ++ daedalus_recipe_dispatch_h264_deblock_luma_v(g_dctx, pix, (size_t)stride, ++ 1, &meta); ++} +diff --git a/libavcodec/aarch64/h264dsp_init_aarch64.c b/libavcodec/aarch64/h264dsp_init_aarch64.c +index 741e551..85ac381 100644 +--- a/libavcodec/aarch64/h264dsp_init_aarch64.c ++++ b/libavcodec/aarch64/h264dsp_init_aarch64.c +@@ -27,6 +27,8 @@ + + void ff_h264_v_loop_filter_luma_neon(uint8_t *pix, ptrdiff_t stride, int alpha, + int beta, int8_t *tc0); ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0); + void ff_h264_h_loop_filter_luma_neon(uint8_t *pix, ptrdiff_t stride, int alpha, + int beta, int8_t *tc0); + void ff_h264_v_loop_filter_luma_intra_neon(uint8_t *pix, ptrdiff_t stride, int alpha, +@@ -114,7 +116,7 @@ av_cold void ff_h264dsp_init_aarch64(H264DSPContext *c, const int bit_depth, + int cpu_flags = av_get_cpu_flags(); + + if (have_neon(cpu_flags) && bit_depth == 8) { +- c->v_loop_filter_luma = ff_h264_v_loop_filter_luma_neon; ++ c->v_loop_filter_luma = ff_h264_v_loop_filter_luma_daedalus; + c->h_loop_filter_luma = ff_h264_h_loop_filter_luma_neon; + c->v_loop_filter_luma_intra= ff_h264_v_loop_filter_luma_intra_neon; + c->h_loop_filter_luma_intra= ff_h264_h_loop_filter_luma_intra_neon; +-- +2.47.3 + diff --git a/arch/ffmpeg-v4l2-request-fourier/PKGBUILD b/arch/ffmpeg-v4l2-request-fourier/PKGBUILD index 6670cfadd..c113842cd 100644 --- a/arch/ffmpeg-v4l2-request-fourier/PKGBUILD +++ b/arch/ffmpeg-v4l2-request-fourier/PKGBUILD @@ -24,7 +24,7 @@ _srcname=FFmpeg _version='8.1' _commit='b57fbbe50c9b2656fad86a1a7eeabfd2b2a50935' # v4l2-request-n8.1 tip 2026-04-24 pkgver=8.1.r123329.b57fbbe -pkgrel=7 # pkgrel=7 — H.264 IDCT 8x8 daedalus-fourier substitution (cycle 7, 2026-05-22) +pkgrel=8 # pkgrel=8 — H.264 luma-v deblock daedalus-fourier substitution (cycle 8, 2026-05-22) epoch=2 # daedalus-fourier pin — first kernel substitution in libavcodec @@ -91,8 +91,9 @@ source=("git+https://github.com/Kwiboo/FFmpeg.git#commit=${_commit}" '0001-libudev-bypass-fallback.patch' '0002-nv15-to-p010-unpack.patch' '0003-h264-idct4-daedalus-fourier.patch' - '0004-h264-idct8-daedalus-fourier.patch') -sha256sums=('SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP') + '0004-h264-idct8-daedalus-fourier.patch' + '0005-h264-deblock-luma-v-daedalus-fourier.patch') +sha256sums=('SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP') pkgver() { cd "${_srcname}" @@ -107,6 +108,7 @@ prepare() { patch -Np1 -i "${srcdir}/0002-nv15-to-p010-unpack.patch" patch -Np1 -i "${srcdir}/0003-h264-idct4-daedalus-fourier.patch" patch -Np1 -i "${srcdir}/0004-h264-idct8-daedalus-fourier.patch" + patch -Np1 -i "${srcdir}/0005-h264-deblock-luma-v-daedalus-fourier.patch" } build() { diff --git a/debian/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch b/debian/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch new file mode 100644 index 000000000..5146c3710 --- /dev/null +++ b/debian/ffmpeg-v4l2-request-fourier/0005-h264-deblock-luma-v-daedalus-fourier.patch @@ -0,0 +1,121 @@ +From 68731c41d7ea68be0e912b128cb4e71fb56e8263 Mon Sep 17 00:00:00 2001 +From: Markus Fritsche +Date: Fri, 22 May 2026 12:15:16 +0200 +Subject: [PATCH] avcodec/aarch64/h264dsp: route H.264 luma-v deblock through + daedalus-fourier +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +H264DSPContext.v_loop_filter_luma (non-intra bS<4 vertical luma +deblock, called per macroblock-row edge from the slice deblock +loop) now dispatches through +daedalus_recipe_dispatch_h264_deblock_luma_v instead of +ff_h264_v_loop_filter_luma_neon. + +The recipe layer picks the substrate; for cycle 8 the daedalus +docstring marks the kernel "CPU primary; QPU opportunistic", but +the libavcodec.so context here is built with +daedalus_ctx_create_no_qpu — process-global pthread_once init, +shared with cycles 6/7. QPU opportunism stays gated off until a +follow-up adds an explicit feature flag (no implicit Vulkan init +in arbitrary host processes). In the meantime cycle 8 is a +plumbing-only substitution, NEON-to-NEON via the daedalus recipe. + +Intra (bS=4) loop filter — c->v_loop_filter_luma_intra — stays on +the in-tree NEON .S code; daedalus's daedalus_h264_deblock_meta +only covers the non-intra path per its docstring. + +FFmpeg `int alpha/beta/int8_t tc0[4]` → daedalus_h264_deblock_meta +(int32_t alpha/beta + inline int8_t tc0[4]). pix already points +to row 0 of the bottom block per FFmpeg's deblock convention, +satisfying daedalus's `dst_off >= 4 * dst_stride` constraint. + +Refs reauktion/daedalus-v4l2#11 — substitution arc step 2 cycle 8. +--- + libavcodec/aarch64/h264_idct_daedalus.c | 36 +++++++++++++++++++---- + libavcodec/aarch64/h264dsp_init_aarch64.c | 4 ++- + 2 files changed, 33 insertions(+), 7 deletions(-) + +diff --git a/libavcodec/aarch64/h264_idct_daedalus.c b/libavcodec/aarch64/h264_idct_daedalus.c +index cbb98af..92365fa 100644 +--- a/libavcodec/aarch64/h264_idct_daedalus.c ++++ b/libavcodec/aarch64/h264_idct_daedalus.c +@@ -1,11 +1,14 @@ + /* +- * H.264 4x4 / 8x8 IDCT + add — daedalus-fourier substitution shims. ++ * H.264 4x4 / 8x8 IDCT + luma-v deblock — daedalus-fourier substitution shims. + * +- * Routes H264DSPContext.idct_add → daedalus_recipe_dispatch_h264_idct4 +- * H264DSPContext.idct8_add → daedalus_recipe_dispatch_h264_idct8 +- * instead of the in-tree ff_h264_idct{,8}_add_neon assembly. The +- * recipe layer picks the substrate (CPU NEON by default for cycles +- * 6 + 7; future cycles may dispatch to V3D opportunistically). ++ * Routes H264DSPContext.idct_add → daedalus_recipe_dispatch_h264_idct4 ++ * H264DSPContext.idct8_add → daedalus_recipe_dispatch_h264_idct8 ++ * H264DSPContext.v_loop_filter_luma → daedalus_recipe_dispatch_h264_deblock_luma_v ++ * instead of the in-tree ff_h264_*_neon assembly. The recipe layer ++ * picks the substrate (CPU NEON for cycles 6 + 7 by default; cycle 8 ++ * is CPU primary with QPU opportunistic — the ctx below is no-QPU, ++ * so cycle 8 stays on the CPU NEON path until a separate change ++ * gates QPU init on a daedalus-fourier feature flag). + * + * FFmpeg's 4x4 and 8x8 block memory layouts match daedalus's + * column-major convention: block[r + N*c] = coefficient at +@@ -40,6 +43,8 @@ static void daedalus_ctx_init_once(void) + + void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride); + void ff_h264_idct8_add_daedalus(uint8_t *dst, int16_t *block, int stride); ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0); + + void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride) + { +@@ -60,3 +65,22 @@ void ff_h264_idct8_add_daedalus(uint8_t *dst, int16_t *block, int stride) + daedalus_recipe_dispatch_h264_idct8(g_dctx, dst, (size_t)stride, + block, 1, &meta); + } ++ ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0) ++{ ++ daedalus_h264_deblock_meta meta = { ++ .dst_off = 0, ++ .alpha = alpha, ++ .beta = beta, ++ }; ++ meta.tc0[0] = tc0[0]; ++ meta.tc0[1] = tc0[1]; ++ meta.tc0[2] = tc0[2]; ++ meta.tc0[3] = tc0[3]; ++ ++ pthread_once(&g_dctx_once, daedalus_ctx_init_once); ++ ++ daedalus_recipe_dispatch_h264_deblock_luma_v(g_dctx, pix, (size_t)stride, ++ 1, &meta); ++} +diff --git a/libavcodec/aarch64/h264dsp_init_aarch64.c b/libavcodec/aarch64/h264dsp_init_aarch64.c +index 741e551..85ac381 100644 +--- a/libavcodec/aarch64/h264dsp_init_aarch64.c ++++ b/libavcodec/aarch64/h264dsp_init_aarch64.c +@@ -27,6 +27,8 @@ + + void ff_h264_v_loop_filter_luma_neon(uint8_t *pix, ptrdiff_t stride, int alpha, + int beta, int8_t *tc0); ++void ff_h264_v_loop_filter_luma_daedalus(uint8_t *pix, ptrdiff_t stride, ++ int alpha, int beta, int8_t *tc0); + void ff_h264_h_loop_filter_luma_neon(uint8_t *pix, ptrdiff_t stride, int alpha, + int beta, int8_t *tc0); + void ff_h264_v_loop_filter_luma_intra_neon(uint8_t *pix, ptrdiff_t stride, int alpha, +@@ -114,7 +116,7 @@ av_cold void ff_h264dsp_init_aarch64(H264DSPContext *c, const int bit_depth, + int cpu_flags = av_get_cpu_flags(); + + if (have_neon(cpu_flags) && bit_depth == 8) { +- c->v_loop_filter_luma = ff_h264_v_loop_filter_luma_neon; ++ c->v_loop_filter_luma = ff_h264_v_loop_filter_luma_daedalus; + c->h_loop_filter_luma = ff_h264_h_loop_filter_luma_neon; + c->v_loop_filter_luma_intra= ff_h264_v_loop_filter_luma_intra_neon; + c->h_loop_filter_luma_intra= ff_h264_h_loop_filter_luma_intra_neon; +-- +2.47.3 + diff --git a/debian/ffmpeg-v4l2-request-fourier/build-deb.sh b/debian/ffmpeg-v4l2-request-fourier/build-deb.sh index 398b710ba..c9add2eba 100755 --- a/debian/ffmpeg-v4l2-request-fourier/build-deb.sh +++ b/debian/ffmpeg-v4l2-request-fourier/build-deb.sh @@ -33,11 +33,13 @@ FFMPEG_VERSION=8.1 # epoch 2 matches Debian's stock ffmpeg (currently 7:7.1.x in trixie); # +rfourier suffix to avoid colliding with upstream/Debian rebuilds. PKGVER=2:${FFMPEG_VERSION}+rfourier+gb57fbbe -PKGREL=7 # pkgrel=7 — H.264 IDCT 8x8 daedalus-fourier substitution - # (cycle 7). Stacks on top of cycle-6 IDCT 4x4 (PR #76) and - # the libxml2-drop ABI-skew workaround (PR #78). Wires - # H264DSPContext.idct8_add through - # daedalus_recipe_dispatch_h264_idct8. (2026-05-22) +PKGREL=8 # pkgrel=8 — H.264 luma-v deblock daedalus-fourier substitution + # (cycle 8, non-intra bS<4 vertical luma). Stacks on cycles + # 6/7 (IDCT 4x4 + 8x8). Wires H264DSPContext.v_loop_filter_luma + # through daedalus_recipe_dispatch_h264_deblock_luma_v. + # ctx stays no-QPU until a separate change gates Vulkan init + # on a feature flag; cycle-8 dispatch is NEON-by-recipe for + # now. (2026-05-22) # daedalus-fourier pin — first kernel substitution in libavcodec (cycle 6 # H.264 IDCT 4x4). Same SHA as the daedalus-v4l2 daemon already ships @@ -68,6 +70,7 @@ patch -Np1 -i "$HERE/0001-libudev-bypass-fallback.patch" patch -Np1 -i "$HERE/0002-nv15-to-p010-unpack.patch" patch -Np1 -i "$HERE/0003-h264-idct4-daedalus-fourier.patch" patch -Np1 -i "$HERE/0004-h264-idct8-daedalus-fourier.patch" +patch -Np1 -i "$HERE/0005-h264-deblock-luma-v-daedalus-fourier.patch" # --- daedalus-fourier: fetch + build static .a with PIC, install to a # per-build prefix; libavcodec.so links it into the shared object so diff --git a/debian/ffmpeg-v4l2-request-fourier/debian/changelog b/debian/ffmpeg-v4l2-request-fourier/debian/changelog index 977a692c8..c9bd89376 100644 --- a/debian/ffmpeg-v4l2-request-fourier/debian/changelog +++ b/debian/ffmpeg-v4l2-request-fourier/debian/changelog @@ -1,3 +1,28 @@ +ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-8) bookworm trixie; urgency=medium + + * Add 0005-h264-deblock-luma-v-daedalus-fourier.patch — + H264DSPContext.v_loop_filter_luma (non-intra bS<4 vertical luma + deblock, called per macroblock-row edge from the slice deblock + loop in libavcodec/h264_loopfilter.c) now dispatches through + daedalus_recipe_dispatch_h264_deblock_luma_v instead of + ff_h264_v_loop_filter_luma_neon. Cycle 8 of the daedalus-v4l2#11 + step 2 substitution arc. + * Cycle 8 is marked "CPU primary; QPU opportunistic" in + daedalus-fourier, but the libavcodec.so context here uses + daedalus_ctx_create_no_qpu (process-global pthread_once, + shared with cycles 6/7). Opportunistic QPU is deferred to a + separate change that gates Vulkan init on a feature flag, to + avoid implicit Vulkan init in arbitrary host processes. For + now cycle 8 is plumbing-only — NEON-by-recipe. + * Intra (bS=4) loop filter c->v_loop_filter_luma_intra stays on + the in-tree NEON .S code; daedalus's daedalus_h264_deblock_meta + only covers the non-intra path per its API docstring. + * Bit-exact against ff_h264_v_loop_filter_luma_neon (daedalus-fourier + cycle 8 green). + * No SONAME change, no Depends change. + + -- Markus Fritsche Fri, 22 May 2026 12:30:00 +0000 + ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-7) bookworm trixie; urgency=medium * Add 0004-h264-idct8-daedalus-fourier.patch — H264DSPContext.idct8_add