50 Commits

Author SHA1 Message Date
marfrit 9bf97fdb49 ffmpeg-v4l2-request-fourier: PKGREL 3 → 5 (force rebuild past orphan -4 .deb)
PR #76 (H.264 IDCT 4×4 daedalus-fourier substitution) was merged but
the resulting .deb was not actually built: an orphan
ffmpeg-v4l2-request-fourier_8.1+rfourier+gb57fbbe-4_arm64.deb (dated
2026-05-19, no matching source commit in main) sat in the apt pool.
.gitea/scripts/check-already-published.sh's debian branch compares
`dpkg --compare-versions $pool_ver ge $source_full` — pool -4
≥ source -3, so CI's skip-check emitted skip=1 and short-circuited
the build.  The ffmpeg-v4l2-request-debian Action reported success
without actually publishing.

Bump source PKGREL past -4 so the next CI run sees source >= pool
and proceeds to build + publish.

No source code change beyond PKGREL + changelog.  Arch side
unaffected (its skip-check is exact-URL-match, not pool-head-ge).
2026-05-21 22:17:00 +02:00
marfrit a536e20218 Merge pull request 'ffmpeg-v4l2-request-fourier: substitute H.264 IDCT 4×4 → daedalus-fourier' (#76) from claude-noether/marfrit-packages:noether/ffmpeg-fourier-idct4-daedalus into main
Reviewed-on: marfrit/marfrit-packages#76
2026-05-21 19:57:31 +00:00
marfrit a1dba5f630 Merge remote-tracking branch 'origin/main' into noether/ffmpeg-fourier-idct4-daedalus 2026-05-21 21:56:41 +02:00
marfrit 88a65cb6d0 CI: add cmake / ninja-build / libvulkan-dev / glslang-tools to ffmpeg-debian deps
The ffmpeg-v4l2-request-debian job now needs to build daedalus-fourier
into a temp prefix before configuring FFmpeg (substitution patch
0003-h264-idct4-daedalus-fourier.patch links libdaedalus_core.a into
libavcodec.so).  Mirror the build-deps the daedalus-v4l2-debian job
already declared for the same reason.

No-op on Arch — makepkg --syncdeps auto-installs cmake/ninja/
vulkan-headers from the PKGBUILD makedepends.
2026-05-21 21:47:48 +02:00
marfrit e641d679d3 ffmpeg-v4l2-request-fourier: substitute H.264 IDCT 4×4 → daedalus-fourier
First cycle of the libavcodec.so substitution arc (reauktion/daedalus-v4l2#11
step 2).  H264DSPContext.idct_add — called per 4×4 block from the
intra-4×4 decode path in libavcodec/h264_mb.c — now dispatches through
daedalus_recipe_dispatch_h264_idct4 instead of ff_h264_idct_add_neon.

## What

- Add 0003-h264-idct4-daedalus-fourier.patch (in both arch/ and
  debian/ ffmpeg-v4l2-request-fourier/).  Creates
  libavcodec/aarch64/h264_idct_daedalus.c (ff_h264_idct_add_daedalus
  shim + lazy pthread_once context init via
  daedalus_ctx_create_no_qpu), patches
  libavcodec/aarch64/h264dsp_init_aarch64.c to wire c->idct_add to
  the shim, adds the new .o to libavcodec/aarch64/Makefile.
- arch/PKGBUILD + debian/build-deb.sh: fetch + build
  daedalus-fourier (pinned at d87239d — lockstep with the
  daedalus-v4l2 daemon's inline build) with
  -DCMAKE_POSITION_INDEPENDENT_CODE=ON into a per-build temp prefix,
  then pass --extra-cflags=-I.../include --extra-ldflags=-L.../lib
  --extra-libs="-ldaedalus_core -lvulkan -lpthread" to FFmpeg
  configure.  daedalus_core.a is static-linked into libavcodec.so.62.
- debian/control Depends gains libvulkan1 (daedalus_core PUBLIC-links
  Vulkan::Vulkan for the queryable QPU substrate; the no-QPU
  constructor still works at runtime but the loader needs
  libvulkan.so.1 present to dlopen libavcodec.so.62).
- arch/PKGBUILD depends gains vulkan-icd-loader, makedepends gains
  cmake / ninja / vulkan-headers.

## Why

The recipe layer picks the substrate; for cycle 6 (H.264 IDCT 4×4)
the recipe is CPU NEON, so this is effectively a NEON-to-NEON
substitution with one extra dispatch call and recipe-table lookup.
The point of this first cycle isn't perf wins — it's plumbing.  Once
the path is wired and stable, follow-up patches batch through the
bulk paths (idct_add16 / idct_add16intra / idct_add8) and stack
cycles 7/8/9 (IDCT 8×8, luma-v deblock, qpel mc20).

Bit-exact against ff_h264_idct_add_neon (daedalus-fourier cycle 6
green; FFmpeg's 4×4 block storage matches daedalus's column-major
convention).

## Scope NOT covered

- Bulk paths (idct_add16 / idct_add16intra / idct_add8) — most IDCT
  4×4 calls in real H.264 streams go through these, not the per-
  block c->idct_add path; intra-4×4-only macroblocks are a minority.
  Batched substitution lands in a follow-up.
- High-bit-depth (10-bit) path — not touched; 8-bit only.
- Cycles 7/8/9 — separate PRs.

## SONAME

Unchanged.  libavcodec.so.62 / libavformat.so.62 / libavutil.so.60.
No daedalus-v4l2-dkms or daedalus-v4l2 bump required.

## Refs

- reauktion/daedalus-v4l2 issue #11 (substitution arc): reauktion/daedalus-v4l2#11
- marfrit/daedalus-fourier cycle 6 close (H.264 IDCT 4×4 NEON green)
2026-05-21 21:44:35 +02:00
marfrit 877238bd1b Merge pull request 'daedalus-v4l2: 3bc0da1 -> 6e6dfa1 — dlopen Kwiboo soname 62 + CI build-deps swap' (#75) from claude-noether/marfrit-packages:noether/daedalus-bump-6e6dfa1-soname62 into main
Reviewed-on: marfrit/marfrit-packages#75
2026-05-21 19:26:39 +00:00
claude-noether 27617e4cb0 daedalus-v4l2: 3bc0da1 -> 6e6dfa1 — dlopen Kwiboo soname 62 (#16) 2026-05-21 21:24:03 +02:00
marfrit a2daab1b28 Merge pull request 'daedalus-v4l2: 77e14e5 -> 3bc0da1 — decode_us + periodic stats' (#74) from claude-noether/marfrit-packages:noether/daedalus-bump-3bc0da1 into main
Reviewed-on: marfrit/marfrit-packages#74
2026-05-21 18:50:46 +00:00
claude-noether 9146e83710 daedalus-v4l2: 77e14e5 -> 3bc0da1 — decode_us + periodic stats (#15) 2026-05-21 20:29:07 +02:00
marfrit abf8fb3077 Merge pull request 'ci: add libvulkan-dev + glslang-tools for daedalus-fourier build dep' (#73) from claude-noether/marfrit-packages:noether/ci-fourier-build-deps into main
Reviewed-on: marfrit/marfrit-packages#73
2026-05-21 18:05:59 +00:00
claude-noether 1414dfeac2 .gitea/workflows: add libvulkan-dev + glslang-tools to daedalus-v4l2 Debian build deps
The daedalus-v4l2 build-deb.sh (post marfrit-packages#72) now fetches
+ cmake-builds daedalus-fourier into a per-build temp prefix before
building the daemon, so the static-archive can be linked in.
daedalus-fourier's CMakeLists requires Vulkan headers and glslangValidator
(for SPIR-V compilation of the .comp compute shaders).  Without them
the configure step on the debian-aarch64 runner fails with:

  CMake Error at FindPackageHandleStandardArgs.cmake:233 (message):
    Could NOT find Vulkan (missing: Vulkan_LIBRARY Vulkan_INCLUDE_DIR)

(Observed on Gitea Actions run 1056.)

Add `libvulkan-dev` and `glslang-tools` to the apt-get install line so
the in-build daedalus-fourier compile succeeds and the daemon can link.
2026-05-21 19:58:19 +02:00
marfrit 41c1e0b6b9 Merge pull request 'daedalus-v4l2: 5d8b436 -> 77e14e5 — #12 (LOW_DELAY) + #13 (daedalus-fourier linkage)' (#72) from claude-noether/marfrit-packages:noether/daedalus-bump-77e14e5-with-fourier into main
Reviewed-on: marfrit/marfrit-packages#72
2026-05-21 17:15:12 +00:00
claude-noether c9a4b82f2c daedalus-v4l2: 5d8b436 -> 77e14e5 — picks up #12 (LOW_DELAY) + #13 (daedalus-fourier linkage)
Daemon-only bump (no daedalus-v4l2-dkms change needed; PROTO_VERSION
stays at 0).

#12 (LOW_DELAY half-measure): daemon sets AV_CODEC_FLAG_LOW_DELAY on
the H.264 AVCodecContext so libavcodec emits frames in decode order
~99% of the time (a few stragglers at GOP boundaries when the
stream's SPS num_reorder_frames overrides the flag).  Visible
improvement vs the 2-1-4-3 pair-swap on Firefox + mpv playback;
not the permanent fix — see daedalus-v4l2#11 for the architectural
plan to substitute daedalus-fourier kernels for libavcodec's
pixel math one cycle at a time.

#13 (daedalus-fourier linkage): daemon now pkg-config-links against
the daedalus-fourier kernel library (marfrit/daedalus-fourier) and
logs substrate availability at startup.  No kernels dispatched yet
— this is the build-time foundation for the substitution work.

build-deb.sh updated to fetch + build + install daedalus-fourier
(pinned at d87239d, marfrit/daedalus-fourier PR #1) into a per-
build temp prefix before invoking the daemon's cmake, exposing it
via PKG_CONFIG_PATH.  Static-linked, so the resulting .deb has no
new runtime deps.  Requires libvulkan-dev + glslang-tools on the
CI runner.

Arch PKGBUILD bumped to the same upstream commit but Arch packaging
for daedalus-fourier itself is a follow-up; until that lands the
Arch build expects daedalus-fourier installed by the user (AUR-style).
Debian-side is end-to-end self-contained via build-deb.sh.

Refs:
  * reauktion/daedalus-v4l2#12
  * reauktion/daedalus-v4l2#13
  * reauktion/daedalus-v4l2#11
  * marfrit/daedalus-fourier#1
2026-05-21 18:39:22 +02:00
marfrit 736b6da176 Merge pull request 'daedalus-v4l2{,-dkms}: 79256dc/6ffe92b -> 5d8b436 — revert parking design' (#71) from claude-noether/marfrit-packages:noether/daedalus-revert-bump-5d8b436 into main
Reviewed-on: marfrit/marfrit-packages#71
2026-05-21 14:54:18 +00:00
claude-noether 34972ae9c1 daedalus-v4l2{,-dkms}: 79256dc/6ffe92b -> 5d8b436 — revert parking design
Lock-step downgrade of both packages to the revert tip of
daedalus-v4l2 (PR #10 closed PRs #7 + #8).  After
0.1.0+r28+g79256dc-1 / 0.1.0+r30+g6ffe92b-1 landed in production,
mpv (--hwdec=vaapi-copy) failed pre-playing with "Unable to dequeue
buffer: Resource temporarily unavailable" because the daemon
parked CAPTURE buffers waiting for libavcodec's display-order
reorder, violating libva's V4L2 stateless 1:1 contract.  See
daedalus-v4l2#9 for the diagnostic, #10 for the revert PR.

DAEDALUS_PROTO_VERSION drops 1 → 0; install both .debs in the same
apt transaction.  Userspace ABI returns to the f0d4186-equivalent
behaviour, plus PR #4 (cosmetic H.264 menu controls).  The
daedalus-v4l2-dkms #64 multi-kernel postinst behaviour stays in
build-deb.sh.

Visible regression: H.264 B-frame streams in Firefox return to the
"2 1 4 3 6 5" pair-swap visual.  Proper fix (concurrent in-flight
requests in daemon + display-order reorder moved into libva-v4l2-
request-fourier) tracked at daedalus-v4l2#11.

Refs:
  * reauktion/daedalus-v4l2#9
  * reauktion/daedalus-v4l2#10  (merged)
  * reauktion/daedalus-v4l2#11
2026-05-21 15:42:03 +02:00
marfrit a9f1b833b9 Merge pull request 'mesa-panvk-bifrost: r3 -> r4 — iter17 XFB primitive decomposition' (#70) from claude-noether/marfrit-packages:noether/mesa-panvk-bifrost-r4-iter17-xfb-decomp into main
Reviewed-on: marfrit/marfrit-packages#70
2026-05-21 12:18:23 +00:00
marfrit 83e8eca56d mesa-panvk-bifrost: r3 -> r4 — iter17 XFB primitive decomposition
iter17 closes the 162 winding_* CTS failures from iter15's baseline by
replacing the upstream pan_nir_lower_xfb call with a panvk-specific NIR
pass (panvk_per_arch(nir_lower_xfb)) that handles per-primitive
decomposition for non-LIST topologies (LINE_STRIP, TRIANGLE_STRIP,
TRIANGLE_FAN, and the four _WITH_ADJACENCY variants).

Topology + per-instance output vertex count are threaded as new sysvals
(vs.xfb_topology + vs.xfb_output_count) so the NIR pass can dispatch
per-topology at runtime without compiling 7+ shader variants.

dEQP-VK.transform_feedback.simple.* result (133596 cases total):
                  iter15 baseline  ->  iter17
  Pass:             796               958   (+162)
  Fail:             243               81    (-162; resume_* by-design only)
  NotSupported:     132551            132551
  Fatal-skip:       6                 6
  Pass rate of runnable: 76.2% -> 91.7% (+15.5pp)

100% of the iter15 winding-fail cluster closed. The remaining 81 fails
are all resume_* (pause/resume XFB, by design — we advertise
transformFeedbackDraw=false).

Second-model review (janet) produced 3 findings; Findings 1+2 were
already fixed in the in-tree applied state (stale applied_state/ snapshot
read by reviewer), Finding 3 (degenerate N underflow on N<2) addressed
by gating non-LIST emission on `output_count > 0` predicate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 14:07:00 +02:00
marfrit 1c8c186681 Merge pull request 'daedalus-v4l2-dkms: 79256dc -> 6ffe92b — fix kernel panic regression from #67' (#69) from claude-noether/marfrit-packages:noether/daedalus-dkms-bump-6ffe92b into main
Reviewed-on: marfrit/marfrit-packages#69
2026-05-21 12:00:15 +00:00
claude-noether a0be2dcc9f daedalus-v4l2-dkms: 79256dc -> 6ffe92b — fix kernel panic from #7
Kernel-only bump.  Fixes the hard-reboot regression introduced by
the daedalus-v4l2#7 split-completion design and observed on higgs
(Pi CM5) during the first mpv vaapi-copy playback of 720p H.264:
device_run now removes src + dst from m2m_ctx's rdy_queue at the
moment it picks them up, not at buf_done time.  Without this, a
parked dst_buf (waiting for libavcodec's display-order release)
stayed in the rdy_queue and got re-picked by the next device_run
after SRC_CONSUMED's job_finish released the scheduler — two
inflight entries on the same vb2_buffer, later HAS_PIXELS calls
list_del on an already-detached list_head, panic.

DAEDALUS_PROTO_VERSION stays at 1 — daemon (userspace
daedalus-v4l2) need NOT bump in lockstep with this DKMS update.
The existing daedalus-v4l2 0.1.0+r28+g79256dc is wire-compatible
with daedalus-v4l2-dkms 0.1.0+r30+g6ffe92b.

Refs:
  * reauktion/daedalus-v4l2#8
2026-05-21 13:56:42 +02:00
marfrit eb89f12c3e Merge pull request 'libva-v4l2-request-fourier: bump pin to c454618 (#15 transparent resize)' (#68) from claude-noether/marfrit-packages:bump-libva-fourier-c454618-issue-15 into main
Reviewed-on: marfrit/marfrit-packages#68
2026-05-21 11:25:39 +00:00
marfrit ce2fff1a4f libva-v4l2-request-fourier: bump pin to c454618 (#15 transparent resize)
Bumps both Arch PKGBUILD and Debian build-deb.sh pins to PR #16 —
codec_store_buffer + request_pool_resize transparent OUTPUT-pool grow
on a mid-session resolution upshift overrun.  Picks up the frame-
survival path that supersedes #13's drop-and-recreate fallback.

Dual-pin per feedback_marfrit_packages_dual_pin so both Arch and
Debian repos see check-already-published.sh report a new version.
2026-05-21 13:24:21 +02:00
marfrit 9301894997 Merge pull request 'daedalus-v4l2{,-dkms}: f0d4186 -> 79256dc — H.264 B-frame reorder fix + menu ctrls' (#67) from claude-noether/marfrit-packages:noether/daedalus-bump-79256dc into main
Reviewed-on: marfrit/marfrit-packages#67
2026-05-21 10:51:14 +00:00
claude-noether f21c1ff80a daedalus-v4l2{,-dkms}: f0d4186 -> 79256dc — H.264 B-frame reorder + menu ctrls
Lock-step bump of both packages to daedalus-v4l2#7 + #4.  PROTO_VERSION
bumps 0 → 1 at the daemon ↔ kernel chardev wire: REQ_DECODE adds
__u64 src_pts (the OUTPUT vb2 timestamp); RESP_FRAME adds __u32 flags
(HAS_PIXELS / SRC_CONSUMED) + __u64 output_src_pts (= frame->pts on
drain).  Both .debs must be installed atomically or the chardev
handshake rejects the version mismatch.

  * daedalus-v4l2: daemon's send_packet → receive_frame loop now
    stamps pkt->pts = req->src_pts and looks up the cookie for each
    drained frame via frame->pts.  chardev_client emits multiple
    RESP_FRAME messages per REQ_DECODE when libavcodec's display-
    order DPB releases an earlier frame on receipt of a later
    bitstream — fixes the "2 1 4 3 6 5" pair-swap on H.264 streams
    with B-frames.

  * daedalus-v4l2-dkms: kernel device_run mirrors src_buf timestamp
    into REQ_DECODE.src_pts.  Completion path splits HAS_PIXELS /
    SRC_CONSUMED: src is released as soon as send_packet succeeds
    (so the m2m scheduler moves on), dst stays parked until the
    matching frame is drained later.  TIMESTAMP_COPY's auto src→dst
    pairing no longer applies once lifecycles decouple — dst is
    stamped explicitly from inflight->src_pts at HAS_PIXELS time.

  * daedalus-v4l2-dkms also carries forward the -2 multi-kernel
    postinst fix (#64) from the prior PKGREL.  PKGREL resets to 1 on
    the new upstream pin.

The daedalus-v4l2#4 H.264 DECODE_MODE + START_CODE menu controls (a
cosmetic warning fix that PR landed alongside #7) is also subsumed —
"Unable to set control(s) error_idx=2/2" no longer fires.

Refs:
  * reauktion/daedalus-v4l2#7
  * reauktion/daedalus-v4l2#4
  * reauktion/daedalus-v4l2#6
2026-05-21 12:41:12 +02:00
marfrit e15b887d8d Merge pull request 'libva-v4l2-request-fourier: bump pin to 2860d75 (#13 bounds-check fix)' (#66) from claude-noether/marfrit-packages:bump-libva-fourier-2860d75-issue-13 into main
Reviewed-on: marfrit/marfrit-packages#66
2026-05-21 10:38:03 +00:00
marfrit b69db65037 libva-v4l2-request-fourier: bump pin to 2860d75 (#13 bounds-check fix)
Bumps both the Arch PKGBUILD and the Debian build-deb.sh pins to PR
#14 merge — codec_store_buffer bounds-checks for VASliceDataBufferType.
Picks up the SIGSEGV fix for mpv --hwdec=vaapi-copy on resolution
upshift mid-stream (issue #13).

Dual-pin so check-already-published.sh detects both pool ABIs as
needing a fresh build.
2026-05-21 12:19:04 +02:00
marfrit adcc824bf7 Merge pull request 'daedalus-v4l2-dkms: postinst — autoinstall for all installed kernels (#64)' (#65) from claude-noether/marfrit-packages:fix/daedalus-dkms-multi-kernel-64 into main
Reviewed-on: marfrit/marfrit-packages#65
2026-05-21 09:28:47 +00:00
claude-noether 7213b23861 daedalus-v4l2-dkms: postinst — autoinstall for all installed kernels (#64)
Previously dkms autoinstall ran only against $(uname -r), so installing
the package on kernel A and rebooting into separately-installed kernel B
left /lib/modules/B/updates/dkms/ empty.  /dev/daedalus-v4l2 absent,
daedalus daemon nothing to talk to, browser/VAAPI silently falling back
to software with no obvious diagnostic for the user.

Now we enumerate every /lib/modules/*/build that resolves to a real
directory (i.e. headers are actually installed for that kernel) and run
'dkms autoinstall -k <kver>' for each.  Per-kernel verify; aggregated
warning only for the kernels that didn't build.

Tested locally: enumeration filters dangling /build symlinks correctly
(2 kernels installed, 1 has headers → only that one is built against).

Bumps PKGREL 1 → 2.  Closes #64.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 11:07:35 +02:00
marfrit 2cd3acd680 Merge pull request 'firefox-fourier 0003: proper V4L2REQUEST type acceptance patch (closes #60)' (#63) from firefox-0003-v4l2request-proper-2026-05-21 into main
Reviewed-on: marfrit/marfrit-packages#63
2026-05-21 05:10:27 +00:00
marfrit 22ac3c9845 firefox-fourier 0003: V4L2REQUEST type acceptance (proper patch, regenerated from real source)
Closes #60.

Resolves the malformed-patch issue from #61 (since reverted in #62)
by regenerating the 0003 patch via actual application against firefox
150.0.3 Pi-OS source.

Functional change vs prior 0003: walking hw_configs accepts
AV_HWDEVICE_TYPE_DRM (legacy) OR integer device_type values 13/14
(AV_HWDEVICE_TYPE_V4L2REQUEST in Kwibos no-AMF / upstream-AMF trees).
CreateV4L2RequestDeviceContext passes integer 13 (Kwibos value) cast
to enum AVHWDeviceType for the av_hwdevice_ctx_create call.

Tested: applied cleanly via patch -p1 against firefox-150.0.3 source
post-Pi-OS-quilt-patches. Test build follow-up in firefox-rpios EC2
script (drops the in-source sed hack from v7-v8).
2026-05-21 06:59:20 +02:00
marfrit 3275d06728 Merge pull request 'Revert #61: malformed firefox-fourier 0003 patch' (#62) from revert-pr-61-malformed-patch into main
Reviewed-on: marfrit/marfrit-packages#62
2026-05-21 04:33:35 +00:00
marfrit 33b91cf7dc Revert "Merge pull request 'firefox-fourier patch #3: accept AV_HWDEVICE_TYPE_V4L2REQUEST too' (#61) from fix/firefox-v4l2request-type-accept-2026-05-21 into main"
This reverts commit a640633ea7, reversing
changes made to de3c2c6744.
2026-05-21 06:32:39 +02:00
marfrit a640633ea7 Merge pull request 'firefox-fourier patch #3: accept AV_HWDEVICE_TYPE_V4L2REQUEST too' (#61) from fix/firefox-v4l2request-type-accept-2026-05-21 into main
Reviewed-on: marfrit/marfrit-packages#61
2026-05-21 04:18:28 +00:00
marfrit 5f21a71770 firefox-fourier patch #3: accept AV_HWDEVICE_TYPE_V4L2REQUEST too
Closes part of #60 (firefox-side patch update for fourier2 ffmpeg).

Background: libavcodec61-fourier2 (Kwiboo v4l2-request-n7.1.3 backed)
registers its hwaccels with AV_HWDEVICE_TYPE_V4L2REQUEST (the dedicated
enum added in FFmpeg 7.1+), not AV_HWDEVICE_TYPE_DRM as fourier1 did.
The firefox-fourier patch #3 walked hw_configs looking only for DRM
and fell through to software for every codec.

Patch updates:
- CreateV4L2RequestDeviceContext now takes an int aDeviceType (Mozillas
  bundled libavutil headers may lack the V4L2REQUEST enumerator), passed
  through to av_hwdevice_ctx_create.
- hw_configs walk accepts DRM (legacy) OR V4L2REQUEST integer value
  (13 on Kwibooss no-AMF tree, 14 on upstream-AMF tree).
- Renamed mDRMDeviceContext to mV4L2RequestDeviceContext for accuracy.

Build pkgrel will be bumped at debian-package level to +fourier2.
2026-05-21 00:09:54 +02:00
marfrit de3c2c6744 Merge pull request 'daedalus-v4l2{,-dkms}: 462aa4b -> f0d4186 — per-ctx vb2 lock' (#58) from claude-noether/marfrit-packages:noether/daedalus-bump-f0d4186 into main
Reviewed-on: marfrit/marfrit-packages#58
2026-05-20 19:27:39 +00:00
marfrit e7e79e5a76 daedalus-v4l2{,-dkms}: 462aa4b -> f0d4186 — per-ctx vb2 lock
Upstream PR #3 — kernel per-context vb2_queue lock so concurrent
clients of /dev/video0 don't serialise on a device-wide mutex.
Pi 5 Firefox VAAPI playback (RDD + content + GPU processes each
opening the device) now works without S_FMT EBUSY collisions.

Verified on higgs: YouTube playback engages daedalus at sustained
~230 fps decode through the libavcodec dlopen path, ~7× headroom
over the 30fps@1080p Pi 5 Fourier target.

Both packages: pkgver 0.1.0.r24.f0d4186, pkgrel reset to 1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 21:26:16 +02:00
marfrit 130a259c69 Merge pull request 'libva-v4l2-request-fourier: c1bb444 -> 77f9236 (PR #12 / issue #11 libva side)' (#57) from claude-noether/marfrit-packages:noether/libva-bump-77f9236 into main
Reviewed-on: marfrit/marfrit-packages#57
2026-05-20 19:18:38 +00:00
claude-noether 9580f33cb6 libva-v4l2-request-fourier: c1bb444 -> 77f9236 (PR #12 / issue #11 libva side)
Bumps both Arch (PKGBUILD) and Debian (build-deb.sh) sides in one commit
this time — following the dual-pin lesson from PR #53.

77f9236 = libva PR #12 merge: src/av1.{c,h} implements av1_set_controls
mapping VAPictureParameterBufferAV1 onto struct v4l2_ctrl_av1_sequence,
queued via S_EXT_CTRLS as V4L2_CID_STATELESS_AV1_SEQUENCE.  The
daedalus_v4l2 daemon track will consume the ctrl to synthesise an
OBU_SEQUENCE_HEADER and prepend it to the slice bitstream, so libdav1d
can parse the OUTPUT buffer that ffmpeg-vaapi delivers without the
sequence header.

Until the daemon-side OBU synth lands (issue #11 operator track), the
SEQUENCE ctrl is just sitting in the request unused.  Harmless on the
RK3588 vpu981 hardware path (vpu981 parses OBU bytes directly, ignores
the ctrl payload).

pkgver: r382.c1bb444 -> r386.77f9236 (commit count 382 -> 386, two new
upstream commits: 9fa18f2 av1 + 77f9236 merge).
pkgrel: 1 (fresh pkgver, no rebuild-only iteration).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 21:17:54 +02:00
marfrit eab66cfab8 Merge pull request 'build.yml: convert ffmpeg+mpv-debian install-deps to apt-get (closes #55)' (#56) from fix/debian-runner-deps-2026-05-20 into main
Reviewed-on: marfrit/marfrit-packages#56
2026-05-20 19:13:50 +00:00
marfrit d2cecbcd05 build.yml: convert ffmpeg+mpv-debian install-deps to apt-get
Closes #55.

PR #47 routed ffmpeg-v4l2-request-debian and mpv-fourier-debian to
runs-on: debian-aarch64 (bohr), but their install-deps steps still
called pacman -Syu. That is a latent break that would surface on the
next pkgver bump (currently silent-skipped by check-already-published.sh
since pool versions match the staged PKGVER).

This patch follows PR #50's pattern (daedalus-v4l2{,-dkms}-debian):

- Replace retry pacman -Syu ... with retry apt-get install ...
- Translate Arch package names to Debian (base-devel -> build-essential,
  pkgconf -> pkg-config, libdrm -> libdrm-dev, x264 -> libx264-dev, etc.).
- For mpv: drop the "configure [marfrit] repo + pre-install
  ffmpeg-v4l2-request-fourier" step entirely. Under apt, stock
  libavcodec-dev / libavformat-dev / libavutil-dev provide trixie-ABI
  headers matching what mpv-fourier's binary will see at runtime; the
  daemon dlopens the fourier libs if installed but doesn't link against
  them at build time.

Validated upstream: equivalent debian build-deps installed cleanly in
PRs #44 (libva) and #50 (daedalus).
2026-05-20 21:09:50 +02:00
marfrit 2028eccc3c Merge pull request 'daedalus-v4l2{,-dkms}: 3dd0eb0 -> 462aa4b — kernel ctrl-binding fix' (#54) from claude-noether/marfrit-packages:noether/daedalus-bump-462aa4b into main
Reviewed-on: marfrit/marfrit-packages#54
2026-05-20 18:45:09 +00:00
marfrit 70c8c2b417 daedalus-v4l2{,-dkms}: 3dd0eb0 -> 462aa4b — kernel ctrl-binding fix
Upstream PR #2 landed the one-line kernel fix that was the missing
half of issue libva-v4l2-request-fourier#8: device_run now calls
v4l2_ctrl_request_setup() before reading ctrl->p_cur, so the
daedalus_h264_meta the daemon receives reflects the in-flight
media_request's bound H.264 stateless control values instead of
stale/default ones.

Pairs with libva-v4l2-request-fourier 1.0.0+r382+gc1bb444 (max_num_
ref_frames fallback + Fix 4 instrumentation that exposed the
control-binding gap in the first place).

Effect on Pi 5 / CM5 hosts (higgs): ffmpeg -hwaccel vaapi against
H.264 sources now produces actual decoded content (per-frame
fnv1a hashes differ, zero MB-decode errors) instead of the
constant 0x6a6a05c5 "best-effort give-up" hash and cascading
decode warnings.

Both packages: pkgver 0.1.0.r22.462aa4b, pkgrel reset to 1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 20:38:32 +02:00
marfrit 793187ff9e Merge pull request 'libva-v4l2-request-fourier (deb): catch build-deb.sh up to c1bb444 (follow-up to #52)' (#53) from claude-noether/marfrit-packages:noether/libva-deb-bump-c1bb444 into main
Reviewed-on: marfrit/marfrit-packages#53
2026-05-20 18:30:24 +00:00
claude-noether 42bf6b1633 libva-v4l2-request-fourier (deb): 9898331 -> c1bb444 (parallel to PR #52)
PR #52 bumped only arch/libva-v4l2-request-fourier/PKGBUILD; the
sibling debian/libva-v4l2-request-fourier/build-deb.sh has its own
parallel UPSTREAM_COMMIT + PKGVER + PKGREL pin that I missed.

Result: the libva-v4l2-request-fourier-debian CI job ran post-merge,
check-already-published.sh saw the .deb-side filename derived from
build-deb.sh (libva-v4l2-request-fourier_1.0.0+r380+g9898331-1_arm64.deb)
was already in the pool, returned skip=1, and the job short-circuited.
trixie repo Packages still advertises r380 instead of r382.

This bump catches build-deb.sh up to the same pin (c1bb444) so the
next merge triggers the build + reprepro publish path.

No code change beyond the three pinned variables + the comment block.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 20:29:10 +02:00
marfrit 40719efc43 Merge pull request 'libva-v4l2-request-fourier: 9898331 -> c1bb444 (PR #9 / issue #8 fix)' (#52) from claude-noether/marfrit-packages:noether/libva-bump-c1bb444 into main
Reviewed-on: marfrit/marfrit-packages#52
2026-05-20 18:23:42 +00:00
claude-noether e540384f50 libva-v4l2-request-fourier: 9898331 -> c1bb444 (PR #9 / issue #8 fix)
Bumps the libva backend pin to include marfrit/libva-v4l2-request-fourier
PR #9 — h264_set_controls fix for the bitstream-vs-session value drift
that breaks the daedalus_v4l2 strict-consumer path (issue #8):

  * max_num_ref_frames fallback when VAAPI client left it 0 (count
    valid DPB entries, then per-profile spec minimum)
  * one-line request_log at h264_set_controls entry dumping raw
    VAAPI bitfields for disambiguating remaining PPS-flag-zero
    portion of #8

The PR explicitly defers the deeper "profile_idc / level_idc from
bitstream" portion of #8 — VAAPI's VAPictureParameterBufferH264 omits
both fields, so a real fix needs SPS-NAL parsing or daedalus
wire-protocol pass-through. Not in this bump.

pkgver: 1.0.0.r380.9898331 -> 1.0.0.r382.c1bb444 (commit count 380->382)
pkgrel: 1 (fresh pkgver, no rebuild-only iteration)

Verified on higgs (Debian 13 trixie, gcc 14.2.0, libva 2.22.0):
clean meson build, vainfo enumerates all 8 codec profiles, multi-device
probe still wires rkvdec / rpi-hevc-dec / daedalus_v4l2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 20:21:46 +02:00
marfrit 9ca97374c8 Merge pull request 'mesa-panvk-bifrost: iter13 — implement VK_EXT_transform_feedback for Bifrost' (#51) from claude-noether/marfrit-packages:noether/mesa-panvk-bifrost-iter13 into main
Reviewed-on: marfrit/marfrit-packages#51
2026-05-20 17:33:05 +00:00
marfrit 902e855d92 mesa-panvk-bifrost: iter13 — implement VK_EXT_transform_feedback for Bifrost
iter12 hit a wall: Brave's ANGLE-Vulkan path requires GLES3, which
requires VK_EXT_transform_feedback, which PanVk-Bifrost did not
implement. This iter implements that extension, unlocking the full
ANGLE-Vulkan-on-Bifrost stack.

The implementation follows Panfrost-Gallium's well-validated XFB lowering
(nir_io_add_intrinsic_xfb_info + pan_nir_lower_xfb) wired into the PanVk
shader pipeline after nir_lower_io. Adds 4 XFB buffer address sysvals
plus per-draw num_vertices to the graphics sysval struct. Buffer state
is tracked on the cmd buffer; per-draw sysval upload populates either
the bound buffer's GPU address or PAN_SHADER_OOB_ADDRESS (memory-sink)
so XFB-capable pipelines used outside Begin/End survive without GPU
fault — the Panfrost-Gallium idiom from gallium/drivers/panfrost/
pan_cmdstream.c:1350.

Verified on PineTab2 (Mali-G52 r1 MC1, RK3566):
- /tmp/panvk-iter13/probe_xfb: 3 vertices captured byte-exact
- /tmp/panvk-iter13/probe_xfb_nodraw: XFB pipeline used without Bind/
  Begin/End survives — DEVICE_LOST regression closed
- Brave 148 with --use-angle=vulkan: WebGL 2.0 (OpenGL ES 3.0) creates
  cleanly, renderer reports
  "ANGLE (ARM, Vulkan 1.2.335 (Mali-G52 r1 MC1), panvk)"
- chrome://gpu graphics feature status: Canvas/Compositing/OpenGL/
  Rasterization/WebGL/WebGL2/WebGPU/Video Decode all hardware accelerated

Phase docs:
- ~/src/panvk-bifrost/phase4_iter13_close.md  (build green)
- ~/src/panvk-bifrost/phase5_iter13_close.md  (review fixes applied)
- ~/src/panvk-bifrost/phase6_iter13_close.md  (Brave integration green)

pkgver bumped 26.0.6.r2 -> 26.0.6.r3; iter13 patch applied via
unified-diff (the 328-line change scope is past sed-of-individual-
lines territory). Sanity checks in prepare() verify the patch landed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 19:13:05 +02:00
marfrit 64269d69ee Merge pull request 'ci: convert daedalus-v4l2{,-dkms}-debian install steps to apt-get' (#50) from claude-noether/marfrit-packages:noether/ci-fourier-debian-apt into main
Reviewed-on: marfrit/marfrit-packages#50
2026-05-20 17:06:21 +00:00
marfrit e976c88016 ci: convert daedalus-v4l2{,-dkms}-debian install steps to apt-get
PR #47 moved the daedalus-v4l2-debian + daedalus-v4l2-dkms-debian
jobs from runs-on: arch-aarch64 to runs-on: debian-aarch64, but
left the install-deps steps using `pacman -Syu` — which doesn't
exist on the Debian runner.  Both jobs were latent-broken; the
break only surfaces once a daedalus pkgver actually changes (the
rebuild guard skipped them in runs #133-134 since nothing about
daedalus moved between PR #47 and PR #48).

PR #49 bumped both daedalus packages to 0.1.0+r20+g3dd0eb0 (the
DAEMON-PPS H.264 SPS/PPS NAL synth landing) — so run #135's
daedalus-debian + daedalus-dkms-debian jobs actually executed and
hit the broken pacman step.  Result: instant failure on `pacman -Syu`.

Fix: replace the pacman invocations with apt-get equivalents.
For daedalus-v4l2-debian, drop the [marfrit] ffmpeg-v4l2-request-
fourier preinstall — Debian's stock libavcodec-dev / libavformat-
dev / libavutil-dev provide matching headers (both trixie ffmpeg
and the daedalus daemon's runtime dlopen target are libavcodec
61.x), and the daemon never link-binds against libav (Option γ —
dlopen at runtime), so any header set with the right struct
definitions works.

Verified end-to-end on higgs (Debian trixie aarch64, equivalent
to bohr): clone the source tarball, run build-deb.sh, produces
daedalus-v4l2_0.1.0+r20+g3dd0eb0-1_arm64.deb cleanly (10/10
ninja steps, daedalus_v4l2_daemon binary linked).

NOTE: ffmpeg-v4l2-request-debian (line ~907) and mpv-fourier-
debian (line ~1048) have the same pacman-on-Debian bug from
PR #47 but are still skipped because their pkgvers haven't moved.
Not fixing those in this PR to keep the change focused on
unblocking DAEMON-PPS verification — they'll need the same
treatment the next time they bump.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 19:03:36 +02:00
marfrit 29cc145d44 Merge pull request 'daedalus-v4l2{,-dkms}: 481279c -> 3dd0eb0 (DAEMON-PPS close)' (#49) from claude-noether/marfrit-packages:noether/daedalus-bump-3dd0eb0 into main
Reviewed-on: marfrit/marfrit-packages#49
2026-05-20 16:54:11 +00:00
18 changed files with 1867 additions and 158 deletions
+72 -60
View File
@@ -924,11 +924,21 @@ jobs:
run: | run: |
set -e set -e
retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; } retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; }
retry pacman -Syu --noconfirm --needed \ export DEBIAN_FRONTEND=noninteractive
dpkg openssh rsync curl base-devel git nasm yasm \ retry apt-get update -qq
linux-api-headers mesa alsa-lib bzip2 fontconfig fribidi gmp \ # Debian build-deps for the FFmpeg fourier-fork build. These
gnutls lame libass dav1d libdrm freetype2 libpulse libva \ # map 1:1 to the previous Arch list; libav*-dev intentionally
libvorbis libvpx libwebp x264 x265 libxml2 opus v4l-utils xz zlib # absent (we are FFmpeg itself, providing those libs).
retry apt-get install -y --no-install-recommends \
build-essential cmake ninja-build git pkg-config nasm yasm \
linux-libc-dev libgl1-mesa-dev libasound2-dev libbz2-dev \
libfontconfig-dev libfribidi-dev libgmp-dev libgnutls28-dev \
libmp3lame-dev libass-dev libdav1d-dev libdrm-dev \
libfreetype-dev libpulse-dev libva-dev libvorbis-dev libvpx-dev \
libwebp-dev libx264-dev libx265-dev libxml2-dev libopus-dev \
libvulkan-dev glslang-tools \
v4l-utils liblzma-dev zlib1g-dev \
curl ca-certificates openssh-client rsync dpkg-dev
- name: install hertz deploy ssh key - name: install hertz deploy ssh key
if: steps.skip-check.outputs.skip != '1' if: steps.skip-check.outputs.skip != '1'
@@ -1063,31 +1073,30 @@ jobs:
run: | run: |
set -e set -e
retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; } retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; }
retry pacman -Syu --noconfirm --needed \ export DEBIAN_FRONTEND=noninteractive
dpkg openssh rsync curl base-devel git meson ninja python-docutils \ retry apt-get update -qq
ladspa wayland-protocols vulkan-headers \ # Debian libav*-dev is ABI-compatible with the fourier ffmpeg
alsa-lib desktop-file-utils glibc hicolor-icon-theme jack lcms2 \ # fork at the header level; mpv link-binds against system
libarchive libass libbluray libcdio libcdio-paranoia libdisplay-info \ # libav at build time, runtime dlopen picks up the fourier
libdrm libdvdnav libdvdread libegl libgl libglvnd libjpeg-turbo \ # libs if installed. The previous [marfrit] pre-install of
libplacebo libpulse libsixel libva libvdpau libx11 libxext \ # ffmpeg-v4l2-request-fourier under pacman is unnecessary
libxkbcommon libxpresent libxrandr libxss libxv luajit mesa mujs \ # under apt: stock Debian libav*-dev provides the trixie
libpipewire rubberband sdl2 openal uchardet vapoursynth \ # ABI mpv-fourier's binary will encounter.
vulkan-icd-loader wayland zlib retry apt-get install -y --no-install-recommends \
build-essential git meson ninja-build pkg-config python3-docutils \
- name: configure [marfrit] repo + pre-install ffmpeg-v4l2-request-fourier ladspa-sdk wayland-protocols libvulkan-dev libwayland-dev \
if: steps.skip-check.outputs.skip != '1' libasound2-dev desktop-file-utils libc6-dev hicolor-icon-theme \
run: | libjack-jackd2-dev liblcms2-dev libarchive-dev libass-dev \
set -e libbluray-dev libcdio-dev libcdio-paranoia-dev libdisplay-info-dev \
curl -sLo /tmp/marfrit.gpg https://packages.reauktion.de/marfrit.gpg libdrm-dev libdvdnav-dev libdvdread-dev libegl-dev libgl-dev \
pacman-key --add /tmp/marfrit.gpg libglvnd-dev libjpeg-dev libplacebo-dev libpulse-dev libsixel-dev \
pacman-key --lsign-key 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C libva-dev libvdpau-dev libx11-dev libxext-dev libxkbcommon-dev \
rm -f /tmp/marfrit.gpg libxpresent-dev libxrandr-dev libxss-dev libxv-dev libluajit-5.1-dev \
if ! grep -q '^\[marfrit\]' /etc/pacman.conf; then libmujs-dev libpipewire-0.3-dev librubberband-dev libsdl2-dev \
printf '\n[marfrit]\nServer = https://packages.reauktion.de/arch/$arch\nSigLevel = Required\n' >> /etc/pacman.conf libopenal-dev libuchardet-dev libvapoursynth-dev liblzma-dev \
fi libavcodec-dev libavformat-dev libavutil-dev libswscale-dev libswresample-dev \
pacman -Sy --noconfirm zlib1g-dev \
rm -f /var/cache/pacman/pkg/ffmpeg-v4l2-request-fourier-*-aarch64.pkg.tar.* curl ca-certificates openssh-client rsync dpkg-dev
printf 'y\ny\ny\n' | pacman -S marfrit/ffmpeg-v4l2-request-fourier
- name: install hertz deploy ssh key - name: install hertz deploy ssh key
if: steps.skip-check.outputs.skip != '1' if: steps.skip-check.outputs.skip != '1'
@@ -1144,39 +1153,39 @@ jobs:
echo "$result" >> "$GITHUB_OUTPUT" echo "$result" >> "$GITHUB_OUTPUT"
echo "decision: $result" echo "decision: $result"
- name: install build-deps (sans ffmpeg — see [marfrit] step) - name: install build-deps
if: steps.skip-check.outputs.skip != '1' if: steps.skip-check.outputs.skip != '1'
run: | run: |
set -e set -e
retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; } retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; }
# Do NOT pull stock 'ffmpeg' here: the arch-aarch64 runner has export DEBIAN_FRONTEND=noninteractive
# ffmpeg-v4l2-request-fourier pre-installed from the mpv-aarch64 retry apt-get update -qq
# job (configured via [marfrit]), and pacman -S ffmpeg would # FFmpeg headers + sonames the daemon dlopens. As of
# conflict on the libav* drop-in. Daedalus build only needs # daedalus-v4l2 PR #16 (commit 514da29), the daemon targets
# libavcodec/libavformat headers, which the fourier package # the Kwiboo fork's libavcodec.so.62 / libavformat.so.62 /
# already supplies. Keep cmake/ninja/pkgconf/libdrm here; the # libavutil.so.60 at /opt/fourier — so the build needs
# ffmpeg-dev equivalent comes via the next step. # /opt/fourier/include and /opt/fourier/lib/pkgconfig.
retry pacman -Syu --noconfirm --needed \ # ffmpeg-v4l2-request-fourier provides both (plus the
dpkg openssh rsync curl base-devel git cmake ninja pkgconf \ # runtime libs the .deb will dlopen on the target host;
libdrm # we install it as a build-dep here and the dpkg-shlibdeps
# step pulls it into the daemon .deb's Depends automatically).
- name: ensure ffmpeg-v4l2-request-fourier installed (link-time ABI source) # Debian-stock libav*-dev removed — would conflict on
if: steps.skip-check.outputs.skip != '1' # /usr/include/libavcodec/avcodec.h vs /opt/fourier's copy.
run: | #
set -e # libvulkan-dev + glslang-tools: needed by the in-build
# Idempotent: pre-install the marfrit fourier ffmpeg so cmake # daedalus-fourier fetch (build-deb.sh fetches the sibling
# finds libavcodec / libavformat / libavutil headers + .so's. # library, cmake-builds it into a temp prefix, then the
# Mirrors mpv-fourier-debian's [marfrit] step. # daedalus daemon static-links against it via pkg-config).
curl -sLo /tmp/marfrit.gpg https://packages.reauktion.de/marfrit.gpg # Without these, daedalus-fourier's find_package(Vulkan)
pacman-key --add /tmp/marfrit.gpg # and glslangValidator find_program both fail at configure
pacman-key --lsign-key 92D5E96D8F63C75E4116AA1FF5C8C4603D0D250C # time. See marfrit/daedalus-fourier PR #1 +
rm -f /tmp/marfrit.gpg # reauktion/daedalus-v4l2 PR #13.
if ! grep -q '^\[marfrit\]' /etc/pacman.conf; then retry apt-get install -y --no-install-recommends \
printf '\n[marfrit]\nServer = https://packages.reauktion.de/arch/$arch\nSigLevel = Required\n' >> /etc/pacman.conf build-essential cmake ninja-build pkg-config git \
fi ffmpeg-v4l2-request-fourier libdrm-dev \
pacman -Sy --noconfirm libvulkan-dev glslang-tools \
rm -f /var/cache/pacman/pkg/ffmpeg-v4l2-request-fourier-*-aarch64.pkg.tar.* linux-libc-dev \
printf 'y\ny\ny\n' | pacman -S --needed marfrit/ffmpeg-v4l2-request-fourier curl ca-certificates openssh-client rsync dpkg-dev
- name: install hertz deploy ssh key - name: install hertz deploy ssh key
if: steps.skip-check.outputs.skip != '1' if: steps.skip-check.outputs.skip != '1'
@@ -1238,7 +1247,10 @@ jobs:
run: | run: |
set -e set -e
retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; } retry() { for i in 1 2 3; do "$@" && return 0; rc=$?; echo "retry $i (exit=$rc)" >&2; sleep $((i*5)); done; return 1; }
retry pacman -Syu --noconfirm --needed dpkg openssh rsync curl tar gzip export DEBIAN_FRONTEND=noninteractive
retry apt-get update -qq
retry apt-get install -y --no-install-recommends \
dpkg-dev openssh-client rsync curl ca-certificates tar gzip
- name: install hertz deploy ssh key - name: install hertz deploy ssh key
if: steps.skip-check.outputs.skip != '1' if: steps.skip-check.outputs.skip != '1'
+8 -3
View File
@@ -18,10 +18,15 @@ _module=daedalus_v4l2
# Same pin as arch/daedalus-v4l2 — keep kernel module + daemon # Same pin as arch/daedalus-v4l2 — keep kernel module + daemon
# bit-versioned together so the chardev wire protocol stays in sync. # bit-versioned together so the chardev wire protocol stays in sync.
_commit=3dd0eb070a75893f78368ce819b9e9ebf08c124d # 5d8b436 reverts PRs #7 + #8 (parking design that broke libva's
# 1:1 contract — see daedalus-v4l2#9 + #10). Tree is
# content-equivalent to f0d4186 plus PR #4 (cosmetic menu ctrls).
# PROTO_VERSION drops 1 → 0; lock-step install with
# daedalus-v4l2 0.1.0.r33.5d8b436 REQUIRED.
_commit=5d8b4369e58ab947d1c56b1f718293c57c6065b5
pkgver=0.1.0.r20.3dd0eb0 pkgver=0.1.0.r33.5d8b436
pkgrel=1 # reset for new upstream pin (3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synth) pkgrel=1 # reset for new upstream pin (5d8b436 — revert parking design)
pkgdesc="V4L2 stateless decoder shim kernel module (DKMS) — Pi 5 / CM5" pkgdesc="V4L2 stateless decoder shim kernel module (DKMS) — Pi 5 / CM5"
arch=('any') arch=('any')
url="https://git.reauktion.de/reauktion/daedalus-v4l2" url="https://git.reauktion.de/reauktion/daedalus-v4l2"
+11 -9
View File
@@ -16,17 +16,19 @@
pkgname=daedalus-v4l2 pkgname=daedalus-v4l2
_upstreampkg=daedalus-v4l2 _upstreampkg=daedalus-v4l2
# Pin the daedalus-v4l2 tip. 481279c = "Phase 8.13: byte-exact end-to- # 6e6dfa1 = picks up daedalus-v4l2 PR #16 — daemon now dlopens
# end via libva (consumer target hit)" — first commit where the full # the Kwiboo fourier fork's libavcodec.so.62 / libavformat.so.62 /
# ffmpeg -hwaccel vaapi → libva → /dev/video0 → daemon path lands a # libavutil.so.60 at /opt/fourier instead of Debian-stock soname
# pixel-correct decoded frame back in ffmpeg. Promote to a later pin # 61/61/59. First step on the daedalus-fourier substitution arc
# only after a future phase closes cleanly. # (daedalus-v4l2#11). Daemon still needs daedalus-fourier at
_commit=3dd0eb070a75893f78368ce819b9e9ebf08c124d # build time (Arch packaging for that is a follow-up; Debian side
# fetches inline via build-deb.sh).
_commit=6e6dfa144da7bc7fa8be50c8da91d7d1c6132a2c
# 0.1.0 (pre-1.0) + commit count + short sha. Bump the .Y on each # 0.1.0 (pre-1.0) + commit count + short sha. Bump the .Y on each
# Phase 8.x close. pkgver() recomputes at build time. # Phase 8.x close. pkgver() recomputes at build time.
pkgver=0.1.0.r20.3dd0eb0 pkgver=0.1.0.r41.6e6dfa1
pkgrel=1 # reset for new upstream pin (3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synth) pkgrel=1 # reset for new upstream pin (6e6dfa1 — soname 62 via /opt/fourier)
pkgdesc="Userspace daemon for the daedalus-v4l2 V4L2 stateless decoder shim (VP9/AV1/H.264 on Pi 5 / CM5)" pkgdesc="Userspace daemon for the daedalus-v4l2 V4L2 stateless decoder shim (VP9/AV1/H.264 on Pi 5 / CM5)"
arch=('aarch64') arch=('aarch64')
url="https://git.reauktion.de/reauktion/daedalus-v4l2" url="https://git.reauktion.de/reauktion/daedalus-v4l2"
@@ -34,7 +36,7 @@ license=('BSD-2-Clause' 'GPL-2.0-or-later')
# Daemon dlopens libavformat.so.61 / libavcodec.so.61 / libavutil.so.59 # Daemon dlopens libavformat.so.61 / libavcodec.so.61 / libavutil.so.59
# at runtime (Option γ — see daemon/src/ffmpeg_loader.h). ffmpeg # at runtime (Option γ — see daemon/src/ffmpeg_loader.h). ffmpeg
# provides those; we don't link them. # provides those; we don't link them.
depends=('ffmpeg' 'libdrm') depends=('ffmpeg-v4l2-request-fourier' 'libdrm')
# Headers from libav*-dev needed at compile time for type-safe function # Headers from libav*-dev needed at compile time for type-safe function
# pointer signatures; pkg-config locates them. # pointer signatures; pkg-config locates them.
makedepends=('cmake' 'ninja' 'pkgconf' 'git' 'ffmpeg') makedepends=('cmake' 'ninja' 'pkgconf' 'git' 'ffmpeg')
@@ -0,0 +1,137 @@
From f760c0541586f43334c02611fcb4c212c08ad576 Mon Sep 17 00:00:00 2001
From: Markus Fritsche <mfritsche@reauktion.de>
Date: Thu, 21 May 2026 21:40:22 +0200
Subject: [PATCH] avcodec/aarch64/h264dsp: route H.264 4x4 IDCT through
daedalus-fourier
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
H264DSPContext.idct_add (called per 4x4 block from the intra-4x4
decode path in h264_mb.c) now dispatches through
daedalus_recipe_dispatch_h264_idct4 instead of ff_h264_idct_add_neon.
The recipe layer picks the substrate; for cycle 6 (H.264 IDCT 4x4)
the recipe is CPU NEON, so this is effectively a NEON-to-NEON
substitution with one extra dispatch call and recipe-table lookup.
Provides the first end-to-end exercise of the daedalus-fourier
kernel pack inside the libavcodec.so decode hot path; follow-up
patches wire IDCT 8x8, luma-v deblock, and qpel mc20.
The library context is process-global, lazily initialised under
pthread_once on first call. We pick the no-QPU constructor because
libavcodec.so is loaded into arbitrary host processes
(firefox-fourier, mpv-fourier, daedalus_v4l2_daemon, ...) and we
cannot assume the host has a usable Vulkan instance. Higher cycles
(deblock luma-v, MC) that benefit from the QPU will provision their
own recipe-selected context once that path is wired.
Bulk paths (idct_add16, idct_add16intra, idct_add8 — used for
non-intra4x4 macroblocks) remain on the stock NEON .S implementations
and will be batched through daedalus_recipe_dispatch_h264_idct4 with
n_blocks>1 in a follow-up.
Bit-exact against ff_h264_idct_add_neon (daedalus-fourier cycle 6
green; see marfrit/daedalus-fourier/CYCLE_LOGS.md).
Refs reauktion/daedalus-v4l2#11 — substitution arc step 2.
---
libavcodec/aarch64/Makefile | 3 +-
libavcodec/aarch64/h264_idct_daedalus.c | 49 +++++++++++++++++++++++
libavcodec/aarch64/h264dsp_init_aarch64.c | 3 +-
3 files changed, 53 insertions(+), 2 deletions(-)
create mode 100644 libavcodec/aarch64/h264_idct_daedalus.c
diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile
index 41ab025..7b95fb1 100644
--- a/libavcodec/aarch64/Makefile
+++ b/libavcodec/aarch64/Makefile
@@ -3,7 +3,8 @@ OBJS-$(CONFIG_AC3DSP) += aarch64/ac3dsp_init_aarch64.o
OBJS-$(CONFIG_FDCTDSP) += aarch64/fdctdsp_init_aarch64.o
OBJS-$(CONFIG_FMTCONVERT) += aarch64/fmtconvert_init.o
OBJS-$(CONFIG_H264CHROMA) += aarch64/h264chroma_init_aarch64.o
-OBJS-$(CONFIG_H264DSP) += aarch64/h264dsp_init_aarch64.o
+OBJS-$(CONFIG_H264DSP) += aarch64/h264dsp_init_aarch64.o \
+ aarch64/h264_idct_daedalus.o
OBJS-$(CONFIG_HUFFYUVDSP) += aarch64/huffyuvdsp_init_aarch64.o
OBJS-$(CONFIG_H264PRED) += aarch64/h264pred_init.o
OBJS-$(CONFIG_H264QPEL) += aarch64/h264qpel_init_aarch64.o
diff --git a/libavcodec/aarch64/h264_idct_daedalus.c b/libavcodec/aarch64/h264_idct_daedalus.c
new file mode 100644
index 0000000..538d223
--- /dev/null
+++ b/libavcodec/aarch64/h264_idct_daedalus.c
@@ -0,0 +1,49 @@
+/*
+ * H.264 4x4 IDCT + add — daedalus-fourier substitution shim.
+ *
+ * Routes H264DSPContext.idct_add through
+ * daedalus_recipe_dispatch_h264_idct4 instead of ff_h264_idct_add_neon.
+ * The recipe layer picks the substrate (CPU NEON by default for
+ * cycle 6; future cycles may dispatch to V3D opportunistically).
+ *
+ * FFmpeg's 4x4 block memory layout matches daedalus's column-major
+ * convention: block[r + 4*c] = coefficient at (row r, col c). Both
+ * sides destructively zero the block after the transform.
+ *
+ * The library context is process-global and lazily initialised under
+ * pthread_once. We pick the no-QPU constructor here because
+ * libavcodec.so is loaded into arbitrary host processes
+ * (firefox-fourier, mpv-fourier, daedalus_v4l2_daemon, ...) and we
+ * cannot assume the host has a usable Vulkan instance. Higher cycles
+ * (deblock, MC) that benefit from the QPU initialise their own
+ * recipe-selected context once that path is wired.
+ */
+
+#include <pthread.h>
+#include <stddef.h>
+#include <stdint.h>
+
+#include <daedalus.h>
+
+#include "libavutil/attributes.h"
+#include "libavcodec/h264dsp.h"
+
+static daedalus_ctx *g_dctx;
+static pthread_once_t g_dctx_once = PTHREAD_ONCE_INIT;
+
+static void daedalus_ctx_init_once(void)
+{
+ g_dctx = daedalus_ctx_create_no_qpu();
+}
+
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride);
+
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride)
+{
+ static const daedalus_h264_block_meta meta = { .dst_off = 0 };
+
+ pthread_once(&g_dctx_once, daedalus_ctx_init_once);
+
+ daedalus_recipe_dispatch_h264_idct4(g_dctx, dst, (size_t)stride,
+ block, 1, &meta);
+}
diff --git a/libavcodec/aarch64/h264dsp_init_aarch64.c b/libavcodec/aarch64/h264dsp_init_aarch64.c
index c684574..b993df2 100644
--- a/libavcodec/aarch64/h264dsp_init_aarch64.c
+++ b/libavcodec/aarch64/h264dsp_init_aarch64.c
@@ -66,6 +66,7 @@ void ff_biweight_h264_pixels_4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride
int weights, int offset);
void ff_h264_idct_add_neon(uint8_t *dst, int16_t *block, int stride);
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride);
void ff_h264_idct_dc_add_neon(uint8_t *dst, int16_t *block, int stride);
void ff_h264_idct_add16_neon(uint8_t *dst, const int *block_offset,
int16_t *block, int stride,
@@ -139,7 +140,7 @@ av_cold void ff_h264dsp_init_aarch64(H264DSPContext *c, const int bit_depth,
c->biweight_pixels_tab[1] = ff_biweight_h264_pixels_8_neon;
c->biweight_pixels_tab[2] = ff_biweight_h264_pixels_4_neon;
- c->idct_add = ff_h264_idct_add_neon;
+ c->idct_add = ff_h264_idct_add_daedalus;
c->idct_dc_add = ff_h264_idct_dc_add_neon;
c->idct_add16 = ff_h264_idct_add16_neon;
c->idct_add16intra = ff_h264_idct_add16intra_neon;
--
2.47.3
+33 -3
View File
@@ -24,8 +24,13 @@ _srcname=FFmpeg
_version='8.1' _version='8.1'
_commit='b57fbbe50c9b2656fad86a1a7eeabfd2b2a50935' # v4l2-request-n8.1 tip 2026-04-24 _commit='b57fbbe50c9b2656fad86a1a7eeabfd2b2a50935' # v4l2-request-n8.1 tip 2026-04-24
pkgver=8.1.r123329.b57fbbe pkgver=8.1.r123329.b57fbbe
pkgrel=5 pkgrel=6 # pkgrel=6 — H.264 IDCT 4x4 daedalus-fourier substitution (2026-05-21)
epoch=2 epoch=2
# daedalus-fourier pin — first kernel substitution in libavcodec
# (cycle 6 H.264 IDCT 4x4). Same SHA as the daedalus-v4l2 daemon's
# inline build; lockstep with that until the public API rolls.
_daedalus_fourier_commit='d87239d8172307d9a1b93c95cbed116d175b85cc'
pkgdesc='FFmpeg with V4L2 Request API hwaccel (Rockchip / Allwinner stateless decode)' pkgdesc='FFmpeg with V4L2 Request API hwaccel (Rockchip / Allwinner stateless decode)'
arch=('aarch64') arch=('aarch64')
url='https://github.com/Kwiboo/FFmpeg' url='https://github.com/Kwiboo/FFmpeg'
@@ -34,6 +39,7 @@ depends=(
alsa-lib alsa-lib
bzip2 bzip2
fontconfig fontconfig
vulkan-icd-loader
fribidi fribidi
gmp gmp
gnutls gnutls
@@ -59,10 +65,13 @@ depends=(
zlib zlib
) )
makedepends=( makedepends=(
cmake
git git
linux-api-headers linux-api-headers
mesa mesa
nasm nasm
ninja
vulkan-headers
) )
provides=( provides=(
libavcodec.so libavcodec.so
@@ -78,9 +87,11 @@ provides=(
conflicts=(ffmpeg) conflicts=(ffmpeg)
replaces=(ffmpeg ffmpeg-v4l2-request-git) replaces=(ffmpeg ffmpeg-v4l2-request-git)
source=("git+https://github.com/Kwiboo/FFmpeg.git#commit=${_commit}" source=("git+https://github.com/Kwiboo/FFmpeg.git#commit=${_commit}"
"daedalus-fourier-${_daedalus_fourier_commit}.tar.gz::https://git.reauktion.de/marfrit/daedalus-fourier/archive/${_daedalus_fourier_commit}.tar.gz"
'0001-libudev-bypass-fallback.patch' '0001-libudev-bypass-fallback.patch'
'0002-nv15-to-p010-unpack.patch') '0002-nv15-to-p010-unpack.patch'
sha256sums=('SKIP' 'SKIP' 'SKIP') '0003-h264-idct4-daedalus-fourier.patch')
sha256sums=('SKIP' 'SKIP' 'SKIP' 'SKIP' 'SKIP')
pkgver() { pkgver() {
cd "${_srcname}" cd "${_srcname}"
@@ -93,9 +104,25 @@ prepare() {
cd "${_srcname}" cd "${_srcname}"
patch -Np1 -i "${srcdir}/0001-libudev-bypass-fallback.patch" patch -Np1 -i "${srcdir}/0001-libudev-bypass-fallback.patch"
patch -Np1 -i "${srcdir}/0002-nv15-to-p010-unpack.patch" patch -Np1 -i "${srcdir}/0002-nv15-to-p010-unpack.patch"
patch -Np1 -i "${srcdir}/0003-h264-idct4-daedalus-fourier.patch"
} }
build() { build() {
# --- daedalus-fourier: build static .a with PIC, install to a
# per-build prefix; libavcodec.so links it into the shared object so
# H264DSPContext.idct_add (and follow-up kernels) dispatch through
# the daedalus recipe layer instead of the in-tree NEON .S code. ---
local _fourier_prefix="${srcdir}/fourier-prefix"
mkdir -p "${_fourier_prefix}"
pushd "${srcdir}"/daedalus-fourier >/dev/null
cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_INSTALL_PREFIX="${_fourier_prefix}"
cmake --build build --target daedalus_core
cmake --install build
popd >/dev/null
cd "${_srcname}" cd "${_srcname}"
# FFmpeg's configure resolves the compiler via `which` and bakes the # FFmpeg's configure resolves the compiler via `which` and bakes the
@@ -147,6 +174,9 @@ build() {
--enable-libx265 \ --enable-libx265 \
--enable-libwebp \ --enable-libwebp \
\ \
--extra-cflags="-I${_fourier_prefix}/include" \
--extra-ldflags="-L${_fourier_prefix}/lib" \
--extra-libs="-ldaedalus_core -lvulkan -lpthread" \
--host-cflags='-fPIC' --host-cflags='-fPIC'
make make
@@ -18,27 +18,30 @@ This patch adds a sibling init path, `InitV4L2RequestDecoder`, that:
* looks up the codec via two complementary mechanisms libavcodec * looks up the codec via two complementary mechanisms libavcodec
uses for v4l2_request: uses for v4l2_request:
- **named codec** (`h264_v4l2request`, `vp8_v4l2request`, etc.): - **named codec** (`h264_v4l2request`, `vp8_v4l2request`, etc.):
the legacy AVCodec-per-hwaccel registration. ALARM, Debian, the legacy AVCodec-per-hwaccel registration.
and most distros building with --enable-v4l2-request expose - **generic codec + hw_configs walk**: the modern hwaccel
this (avcodec_find_decoder_by_name lookup). registration. Accepts EITHER AV_HWDEVICE_TYPE_DRM (legacy
- **generic codec + AV_HWDEVICE_TYPE_DRM** in `hw_configs`: ffmpeg-v4l2-request-fork output prior to FFmpeg 7.1) OR
the modern hwaccel registration on some upstream-only ffmpeg AV_HWDEVICE_TYPE_V4L2REQUEST (FFmpeg 7.1+ dedicated enum,
builds. value 13 on Kwiboo's no-AMF tree, 14 on upstream-AMF tree).
Mozilla's bundled libavutil headers may not have the V4L2REQUEST
enumerator, so the test is on the integer value via `(int)cast`.
Probes named-codec first (explicit, portable) and falls back to Probes named-codec first (explicit, portable) and falls back to
walking the generic codec's `hw_configs` for the DRM device type; walking the generic codec's `hw_configs` for either device type;
* creates an `AV_HWDEVICE_TYPE_DRM` hwdevice context bound to * creates an hwdevice context bound to `/dev/dri/renderD128`. Uses
`/dev/dri/renderD128` via the new `av_hwdevice_ctx_create` wrapper integer 13 (V4L2REQUEST as defined by Kwiboo's v4l2-request-n7.1.3
(patch 2/4) and attaches it to the codec context; tree, what our libavcodec61-fourier emits) cast to enum
AVHWDeviceType for the av_hwdevice_ctx_create call;
* reuses the existing `ChooseV4L2PixelFormat` get-format callback * reuses the existing `ChooseV4L2PixelFormat` get-format callback
(already returns `AV_PIX_FMT_DRM_PRIME`) and the existing (already returns `AV_PIX_FMT_DRM_PRIME`) and the existing
`apply_cropping = 0` constraint. `apply_cropping = 0` constraint.
`InitV4L2RequestDecoder` is invoked **before** `InitV4L2Decoder` in `InitV4L2RequestDecoder` is invoked **before** `InitV4L2Decoder` in
`InitHWDecoderIfAllowed`. On Rockchip mainline it succeeds via either `InitHWDecoderIfAllowed`. On Rockchip mainline it succeeds via either
mechanism (ALARM uses the named codec). On Pi4 / Mediatek / mechanism. On Pi4 / Mediatek / vendor-MPP-stateful boards neither
vendor-MPP-stateful boards neither mechanism is registered for the mechanism is registered for the codec, the function bails out, and the
codec, the function bails out, and the existing stateful existing stateful `InitV4L2Decoder` runs as before. No regression of
`InitV4L2Decoder` runs as before. No regression of stateful boards. stateful boards.
`mDRMDeviceContext` is unconditionally `av_buffer_unref`'d in `mDRMDeviceContext` is unconditionally `av_buffer_unref`'d in
`ProcessShutdown` (no-op when null). Gated behind `ProcessShutdown` (no-op when null). Gated behind
@@ -46,9 +49,8 @@ codec, the function bails out, and the existing stateful
Bug 1969297. Bug 1969297.
diff --git a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h --- a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h 2026-05-21 04:57:59.570946601 +0000
--- a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h 2026-03-18 19:22:14.000000000 +0000 +++ b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h 2026-05-21 04:57:59.876488776 +0000
+++ b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h 2026-04-27 20:43:39.347992674 +0000
@@ -225,7 +225,12 @@ @@ -225,7 +225,12 @@
bool IsLinuxHDR() const; bool IsLinuxHDR() const;
MediaResult InitVAAPIDecoder(); MediaResult InitVAAPIDecoder();
@@ -73,9 +75,8 @@ diff --git a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.h b/dom/media/platfor
// If video overlay is used we want to upload SW decoded frames to // If video overlay is used we want to upload SW decoded frames to
// DMABuf and present it as a external texture to rendering pipeline. // DMABuf and present it as a external texture to rendering pipeline.
bool mUploadSWDecodeToDMABuf = false; bool mUploadSWDecodeToDMABuf = false;
diff --git a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp --- a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp 2026-05-21 04:57:59.566685221 +0000
--- a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp 2026-04-27 16:09:10.000000000 +0200 +++ b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp 2026-05-21 04:58:00.136004159 +0000
+++ b/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp 2026-04-29 00:10:00.098884335 +0200
@@ -403,6 +403,129 @@ @@ -403,6 +403,129 @@
return NS_OK; return NS_OK;
} }
@@ -90,7 +91,7 @@ diff --git a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp b/dom/media/platf
+ } + }
+ const char* drmDevice = "/dev/dri/renderD128"; + const char* drmDevice = "/dev/dri/renderD128";
+ if (mLib->av_hwdevice_ctx_create(&mDRMDeviceContext, + if (mLib->av_hwdevice_ctx_create(&mDRMDeviceContext,
+ AV_HWDEVICE_TYPE_DRM, drmDevice, + (enum AVHWDeviceType)13, drmDevice,
+ nullptr, 0) < 0) { + nullptr, 0) < 0) {
+ FFMPEG_LOG(" av_hwdevice_ctx_create(DRM, %s) failed", drmDevice); + FFMPEG_LOG(" av_hwdevice_ctx_create(DRM, %s) failed", drmDevice);
+ return false; + return false;
@@ -143,7 +144,7 @@ diff --git a/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp b/dom/media/platf
+ for (int i = 0;; i++) { + for (int i = 0;; i++) {
+ const AVCodecHWConfig* cfg = mLib->avcodec_get_hw_config(generic, i); + const AVCodecHWConfig* cfg = mLib->avcodec_get_hw_config(generic, i);
+ if (!cfg) break; + if (!cfg) break;
+ if (cfg->device_type == AV_HWDEVICE_TYPE_DRM) { + if (cfg->device_type == AV_HWDEVICE_TYPE_DRM || (int)cfg->device_type == 13 || (int)cfg->device_type == 14) {
+ codec = generic; + codec = generic;
+ FFMPEG_LOG(" using generic codec %s with DRM hwaccel", codec->name); + FFMPEG_LOG(" using generic codec %s with DRM hwaccel", codec->name);
+ break; + break;
+17 -19
View File
@@ -24,31 +24,29 @@ pkgname=libva-v4l2-request-fourier
epoch=1 epoch=1
_upstreampkg=libva-v4l2-request _upstreampkg=libva-v4l2-request
# Pin the fork tip. de27e95 = "v4l2: log error_idx + failing ctrl id # Pin the fork tip. c454618 = PR #16 merge "picture, request_pool:
# on S_EXT_CTRLS failure" — Phase 8.13 diagnostic that surfaced the # transparent OUTPUT-pool resize on bitstream overrun (#15)" —
# real root cause of the libva→daedalus_v4l2 request-completion # follow-up root-cause fix to #13/#14. On a mid-stream bitstream-
# timeout (turned out the EINVAL libva was logging was a harmless # budget overrun (typical cause: SPS-driven resolution upshift in an
# H264/HEVC probe; actual VP9 stateless control SET worked all along). # adaptive-bitrate stream), codec_store_buffer now snapshots the in-
# flight surface's accumulated bytes, releases its OUTPUT pool slot,
# calls request_pool_resize (STREAMOFF → REQBUFS(0) → S_FMT with
# 2×sizeimage hint, capped at 1 GiB, page-aligned → CREATE_BUFS →
# mmap → media_request_alloc → STREAMON), re-acquires a slot, re-
# mirrors the surface's source_{data,size,request_fd}, restores the
# bytes, and continues. The frame survives instead of being dropped
# back to libavcodec for surface recreation. CAPTURE side untouched
# (per-queue V4L2 streaming independence).
# #
# Prior pin (7ac934e) was iter38b — fresnel-fourier multi-device probe # Prior pin (2860d75) = PR #14 merge — codec_store_buffer bounds-
# + MAX_PROFILES bounds-check fix. de27e95 added the daedalus_v4l2 # check floor (#13).
# probe slot (b5b3acf), the meson option gate (2146341), and the _commit=c454618ae11addce2e17b560f4deeacbed067d98
# S_EXT_CTRLS diagnostic (de27e95 itself). c332d34 (LIBVA-1) added
# the per-codec dispatch: rpi-hevc-dec + daedalus_v4l2 both probe each
# other as alts, VP9/AV1/H.264 route to daedalus via new 'd' kind,
# HEVC stays on 'p' (rpi-hevc-dec). 9898331 (LIBVA-2) completes that
# by adding video_fd_daedalus to any_fd_supports_output_format's probe
# array — without it, H.264/VP9/AV1 profiles never got advertised on
# Pi 5 mixed deployments (rpi-hevc-dec primary, daedalus alt) and
# ffmpeg bailed with "No support for codec h264 profile 578" before
# the per-codec dispatch could fire.
_commit=989833114a7708ad999dc68309cbc181d9913bdb
# Project version from meson.build (1.0.0) + commit count + short sha, # Project version from meson.build (1.0.0) + commit count + short sha,
# matching the ffmpeg-v4l2-request-fourier convention. Recomputed at # matching the ffmpeg-v4l2-request-fourier convention. Recomputed at
# build time by pkgver() below; the static value here is a placeholder # build time by pkgver() below; the static value here is a placeholder
# so AUR-style consumers see something coherent before src/ exists. # so AUR-style consumers see something coherent before src/ exists.
pkgver=1.0.0.r380.9898331 pkgver=1.0.0.r390.c454618
pkgrel=1 pkgrel=1
pkgdesc="VA-API backend for V4L2 stateless decoders (multiplanar fork — fourier umbrella)" pkgdesc="VA-API backend for V4L2 stateless decoders (multiplanar fork — fourier umbrella)"
arch=('aarch64') arch=('aarch64')
@@ -0,0 +1,328 @@
--- a/src/panfrost/vulkan/panvk_shader.h 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/panvk_shader.h 2026-05-20 18:52:53.312698258 +0200
@@ -150,6 +150,10 @@
struct {
#if PAN_ARCH < 9
int32_t raw_vertex_offset;
+ uint32_t num_vertices; /* iter13: XFB needs per-draw vertex count */
+ /* aligned_u64 attribute below inserts the 4-byte alignment gap
+ * after num_vertices automatically — no explicit pad needed. */
+ aligned_u64 xfb_address[4]; /* iter13: 4 transform feedback buffer base addresses */
#endif
int32_t first_vertex;
int32_t base_instance;
--- a/src/panfrost/vulkan/panvk_vX_physical_device.c 2026-05-20 19:09:29.711145446 +0200
+++ b/src/panfrost/vulkan/panvk_vX_physical_device.c 2026-05-20 18:52:54.832720445 +0200
@@ -169,6 +169,7 @@
.EXT_provoking_vertex = true,
.EXT_queue_family_foreign = true,
.EXT_robustness2 = true,
+ .EXT_transform_feedback = PAN_ARCH < 9, /* iter13: JM-class only for now */
.EXT_sampler_filter_minmax = PAN_ARCH >= 10,
.EXT_scalar_block_layout = true,
.EXT_separate_stencil_usage = true,
@@ -495,6 +496,10 @@
.robustImageAccess2 = false,
.nullDescriptor = true,
+ /* VK_EXT_transform_feedback (iter13) */
+ .transformFeedback = PAN_ARCH < 9,
+ .geometryStreams = false,
+
/* VK_KHR_shader_clock */
.shaderSubgroupClock = device->kmod.dev->props.gpu_can_query_timestamp,
.shaderDeviceClock = device->kmod.dev->props.timestamp_device_coherent,
@@ -1020,6 +1025,18 @@
.robustStorageBufferAccessSizeAlignment = 1,
.robustUniformBufferAccessSizeAlignment = 1,
+ /* VK_EXT_transform_feedback (iter13) */
+ .maxTransformFeedbackStreams = 1,
+ .maxTransformFeedbackBuffers = 4,
+ .maxTransformFeedbackBufferSize = UINT32_MAX,
+ .maxTransformFeedbackStreamDataSize = 512,
+ .maxTransformFeedbackBufferDataSize = 512,
+ .maxTransformFeedbackBufferDataStride = 2048,
+ .transformFeedbackQueries = false,
+ .transformFeedbackStreamsLinesTriangles = false,
+ .transformFeedbackRasterizationStreamSelect = false,
+ .transformFeedbackDraw = false,
+
/* VK_EXT_shader_object */
/* We do not currently support VK_EXT_shader_object but this is used
* internally by vk_shader
--- a/src/panfrost/vulkan/panvk_vX_shader.c 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/panvk_vX_shader.c 2026-05-20 18:52:56.556745611 +0200
@@ -21,6 +21,7 @@
#include "panvk_physical_device.h"
#include "panvk_sampler.h"
#include "panvk_shader.h"
+#include "pan_nir.h" /* iter13: pan_nir_lower_xfb */
#include "spirv/nir_spirv.h"
#include "util/memstream.h"
@@ -100,6 +101,20 @@
case nir_intrinsic_load_raw_vertex_offset_pan:
val = load_sysval(b, graphics, bit_size, vs.raw_vertex_offset);
break;
+ case nir_intrinsic_load_num_vertices: /* iter13: XFB index calc */
+ val = load_sysval(b, graphics, bit_size, vs.num_vertices);
+ break;
+ case nir_intrinsic_load_xfb_address: { /* iter13: XFB buffer N base address */
+ unsigned idx = nir_intrinsic_base(intr);
+ switch (idx) {
+ case 0: val = load_sysval(b, graphics, bit_size, vs.xfb_address[0]); break;
+ case 1: val = load_sysval(b, graphics, bit_size, vs.xfb_address[1]); break;
+ case 2: val = load_sysval(b, graphics, bit_size, vs.xfb_address[2]); break;
+ case 3: val = load_sysval(b, graphics, bit_size, vs.xfb_address[3]); break;
+ default: return false;
+ }
+ break;
+ }
case nir_intrinsic_load_layer_id:
assert(b->shader->info.stage == MESA_SHADER_FRAGMENT);
val = load_sysval(b, graphics, bit_size, layer_id);
@@ -457,6 +472,7 @@
core_max_id);
pan_preprocess_nir(nir, pdev->kmod.dev->props.gpu_id);
+
}
static void
@@ -870,6 +886,18 @@
nir_var_shader_in | nir_var_shader_out, UINT32_MAX);
NIR_PASS(_, nir, nir_lower_io, nir_var_shader_in | nir_var_shader_out,
glsl_type_size, nir_lower_io_use_interpolated_input_intrinsics);
+
+#if PAN_ARCH < 9
+ /* iter13: VK_EXT_transform_feedback — runs AFTER nir_lower_io so that
+ * shader outputs are now store_output intrinsics that pan_nir_lower_xfb
+ * can rewrite to nir_store_global+nir_load_xfb_address. */
+ if (nir->info.stage == MESA_SHADER_VERTEX &&
+ nir->info.has_transform_feedback_varyings) {
+ NIR_PASS(_, nir, nir_opt_constant_folding);
+ NIR_PASS(_, nir, nir_io_add_intrinsic_xfb_info);
+ NIR_PASS(_, nir, pan_nir_lower_xfb);
+ }
+#endif
}
static VkResult
@@ -1288,6 +1316,9 @@
.view_mask = (state && state->rp) ? state->rp->view_mask : 0,
.robust2_modes = robust2_modes,
.robust_descriptors = dev->vk.enabled_features.nullDescriptor,
+ /* iter13: XFB shaders must disable IDVS (matches Panfrost-Gallium). */
+ .no_idvs = (info->stage == MESA_SHADER_VERTEX) &&
+ info->nir->info.has_transform_feedback_varyings,
};
switch (info->stage) {
--- a/src/panfrost/vulkan/panvk_cmd_draw.h 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/panvk_cmd_draw.h 2026-05-20 18:52:57.748763011 +0200
@@ -135,6 +135,19 @@
struct panvk_graphics_sysvals sysvals;
#if PAN_ARCH < 9
+ /* iter13: VK_EXT_transform_feedback state (JM-class only for now). */
+ struct {
+ bool active;
+ uint32_t buffer_count;
+ struct {
+ uint64_t addr;
+ uint64_t offset;
+ uint64_t size;
+ } buffers[4];
+ } xfb;
+#endif
+
+#if PAN_ARCH < 9
struct panvk_shader_link link;
#endif
--- a/src/panfrost/vulkan/panvk_vX_cmd_draw.c 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/panvk_vX_cmd_draw.c 2026-05-20 19:10:23.031919662 +0200
@@ -10,6 +10,7 @@
#include "panvk_entrypoints.h"
#include "pan_desc.h"
+#include "pan_compiler.h" /* PAN_SHADER_OOB_ADDRESS */
#include "pan_util.h"
static void
@@ -722,6 +723,35 @@
set_gfx_sysval(cmdbuf, dirty_sysvals, vs.raw_vertex_offset,
info->vertex.raw_offset);
set_gfx_sysval(cmdbuf, dirty_sysvals, layer_id, info->layer_id);
+
+ /* iter13: VK_EXT_transform_feedback sysvals — always set (per draw),
+ * reflect bound XFB state. set_gfx_sysval is a no-op if value unchanged. */
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.num_vertices, info->vertex.count);
+ {
+ const struct panvk_cmd_graphics_state *_gfx = &cmdbuf->state.gfx;
+ /* iter13: default each XFB buffer address to PAN_SHADER_OOB_ADDRESS
+ * (= 1<<63). This is the Panfrost-Gallium memory-sink idiom — the
+ * Bifrost MMU silently discards stores to this address, so a pipeline
+ * with XFB outputs used in a non-XFB draw (or in an XFB draw with
+ * fewer bound buffers than the shader declares) is safe instead of
+ * faulting. See gallium/drivers/panfrost/pan_cmdstream.c PAN_SYSVAL_XFB. */
+ uint64_t _xa0 = PAN_SHADER_OOB_ADDRESS, _xa1 = PAN_SHADER_OOB_ADDRESS,
+ _xa2 = PAN_SHADER_OOB_ADDRESS, _xa3 = PAN_SHADER_OOB_ADDRESS;
+ if (_gfx->xfb.active) {
+ if (_gfx->xfb.buffer_count > 0 && _gfx->xfb.buffers[0].addr)
+ _xa0 = _gfx->xfb.buffers[0].addr + _gfx->xfb.buffers[0].offset;
+ if (_gfx->xfb.buffer_count > 1 && _gfx->xfb.buffers[1].addr)
+ _xa1 = _gfx->xfb.buffers[1].addr + _gfx->xfb.buffers[1].offset;
+ if (_gfx->xfb.buffer_count > 2 && _gfx->xfb.buffers[2].addr)
+ _xa2 = _gfx->xfb.buffers[2].addr + _gfx->xfb.buffers[2].offset;
+ if (_gfx->xfb.buffer_count > 3 && _gfx->xfb.buffers[3].addr)
+ _xa3 = _gfx->xfb.buffers[3].addr + _gfx->xfb.buffers[3].offset;
+ }
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_address[0], _xa0);
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_address[1], _xa1);
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_address[2], _xa2);
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_address[3], _xa3);
+ }
#endif
if (dyn_gfx_state_dirty(cmdbuf, CB_BLEND_CONSTANTS)) {
--- a/src/panfrost/vulkan/meson.build 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/meson.build 2026-05-20 18:53:04.484861338 +0200
@@ -73,6 +73,7 @@
jm_inc_dir = ['jm']
jm_files = [
'jm/panvk_vX_bind_queue.c',
+ 'jm/panvk_vX_cmd_xfb.c', # iter13
'jm/panvk_vX_cmd_buffer.c',
'jm/panvk_vX_cmd_dispatch.c',
'jm/panvk_vX_cmd_draw.c',
--- a/src/panfrost/vulkan/jm/panvk_vX_cmd_buffer.c 2026-04-29 22:19:00.000000000 +0200
+++ b/src/panfrost/vulkan/jm/panvk_vX_cmd_buffer.c 2026-05-20 19:10:26.163965149 +0200
@@ -473,5 +473,12 @@
vk_command_buffer_begin(&cmdbuf->vk, pBeginInfo);
+#if PAN_ARCH < 9
+ /* iter13: clear XFB state on Begin so a reused command buffer does not
+ * inherit stale xfb.buffer_count / xfb.active / xfb.buffers[] from a
+ * prior recording. */
+ memset(&cmdbuf->state.gfx.xfb, 0, sizeof(cmdbuf->state.gfx.xfb));
+#endif
+
return VK_SUCCESS;
}
--- a/src/panfrost/vulkan/jm/panvk_vX_cmd_xfb.c 2026-05-18 12:50:53.067999996 +0200
+++ b/src/panfrost/vulkan/jm/panvk_vX_cmd_xfb.c 2026-05-20 19:10:27.175979847 +0200
@@ -0,0 +1,111 @@
+/*
+ * Copyright © 2026 mfritsche / claude-noether
+ * SPDX-License-Identifier: MIT
+ *
+ * iter13: VK_EXT_transform_feedback command handlers for the JM
+ * architecture path (Bifrost v6/v7 + Valhall-JM v9).
+ *
+ * The runtime contract:
+ * - vkCmdBindTransformFeedbackBuffersEXT: stash (gpu_addr, offset, size)
+ * for each slot into cmdbuf->state.gfx.xfb.buffers[].
+ * - vkCmdBeginTransformFeedbackEXT: set cmdbuf->state.gfx.xfb.active = true.
+ * Mark sysvals dirty so the next draw re-emits vs.xfb_address[].
+ * - vkCmdEndTransformFeedbackEXT: set active = false.
+ *
+ * Counter buffers (firstCounterBuffer/counterBufferCount/pCounterBuffers/
+ * pCounterBufferOffsets) are accepted by API but ignored — v1 doesn't
+ * support pause/resume. transformFeedbackDraw is advertised as false.
+ *
+ * Per-draw integration: jm/panvk_vX_cmd_draw.c reads cmdbuf->state.gfx.xfb
+ * and populates vs.xfb_address[i] for shader use. The pan_nir_lower_xfb
+ * pass in panvk_vX_shader.c emits nir_load_xfb_address(i) which lowers
+ * (via panvk_vX_shader.c sysval handler) to a load from the per-draw
+ * sysval push area.
+ */
+
+#include "vk_log.h"
+#include "util/log.h"
+
+#include "panvk_cmd_buffer.h"
+#include "panvk_cmd_draw.h"
+#include "panvk_buffer.h"
+#include "panvk_entrypoints.h"
+
+VKAPI_ATTR void VKAPI_CALL
+panvk_per_arch(CmdBindTransformFeedbackBuffersEXT)(
+ VkCommandBuffer commandBuffer,
+ uint32_t firstBinding,
+ uint32_t bindingCount,
+ const VkBuffer *pBuffers,
+ const VkDeviceSize *pOffsets,
+ const VkDeviceSize *pSizes)
+{
+ VK_FROM_HANDLE(panvk_cmd_buffer, cmdbuf, commandBuffer);
+ struct panvk_cmd_graphics_state *gfx = &cmdbuf->state.gfx;
+
+ for (uint32_t i = 0; i < bindingCount; i++) {
+ uint32_t slot = firstBinding + i;
+ if (slot >= 4)
+ continue;
+
+ VK_FROM_HANDLE(panvk_buffer, buf, pBuffers[i]);
+ gfx->xfb.buffers[slot].addr = panvk_buffer_gpu_ptr(buf, 0);
+ gfx->xfb.buffers[slot].offset = pOffsets[i];
+ gfx->xfb.buffers[slot].size =
+ (pSizes != NULL && pSizes[i] != VK_WHOLE_SIZE)
+ ? pSizes[i]
+ : (buf->vk.size - pOffsets[i]);
+ }
+
+ if (firstBinding + bindingCount > gfx->xfb.buffer_count)
+ gfx->xfb.buffer_count = firstBinding + bindingCount;
+}
+
+VKAPI_ATTR void VKAPI_CALL
+panvk_per_arch(CmdBeginTransformFeedbackEXT)(
+ VkCommandBuffer commandBuffer,
+ uint32_t firstCounterBuffer,
+ uint32_t counterBufferCount,
+ const VkBuffer *pCounterBuffers,
+ const VkDeviceSize *pCounterBufferOffsets)
+{
+ VK_FROM_HANDLE(panvk_cmd_buffer, cmdbuf, commandBuffer);
+ struct panvk_cmd_graphics_state *gfx = &cmdbuf->state.gfx;
+
+ /* Counter buffers ignored in v1 — see VkPhysicalDeviceTransformFeedback
+ * PropertiesEXT.transformFeedbackDraw = false in panvk_vX_physical_device.c.
+ * App is spec-compliant if it does not pass counter buffers (which our
+ * features advertisement allows), but warn loudly if it does so we do not
+ * silently produce wrong capture state. */
+ (void)firstCounterBuffer;
+ (void)pCounterBufferOffsets;
+ if (counterBufferCount > 0 && pCounterBuffers != NULL) {
+ mesa_logw("panvk: CmdBeginTransformFeedbackEXT: counter buffers not "
+ "implemented (transformFeedbackDraw=false); XFB resume will "
+ "restart at buffer offset 0");
+ }
+
+ gfx->xfb.active = true;
+ /* Per-draw set_gfx_sysval picks up the change automatically — no
+ * explicit dirty marking required (set_gfx_sysval uses memcmp +
+ * BITSET to detect state diffs and re-emit sysvals). */
+}
+
+VKAPI_ATTR void VKAPI_CALL
+panvk_per_arch(CmdEndTransformFeedbackEXT)(
+ VkCommandBuffer commandBuffer,
+ uint32_t firstCounterBuffer,
+ uint32_t counterBufferCount,
+ const VkBuffer *pCounterBuffers,
+ const VkDeviceSize *pCounterBufferOffsets)
+{
+ VK_FROM_HANDLE(panvk_cmd_buffer, cmdbuf, commandBuffer);
+ struct panvk_cmd_graphics_state *gfx = &cmdbuf->state.gfx;
+
+ (void)firstCounterBuffer;
+ (void)counterBufferCount;
+ (void)pCounterBuffers;
+ (void)pCounterBufferOffsets;
+
+ gfx->xfb.active = false;
+}
@@ -0,0 +1,629 @@
diff -urN a/src/panfrost/vulkan/meson.build b/src/panfrost/vulkan/meson.build
--- a/src/panfrost/vulkan/meson.build 2026-05-21 14:04:02.529474145 +0200
+++ b/src/panfrost/vulkan/meson.build 2026-05-21 14:04:04.106755486 +0200
@@ -123,6 +123,7 @@
'panvk_vX_nir_lower_input_attachment_loads.c',
'panvk_vX_sampler.c',
'panvk_vX_shader.c',
+ 'panvk_vX_xfb_lower.c',
sha1_h,
]
diff -urN a/src/panfrost/vulkan/panvk_shader.h b/src/panfrost/vulkan/panvk_shader.h
--- a/src/panfrost/vulkan/panvk_shader.h 2026-05-21 14:04:02.525251986 +0200
+++ b/src/panfrost/vulkan/panvk_shader.h 2026-05-21 14:04:04.084251800 +0200
@@ -154,6 +154,8 @@
/* aligned_u64 attribute below inserts the 4-byte alignment gap
* after num_vertices automatically — no explicit pad needed. */
aligned_u64 xfb_address[4]; /* iter13: 4 transform feedback buffer base addresses */
+ uint32_t xfb_topology; /* iter17: panvk_xfb_topology enum value */
+ uint32_t xfb_output_count; /* iter17: per-instance output verts after decomp */
#endif
int32_t first_vertex;
int32_t base_instance;
@@ -569,4 +571,76 @@
struct pan_compute_dim local_size, const void *bin_ptr, size_t bin_size,
struct panvk_shader **shader_out);
+
+#if PAN_ARCH < 9
+/* iter17: encoding for vs.xfb_topology sysval. Maps VkPrimitiveTopology values
+ * we need to distinguish at shader runtime for XFB capture. LIST topologies
+ * use the iter13 single-store fast path; non-LIST need per-vertex decomposition. */
+enum panvk_xfb_topology {
+ PANVK_XFB_TOPO_LIST = 0,
+ PANVK_XFB_TOPO_LINE_STRIP = 1,
+ PANVK_XFB_TOPO_TRI_STRIP = 2,
+ PANVK_XFB_TOPO_TRI_FAN = 3,
+ PANVK_XFB_TOPO_LINE_LIST_ADJ = 4,
+ PANVK_XFB_TOPO_LINE_STRIP_ADJ = 5,
+ PANVK_XFB_TOPO_TRI_LIST_ADJ = 6,
+ PANVK_XFB_TOPO_TRI_STRIP_ADJ = 7,
+};
+
+#include "panvk_macros.h"
+struct nir_shader;
+bool panvk_per_arch(nir_lower_xfb)(struct nir_shader *nir);
+
+/* Map VkPrimitiveTopology to panvk_xfb_topology enum (driver-side helper). */
+static inline uint32_t
+panvk_vk_topology_to_xfb_enum(VkPrimitiveTopology topo)
+{
+ switch (topo) {
+ case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP:
+ return PANVK_XFB_TOPO_LINE_STRIP;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP:
+ return PANVK_XFB_TOPO_TRI_STRIP;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN:
+ return PANVK_XFB_TOPO_TRI_FAN;
+ case VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY:
+ return PANVK_XFB_TOPO_LINE_LIST_ADJ;
+ case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY:
+ return PANVK_XFB_TOPO_LINE_STRIP_ADJ;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY:
+ return PANVK_XFB_TOPO_TRI_LIST_ADJ;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY:
+ return PANVK_XFB_TOPO_TRI_STRIP_ADJ;
+ case VK_PRIMITIVE_TOPOLOGY_POINT_LIST:
+ case VK_PRIMITIVE_TOPOLOGY_LINE_LIST:
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST:
+ default:
+ return PANVK_XFB_TOPO_LIST;
+ }
+}
+
+/* Compute the per-instance output vertex count for a given (topology, input count). */
+static inline uint32_t
+panvk_xfb_output_count(VkPrimitiveTopology topo, uint32_t input_count)
+{
+ switch (topo) {
+ case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP:
+ return input_count >= 1 ? 2u * (input_count - 1u) : 0u;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP:
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN:
+ return input_count >= 2 ? 3u * (input_count - 2u) : 0u;
+ case VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY:
+ return (input_count / 4u) * 2u;
+ case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY:
+ return input_count >= 3 ? 2u * (input_count - 3u) : 0u;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY:
+ return (input_count / 6u) * 3u;
+ case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY:
+ return input_count >= 6 ? 3u * (input_count / 2u - 2u) : 0u;
+ default:
+ return input_count; /* LIST topologies: 1:1 mapping */
+ }
+}
+#endif
+
+
#endif
diff -urN a/src/panfrost/vulkan/panvk_vX_cmd_draw.c b/src/panfrost/vulkan/panvk_vX_cmd_draw.c
--- a/src/panfrost/vulkan/panvk_vX_cmd_draw.c 2026-05-21 14:04:02.528576354 +0200
+++ b/src/panfrost/vulkan/panvk_vX_cmd_draw.c 2026-05-21 14:04:04.091357598 +0200
@@ -727,6 +727,20 @@
/* iter13: VK_EXT_transform_feedback sysvals — always set (per draw),
* reflect bound XFB state. set_gfx_sysval is a no-op if value unchanged. */
set_gfx_sysval(cmdbuf, dirty_sysvals, vs.num_vertices, info->vertex.count);
+
+ /* iter17: XFB primitive-decomposition sysvals.
+ * xfb_topology = enum value for the current bound topology.
+ * xfb_output_count = per-instance output vertex count after decomposition.
+ * For LIST topologies, output_count == input vertex count and the shader
+ * takes the iter13 single-store fast path. */
+ {
+ VkPrimitiveTopology vk_topo =
+ cmdbuf->vk.dynamic_graphics_state.ia.primitive_topology;
+ uint32_t topo_enum = panvk_vk_topology_to_xfb_enum(vk_topo);
+ uint32_t out_count = panvk_xfb_output_count(vk_topo, info->vertex.count);
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_topology, topo_enum);
+ set_gfx_sysval(cmdbuf, dirty_sysvals, vs.xfb_output_count, out_count);
+ }
{
const struct panvk_cmd_graphics_state *_gfx = &cmdbuf->state.gfx;
/* iter13: default each XFB buffer address to PAN_SHADER_OOB_ADDRESS
diff -urN a/src/panfrost/vulkan/panvk_vX_shader.c b/src/panfrost/vulkan/panvk_vX_shader.c
--- a/src/panfrost/vulkan/panvk_vX_shader.c 2026-05-21 14:04:02.527576494 +0200
+++ b/src/panfrost/vulkan/panvk_vX_shader.c 2026-05-21 14:04:04.098356619 +0200
@@ -895,7 +895,10 @@
nir->info.has_transform_feedback_varyings) {
NIR_PASS(_, nir, nir_opt_constant_folding);
NIR_PASS(_, nir, nir_io_add_intrinsic_xfb_info);
- NIR_PASS(_, nir, pan_nir_lower_xfb);
+ /* iter17: panvk-specific replacement for pan_nir_lower_xfb that handles
+ * primitive decomposition for non-LIST topologies. Single-store LIST
+ * fast path matches iter13 behavior. */
+ NIR_PASS(_, nir, panvk_per_arch(nir_lower_xfb));
}
#endif
}
diff -urN a/src/panfrost/vulkan/panvk_vX_xfb_lower.c b/src/panfrost/vulkan/panvk_vX_xfb_lower.c
--- a/src/panfrost/vulkan/panvk_vX_xfb_lower.c 1970-01-01 01:00:00.000000000 +0100
+++ b/src/panfrost/vulkan/panvk_vX_xfb_lower.c 2026-05-21 14:04:04.115354242 +0200
@@ -0,0 +1,486 @@
+/*
+ * Copyright © 2026 mfritsche / claude-noether
+ * SPDX-License-Identifier: MIT
+ *
+ * iter17: panvk-specific replacement for pan_nir_lower_xfb that handles
+ * primitive decomposition for transform_feedback on non-LIST topologies
+ * (TRIANGLE_STRIP/FAN, LINE_STRIP, *_WITH_ADJACENCY).
+ *
+ * Approach: emit a topology dispatch at the start of each store_output
+ * lowering. The shader reads vs.xfb_topology sysval at runtime and branches
+ * into per-topology emission logic. For each affected topology, the lowered
+ * code emits guarded conditional stores — one per primitive this vertex
+ * contributes to, computing the output buffer position via primitive index
+ * and slot within the decomposed primitive.
+ *
+ * For LIST topologies (POINT/LINE/TRIANGLE LIST), takes a fast path that
+ * matches iter13's single-store behavior.
+ *
+ * For TRIANGLE_FAN, the central vertex (v=0) contributes to ALL primitives
+ * as slot 2 — handled via a NIR loop bounded by num_vertices.
+ *
+ * See ~/src/panvk-bifrost/iter17/phase{0,1,2}_*.md for full design context.
+ */
+
+#include "panvk_macros.h"
+
+#if PAN_ARCH < 9
+
+#include "panvk_shader.h"
+
+#include "compiler/nir/nir_builder.h"
+#include "pan_nir.h"
+
+#include <vulkan/vulkan_core.h>
+
+/* ----- Address arithmetic ----- */
+
+static nir_def *
+xfb_store_addr(nir_builder *b, nir_def *buf, nir_def *out_idx,
+ uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *byte_off = nir_iadd_imm(b,
+ nir_imul_imm(b, out_idx, stride), offset_bytes);
+ return nir_iadd(b, buf, nir_u2u64(b, byte_off));
+}
+
+static void
+emit_list_store(nir_builder *b, nir_def *buf, nir_def *output_count,
+ nir_def *instance_id, nir_def *raw_vid, nir_def *value,
+ uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *out_idx = nir_iadd(b,
+ nir_imul(b, instance_id, output_count), raw_vid);
+ nir_def *addr = xfb_store_addr(b, buf, out_idx, stride, offset_bytes);
+ nir_store_global(b, value, addr);
+}
+
+static void
+emit_prim_store(nir_builder *b, nir_def *buf, nir_def *output_count,
+ nir_def *instance_id, nir_def *eligible,
+ nir_def *prim_idx, nir_def *slot,
+ uint32_t verts_per_prim,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ nir_push_if(b, eligible);
+ {
+ nir_def *out_idx = nir_iadd(b,
+ nir_imul(b, instance_id, output_count),
+ nir_iadd(b, nir_imul_imm(b, prim_idx, verts_per_prim), slot));
+ nir_def *addr = xfb_store_addr(b, buf, out_idx, stride, offset_bytes);
+ nir_store_global(b, value, addr);
+ }
+ nir_pop_if(b, NULL);
+}
+
+/* ----- Per-topology emission ----- */
+
+/* TRIANGLE_STRIP: vertex v contributes to prims v, v-1, v-2 (per eligibility). */
+static void
+emit_tri_strip(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *Nm2 = nir_iadd_imm(b, N, -2);
+ nir_def *Nm1 = nir_iadd_imm(b, N, -1);
+
+ /* Prim v, slot 0: v < N-2 */
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ult(b, v, Nm2),
+ v, nir_imm_int(b, 0), 3, value, stride, offset_bytes);
+
+ /* Prim v-1, slot = 1 if prim even else 2: 1 <= v < N-1 */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -1);
+ nir_def *parity = nir_iand_imm(b, prim, 1u);
+ nir_def *slot = nir_iadd_imm(b, parity, 1);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 1)),
+ nir_ult(b, v, Nm1));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, slot, 3, value, stride, offset_bytes);
+ }
+
+ /* Prim v-2, slot = 2 if prim even else 1: 2 <= v < N */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -2);
+ nir_def *parity = nir_iand_imm(b, prim, 1u);
+ nir_def *slot = nir_isub(b, nir_imm_int(b, 2), parity);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 2)),
+ nir_ult(b, v, N));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, slot, 3, value, stride, offset_bytes);
+ }
+}
+
+/* LINE_STRIP: vertex v contributes to prim v slot 0 + prim v-1 slot 1. */
+static void
+emit_line_strip(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *Nm1 = nir_iadd_imm(b, N, -1);
+
+ /* Prim v, slot 0: v < N-1 */
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ult(b, v, Nm1),
+ v, nir_imm_int(b, 0), 2, value, stride, offset_bytes);
+
+ /* Prim v-1, slot 1: 1 <= v < N */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -1);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 1)),
+ nir_ult(b, v, N));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, nir_imm_int(b, 1), 2, value, stride, offset_bytes);
+ }
+}
+
+/* TRIANGLE_FAN: prim p emits {p+1, p+2, 0}.
+ * vertex v=0: contributes to ALL prims as slot 2 (loop required)
+ * vertex v>=1: contributes to prim v-1 as slot 0 (if 1 <= v <= N-2)
+ * vertex v>=2: contributes to prim v-2 as slot 1 (if 2 <= v <= N-1)
+ */
+static void
+emit_tri_fan(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *Nm1 = nir_iadd_imm(b, N, -1);
+ nir_def *Nm2 = nir_iadd_imm(b, N, -2);
+
+ /* Prim v-1, slot 0: 1 <= v < N-1 */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -1);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 1)),
+ nir_ult(b, v, Nm1));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, nir_imm_int(b, 0), 3, value, stride, offset_bytes);
+ }
+
+ /* Prim v-2, slot 1: 2 <= v < N */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -2);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 2)),
+ nir_ult(b, v, N));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, nir_imm_int(b, 1), 3, value, stride, offset_bytes);
+ }
+
+ /* Central vertex (v == 0): loop over all prims, write to slot 2. */
+ nir_push_if(b, nir_ieq_imm(b, v, 0));
+ {
+ nir_variable *p_var = nir_local_variable_create(b->impl,
+ glsl_uint_type(), "fan_p");
+ nir_store_var(b, p_var, nir_imm_int(b, 0), 0x1);
+ nir_push_loop(b);
+ {
+ nir_def *p = nir_load_var(b, p_var);
+ nir_push_if(b, nir_uge(b, p, Nm2));
+ {
+ nir_jump(b, nir_jump_break);
+ }
+ nir_pop_if(b, NULL);
+
+ nir_def *out_idx = nir_iadd(b,
+ nir_imul(b, instance_id, output_count),
+ nir_iadd_imm(b, nir_imul_imm(b, p, 3), 2));
+ nir_def *addr = xfb_store_addr(b, buf, out_idx, stride, offset_bytes);
+ nir_store_global(b, value, addr);
+
+ nir_store_var(b, p_var, nir_iadd_imm(b, p, 1), 0x1);
+ }
+ nir_pop_loop(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+}
+
+/* LINE_LIST_WITH_ADJACENCY: 4-vertex groups [4i..4i+3]; output {4i+1, 4i+2}.
+ * v contributes if v%4 == 1: prim v/4 slot 0
+ * v contributes if v%4 == 2: prim v/4 slot 1
+ */
+static void
+emit_line_list_adj(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ (void)N; /* eligibility is mod-based, not range-based */
+ nir_def *vmod4 = nir_iand_imm(b, v, 3u);
+ nir_def *prim = nir_ushr_imm(b, v, 2); /* v / 4 */
+
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ieq_imm(b, vmod4, 1),
+ prim, nir_imm_int(b, 0), 2, value, stride, offset_bytes);
+
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ieq_imm(b, vmod4, 2),
+ prim, nir_imm_int(b, 1), 2, value, stride, offset_bytes);
+}
+
+/* LINE_STRIP_WITH_ADJACENCY: prim p emits {p+1, p+2}.
+ * v contributes to prim v-1 slot 0 (1 <= v <= N-2)
+ * v contributes to prim v-2 slot 1 (2 <= v <= N-1)
+ */
+static void
+emit_line_strip_adj(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ nir_def *Nm1 = nir_iadd_imm(b, N, -1);
+ nir_def *Nm2 = nir_iadd_imm(b, N, -2);
+
+ /* Prim v-1, slot 0: 1 <= v <= N-2 ⇔ v >= 1 AND v <= N-2 ⇔ v >= 1 AND v < N-1 */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -1);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 1)),
+ nir_ult(b, v, Nm1));
+ (void)Nm2;
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, nir_imm_int(b, 0), 2, value, stride, offset_bytes);
+ }
+
+ /* Prim v-2, slot 1: 2 <= v <= N-1 ⇔ v >= 2 AND v < N */
+ {
+ nir_def *prim = nir_iadd_imm(b, v, -2);
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 2)),
+ nir_ult(b, v, N));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, nir_imm_int(b, 1), 2, value, stride, offset_bytes);
+ }
+}
+
+/* TRIANGLE_LIST_WITH_ADJACENCY: 6-vertex groups; output {6i, 6i+2, 6i+4}.
+ * v contributes if v%6 == 0: prim v/6 slot 0
+ * v contributes if v%6 == 2: prim v/6 slot 1
+ * v contributes if v%6 == 4: prim v/6 slot 2
+ */
+static void
+emit_tri_list_adj(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ (void)N;
+ nir_def *vmod6 = nir_umod_imm(b, v, 6);
+ nir_def *prim = nir_udiv_imm(b, v, 6);
+
+ for (uint32_t slot = 0; slot < 3; slot++) {
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ieq_imm(b, vmod6, slot * 2),
+ prim, nir_imm_int(b, slot), 3, value, stride, offset_bytes);
+ }
+}
+
+/* TRIANGLE_STRIP_WITH_ADJACENCY: prim i emits:
+ * even i: {2i, 2i+2, 2i+4} (slots 0, 1, 2 ← input indices 2i, 2i+2, 2i+4)
+ * odd i: {2i, 2i+4, 2i+2} (slots 0, 1, 2 ← input indices 2i, 2i+4, 2i+2)
+ *
+ * Only EVEN input vertices contribute (since all output indices are 2*something).
+ * For even input v:
+ * prim v/2 slot 0 (always, if v/2 < N/2-2)
+ * prim (v-2)/2 slot 1 if (v-2)/2 even, slot 2 if odd (when v >= 2)
+ * prim (v-4)/2 slot 2 if (v-4)/2 even, slot 1 if odd (when v >= 4)
+ */
+static void
+emit_tri_strip_adj(nir_builder *b, nir_def *v, nir_def *N,
+ nir_def *buf, nir_def *output_count, nir_def *instance_id,
+ nir_def *value, uint16_t stride, uint16_t offset_bytes)
+{
+ /* Bail for odd input vertices — they never contribute. */
+ nir_def *v_is_even = nir_ieq_imm(b, nir_iand_imm(b, v, 1u), 0);
+ nir_push_if(b, v_is_even);
+ {
+ nir_def *N_half = nir_ushr_imm(b, N, 1);
+ nir_def *max_prim = nir_iadd_imm(b, N_half, -2); /* N/2 - 2 */
+ nir_def *v_half = nir_ushr_imm(b, v, 1);
+
+ /* Prim v/2 slot 0: v/2 < N/2 - 2 */
+ emit_prim_store(b, buf, output_count, instance_id,
+ nir_ult(b, v_half, max_prim),
+ v_half, nir_imm_int(b, 0), 3, value, stride, offset_bytes);
+
+ /* Prim (v-2)/2 = v/2 - 1: v >= 2 AND prim < N/2-2 */
+ {
+ nir_def *prim = nir_iadd_imm(b, v_half, -1);
+ nir_def *parity = nir_iand_imm(b, prim, 1u);
+ nir_def *slot = nir_iadd_imm(b, parity, 1); /* even→1, odd→2 */
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 2)),
+ nir_ult(b, prim, max_prim));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, slot, 3, value, stride, offset_bytes);
+ }
+
+ /* Prim (v-4)/2 = v/2 - 2: v >= 4 AND prim < N/2-2 */
+ {
+ nir_def *prim = nir_iadd_imm(b, v_half, -2);
+ nir_def *parity = nir_iand_imm(b, prim, 1u);
+ nir_def *slot = nir_isub(b, nir_imm_int(b, 2), parity); /* even→2, odd→1 */
+ nir_def *eligible = nir_iand(b,
+ nir_uge(b, v, nir_imm_int(b, 4)),
+ nir_ult(b, prim, max_prim));
+ emit_prim_store(b, buf, output_count, instance_id, eligible,
+ prim, slot, 3, value, stride, offset_bytes);
+ }
+ }
+ nir_pop_if(b, NULL);
+}
+
+/* ----- Main lowering: per store_output XFB channel ----- */
+
+static void
+lower_xfb_output_iter17(nir_builder *b, nir_intrinsic_instr *intr,
+ unsigned channel_idx, unsigned num_components,
+ unsigned buffer, unsigned offset_words)
+{
+ assert(buffer < MAX_XFB_BUFFERS);
+ assert(nir_intrinsic_component(intr) == 0);
+
+ uint16_t stride = b->shader->info.xfb_stride[buffer] * 4;
+ assert(stride != 0);
+ uint16_t offset_bytes = offset_words * 4;
+
+ BITSET_SET(b->shader->info.system_values_read, SYSTEM_VALUE_VERTEX_ID_ZERO_BASE);
+ BITSET_SET(b->shader->info.system_values_read, SYSTEM_VALUE_INSTANCE_ID);
+
+ nir_def *topology = load_sysval(b, graphics, 32, vs.xfb_topology);
+ nir_def *out_count = load_sysval(b, graphics, 32, vs.xfb_output_count);
+ nir_def *N = nir_load_num_vertices(b);
+ nir_def *v = nir_load_raw_vertex_id_pan(b);
+ nir_def *instance = nir_load_instance_id(b);
+ nir_def *buf = nir_load_xfb_address(b, 64, .base = buffer);
+
+ nir_def *src = intr->src[0].ssa;
+ nir_component_mask_t mask = nir_component_mask(num_components);
+ nir_def *value = nir_channels(b, src, mask << channel_idx);
+
+ /* Topology dispatch ladder. LIST first (fast path). */
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LIST));
+ {
+ emit_list_store(b, buf, out_count, instance, v, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ /* iter17 Janet Finding 3: gate all non-LIST emission on
+ * output_count > 0. For degenerate input counts (N < min required
+ * for the topology), output_count is 0 and we must emit NO stores
+ * — otherwise N-2 / N-3 / etc. arithmetic underflows in the
+ * eligibility predicates and we falsely fire stores. */
+ nir_push_if(b, nir_ult(b, nir_imm_int(b, 0), out_count));
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_TRI_STRIP));
+ {
+ emit_tri_strip(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LINE_STRIP));
+ {
+ emit_line_strip(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_TRI_FAN));
+ {
+ emit_tri_fan(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LINE_LIST_ADJ));
+ {
+ emit_line_list_adj(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LINE_STRIP_ADJ));
+ {
+ emit_line_strip_adj(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_TRI_LIST_ADJ));
+ {
+ emit_tri_list_adj(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_push_else(b, NULL);
+ {
+ /* TRI_STRIP_ADJ — last case */
+ emit_tri_strip_adj(b, v, N, buf, out_count, instance, value,
+ stride, offset_bytes);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL);
+ }
+ nir_pop_if(b, NULL); /* Janet Finding 3: close output_count > 0 guard */
+ }
+ nir_pop_if(b, NULL);
+}
+
+/* Mirror of pan_nir_lower_xfb's lower_xfb: load_vertex_id rewrite +
+ * dispatch store_output through our topology-aware emission. */
+static bool
+lower_xfb_iter17(nir_builder *b, nir_intrinsic_instr *intr,
+ UNUSED void *data)
+{
+ if (intr->intrinsic == nir_intrinsic_load_vertex_id) {
+ b->cursor = nir_instr_remove(&intr->instr);
+ nir_def *repl = nir_iadd(b, nir_load_raw_vertex_id_pan(b),
+ nir_load_raw_vertex_offset_pan(b));
+ nir_def_rewrite_uses(&intr->def, repl);
+ return true;
+ }
+
+ if (intr->intrinsic != nir_intrinsic_store_output)
+ return false;
+
+ bool progress = false;
+ b->cursor = nir_before_instr(&intr->instr);
+
+ /* io_xfb has only out[0,1]; the other 2 channels are in io_xfb2.
+ * Outer loop selects which annotation; inner picks which channel. */
+ for (unsigned i = 0; i < 2; ++i) {
+ nir_io_xfb xfb = i ? nir_intrinsic_io_xfb2(intr)
+ : nir_intrinsic_io_xfb(intr);
+ for (unsigned j = 0; j < 2; ++j) {
+ if (!xfb.out[j].num_components)
+ continue;
+ lower_xfb_output_iter17(b, intr, i * 2 + j, xfb.out[j].num_components,
+ xfb.out[j].buffer, xfb.out[j].offset);
+ progress = true;
+ }
+ }
+
+ if (progress)
+ nir_instr_remove(&intr->instr);
+ return progress;
+}
+
+bool
+panvk_per_arch(nir_lower_xfb)(nir_shader *nir)
+{
+ return nir_shader_intrinsics_pass(
+ nir, lower_xfb_iter17, nir_metadata_control_flow, NULL);
+}
+
+#endif /* PAN_ARCH < 9 */
+29 -1
View File
@@ -30,7 +30,7 @@
pkgname=mesa-panvk-bifrost pkgname=mesa-panvk-bifrost
_mesaver=26.0.6 _mesaver=26.0.6
pkgver=26.0.6.r2 pkgver=26.0.6.r4
pkgrel=1 pkgrel=1
pkgdesc="Patched Mesa libvulkan_panfrost.so exposing Bifrost-gen Mali to Vulkan apps (panvk-bifrost campaign)" pkgdesc="Patched Mesa libvulkan_panfrost.so exposing Bifrost-gen Mali to Vulkan apps (panvk-bifrost campaign)"
arch=('aarch64') arch=('aarch64')
@@ -79,6 +79,8 @@ source=(
"https://archive.mesa3d.org/mesa-${_mesaver}.tar.xz" "https://archive.mesa3d.org/mesa-${_mesaver}.tar.xz"
"0001-panvk-expose-robustness2-nullDescriptor-bifrost.patch" "0001-panvk-expose-robustness2-nullDescriptor-bifrost.patch"
"0002-panvk-expose-vulkan-1.1-1.2-on-bifrost.patch" "0002-panvk-expose-vulkan-1.1-1.2-on-bifrost.patch"
"0003-panvk-bifrost-vk-ext-transform-feedback.patch"
"0004-panvk-bifrost-xfb-primitive-decomposition.patch"
"brave-vulkan" "brave-vulkan"
"icd.json" "icd.json"
) )
@@ -88,6 +90,8 @@ sha256sums=(
'SKIP' 'SKIP'
'SKIP' 'SKIP'
'SKIP' 'SKIP'
'SKIP'
'SKIP'
) )
prepare() { prepare() {
@@ -107,12 +111,36 @@ prepare() {
sed -i 's|bool has_vk1_1 = PAN_ARCH >= 10;|bool has_vk1_1 = true;|' src/panfrost/vulkan/panvk_vX_physical_device.c sed -i 's|bool has_vk1_1 = PAN_ARCH >= 10;|bool has_vk1_1 = true;|' src/panfrost/vulkan/panvk_vX_physical_device.c
sed -i 's|bool has_vk1_2 = PAN_ARCH >= 10;|bool has_vk1_2 = true;|' src/panfrost/vulkan/panvk_vX_physical_device.c sed -i 's|bool has_vk1_2 = PAN_ARCH >= 10;|bool has_vk1_2 = true;|' src/panfrost/vulkan/panvk_vX_physical_device.c
# iter13: VK_EXT_transform_feedback implementation for Bifrost (PAN_ARCH<9).
# Applied as a real unified-diff patch — the change is too large for sed.
# Phase-doc context: ~/src/panvk-bifrost/phase{4,5,6}_iter13_close.md.
# Unlocks ANGLE-Vulkan → GLES3 → WebGL2 / WebGPU on Brave (chrome://gpu
# reports "Hardware accelerated" across the board for the affected paths).
patch -p1 < "${srcdir}/0003-panvk-bifrost-vk-ext-transform-feedback.patch"
# iter17: XFB primitive decomposition for non-LIST topologies (TRI_STRIP,
# TRI_FAN, LINE_STRIP, *_WITH_ADJACENCY). Replacement panvk-specific
# NIR pass (panvk_per_arch(nir_lower_xfb)) substituted for upstream
# pan_nir_lower_xfb. Closes the 162 dEQP-VK winding_* failures from
# iter15 (958 P / 81 F / 0 Crash on full XFB CTS — remaining 81 fails
# are by-design resume_* tests, transformFeedbackDraw=false).
# Phase-doc context: ~/src/panvk-bifrost/iter17/phase{0,1,2,4,5,6,8}_*.md.
patch -p1 < "${srcdir}/0004-panvk-bifrost-xfb-primitive-decomposition.patch"
# Sanity-check the patches landed. # Sanity-check the patches landed.
grep -q "KHR_robustness2 = true," src/panfrost/vulkan/panvk_vX_physical_device.c grep -q "KHR_robustness2 = true," src/panfrost/vulkan/panvk_vX_physical_device.c
grep -q "EXT_robustness2 = true," src/panfrost/vulkan/panvk_vX_physical_device.c grep -q "EXT_robustness2 = true," src/panfrost/vulkan/panvk_vX_physical_device.c
grep -q "nullDescriptor = true," src/panfrost/vulkan/panvk_vX_physical_device.c grep -q "nullDescriptor = true," src/panfrost/vulkan/panvk_vX_physical_device.c
grep -q "has_vk1_1 = true;" src/panfrost/vulkan/panvk_vX_physical_device.c grep -q "has_vk1_1 = true;" src/panfrost/vulkan/panvk_vX_physical_device.c
grep -q "has_vk1_2 = true;" src/panfrost/vulkan/panvk_vX_physical_device.c grep -q "has_vk1_2 = true;" src/panfrost/vulkan/panvk_vX_physical_device.c
# iter13 sanity:
grep -q "EXT_transform_feedback = PAN_ARCH < 9," src/panfrost/vulkan/panvk_vX_physical_device.c
test -f src/panfrost/vulkan/jm/panvk_vX_cmd_xfb.c
# iter17 sanity: pan_nir_lower_xfb call site has been replaced; new file present.
grep -q "panvk_per_arch(nir_lower_xfb)" src/panfrost/vulkan/panvk_vX_shader.c
grep -q "xfb_topology" src/panfrost/vulkan/panvk_shader.h
grep -q "panvk_xfb_topology" src/panfrost/vulkan/panvk_shader.h
test -f src/panfrost/vulkan/panvk_vX_xfb_lower.c
} }
build() { build() {
+49 -23
View File
@@ -14,9 +14,9 @@
# Sibling userspace package: ../daedalus-v4l2/build-deb.sh # Sibling userspace package: ../daedalus-v4l2/build-deb.sh
set -euo pipefail set -euo pipefail
UPSTREAM_COMMIT=3dd0eb070a75893f78368ce819b9e9ebf08c124d UPSTREAM_COMMIT=5d8b4369e58ab947d1c56b1f718293c57c6065b5
PKGVER=0.1.0+r20+g3dd0eb0 PKGVER=0.1.0+r33+g5d8b436
PKGREL=1 # reset for new upstream pin (3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synth) PKGREL=1 # reset for new upstream pin (5d8b436 — revert parking design); still carries the #64 multi-kernel postinst fix
MODULE_NAME=daedalus_v4l2 MODULE_NAME=daedalus_v4l2
HERE=$(dirname "$(readlink -f "$0")") HERE=$(dirname "$(readlink -f "$0")")
@@ -78,7 +78,6 @@ set -e
NAME=${MODULE_NAME} NAME=${MODULE_NAME}
VERSION=${PKGVER} VERSION=${PKGVER}
KERNELVER=\$(uname -r)
# Yellow + bold ANSI for the warning so it stands out in apt's # Yellow + bold ANSI for the warning so it stands out in apt's
# stream of "Setting up" lines. Disable colour on non-TTY. # stream of "Setting up" lines. Disable colour on non-TTY.
@@ -101,29 +100,56 @@ if [ "\$1" = "configure" ]; then
dkms add "\$NAME/\$VERSION" 2>/dev/null || true dkms add "\$NAME/\$VERSION" 2>/dev/null || true
# Don't let autoinstall failure mask the actual problem behind '|| true'. # Enumerate every kernel whose headers are actually present
# Run it, capture the result, then verify post-condition. # (/lib/modules/<kver>/build resolves to a directory). We iterate
autoinstall_rc=0 # all of them — not just \$(uname -r) — so that installing this
dkms autoinstall "\$NAME/\$VERSION" || autoinstall_rc=\$? # package after a kernel update covers the newly-installed kernel
# too, and so that a later kernel-headers install for a previously
# uncovered version gets picked up on dpkg-reconfigure. Without
# this, autoinstall (which targets only the running kernel) leaves
# /dev/daedalus-v4l2 absent after a kernel switch + reboot
# (marfrit/marfrit-packages#64).
kvers=''
for d in /lib/modules/*/build; do
[ -d "\$d" ] || continue
k=\$(basename "\$(dirname "\$d")")
kvers="\$kvers \$k"
done
# Verify the module actually built + installed for the running kernel. if [ -z "\$kvers" ]; then
status=\$(dkms status -m "\$NAME" -v "\$VERSION" -k "\$KERNELVER" 2>/dev/null || true)
if ! printf '%s\\n' "\$status" | grep -q -E 'installed|loaded'; then
warn "" warn ""
warn "DKMS build did NOT land for kernel \$KERNELVER." warn "No kernels with headers found under /lib/modules/*/build."
warn " dkms status -m \$NAME -v \$VERSION -k \$KERNELVER:" warn "Install kernel headers (e.g. linux-headers-rpi-2712 on Pi OS)"
warn " \$(printf '%s' "\$status" | head -1)" warn "then finish with:"
warn ""
warn "Most likely cause: kernel headers package is missing."
warn " Raspberry Pi OS / Pi 5: apt install linux-headers-rpi-2712"
warn " Debian generic: apt install linux-headers-\$KERNELVER"
warn ""
warn "After installing headers, finish the install with:"
warn " sudo dkms autoinstall \$NAME/\$VERSION" warn " sudo dkms autoinstall \$NAME/\$VERSION"
warn " sudo modprobe daedalus_v4l2" exit 0
fi
failed=''
for k in \$kvers; do
dkms autoinstall -k "\$k" "\$NAME/\$VERSION" >/dev/null 2>&1 || true
s=\$(dkms status -m "\$NAME" -v "\$VERSION" -k "\$k" 2>/dev/null || true)
if ! printf '%s\\n' "\$s" | grep -q -E 'installed|loaded'; then
failed="\$failed \$k"
fi
done
if [ -n "\$failed" ]; then
warn "" warn ""
warn "Until then daedalus_v4l2 will NOT be loadable and the" warn "DKMS build did NOT land for kernel(s):\$failed"
warn "userspace daedalus-v4l2 daemon will have nothing to talk to." warn ""
warn "Most likely cause: kernel headers missing for those versions."
warn " Raspberry Pi OS / Pi 5: apt install linux-headers-rpi-2712"
warn " Debian generic: apt install linux-headers-<version>"
warn ""
warn "After installing headers, finish with:"
for k in \$failed; do
warn " sudo dkms autoinstall -k \$k \$NAME/\$VERSION"
done
warn " sudo modprobe daedalus_v4l2 (after booting that kernel)"
warn ""
warn "Until then daedalus_v4l2 will NOT be loadable on those kernels"
warn "and the userspace daedalus-v4l2 daemon will have nothing to talk to."
fi fi
fi fi
+96
View File
@@ -1,3 +1,99 @@
daedalus-v4l2-dkms (0.1.0+r33+g5d8b436-1) bookworm trixie; urgency=medium
* Bump to 5d8b436 — reverts daedalus-v4l2 PRs #7 + #8. Kernel
module returns to the pre-#7 buf_done_and_job_finish completion
model: no src/dst lifecycle decoupling, no parked dst_bufs, no
1:1-contract violation against libva-v4l2-request-fourier
(closes daedalus-v4l2#9 + #10 as won't-fix at this layer; proper
fix tracked at daedalus-v4l2#11).
* Wire-protocol drops 1 → 0; lock-step install with daedalus-v4l2
0.1.0+r33+g5d8b436 REQUIRED.
* Carries forward the #64 multi-kernel postinst fix.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 14:50:00 +0000
daedalus-v4l2-dkms (0.1.0+r30+g6ffe92b-1) bookworm trixie; urgency=medium
* Bump to 6ffe92b — fixes the kernel panic regression introduced
by 79256dc's split-completion design (closes daedalus-v4l2#8).
`device_run` now removes both src + dst from `m2m_ctx`'s
rdy_queue at pickup time, not at `buf_done` time. Without
this, after `SRC_CONSUMED`'s `job_finish` released the m2m
scheduler, the NEXT `device_run` saw the still-queued parked
dst_buf and paired it with a fresh src — two inflight entries
referencing the same vb2_buffer, the later `HAS_PIXELS`
triggered list_del on an already-detached list_head, smashing
the rdy_queue → hard reboot on Pi CM5 during `mpv vaapi-copy`
playback of 720p H.264 (2026-05-21).
* Wire protocol unchanged — DAEDALUS_PROTO_VERSION stays at 1.
Daemon (userspace daedalus-v4l2 package) need NOT bump in
lockstep with this DKMS update; the existing
daedalus-v4l2 0.1.0+r28+g79256dc is wire-compatible with
daedalus-v4l2-dkms 0.1.0+r30+g6ffe92b.
* Carries forward the #64 multi-kernel postinst fix.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 14:00:00 +0000
daedalus-v4l2-dkms (0.1.0+r28+g79256dc-1) bookworm trixie; urgency=medium
* Bump to 79256dc — H.264 B-frame display reorder fix (closes
daedalus-v4l2#6). libavcodec's H.264 decoder reorders output to
display order before returning from avcodec_receive_frame; the
daemon was binding each REQ_DECODE's pixels to the cookie of the
bitstream that triggered the receive_frame call, not the cookie
of the bitstream that actually produced the picture. For B-frame
sequences this paired cookie N's CAPTURE buffer with cookie N-2's
pixels and silently lost intermediate frames — visible as
"2 1 4 3 6 5" frame pairing in mpv / Firefox on Pi CM5.
* Wire-protocol bump (DAEDALUS_PROTO_VERSION 0 → 1): REQ_DECODE
gains __u64 src_pts; RESP_FRAME gains __u32 flags +
__u64 output_src_pts. Kernel + daemon must install atomically
(this package + daedalus-v4l2 0.1.0+r28+g79256dc).
* Carries forward the #64 multi-kernel postinst fix from -2:
autoinstall for every /lib/modules/*/build that resolves to real
headers, not just $(uname -r).
* Closes #64.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 12:00:00 +0000
daedalus-v4l2-dkms (0.1.0+r24+gf0d4186-2) bookworm trixie; urgency=medium
* postinst: autoinstall for every installed kernel with headers, not
just the running one. Previously `dkms autoinstall $NAME/$VERSION`
built only against `$(uname -r)`, so installing the package on
kernel A and then rebooting into a separately-installed kernel B
left /lib/modules/B/updates/dkms/ empty — /dev/daedalus-v4l2 absent,
daedalus daemon nothing to talk to, browser/VAAPI silently falling
back to software with no obvious diagnostic. Now we enumerate every
/lib/modules/*/build that resolves to a real directory and run
`dkms autoinstall -k <kver>` for each, reporting per-kernel failure
only when headers are missing. Closes #64.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 09:30:00 +0000
daedalus-v4l2-dkms (0.1.0+r24+gf0d4186-1) bookworm trixie; urgency=medium
* Bump to f0d4186 — per-ctx vb2 lock fix. daedalus_queue_init now
uses ctx->vb_mutex instead of ctx->dev->m2m_lock for each
vb2_queue's lock, unblocking Firefox's multi-process VAAPI
clients (they were colliding on the device-wide mutex and one
would EBUSY-fail S_FMT while another was mid-streamon).
-- Markus Fritsche <mfritsche@reauktion.de> Wed, 20 May 2026 23:00:00 +0000
daedalus-v4l2-dkms (0.1.0+r22+g462aa4b-1) bookworm trixie; urgency=medium
* Bump to 462aa4b — kernel device_run() now calls
v4l2_ctrl_request_setup() before reading the H.264 stateless
control values from the bound media_request, so the values
daedalus ships to the userspace daemon match what the V4L2
client (libva-v4l2-request-fourier) actually set. Closes the
libva→kernel control-binding gap that was causing decoded
frames to come back as best-effort zero garbage from libavcodec.
* Wire-ABI lockstep with daedalus-v4l2 0.1.0+r22+g462aa4b.
-- Markus Fritsche <mfritsche@reauktion.de> Wed, 20 May 2026 22:00:00 +0000
daedalus-v4l2-dkms (0.1.0+r20+g3dd0eb0-1) bookworm trixie; urgency=medium daedalus-v4l2-dkms (0.1.0+r20+g3dd0eb0-1) bookworm trixie; urgency=medium
* Bump to 3dd0eb0 — DAEMON-PPS kernel-side changes. device_run() * Bump to 3dd0eb0 — DAEMON-PPS kernel-side changes. device_run()
+41 -8
View File
@@ -11,13 +11,23 @@
# Upstream repo: https://git.reauktion.de/reauktion/daedalus-v4l2 # Upstream repo: https://git.reauktion.de/reauktion/daedalus-v4l2
set -euo pipefail set -euo pipefail
# Same pin as the Arch PKGBUILD. 481279c = "Phase 8.13: byte-exact # 6e6dfa1 = picks up daedalus-v4l2 PR #16 — daemon now dlopens
# end-to-end via libva (consumer target hit)" — first commit where the # the Kwiboo fourier fork's libavcodec.so.62 / libavformat.so.62 /
# full ffmpeg -hwaccel vaapi → libva → /dev/video0 → daemon path lands # libavutil.so.60 at /opt/fourier instead of Debian-stock soname
# a pixel-correct decoded frame back in ffmpeg. # 61/61/59. First step on the daedalus-fourier substitution arc
UPSTREAM_COMMIT=3dd0eb070a75893f78368ce819b9e9ebf08c124d # (daedalus-v4l2#11): routes the daemon through the libavcodec
PKGVER=0.1.0+r20+g3dd0eb0 # source tree we own in marfrit-packages. Headers + .pc files
PKGREL=1 # reset for new upstream pin (3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synth) # come from ffmpeg-v4l2-request-fourier (installed by the CI
# workflow before this script runs; see PKG_CONFIG_PATH below).
UPSTREAM_COMMIT=6e6dfa144da7bc7fa8be50c8da91d7d1c6132a2c
PKGVER=0.1.0+r41+g6e6dfa1
PKGREL=1 # reset for new upstream pin (6e6dfa1 — soname 62 via /opt/fourier)
# daedalus-fourier pin. d87239d = marfrit/daedalus-fourier PR #1 merge
# (install rules + pkg-config, enables this consumer to find_package
# + link). Bump in lockstep with the upstream daemon when daedalus-
# fourier's API or installed shaders are changed by a new consumer.
DAEDALUS_FOURIER_COMMIT=d87239d8172307d9a1b93c95cbed116d175b85cc
HERE=$(dirname "$(readlink -f "$0")") HERE=$(dirname "$(readlink -f "$0")")
@@ -27,14 +37,37 @@ export SOURCE_DATE_EPOCH=1779231600
work=$(mktemp -d) work=$(mktemp -d)
trap "rm -rf $work" EXIT trap "rm -rf $work" EXIT
# --- daedalus-fourier: fetch + build + install to per-build prefix ---
#
# Static-linked into the daemon, so the temp prefix is only for the
# duration of this build script. Requires libvulkan-dev + glslang-tools
# on the runner (already needed for the daedalus-fourier benches).
FOURIER_PREFIX=$work/fourier-prefix
mkdir -p "$FOURIER_PREFIX"
cd "$work"
curl --connect-timeout 10 --max-time 600 --retry 3 --retry-delay 5 -sSLfo daedalus-fourier.tar.gz \
"https://git.reauktion.de/marfrit/daedalus-fourier/archive/${DAEDALUS_FOURIER_COMMIT}.tar.gz"
tar xzf daedalus-fourier.tar.gz
cd daedalus-fourier
cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX="$FOURIER_PREFIX"
cmake --build build --target daedalus_core
cmake --install build
# --- daedalus-v4l2: fetch + build daemon against installed daedalus-fourier ---
cd "$work" cd "$work"
curl --connect-timeout 10 --max-time 600 --retry 3 --retry-delay 5 -sSLfo daedalus-v4l2.tar.gz \ curl --connect-timeout 10 --max-time 600 --retry 3 --retry-delay 5 -sSLfo daedalus-v4l2.tar.gz \
"https://git.reauktion.de/reauktion/daedalus-v4l2/archive/${UPSTREAM_COMMIT}.tar.gz" "https://git.reauktion.de/reauktion/daedalus-v4l2/archive/${UPSTREAM_COMMIT}.tar.gz"
tar xzf daedalus-v4l2.tar.gz tar xzf daedalus-v4l2.tar.gz
SRCDIR=daedalus-v4l2 SRCDIR=daedalus-v4l2
# Build daemon (CMake) # Build daemon (CMake) — point pkg-config at the daedalus-fourier
# temp prefix so pkg_check_modules(DAEDALUS_FOURIER …) resolves to it.
cd "$SRCDIR/daemon" cd "$SRCDIR/daemon"
PKG_CONFIG_PATH="$FOURIER_PREFIX/lib/pkgconfig:/opt/fourier/lib/pkgconfig" \
cmake -B build -G Ninja \ cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \ -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_INSTALL_PREFIX=/usr
+152
View File
@@ -1,3 +1,155 @@
daedalus-v4l2 (0.1.0+r41+g6e6dfa1-1) bookworm trixie; urgency=medium
* Bump to 6e6dfa1 — daedalus-v4l2 PR #16. Daemon dlopens Kwiboo
fourier fork's libavcodec.so.62 / libavformat.so.62 /
libavutil.so.60 at /opt/fourier instead of Debian-stock
soname 61/61/59. First step on the daedalus-fourier
substitution arc (daedalus-v4l2#11): the next PR series
layers daedalus_recipe_dispatch_h264_* substitution patches
into ffmpeg-v4l2-request-fourier's H264DSPContext NEON init,
reaching the daemon's production decode path.
* Build: PKG_CONFIG_PATH now includes /opt/fourier/lib/pkgconfig
so daemon's pkg_check_modules picks up the Kwiboo .pc files.
* CI workflow build-deps: libavcodec-dev / libavformat-dev /
libavutil-dev (Debian stock 7.1.3) → ffmpeg-v4l2-request-fourier
(provides /opt/fourier/include + .pc files).
* Wire protocol unchanged. No daedalus-v4l2-dkms bump.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 21:30:00 +0000
daedalus-v4l2 (0.1.0+r39+g3bc0da1-1) bookworm trixie; urgency=medium
* Bump to 3bc0da1 — picks up daedalus-v4l2 PR #15. Per-frame
`decoder: OK ...` log line gains `decode_us=N` (libavcodec
send_packet + receive_frame wall-clock cost in microseconds).
New `decoder stats` summary line every 60 decoded frames with
codec, fps, avg decode_us, MBs/s throughput, B/MB bitrate.
* Pure observability — no decode-path behaviour change.
Establishes baseline metrics for the substitution work in
daedalus-v4l2#11 step 2 (replacing libavcodec primitives with
daedalus-fourier kernels one cycle at a time).
* On Pi CM5 / bbb 720p H.264 baseline: ~4 ms decode_us / 24 fps
/ 90 K MBs/s — workload is well under 1 % of any single
daedalus-fourier kernel's NEON ceiling.
* Wire protocol unchanged. No daedalus-v4l2-dkms bump needed.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 18:30:00 +0000
daedalus-v4l2 (0.1.0+r37+g77e14e5-1) bookworm trixie; urgency=medium
* Bump to 77e14e5 — picks up daedalus-v4l2 PRs #12 + #13.
* #12 (LOW_DELAY half-measure): the daemon now sets
AV_CODEC_FLAG_LOW_DELAY on the H.264 AVCodecContext so libavcodec
emits frames in decode order ~99% of the time (a few stragglers
at GOP boundaries when the stream's SPS num_reorder_frames
overrides the flag). Visible improvement vs the 2-1-4-3
pair-swap on Firefox YouTube + mpv playback; not a permanent
fix (see #11 for the architectural plan).
* #13 (daedalus-fourier linkage): the daemon now pkg-config-links
against the daedalus-fourier kernel library (marfrit/
daedalus-fourier) and logs substrate availability at startup.
No kernels dispatched yet — this is the build-time / link-time
foundation for the H.264 daemon-rewrite plan in #11
(substituting daedalus-fourier IDCT 4×4 / IDCT 8×8 / luma
deblock primitives for libavcodec's per-MB pixel math, one
cycle at a time, measuring CPU saved per substitution).
* Build-deb.sh now fetches + builds + installs daedalus-fourier
(pinned at d87239d, marfrit/daedalus-fourier PR #1) into a
per-build temp prefix, then builds the daemon with
PKG_CONFIG_PATH pointing at it. daedalus-fourier is
statically linked into the daemon binary, so the resulting
.deb has no new runtime deps. Requires libvulkan-dev +
glslang-tools on the CI runner (the daedalus-fourier benches
already needed those).
* Wire protocol unchanged — DAEDALUS_PROTO_VERSION stays at 0.
No daedalus-v4l2-dkms bump needed.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 16:30:00 +0000
daedalus-v4l2 (0.1.0+r33+g5d8b436-1) bookworm trixie; urgency=medium
* Bump to 5d8b436 — reverts daedalus-v4l2 PRs #7 + #8 (the parking
design that broke libva-v4l2-request-fourier's 1:1 CAPTURE
contract; see daedalus-v4l2#9 + #10). After daemon-r28+g79256dc
landed, mpv (--hwdec=vaapi-copy) failed pre-playing with
"Unable to dequeue buffer: Resource temporarily unavailable" /
"Failed to end picture decode" because the daemon parked CAPTURE
buffers waiting for libavcodec to release H.264 B-frames in
display order — violating the V4L2 stateless 1:1 contract.
Firefox tolerated the mess (visible "2 1 4 3" pair-swap); mpv
bailed.
* This bump restores f0d4186-equivalent behaviour, plus PR #4
(cosmetic H.264 DECODE_MODE / START_CODE menu controls). PR #7
+ PR #8 wire-protocol additions (src_pts / output_src_pts /
RESP_FRAME flags) are reverted — DAEDALUS_PROTO_VERSION drops
back from 1 → 0. Lock-step install with daedalus-v4l2-dkms
0.1.0+r33+g5d8b436 REQUIRED.
* Visible regression: H.264 B-frame streams in Firefox revert to
the original "2 1 4 3 6 5" pair-swap visual. The proper fix
(concurrent in-flight requests in daemon + display-order reorder
in libva-v4l2-request-fourier) is tracked at daedalus-v4l2#11.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 14:50:00 +0000
daedalus-v4l2 (0.1.0+r28+g79256dc-1) bookworm trixie; urgency=medium
* Bump to 79256dc — H.264 B-frame display reorder fix (closes
daedalus-v4l2#6 + #4 menu controls). Daemon side: the
avcodec_send_packet → receive_frame loop now stamps pkt->pts =
req->src_pts so libavcodec's display-ordered frame->pts identifies
which OUTPUT bitstream's pixels each drained frame belongs to.
chardev_client maintains a (src_pts → cookie) lookup table so the
daemon can ship pixels to the cookie of the *originating*
bitstream, not the cookie of whatever REQ triggered the
receive_frame call. Multiple RESP_FRAME messages per REQ_DECODE
are now possible (one for the just-consumed src, one or more for
drained pixels).
* Wire-protocol bump (DAEDALUS_PROTO_VERSION 0 → 1): REQ_DECODE
gains __u64 src_pts; RESP_FRAME gains __u32 flags +
__u64 output_src_pts. Daemon + kernel must install atomically
(this package + daedalus-v4l2-dkms 0.1.0+r28+g79256dc).
* Also subsumes 79256dc's predecessor 7ff2d89 — H.264 DECODE_MODE +
START_CODE menu-control registration that retires the
"Unable to set control(s) error_idx=2/2" warning libva-v4l2-
request emitted on every context init.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 12:00:00 +0000
daedalus-v4l2 (0.1.0+r24+gf0d4186-1) bookworm trixie; urgency=medium
* Bump to f0d4186 — kernel per-ctx vb2 lock fix. daedalus_queue_init
was wiring src_vq->lock and dst_vq->lock to ctx->dev->m2m_lock (a
device-wide mutex), serialising every vb2 ioctl across all
concurrent clients of /dev/video0. For Firefox (which spawns
separate content + RDD + GPU processes that each open the device
and run libva probe simultaneously), one libva session's
S_FMT(OUTPUT_MPLANE) hit EBUSY while another was mid-streamon —
Firefox VAAPI playback fell apart at startup.
* Fix gives each open() its own ctx->vb_mutex; vb2 ioctls run
independently per client. Matches cedrus / rkvdec / hantro
pattern.
* Verified on higgs: Firefox YouTube playback engages VAAPI cleanly,
sustained ~230 fps decode at 640x368 through the daedalus daemon,
zero EBUSY in stderr or daemon journal.
-- Markus Fritsche <mfritsche@reauktion.de> Wed, 20 May 2026 23:00:00 +0000
daedalus-v4l2 (0.1.0+r22+g462aa4b-1) bookworm trixie; urgency=medium
* Bump to 462aa4b — kernel-side fix for control-binding gap that
closes the libva→daemon SPS/PPS pipeline. Kernel device_run now
calls v4l2_ctrl_request_setup() before reading ctrl->p_cur, so
the daemon's daedalus_h264_meta block actually carries THIS
request's V4L2 stateless H.264 control values instead of stale
/default ones. Pairs with libva-v4l2-request-fourier r382+gc1bb444
(Fix 3 + Fix 4 from issue libva-v4l2-request-fourier#8).
* After-fix on higgs (Pi CM5): ffmpeg -hwaccel vaapi -i h264.mp4
produces unique decoded P-frames (per-frame fnv1a hashes differ)
and zero "error while decoding MB" / "reference frames exceeds
max" warnings.
-- Markus Fritsche <mfritsche@reauktion.de> Wed, 20 May 2026 22:00:00 +0000
daedalus-v4l2 (0.1.0+r20+g3dd0eb0-1) bookworm trixie; urgency=medium daedalus-v4l2 (0.1.0+r20+g3dd0eb0-1) bookworm trixie; urgency=medium
* Bump to 3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synthesiser. * Bump to 3dd0eb0 — DAEMON-PPS H.264 SPS/PPS NAL synthesiser.
@@ -0,0 +1,137 @@
From f760c0541586f43334c02611fcb4c212c08ad576 Mon Sep 17 00:00:00 2001
From: Markus Fritsche <mfritsche@reauktion.de>
Date: Thu, 21 May 2026 21:40:22 +0200
Subject: [PATCH] avcodec/aarch64/h264dsp: route H.264 4x4 IDCT through
daedalus-fourier
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
H264DSPContext.idct_add (called per 4x4 block from the intra-4x4
decode path in h264_mb.c) now dispatches through
daedalus_recipe_dispatch_h264_idct4 instead of ff_h264_idct_add_neon.
The recipe layer picks the substrate; for cycle 6 (H.264 IDCT 4x4)
the recipe is CPU NEON, so this is effectively a NEON-to-NEON
substitution with one extra dispatch call and recipe-table lookup.
Provides the first end-to-end exercise of the daedalus-fourier
kernel pack inside the libavcodec.so decode hot path; follow-up
patches wire IDCT 8x8, luma-v deblock, and qpel mc20.
The library context is process-global, lazily initialised under
pthread_once on first call. We pick the no-QPU constructor because
libavcodec.so is loaded into arbitrary host processes
(firefox-fourier, mpv-fourier, daedalus_v4l2_daemon, ...) and we
cannot assume the host has a usable Vulkan instance. Higher cycles
(deblock luma-v, MC) that benefit from the QPU will provision their
own recipe-selected context once that path is wired.
Bulk paths (idct_add16, idct_add16intra, idct_add8 — used for
non-intra4x4 macroblocks) remain on the stock NEON .S implementations
and will be batched through daedalus_recipe_dispatch_h264_idct4 with
n_blocks>1 in a follow-up.
Bit-exact against ff_h264_idct_add_neon (daedalus-fourier cycle 6
green; see marfrit/daedalus-fourier/CYCLE_LOGS.md).
Refs reauktion/daedalus-v4l2#11 — substitution arc step 2.
---
libavcodec/aarch64/Makefile | 3 +-
libavcodec/aarch64/h264_idct_daedalus.c | 49 +++++++++++++++++++++++
libavcodec/aarch64/h264dsp_init_aarch64.c | 3 +-
3 files changed, 53 insertions(+), 2 deletions(-)
create mode 100644 libavcodec/aarch64/h264_idct_daedalus.c
diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile
index 41ab025..7b95fb1 100644
--- a/libavcodec/aarch64/Makefile
+++ b/libavcodec/aarch64/Makefile
@@ -3,7 +3,8 @@ OBJS-$(CONFIG_AC3DSP) += aarch64/ac3dsp_init_aarch64.o
OBJS-$(CONFIG_FDCTDSP) += aarch64/fdctdsp_init_aarch64.o
OBJS-$(CONFIG_FMTCONVERT) += aarch64/fmtconvert_init.o
OBJS-$(CONFIG_H264CHROMA) += aarch64/h264chroma_init_aarch64.o
-OBJS-$(CONFIG_H264DSP) += aarch64/h264dsp_init_aarch64.o
+OBJS-$(CONFIG_H264DSP) += aarch64/h264dsp_init_aarch64.o \
+ aarch64/h264_idct_daedalus.o
OBJS-$(CONFIG_HUFFYUVDSP) += aarch64/huffyuvdsp_init_aarch64.o
OBJS-$(CONFIG_H264PRED) += aarch64/h264pred_init.o
OBJS-$(CONFIG_H264QPEL) += aarch64/h264qpel_init_aarch64.o
diff --git a/libavcodec/aarch64/h264_idct_daedalus.c b/libavcodec/aarch64/h264_idct_daedalus.c
new file mode 100644
index 0000000..538d223
--- /dev/null
+++ b/libavcodec/aarch64/h264_idct_daedalus.c
@@ -0,0 +1,49 @@
+/*
+ * H.264 4x4 IDCT + add — daedalus-fourier substitution shim.
+ *
+ * Routes H264DSPContext.idct_add through
+ * daedalus_recipe_dispatch_h264_idct4 instead of ff_h264_idct_add_neon.
+ * The recipe layer picks the substrate (CPU NEON by default for
+ * cycle 6; future cycles may dispatch to V3D opportunistically).
+ *
+ * FFmpeg's 4x4 block memory layout matches daedalus's column-major
+ * convention: block[r + 4*c] = coefficient at (row r, col c). Both
+ * sides destructively zero the block after the transform.
+ *
+ * The library context is process-global and lazily initialised under
+ * pthread_once. We pick the no-QPU constructor here because
+ * libavcodec.so is loaded into arbitrary host processes
+ * (firefox-fourier, mpv-fourier, daedalus_v4l2_daemon, ...) and we
+ * cannot assume the host has a usable Vulkan instance. Higher cycles
+ * (deblock, MC) that benefit from the QPU initialise their own
+ * recipe-selected context once that path is wired.
+ */
+
+#include <pthread.h>
+#include <stddef.h>
+#include <stdint.h>
+
+#include <daedalus.h>
+
+#include "libavutil/attributes.h"
+#include "libavcodec/h264dsp.h"
+
+static daedalus_ctx *g_dctx;
+static pthread_once_t g_dctx_once = PTHREAD_ONCE_INIT;
+
+static void daedalus_ctx_init_once(void)
+{
+ g_dctx = daedalus_ctx_create_no_qpu();
+}
+
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride);
+
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride)
+{
+ static const daedalus_h264_block_meta meta = { .dst_off = 0 };
+
+ pthread_once(&g_dctx_once, daedalus_ctx_init_once);
+
+ daedalus_recipe_dispatch_h264_idct4(g_dctx, dst, (size_t)stride,
+ block, 1, &meta);
+}
diff --git a/libavcodec/aarch64/h264dsp_init_aarch64.c b/libavcodec/aarch64/h264dsp_init_aarch64.c
index c684574..b993df2 100644
--- a/libavcodec/aarch64/h264dsp_init_aarch64.c
+++ b/libavcodec/aarch64/h264dsp_init_aarch64.c
@@ -66,6 +66,7 @@ void ff_biweight_h264_pixels_4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride
int weights, int offset);
void ff_h264_idct_add_neon(uint8_t *dst, int16_t *block, int stride);
+void ff_h264_idct_add_daedalus(uint8_t *dst, int16_t *block, int stride);
void ff_h264_idct_dc_add_neon(uint8_t *dst, int16_t *block, int stride);
void ff_h264_idct_add16_neon(uint8_t *dst, const int *block_offset,
int16_t *block, int stride,
@@ -139,7 +140,7 @@ av_cold void ff_h264dsp_init_aarch64(H264DSPContext *c, const int bit_depth,
c->biweight_pixels_tab[1] = ff_biweight_h264_pixels_8_neon;
c->biweight_pixels_tab[2] = ff_biweight_h264_pixels_4_neon;
- c->idct_add = ff_h264_idct_add_neon;
+ c->idct_add = ff_h264_idct_add_daedalus;
c->idct_dc_add = ff_h264_idct_dc_add_neon;
c->idct_add16 = ff_h264_idct_add16_neon;
c->idct_add16intra = ff_h264_idct_add16intra_neon;
--
2.47.3
+41 -1
View File
@@ -33,7 +33,15 @@ FFMPEG_VERSION=8.1
# epoch 2 matches Debian's stock ffmpeg (currently 7:7.1.x in trixie); # epoch 2 matches Debian's stock ffmpeg (currently 7:7.1.x in trixie);
# +rfourier suffix to avoid colliding with upstream/Debian rebuilds. # +rfourier suffix to avoid colliding with upstream/Debian rebuilds.
PKGVER=2:${FFMPEG_VERSION}+rfourier+gb57fbbe PKGVER=2:${FFMPEG_VERSION}+rfourier+gb57fbbe
PKGREL=2 # pkgrel=2Path A move to /opt/fourier prefix (2026-05-19) PKGREL=5 # pkgrel=5H.264 IDCT 4x4 daedalus-fourier substitution; skip past
# an orphan -4 .deb sitting in the apt pool that made
# check-already-published.sh's `pool_ver ge source_full` short-
# circuit the previous -3 build (PR #76). (2026-05-21)
# daedalus-fourier pin — first kernel substitution in libavcodec (cycle 6
# H.264 IDCT 4x4). Same SHA as the daedalus-v4l2 daemon already ships
# inline; rev in lockstep with the daemon when the public API rolls.
DAEDALUS_FOURIER_COMMIT=d87239d8172307d9a1b93c95cbed116d175b85cc
HERE=$(dirname "$(readlink -f "$0")") HERE=$(dirname "$(readlink -f "$0")")
@@ -57,6 +65,34 @@ fi
# Apply patches (same as Arch). # Apply patches (same as Arch).
patch -Np1 -i "$HERE/0001-libudev-bypass-fallback.patch" patch -Np1 -i "$HERE/0001-libudev-bypass-fallback.patch"
patch -Np1 -i "$HERE/0002-nv15-to-p010-unpack.patch" patch -Np1 -i "$HERE/0002-nv15-to-p010-unpack.patch"
patch -Np1 -i "$HERE/0003-h264-idct4-daedalus-fourier.patch"
# --- daedalus-fourier: fetch + build static .a with PIC, install to a
# per-build prefix; libavcodec.so links it into the shared object so
# H264DSPContext.idct_add (and follow-up kernels) dispatch through the
# daedalus recipe layer instead of the in-tree NEON .S code. ---
#
# PIC is mandatory — the static .a is linked into a .so, so all object
# code must be relocatable. Vulkan is PUBLIC-linked by daedalus_core
# (queryable QPU substrate); we add libvulkan1 to Debian Depends below
# so dlopen of libavcodec.so.62 succeeds on stock trixie.
FOURIER_PREFIX=$work/fourier-prefix
mkdir -p "$FOURIER_PREFIX"
pushd "$work" >/dev/null
curl --connect-timeout 10 --max-time 600 --retry 3 --retry-delay 5 -sSLfo daedalus-fourier.tar.gz \
"https://git.reauktion.de/marfrit/daedalus-fourier/archive/${DAEDALUS_FOURIER_COMMIT}.tar.gz"
tar xzf daedalus-fourier.tar.gz
pushd daedalus-fourier >/dev/null
cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_INSTALL_PREFIX="$FOURIER_PREFIX"
cmake --build build --target daedalus_core
cmake --install build
popd >/dev/null
popd >/dev/null
cd "$work/FFmpeg"
# Configure with Arch-parity flags. Drops the same set of features # Configure with Arch-parity flags. Drops the same set of features
# (X11, AMF, CUDA, FireWire, AviSynth, Bluray, OpenMPT, JPEG-XL, # (X11, AMF, CUDA, FireWire, AviSynth, Bluray, OpenMPT, JPEG-XL,
@@ -73,6 +109,9 @@ patch -Np1 -i "$HERE/0002-nv15-to-p010-unpack.patch"
--mandir=/opt/fourier/share/man \ --mandir=/opt/fourier/share/man \
--extra-ldexeflags='-Wl,-rpath,/opt/fourier/lib' \ --extra-ldexeflags='-Wl,-rpath,/opt/fourier/lib' \
--extra-ldsoflags='-Wl,-rpath,/opt/fourier/lib' \ --extra-ldsoflags='-Wl,-rpath,/opt/fourier/lib' \
--extra-cflags="-I${FOURIER_PREFIX}/include" \
--extra-ldflags="-L${FOURIER_PREFIX}/lib" \
--extra-libs="-ldaedalus_core -lvulkan -lpthread" \
--disable-debug \ --disable-debug \
--disable-static \ --disable-static \
--disable-doc \ --disable-doc \
@@ -147,6 +186,7 @@ Priority: optional
Architecture: arm64 Architecture: arm64
Depends: libc6, Depends: libc6,
libdrm2, libdrm2,
libvulkan1,
libfontconfig1, libfontconfig1,
libfreetype6, libfreetype6,
libfribidi0, libfribidi0,
+45
View File
@@ -1,3 +1,48 @@
ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-5) bookworm trixie; urgency=medium
* pkgrel-only bump (3 → 5) to force a rebuild of the H.264 IDCT 4x4
daedalus-fourier substitution that landed in marfrit-packages PR
#76. An orphan -4 .deb already sat in the apt pool (dated
2026-05-19, no matching source commit in main); CI's
check-already-published.sh compares with `dpkg --compare-versions
pool_ver ge source_full`, which short-circuited PR #76's -3
build. Skipping past -4 lets the CI workflow actually publish the
substitution.
* No source code change beyond PKGREL and this changelog entry.
Substitution + control + build-deb.sh wiring stay as PR #76 left
them.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 21:30:00 +0000
ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-3) bookworm trixie; urgency=medium
* Add 0003-h264-idct4-daedalus-fourier.patch — H264DSPContext.idct_add
(per-block 4x4 IDCT, called from the intra-4x4 decode path in
libavcodec/h264_mb.c) now dispatches through
daedalus_recipe_dispatch_h264_idct4 instead of
ff_h264_idct_add_neon. First end-to-end exercise of the
daedalus-fourier kernel pack inside libavcodec.so on the
production decode hot path (daedalus-v4l2#11 step 2 — cycle 6
H.264 IDCT 4x4, NEON-by-recipe).
* build-deb.sh: fetches + builds daedalus-fourier (pinned at
d87239d, lockstep with the daemon's static link) with
-fPIC into a per-build temp prefix, then passes
--extra-cflags=-I.../include --extra-ldflags=-L.../lib
--extra-libs="-ldaedalus_core -lvulkan -lpthread" to FFmpeg
configure. Static-linked into libavcodec.so.62.
* Bulk paths (idct_add16 / idct_add16intra / idct_add8) remain on
the stock NEON .S code and will be batched through
daedalus_recipe_dispatch_h264_idct4 with n_blocks>1 in a
follow-up. Cycles 7/8/9 (IDCT 8x8 / luma-v deblock / qpel mc20)
land in subsequent patches.
* Depends gains libvulkan1 — daedalus_core PUBLIC-links Vulkan
(queryable QPU substrate); the no-QPU constructor still works,
but the loader refuses libavcodec.so.62 at dlopen time without
libvulkan.so.1 present.
* No ABI change; SONAMEs stay 62/62/60.
-- Markus Fritsche <mfritsche@reauktion.de> Thu, 21 May 2026 20:00:00 +0000
ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-1) bookworm trixie; urgency=medium ffmpeg-v4l2-request-fourier (2:8.1+rfourier+gb57fbbe-1) bookworm trixie; urgency=medium
* Initial Debian packaging for the Kwiboo FFmpeg fork with V4L2 * Initial Debian packaging for the Kwiboo FFmpeg fork with V4L2
+19 -9
View File
@@ -10,15 +10,25 @@
# Upstream fork: https://git.reauktion.de/marfrit/libva-v4l2-request-fourier # Upstream fork: https://git.reauktion.de/marfrit/libva-v4l2-request-fourier
set -euo pipefail set -euo pipefail
# Same pin as the Arch PKGBUILD. 9898331 = LIBVA-2 close — completes # Same pin as the Arch PKGBUILD. c454618 = PR #16 merge "picture,
# the per-codec dispatch from c332d34 (LIBVA-1) by adding video_fd_ # request_pool: transparent OUTPUT-pool resize on bitstream overrun
# daedalus to any_fd_supports_output_format's probe array. Without # (#15)" — follow-up root-cause fix to #13/#14. On a mid-stream
# it, H.264/VP9/AV1 profiles never got advertised on Pi 5 mixed # bitstream-budget overrun (typical cause: SPS-driven resolution
# deployments (rpi-hevc-dec primary, daedalus_v4l2 alt) — ffmpeg # upshift in an adaptive-bitrate stream), codec_store_buffer now
# bailed with "No support for codec h264 profile 578" before the # snapshots the in-flight surface's accumulated bytes, releases its
# per-codec dispatch could even fire. # OUTPUT pool slot, calls request_pool_resize (STREAMOFF →
UPSTREAM_COMMIT=989833114a7708ad999dc68309cbc181d9913bdb # REQBUFS(0) → S_FMT with 2×sizeimage hint, capped at 1 GiB, page-
PKGVER=1.0.0+r380+g9898331 # aligned → CREATE_BUFS → mmap → media_request_alloc → STREAMON),
# re-acquires a slot, re-mirrors the surface's source_{data,size,
# request_fd}, restores the bytes, and continues. The frame
# survives instead of being dropped back to libavcodec for surface
# recreation. CAPTURE side untouched (per-queue V4L2 streaming
# independence).
#
# Prior pin (2860d75) = PR #14 merge — codec_store_buffer bounds-
# check floor (#13).
UPSTREAM_COMMIT=c454618ae11addce2e17b560f4deeacbed067d98
PKGVER=1.0.0+r390+gc454618
PKGREL=1 PKGREL=1
HERE=$(dirname "$(readlink -f "$0")") HERE=$(dirname "$(readlink -f "$0")")