ffmpeg-v4l2-request-fourier: preserve sl->mb for inspection callback (0017) #107

Merged
marfrit merged 1 commits from claude-noether/marfrit-packages:noether/h264-mb-coeffs-side-buffer into main 2026-05-26 07:48:48 +00:00
Owner

Companion to 0016 (PR #106). Foundation for daedalus-decoder PR-A3 — extracting real H.264 coefficients into daedalus-decoder for full-pipeline IDCT validation.

Problem

Patch 0016's per-MB inspection callback fires at the end of ff_h264_hl_decode_mb. By that time the IDCT-add path has already zeroed sl->mb (FFmpeg convention — ff_h264_idct_add_neon and friends destroy the input block buffer as they go). Consumers reading coefficients in the callback get zeros.

What 0017 adds

  • New field in H264Context: int16_t mb_inspect_coeffs[16 * 48] — side buffer matching the 8-bit half of sl->mb's declared size.
  • At the start of ff_h264_hl_decode_mb, a single memcpy snapshots sl->mb into the side buffer — BEFORE the variant hl_decode_mb_* runs and zeros the original.

Memcpy gated on h->mb_inspect_cb != NULL. Zero cost when no inspection consumer is registered — the existing decode path adds one branch per MB.

The existing 0016 callback now reads:

  • h->mb_inspect_coeffs = pre-IDCT coefficients (this patch)
  • h->cur_pic.f->data = post-pixel-work pre-deblock reconstruction

and the consumer can derive P = pixels − IDCT(C) for daedalus-decoder's frame-major dispatch.

Limitations

  • 8-bit H.264 only. High-bit-depth uses the upper half of sl->mb (int16_t[16 * 48 * 2] declared); preserving the high-depth case would need a wider side buffer.
  • Single-threaded decode assumed (avctx->thread_count = 1). Multi-slice / multi-threaded streams would race on the single side buffer — explicit limitation of the inspection mechanism. Future extension would put per-slice buffers in H264SliceContext.

These match the daedalus-decoder consumer's existing scope.

Verified

Patches 0016 + 0017 apply cleanly and build in sequence against the Kwiboo v4l2-request-n8.1 fork at the pinned commit b57fbbe5. ff_h264_set_mb_inspect_cb symbol still exported.

Wiring

  • arch PKGBUILD: source[] + prepare().
  • debian build-deb.sh: patch sequence.
  • Both pkgrel: 13 → 14.

Refs reauktion/daedalus-decoder!14 (PR-A2 callback wiring complete; PR-A3 will consume the side buffer).

Companion to 0016 (PR #106). Foundation for daedalus-decoder PR-A3 — extracting real H.264 coefficients into daedalus-decoder for full-pipeline IDCT validation. ## Problem Patch 0016's per-MB inspection callback fires at the **end** of `ff_h264_hl_decode_mb`. By that time the IDCT-add path has already zeroed `sl->mb` (FFmpeg convention — `ff_h264_idct_add_neon` and friends destroy the input block buffer as they go). Consumers reading coefficients in the callback get zeros. ## What 0017 adds - New field in `H264Context`: `int16_t mb_inspect_coeffs[16 * 48]` — side buffer matching the 8-bit half of `sl->mb`'s declared size. - At the **start** of `ff_h264_hl_decode_mb`, a single `memcpy` snapshots `sl->mb` into the side buffer — BEFORE the variant `hl_decode_mb_*` runs and zeros the original. Memcpy gated on `h->mb_inspect_cb != NULL`. **Zero cost** when no inspection consumer is registered — the existing decode path adds one branch per MB. The existing 0016 callback now reads: - `h->mb_inspect_coeffs` = pre-IDCT coefficients (this patch) - `h->cur_pic.f->data` = post-pixel-work pre-deblock reconstruction and the consumer can derive `P = pixels − IDCT(C)` for daedalus-decoder's frame-major dispatch. ## Limitations - 8-bit H.264 only. High-bit-depth uses the upper half of `sl->mb` (`int16_t[16 * 48 * 2]` declared); preserving the high-depth case would need a wider side buffer. - Single-threaded decode assumed (`avctx->thread_count = 1`). Multi-slice / multi-threaded streams would race on the single side buffer — explicit limitation of the inspection mechanism. Future extension would put per-slice buffers in `H264SliceContext`. These match the daedalus-decoder consumer's existing scope. ## Verified Patches 0016 + 0017 apply cleanly and build in sequence against the Kwiboo `v4l2-request-n8.1` fork at the pinned commit `b57fbbe5`. `ff_h264_set_mb_inspect_cb` symbol still exported. ## Wiring - arch PKGBUILD: source[] + prepare(). - debian build-deb.sh: patch sequence. - Both pkgrel: 13 → 14. Refs reauktion/daedalus-decoder!14 (PR-A2 callback wiring complete; PR-A3 will consume the side buffer).
marfrit added 1 commit 2026-05-26 07:46:38 +00:00
Companion to 0016 (PR #106).  Adds a coefficient side buffer in
H264Context, populated at the start of ff_h264_hl_decode_mb with a
single memcpy from sl->mb BEFORE IDCT-add zeros it.  The existing
post-pixel-work callback (still in 0016) can now read:
  - h->mb_inspect_coeffs  = pre-IDCT coefficients (this patch)
  - h->cur_pic.f->data    = post-pixel-work pre-deblock reconstruction

and derive P = pixels − IDCT(C) for daedalus-decoder's frame-major
dispatch in PR-A3+.

Memcpy gated on (h->mb_inspect_cb != NULL).  Zero cost when no
consumer is registered.  Side buffer = 16 * 48 int16 = 1536 bytes
(matches the 8-bit half of sl->mb's int16_t[16 * 48 * 2] declared
size; high-bit-depth uses the upper half — not preserved here since
the daedalus-decoder consumer is 8-bit-only).

Single-threaded decode assumed at the consumer side
(avctx->thread_count = 1).  Multi-slice / multi-threaded streams
would race on the single side buffer — explicit limitation of the
inspection mechanism, future extension would put per-slice buffers
in H264SliceContext.

Verified: patches 0016 + 0017 apply cleanly and build in sequence
against the Kwiboo v4l2-request-n8.1 fork at the pinned commit
b57fbbe5.  ff_h264_set_mb_inspect_cb symbol exported as before.

Wired into arch PKGBUILD + debian build-deb.sh patch sequence.
pkgrel bumped 13 → 14.

Refs reauktion/daedalus-decoder!14 (PR-A2 callback wiring complete,
PR-A3 coefficient extraction is the next consumer).
marfrit merged commit 368fcff41f into main 2026-05-26 07:48:48 +00:00
marfrit deleted branch noether/h264-mb-coeffs-side-buffer 2026-05-26 07:48:48 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/marfrit-packages#107