Files
fresnel-fourier/phase8_iteration1_close.md
claude-noether dc6937868a iter1 Phase 8 close: 2/5 codecs passing, 3 lessons distilled to memory
Iteration 1 closes with all five Phase 1 boolean-correctness criteria
green. Second codec passes — campaign scoreboard 1/5 → 2/5 (H.264
in T4, MPEG-2 in iter1). Loop terminates per
feedback_dev_process.md Phase 8.

What landed:

  Code (libva-v4l2-request-fourier master 65969da..229d6d1):
    e7dad7a iter1 Phase 6 commit A: config.c break for MPEG-2 cases
    5fe873c iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API
    3aab187 iter1 Phase 6 commit C: delete staging-era include/mpeg2-ctrls.h
    229d6d1 iter1 Phase 6 commit D: drop missed mpeg2-ctrls.h include from context.c (fix-forward)

  All four authored as Claude (noether) per feedback_gitea_as_claude_noether.md.

  Campaign docs (fresnel-fourier):
    f720c77 iter1 Phase 0 + Phase 1 lock
    cc55a6e iter1 Phase 2 situation analysis (3 bugs)
    b9625af iter1 Phase 3 baseline measurements (4 baselines)
    3e996d0 iter1 Phase 4 plan (6 contract clauses)
    0e2e1c2 iter1 Phase 5 sonnet review (6 findings, 4 amendments)
    ec9133a iter1 Phase 7 verification (5/5 GREEN)
    [this commit] iter1 Phase 8 close

Three lessons distilled to durable memory:

  L1 — feedback_header_deletion_check.md
    Phase 2 grep audit found 2 of 3 include sites for the
    staging-era mpeg2-ctrls.h header; build broke on Commit C
    delete because of context.c:42. Rule: when removing a header,
    let the compiler enumerate includes authoritatively (clean
    rebuild after include-removal patches, before git rm). Grep
    is a hint; the compiler is the authority.

  L2 — feedback_review_empirical_over_theoretical.md
    Phase 5 reviewer S2 flagged a numerical mismatch in
    vbv_buffer_size between Baseline C (1.31 MB) and predicted
    post-fix (1 MB). I rejected with confident source-read
    reasoning (slot->size = sizeimage = matches). Phase 7
    byte-compare empirically showed the reviewer was right —
    post-fix produces 1 MB, not 1.31 MB. Operational impact nil
    (kernel ignores the field), but my Phase 5 rebuttal had a
    source-read gap. Rule: when a reviewer cites a concrete
    numerical discrepancy, defer to Phase 7 byte-compare; don't
    reject on source-read alone.

  L3 — feedback_rockchip_pixel_verify_path.md
    Iter1 + T4 (H.264) both empirically confirm: libva backend's
    vaDeriveImage / cached-mmap readback returns all-zero NV12
    on RK3399 — same iter1 patch-0011 cache-coherency bug class
    observed on RK3568. Pixel verification must use DMA-BUF GL
    import (mpv --vo=image, ffmpeg-v4l2request DRM_PRIME +
    hwdownload). NOT ffmpeg-vaapi+hwdownload (cache-stale on
    Rockchip). Codec-agnostic; applies to all 5 codecs in
    campaign scope.

Backlog items deferred (campaign-internal, not durable memory):

  B1: V4L2 /dev/videoN numbering shuffles across reboots
      (rkvdec moved video3+media1 → video1+media0 between
       Phase 0/3 and Phase 7). Backend should probe /dev/media*
      for driver match. Iter2+ Phase 4 cross-cutting candidate.

  B2: mpv --hwdec=vaapi-copy silently filters MPEG-2 out before
      libva is loaded. mpv --hwdec=vaapi (DMA-BUF) DOES engage.
      Phase 1 criterion 3 ended up anchored on ffmpeg-direct.
      Mpv-side investigation as separate follow-up.

  B3: Latent surface-reuse bug — picture.c:287 h264.matrix_set
      reset writes byte 240 of params union, lands inside
      mpeg2.iqmatrix.chroma_intra_quantiser_matrix[20] (verified
      via offsetof on fresnel via gcc + libva). Per-frame
      RenderPicture overwrites this byte for ffmpeg-vaapi flows
      that send VAIQMatrixBufferType every frame. Latent for
      VAAPI clients that reuse a surface without re-sending
      IQMatrix. Iter2+ candidate.

  B4: src/context.c:142-155 H.264 device-init runs unconditionally
      on every CreateContext, EINVALs on hantro. Intentional
      best-effort but request_log fires "Unable to set
      control(s)" cosmetically. Suppress-log candidate, low
      priority.

  B5: vbv_buffer_size = SOURCE_SIZE_MAX (1 MB) rather than
      negotiated sizeimage. Kernel ignores. Polish candidate.

Phase 4 cross-cutting work items collected:
  - Add VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC to libva backend
    image-export (fixes L3's vaDeriveImage cache-stale bug for
    all codecs).
  - V4L2 device-discovery probe (fixes B1).
  - Picture.c BeginPicture profile-aware reset (fixes B3).
  - Context.c H.264 device-init log suppression (fixes B4).

Campaign roadmap (codec iterations remaining):
  iter2: HEVC on rkvdec — re-enable h265.c in build, audit against
         rkvdec kernel HEVC contract.
  iter3: VP8 on hantro — implement vp8.c.
  iter4: VP9 on rkvdec — implement vp9.c (largest control surface).

Phase 5 review S2 historical-record correction: Phase 5 reviewer
was numerically right about vbv_buffer_size. My Phase 5 rebuttal
in phase5_iter1_review.md was empirically wrong. Acknowledged in
phase7_iter1_verification.md and phase8_iteration1_close.md;
Phase 5 doc preserved as-is for the historical record.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 09:37:19 +00:00

12 KiB
Raw Permalink Blame History

Iteration 1 — Phase 8 (close)

Iteration 1 of the fresnel-fourier campaign closes 2026-05-08 with all five Phase 1 boolean-correctness criteria green. Second codec passes: campaign scoreboard advances 1/5 → 2/5 (H.264 from T4, MPEG-2 from iter1).

What iter1 fixed

MPEG-2 hardware-accelerated decode on Rockchip RK3399 / hantro-vpu-dec / libva-v4l2-request-fourier — bit-exact correct against software reference, when read via the cache-coherency-safe DMA-BUF GL import path.

Three independent bugs (per Phase 2 source-read), all in the libva backend (kernel + driver path was already solid per Phase 0 cross-validator sweep):

  1. src/config.c:55-69 fall-through to default: — MPEG-2 case labels existed but lacked break;, so vaCreateConfig(VAProfileMPEG2Main) returned VA_STATUS_ERROR_UNSUPPORTED_PROFILE (= 12). Fix: 5-line break; addition.

  2. src/mpeg2.c used staging-era UAPIV4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (= V4L2_CID_MPEG_BASE+250 = 0x9909fa) and the combined slice_params struct. Mainline kernel removed these in favor of the split V4L2_CID_STATELESS_MPEG2_{SEQUENCE,PICTURE,QUANTISATION} (= 0xa409dc/dd/de) with three separate control structs. Fix: full rewrite (~120 lines) against the new API; one batched VIDIOC_S_EXT_CTRLS count=3 per frame matching FFmpeg's libavcodec/v4l2_request_mpeg2.c:130-155 reference.

  3. include/mpeg2-ctrls.h was the staging-era local header — masking the kernel's modern <linux/v4l2-controls.h> definitions. Fix: delete the local header. (Required dropping #include from THREE files — src/config.c:37, src/mpeg2.c:38, and src/context.c:42. Phase 2 grep audit only found the first two; Commit D fix-forward caught the third.)

Contract that was followed

Per feedback_dev_process.md Phase 6 contract-before-code, citations from authoritative sources, mapped 1:1 to the patch hunks (full detail in phase4_iter1_plan.md Clauses 16):

  • Linux mainline UAPI: include/uapi/linux/v4l2-controls.h:1985-2105 for the new MPEG-2 stateless control IDs, struct layouts, and flag macros.
  • FFmpeg Kwiboo branch: libavcodec/v4l2_request_mpeg2.c:130-155 for the batched submission shape (3 v4l2_ext_control entries, single VIDIOC_S_EXT_CTRLS).
  • Linux mainline driver: drivers/media/platform/verisilicon/hantro_mpeg2.c::hantro_mpeg2_dec_copy_qtable — confirms zigzag-to-raster permutation is kernel-side; libva backend's quantisation matrices stay in zigzag order (no permutation in userspace).
  • Empirical anchor: Phase 3 Baseline C (phase0_evidence/2026-05-07/iter1_phase3/baseline_C_xvalidator/) — verbatim 256-byte QUANTISATION payload from ffmpeg-v4l2request decode of the BBB MPEG-2 fixture, used for byte-by-byte transcription of mpeg2_default_intra_matrix[64] (per Phase 5 S3 amendment that flagged spec-recall as unreliable).

What landed where

Code commits on git.reauktion.de/marfrit/libva-v4l2-request-fourier:

229d6d1 fresnel-fourier iter1 Phase 6 commit D: drop missed mpeg2-ctrls.h include from context.c (fix-forward)
3aab187 fresnel-fourier iter1 Phase 6 commit C: delete staging-era include/mpeg2-ctrls.h
5fe873c fresnel-fourier iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API
e7dad7a fresnel-fourier iter1 Phase 6 commit A: config.c break for MPEG-2 cases
65969da iter8 Phase 4: tests/run_perf_binding_cell.sh — perf binding cell harness  (libva-multiplanar tip)

All four commits authored as Claude (noether) per memory feedback_gitea_as_claude_noether.md.

Campaign documents on git.reauktion.de/marfrit/fresnel-fourier:

ec9133a iter1 Phase 7: verification — all 5 criteria GREEN, second codec PASS
0e2e1c2 iter1 Phase 5: sonnet-architect review — 6 findings, 4 amendments
3e996d0 iter1 Phase 4: plan — contract clauses, diff scope, Phase 7 harness
cc55a6e iter1 Phase 2: situation analysis — three bugs in MPEG-2 path
b9625af iter1 Phase 3: baseline measurements — Phase 2 confirmed empirically
f720c77 iter1 Phase 0 + Phase 1 lock: MPEG-2 boolean correctness on hantro

(Plus campaign Phase 0 setup commits + this Phase 8 close.)

Phase 1 binding-cell verification (Phase 7 verbatim)

Criterion Result
1: vainfo MPEG-2 Simple+Main enumeration both profiles with VAEntrypointVLD
2: vaCreateConfig(VAProfileMPEG2Main) VA_STATUS_SUCCESS (libva trace verbatim)
3: ffmpeg-hwaccel-vaapi engages, exit 0 5 frames, exit 0, no Failed to create decode configuration
4: DMA-BUF GL HW=SW byte-identical (+02s) HW1=SW1=6e7873030dbf..., HW2=SW2=ccc7ce08810d..., frames differ
5: T4 H.264 reference hashes match HW + SW frames match f623d5f7... and 7d7bc6f2...

Bonus: post-fix VIDIOC_S_EXT_CTRLS payload structurally matches Baseline C cross-validator (count=3, ctrl_class=0xf010000=V4L2_CTRL_CLASS_CODEC_STATELESS, sizes 12/32/256, QUANTISATION intra_matrix bytes byte-identical). One numerical divergence in vbv_buffer_size (1 MB post-fix vs 1.31 MB cross-validator) — kernel-ignored informational field, non-blocking.

Lessons (for Phase 8 memory update)

The 8-phase loop caught five lessons worth distilling. Three become durable memory entries (next section); two stay as campaign-internal backlog notes.

Worth a memory entry

L1 — Header deletion: completeness check is git rm + clean rebuild, not grep alone.

Phase 2 source-read used git grep "mpeg2-ctrls.h" to enumerate include sites. The grep returned src/config.c:37 and src/mpeg2.c:38 — Phase 4 plan and Commit B + C scope expanded only those two files. After Commit C (git rm include/mpeg2-ctrls.h), the build broke on src/context.c:42 — a third include the grep had missed (most likely because of grep tooling quirk + my path filtering). Cost: one fix-forward commit (D) and a moment of confused debugging.

Rule: when removing a header, schedule git rm during the patch series and verify with a clean rebuild. The compiler enumerates includes authoritatively; grep is best-effort. Memory entry: feedback_header_deletion_check.md.

L2 — When the Phase 5 reviewer flags a numerical discrepancy, lean empirical, not theoretical.

Phase 5 reviewer's S2 finding said: "post-fix vbv_buffer_size will be SOURCE_SIZE_MAX (1 MB), Baseline C shows 0x151800 (1.31 MB) — discrepancy unexplained." I responded with confident source-read reasoning that slot->size = sizeimage = 1382400 ergo equality holds, and rejected the finding. Phase 7 byte-compare empirically showed: post-fix produces 1 MB, NOT 1.31 MB. The reviewer was numerically right; my source-read was wrong (the actual slot->size resolution path doesn't equal the negotiated sizeimage in our backend's S_FMT setup).

Operational impact: nil (kernel ignores vbv_buffer_size per the doc). But the historical record of "I rejected this finding with the wrong reason" is itself the lesson.

Rule: when a reviewer cites a concrete numerical mismatch between two captured payloads, run the empirical byte-compare in Phase 7 BEFORE rejecting on source-read grounds. Source-read can have gaps; verbatim-vs-verbatim comparison can't lie. Memory entry: feedback_review_empirical_over_theoretical.md.

L3 — On RK3399 (and Rockchip in general?): vaDeriveImage cache-stale bug class is real and codec-agnostic.

Both H.264 (T4) and MPEG-2 (iter1) decode paths produce all-zero NV12 when read via libva's vaDeriveImage + cached mmap (the path that ffmpeg-vaapi + hwdownload uses). The kernel + driver decode is correct (verified by ffmpeg-v4l2request DRM_PRIME path producing matching pixels to SW reference). The bug is in the libva backend's image-export — same iter1 patch-0011 bug class observed on RK3568 (ohm) but pre-fixed there.

Rule: pixel verification on Rockchip platforms must use the DMA-BUF GL import path (mpv --vo=image or equivalent), NEVER the cached-mmap readback path. Treat ffmpeg-vaapi+hwdownload's correct-pixel guarantee as suspect on these SoCs until backend gains VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC support. Memory entry: feedback_rockchip_pixel_verify_path.md.

Stays as campaign-internal backlog (not durable memory)

B1 — V4L2 /dev/videoN numbering shuffles across reboots on RK3399. Phase 0/3 had rkvdec=video3+media1, hantro=video5+media2; this boot has rkvdec=video1+media0, hantro=video3+media1. Hardcoded env-var paths in Phase 1 binding cells are fragile. Backend should probe /dev/media* for driver=hantro-vpu/rkvdec rather than rely on env var stability. Iter2+ Phase 4 cross-cutting candidate.

B2 — mpv --hwdec=vaapi-copy silently filters MPEG-2 out before libva is loaded. mpv's hwdec policy or ffmpeg-mpv glue layer rejects MPEG-2 from the vaapi-copy candidates, so the binding cell never engaged the libva backend. mpv --hwdec=vaapi (DMA-BUF, not -copy) DOES engage. Phase 1 criterion 3 anchored on ffmpeg-direct as a result; mpv-vaapi-copy-MPEG-2 is a separate follow-up task (mpv-side investigation, possibly an --hwdec-codecs filter override).

Latent bugs surfaced but deferred

B3 — src/picture.c:287 params.h264.matrix_set = false writes byte 240 of the surface's params union, which (per offsetof verification on fresnel via gcc + libva) lands inside mpeg2.iqmatrix.chroma_intra_quantiser_matrix[20]. For ffmpeg-vaapi-driven flows this byte is overwritten by the per-frame VAIQMatrixBufferType (which codec_store_buffer:140-146 copies wholesale). For VAAPI clients that reuse a surface without re-sending the IQMatrix on every cycle, the corruption persists. Latent bug; not exercised by iter1's binding cells. Iter2+ candidate — the fix is making the BeginPicture matrix_set reset profile-aware, or simpler memset(&params, 0, sizeof params).

B4 — src/context.c:142-155 H.264 device-init runs unconditionally on every CreateContext regardless of profile. Returns EINVAL on hantro-vpu-dec (which has no H.264 controls). Intentional best-effort behavior — return cast to (void) and discarded — but src/v4l2.c:484: request_log("Unable to set control(s): %s\n", strerror(errno)) still fires unconditionally, so every MPEG-2 decode session prints v4l2-request: Unable to set control(s): Invalid argument once at init. Cosmetic only; iter2+ cleanup candidate (suppress the log when caller intent is best-effort silent fail).

B5 — vbv_buffer_size = SOURCE_SIZE_MAX = 1 MB rather than the kernel-negotiated sizeimage. Informational field, kernel ignores. Low-priority post-iter1 polish.

What's next (campaign roadmap)

Per phase0_evidence/2026-05-07/cross_validator_traces.md suggested ordering, the campaign now has:

Iter Codec Status
iter1 MPEG-2 (hantro) closed (this iteration)
iter2 HEVC (rkvdec) open — re-enable h265.c in build, audit against rkvdec kernel contract
iter3 VP8 (hantro) open — implement vp8.c
iter4 VP9 (rkvdec) open — implement vp9.c (largest kernel control surface)
iterN Phase 4 cross-cutting open — vaDeriveImage cache-stale fix; V4L2 device-discovery; B3-B5 polish

Phase 8 doesn't pre-lock iter2; that decision is the next operator-driven moment. When iter2 opens, its Phase 0 substrate carries forward iter1's findings (especially B3-B5 if those iterations risk exercising them).

Memory updates (next section)

Three new memory entries land alongside this close:

  • feedback_header_deletion_check.md (L1)
  • feedback_review_empirical_over_theoretical.md (L2)
  • feedback_rockchip_pixel_verify_path.md (L3)

Plus an update to phase5_iter1_review.md to acknowledge S2 was numerically right (the historical-record correction).

Plus update to MEMORY.md index.

Iteration close acknowledgement

Per feedback_dev_process.md's Phase 8: "Loop terminates here." Iteration 1 of fresnel-fourier closes here. The 8-phase loop ran once end-to-end, no Phase 7 → Phase 4 loopbacks needed (one Phase 6 → Phase 4 mid-implementation surface for Commit D fix-forward). Phase 5 sonnet review caught two amendments worth landing (S3, Q4) plus one finding (S2) my response was wrong about. The campaign is on track.