From df787a6cc29cde88edbd45bf67a392f140afcf1b Mon Sep 17 00:00:00 2001 From: "Claude (noether)" Date: Fri, 8 May 2026 15:16:55 +0000 Subject: [PATCH] iter2 Phase 8 close: 3/5 codecs passing, lesson L1 extended (BOTH directions) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Iteration 2 closes with all 5 Phase 1 boolean-correctness criteria green. Third codec passes — campaign scoreboard 2/5 → 3/5 (H.264 in T4, MPEG-2 in iter1, HEVC in iter2). Loop terminates per feedback_dev_process.md Phase 8. Notable: ZERO Phase 7 → Phase 4 loopbacks needed. Phase 5 review caught all 3 would-be loopback triggers in advance (data_byte_offset rename, dpb.rps→index-arrays semantics, pic_order_cnt_val rename). This is the dev-process ideal: review catches bugs before implementation lands; verification confirms contract. What landed: Code (libva-v4l2-request-fourier master 229d6d1 → 8d71e20): cca539d iter2 Phase 6 commit A: config.c break for HEVCMain case 8d71e20 iter2 Phase 6 commit B: rewrite h265.c against new V4L2 stateless HEVC API (6 files, 463 ins, 236 del) Both authored as Claude (noether) per feedback_gitea_as_claude_noether.md. Campaign docs (fresnel-fourier): 6e8c970 iter2 Phase 0 + Phase 1 lock b3ba157 iter2 Phase 2 situation analysis (6 bugs) d35a247 iter2 Phase 3 baselines (substrate post-pacman-Syu + HEVC anchor) 348736e iter2 Phase 4 plan (10 contract clauses) 9eae068 iter2 Phase 5 sonnet review (3 Critical UAPI errors caught) 05b4bd5 iter2 Phase 7 verification (5/5 GREEN) [this commit] iter2 Phase 8 close Lesson L1 distilled to memory (extension of iter1 entry): feedback_review_empirical_over_theoretical.md updated with Direction 2 corollary. Original iter1 lesson covered author-rebuttal-of-reviewer-finding (lean empirical, defer to Phase 7 byte-compare). iter2 surfaced opposite direction: author-too-credulously-adopting-reviewer-amendment. Concrete iter2 instance: Phase 5 S1 suggested picture->pic_fields.bits.uniform_spacing_flag exists in VAAPI as source for V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING. I adopted into amended plan without verifying. Phase 6 build failed: error: struct has no member named 'uniform_spacing_flag' VAAPI's VAPictureParameterBufferHEVC doesn't expose either bit 19 or bit 20. Reviewer-cited mapping was wrong; cheap gcc test-compile would have caught it. Memory updated with Direction 2 protocol: when reviewer suggests a field mapping, verify empirically (test-compile, struct dump, kernel UAPI grep) BEFORE incorporating into the amended plan. Generalized rule: empirical evidence trumps source-read theory in BOTH directions of Phase 5 review. Backlog items deferred (campaign-internal, not durable memory): B6 — VAAPI ↔ V4L2 SPS field-fidelity gaps: sps_max_num_reorder_pics (post-fix=0, baseline=2), sps_max_latency_increase_plus1 (post-fix=0, baseline=4), possibly PPS bit 12 ENTROPY_CODING_SYNC_ENABLED. VAAPI doesn't expose; FFmpeg parses from bitstream directly. Operational impact NIL (Phase 7 Criterion 4 byte-identical pixel pass). Phase 8 polish-backlog candidate: add SPS bitstream parsing to h265_fill_sps when VAAPI doesn't supply the fields. Probably low ROI — kernel HEVC handler tolerates 0 values for the BBB fixture. Defer until a real-world consumer surfaces a fixture that breaks on this. B7 — Phase 4 plan body sizeof typo: Plan claimed sizeof(scaling_matrix) = 1296. Empirical = 1000 bytes. Code uses sizeof() symbolically so produces correct bytes; only plan body's expected-value comment was wrong. Phase 7 byte-compare structural check caught it. Future polish: state struct sizes via sizeof() references in plan bodies, not hand-computed values. iter1 carryover backlog (still deferred): B3 latent surface-reuse bug — picture.c:287 h264.matrix_set=false hits union byte 240. For HEVC: byte 240 lands in h265.picture. RenderPicture's per-frame VAPictureParameterBufferType overwrite masks the corruption (verified Phase 5 Q3 — mpv-vaapi sends VAPictureParameterBufferType per frame for HEVC, no MPEG-2-style filtering). Iter2+ Phase 4 cross-cutting candidate. B4 context.c H.264 device-init log noise (rkvdec accepts both H.264 + HEVC controls cleanly; on hantro both EINVAL with cosmetic log). Iter2 added a 2nd batched call for HEVC; same (void) swallow pattern. Cosmetic. B5 vbv_buffer_size 1MB vs 1.31MB (MPEG-2-only; not exercised by HEVC). Phase 4 cross-cutting work items collected: - VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC for vaDeriveImage cache-stale fix (still applies to all codecs) - V4L2 device-discovery probe (still applies) - picture.c BeginPicture profile-aware reset (B3) - context.c H.264+HEVC device-init log suppression (B4) - mpeg2 vbv_buffer_size negotiation polish (B5) - h265 SPS bitstream-parse fidelity polish (B6) Campaign roadmap (codec iterations remaining): iter3: VP8 on hantro — implement vp8.c. Smaller scope than iter2; predicting closer to iter1 MPEG-2 in size. No slice_params dynamic-array; single v4l2_ctrl_vp8_frame struct per kernel UAPI. iter4: VP9 on rkvdec — implement vp9.c. Largest control surface remaining. Phase 5 review value confirmed empirically AGAIN: 3 Critical findings caught (data_byte_offset rename, dpb.rps→index-arrays semantics, pic_order_cnt_val rename) — would have been Phase 6 compile failures or silent semantic bugs. Without that review pass, iter2 would have required at least 1-2 Phase 7 → Phase 4 loopback cycles. Reviews are never skippable per global ~/.claude/CLAUDE.md rule; iter2 exemplifies why. Co-Authored-By: Claude Opus 4.7 (1M context) --- phase8_iteration2_close.md | 138 +++++++++++++++++++++++++++++++++++++ 1 file changed, 138 insertions(+) create mode 100644 phase8_iteration2_close.md diff --git a/phase8_iteration2_close.md b/phase8_iteration2_close.md new file mode 100644 index 0000000..3be0cce --- /dev/null +++ b/phase8_iteration2_close.md @@ -0,0 +1,138 @@ +# Iteration 2 — Phase 8 (close) + +Iteration 2 of the fresnel-fourier campaign closes 2026-05-08 with all five Phase 1 boolean-correctness criteria green. **Third codec passes**: campaign scoreboard advances 2/5 → **3/5** (H.264 in T4, MPEG-2 in iter1, HEVC in iter2). + +## What iter2 fixed + +HEVC Main hardware-accelerated decode on Rockchip RK3399 / rkvdec / libva-v4l2-request-fourier — bit-exact correct against software reference, when read via the cache-coherency-safe DMA-BUF GL import path. + +Six bugs (per Phase 2 source-read), all in the libva backend (kernel + driver path was already proven by Phase 0 cross-validator sweep). Iter1's three-bug pattern repeated for HEVC, plus three additional bugs unique to HEVC's larger contract surface: + +1. **`src/config.c:67` HEVCMain fall-through** — case label existed but lacked `break;`, so `vaCreateConfig(VAProfileHEVCMain)` returned `VA_STATUS_ERROR_UNSUPPORTED_PROFILE` (12). Same iter1-Bug-1 pattern. Fix: 5-line `break;` (Commit A). + +2. **`src/picture.c:204-206` HEVCMain explicit reject** — `case VAProfileHEVCMain: return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;` with stale `/* Fourier-local: HEVC stripped, no HW support on RK3566. */` comment. RK3566 was ohm context; fresnel is RK3399 where rkvdec does support HEVC. Fix: replace explicit reject with dispatch to `h265_set_controls()` (Commit B). + +3. **`src/h265.c` used staging-era CIDs** — `V4L2_CID_MPEG_VIDEO_HEVC_PPS/SPS/SLICE_PARAMS` removed from kernel UAPI (mainline split into 7 controls: `V4L2_CID_STATELESS_HEVC_{SPS,PPS,SLICE_PARAMS,SCALING_MATRIX,DECODE_PARAMS,DECODE_MODE,START_CODE}`). Test-compile rejected the old CIDs. Fix: full 588-line rewrite (Commit B). + +4. **HEVC slice_params is now a dynamic-array, not single-slot** — kernel rkvdec advertises `elems=1 dims=[600] dynamic-array`. Old h265.c memcpy-overwrote the single slot per VASliceParameterBufferType arrival; multi-slice frames lost data. Fix: per-slice append to `params.h265.slices[64]` array in `codec_store_buffer`, batched dynamic-array submission in `h265_set_controls` (Commit B). + +5. **Missing controls in submission** — old code submitted only PPS+SPS+SLICE_PARAMS. New API needs DECODE_PARAMS (per-frame DPB info, NEW), SCALING_MATRIX (conditional), DECODE_MODE+START_CODE (device-wide). Fix: 5-control batched per-frame submission + 2-control device-wide init in `context.c` (Commit B). + +6. **`src/meson.build` excluded h265.c + h265.h** — left commented out from the libva-multiplanar iter5 sweep. Fix: uncomment both (Commit B). + +## Contract that was followed + +Per `feedback_dev_process.md` Phase 6 contract-before-code: 10 contract clauses cited from authoritative sources, mapped 1:1 to the patch hunks (full detail in [`phase4_iter2_plan.md`](phase4_iter2_plan.md) + Phase 5 review amendments in [`phase5_iter2_review.md`](phase5_iter2_review.md)): + +- Linux mainline UAPI `:2090-2300` for the 7 stateless HEVC control struct definitions, flag bit positions, and DPB entry semantics. +- FFmpeg Kwiboo branch `libavcodec/v4l2_request_hevc.c:505-565` for batched submission shape and `get_ref_pic_index()` DPB-classification pattern. +- Linux mainline driver `drivers/staging/media/rkvdec/` — **NOT available** for HEVC (mainline 7.1-rc2 has only h264 + vp9 in staging rkvdec; HEVC is out-of-tree per Christian Hewitt patches). Contract anchor instead leans on UAPI doc + FFmpeg + Phase 3 cross-validator bytes. +- Empirical anchor: Phase 3 Baseline B verbatim ([`phase0_evidence/2026-05-08/iter2_phase3/`](phase0_evidence/2026-05-08/iter2_phase3/)) — verbatim 5-control batched submission per frame, used to validate Phase 7 byte-compare. + +## What landed where + +**Code commits on `git.reauktion.de/marfrit/libva-v4l2-request-fourier`** (master 229d6d1 → 8d71e20): + +``` +8d71e20 fresnel-fourier iter2 Phase 6 commit B: rewrite h265.c against new V4L2 stateless HEVC API + (6 files, 463 insertions, 236 deletions: src/h265.c rewrite + src/picture.c + dispatch+accumulation+reset + src/surface.h slices[64] extension + src/h265.h + HEVC_MAX_SLICES_PER_FRAME #define + src/context.c HEVC device-init + src/meson.build + uncomment h265.c+h265.h) +cca539d fresnel-fourier iter2 Phase 6 commit A: config.c break for HEVCMain case +229d6d1 fresnel-fourier iter1 Phase 6 commit D (libva-multiplanar tip + iter1 work) +``` + +Both iter2 commits authored as `Claude (noether)` per memory `feedback_gitea_as_claude_noether.md`. + +**Campaign documents on `git.reauktion.de/marfrit/fresnel-fourier`**: + +``` +05b4bd5 iter2 Phase 7: verification — all 5 criteria GREEN, third codec PASS +9eae068 iter2 Phase 5: sonnet review — 3 critical UAPI errors caught, 7 amendments +348736e iter2 Phase 4: plan — 10 contract clauses, ~400-line h265.c rewrite +d35a247 iter2 Phase 3: baselines — substrate verified post-upgrade, HEVC anchor captured +b3ba157 iter2 Phase 2: situation analysis — six bugs in HEVC path +6e8c970 iter2 Phase 0 + Phase 1 lock: HEVC Main on rkvdec +``` + +(Plus this Phase 8 close.) + +## Phase 1 binding-cell verification (Phase 7 verbatim) + +| Criterion | Result | +|---|---| +| 1: vainfo `VAProfileHEVCMain` enumeration on rkvdec bind | ✅ enumerated, no regression | +| 2: `vaCreateConfig(VAProfileHEVCMain)` | ✅ `VA_STATUS_SUCCESS` (libva trace verbatim) | +| 3: ffmpeg-hwaccel-vaapi exit 0 | ✅ 5 frames, no Failed-to-create, cap_pool engaged | +| 4: DMA-BUF GL HW=SW byte-identical (+02s) | ✅ HW1=SW1=`47a5f3850df5...`, HW2=SW2=`a467b3bc9d7b...`, frames differ | +| 5: iter1 MPEG-2 + T4 H.264 reference hashes | ✅ all 4 hashes match references byte-identical | + +Bonus: post-fix `VIDIOC_S_EXT_CTRLS` payload structurally matches Baseline B cross-validator (count=5, ctrl_class=`V4L2_CTRL_CLASS_CODEC_STATELESS`, sizes 40/64/280/1000/328). Two informational SPS field-value divergences (`sps_max_num_reorder_pics`, `sps_max_latency_increase_plus1`) — VAAPI doesn't expose; non-blocking per Criterion 4 byte-identical pixel pass. + +## Lessons (for Phase 8 memory update) + +The 8-phase loop ran cleanly: zero Phase 7 → Phase 4 loopbacks. **Phase 5 review caught all the would-be loopback triggers in advance** — 3 Critical findings (data_byte_offset rename, dpb.rps→index-arrays semantics, pic_order_cnt_val rename) that would have been Phase 6 compile failures or silent semantic bugs. + +### Worth a memory update + +**L1 — Empirical-over-theoretical applies in BOTH directions of Phase 5 review** + +iter1 lesson `feedback_review_empirical_over_theoretical.md` codified: when a Phase 5 reviewer cites a numerical mismatch, lean empirical (defer to Phase 7 byte-compare) instead of rejecting on source-read theory. iter2 surfaced the **reverse direction**: when a reviewer SUGGESTS a field mapping that the original plan was missing, the author should also verify the suggestion empirically before adopting. + +Concretely: Phase 5 review S1 suggested `picture->pic_fields.bits.uniform_spacing_flag` exists in VAAPI as the source for `V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING`. I incorporated the suggestion into the Phase 4 plan amendments without verifying. Phase 6 build failed: + +``` +../src/h265.c:224:37: error: 'struct ' has no member named 'uniform_spacing_flag' +``` + +Test-compile would have caught it before Phase 6. The lesson generalizes the iter1 rule: **empirical evidence trumps source-read theory in BOTH directions** — author-rebuttals AND reviewer-suggested-amendments. The right discipline: + +- Reviewer flags a numerical mismatch → defer to Phase 7 byte-compare; don't rebut on source-read. +- Reviewer suggests a field mapping → empirically verify (gcc test-compile, struct dump, kernel UAPI grep) before incorporating into the plan. + +**Memory action**: extend the existing `feedback_review_empirical_over_theoretical.md` entry with this iter2-derived corollary. Keep the existing iter1 example, add the iter2 example. + +### Stays as campaign-internal backlog (not durable memory) + +**B6 — VAAPI ↔ V4L2 SPS field-fidelity gap (low-priority polish)**: post-iter2 SPS bytes diverge from Baseline B on `sps_max_num_reorder_pics` (post-fix=0, baseline=2) and `sps_max_latency_increase_plus1` (post-fix=0, baseline=4). VAAPI's `VAPictureParameterBufferHEVC` doesn't expose these; FFmpeg parses them from the bitstream's SPS header. Non-blocking (Criterion 4 byte-identical pixel pass). Phase 8 polish-backlog candidate: add SPS bitstream parsing to `h265_fill_sps` to extract these fields from the slice OUTPUT buffer when VAAPI doesn't supply them. Probably low ROI — kernel HEVC handler tolerates 0 values for the BBB fixture. Defer until a real-world consumer surfaces a fixture that breaks on this. + +**B7 — Phase 4 plan body had a `sizeof()` typo**: claimed `sizeof(struct v4l2_ctrl_hevc_scaling_matrix) = 1296`; empirical sizeof is 1000 = 96 + 384 + 384 + 128 + 6 + 2. Code uses `sizeof()` symbolically so produces correct bytes; only the plan body's expected-value comment was wrong. Phase 7 byte-compare's structural check caught this (plan-body number wrong, code right). Future polish: state struct sizes via `sizeof()` references in plan bodies, not hand-computed values; or have Phase 7 byte-compare always validate sizes via runtime sizeof. + +**B8 — Phase 5 amendment empirically wrong (uniform_spacing_flag)**: same as L1 above; documented inline in `src/h265.c::h265_fill_pps`. Already fixed (PPS flag bits 19+20 left zero — VAAPI doesn't expose either). Lesson is durable (L1); the bug is fixed. + +### Latent bugs surfaced but deferred (carryover) + +**iter1 B3 — picture.c:287 `params.h264.matrix_set = false` writes union byte 240** — for HEVC surfaces, byte 240 lands inside `h265.picture` (Phase 2 Bug 8 verified). RenderPicture's `VAPictureParameterBufferHEVC` per-frame copy overwrites the corrupted byte. Phase 7 confirmed mpv-vaapi sends VAPictureParameterBufferType per frame for HEVC (no silent filtering like vaapi-copy did for MPEG-2), so the masking holds for iter2 binding cells. Latent for VAAPI clients that reuse a surface without re-sending picture params. iter2+ Phase 4 cross-cutting backlog. + +**iter1 B4 — `src/context.c` H.264 device-init log noise on hantro** — extended in iter2 Commit B with HEVC device-init in a separate batched call. New behavior on rkvdec: BOTH H.264 and HEVC device-init batches succeed (rkvdec advertises both control families). On hantro: H.264 batch EINVALs (existing noise), HEVC batch ALSO EINVALs (HEVC controls don't exist on hantro). Net: cosmetic log line might fire twice on hantro instead of once. Same intentional `(void)v4l2_set_controls(...)` swallow pattern; not blocking. + +**iter1 B5 — vbv_buffer_size 1 MB vs negotiated 1.31 MB**: not exercised by HEVC (HEVC's SPS doesn't have an analogous field). MPEG-2-only. + +## What's next (campaign roadmap) + +Per `phase0_evidence/2026-05-07/cross_validator_traces.md` suggested ordering, the campaign now has: + +| Iter | Codec | Status | +|---|---|---| +| iter1 | MPEG-2 (hantro) | ✅ closed | +| iter2 | HEVC (rkvdec) | ✅ closed (this iteration) | +| iter3 | VP8 (hantro) | open — implement `vp8.c` | +| iter4 | VP9 (rkvdec) | open — implement `vp9.c` (largest control surface) | +| iterN | Phase 4 cross-cutting | open — vaDeriveImage cache-stale fix; V4L2 device-discovery; B3 BeginPicture profile-aware reset; B4 context.c log suppression; B5 vbv_buffer_size; B6 SPS bitstream-parse fidelity polish | + +Phase 8 doesn't pre-lock iter3; that's the next operator-driven moment. When iter3 opens, its Phase 0 substrate carries forward iter2's findings (especially B6, B7, B8 polish items if convenient to bundle with the VP8 work). + +Predicted iter3 difficulty (VP8): smaller than iter2 (VP8 has fewer controls than HEVC). Closer to iter1 MPEG-2 in scope. Reuse the per-frame batched submission pattern; no slice_params dynamic-array (VP8 is frame-mode); no DECODE_PARAMS-equivalent (VP8 control state fits in a single `v4l2_ctrl_vp8_frame` struct per kernel UAPI). + +## Memory updates + +One extension to an existing memory entry (no new files): + +- **Extend `feedback_review_empirical_over_theoretical.md`** with iter2's reverse-direction example (reviewer-suggested mapping that proved empirically wrong on test-compile). Same rule, both directions. + +## Iteration close acknowledgement + +Per `feedback_dev_process.md`'s Phase 8: "Loop terminates here." Iteration 2 of fresnel-fourier closes here. The 8-phase loop ran end-to-end with **zero Phase 7 → Phase 4 loopbacks** — Phase 5 review caught all the would-be loopback triggers in advance (3 Critical UAPI errors). This is the dev-process ideal: review catches the bugs before the implementation lands; verification confirms the contract; close documents both the substrate and the lessons. + +Phase 5 review value confirmed empirically. The campaign continues against the corrected memory and the 3-codec-passing scoreboard.