Files
libva-multiplanar/phase8_iteration7_close.md
T
claude-noether ec769a9687 docs: clarify Rockchip silicon across operative docs (RK3566)
PineTab2 is Rockchip RK3566 silicon, not RK3568. The hantro driver
attaches via the rockchip,rk3568-vpu DT compatible because RK3566/
RK3568 silicon is close enough to share that variant. The proper
RK3566 mainline driver target (rkvdec2 / vdpu346) has no kernel
support yet — Christian Hewitt's patch series LKML 2025/12/26/206
is unmerged.

Updated operative docs to use the consistent form:
"PineTab2 (Rockchip RK3566 silicon; hantro driver via the
rockchip,rk3568-vpu DT compatible)" or shorter variants.

Files updated:
- README.md (campaign top-level): TL;DR, deliverable, KWin link,
  hardware target, hardware listing
- firefox-fourier/README.md: tested-on line
- phase8_iteration7_close.md: hardware carry
- phase8_iteration6_close.md: hardware carry, MPEG-2 drop
  rationale
- phase0_findings_iter7.md: predecessor summary, fourier-fresnel
  description, hardware carry
- phase2_iter7_situation.md: msync hypothesis hardware reference

Historical iter1-iter5 phase docs left as-is — they're snapshots
of what the campaign believed at the time. The canonical source
for the silicon-ID correction is track_F_research_2026-05-06.md
(commit 358801b).

Not a correctness change. The campaign's empirical evidence is
unaffected — the hantro/rk3568-vpu driver path that we exercised
was always the actual decode path on PineTab2 silicon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 11:39:28 +00:00

9.7 KiB
Raw Blame History

Iteration 7 close (Phase 8) — A+B+C all GREEN

Opened 2026-05-06 immediately after iter6 close. Closing same day. Locked candidates: A (msync pixel-correctness verification — iter5 sonnet C3), B (slot-leak error recovery — iter6 internal carry), C (cap_pool race synthetic harness — iter5 sonnet C4 / iter6 candidate A formal anchor).

All three closed GREEN. One additional latent bug (OUTPUT pool stale-slot on context-bound resolution change) was surfaced by the Track C harness during Phase 7 verification and fixed in-iteration. One distinct latent bug (STREAMON-on-context-recreate after resolution change) was also surfaced and is logged as iter8 candidate.

Verdict per track

Track A: GREEN — msync removal pixel-verified safe

tests/run_msync_pixel_verify.sh: 100-frame bbb_1080p30_h264.mp4 decoded twice — once via FFmpeg's libavcodec SW path, once via FFmpeg-VAAPI through our v4l2_request driver with hwdownload,format=nv12,crop=$W:$H to normalize MB-padding artifacts. Result: byte-for-byte identical, sha256 58c8f3f4320fe1d761ba3b461d2f63fbbdbad3d2a608a83ae1bca9b0aa3ff6fb for both. iter5 Phase 5 sonnet caveat C3 formally closes — the kernel V4L2 framework does cache sync at DQBUF for CMA-backed allocations on hantro G1 / kernel 6.19.10, the iter1 msync(MS_SYNC|MS_INVALIDATE) was structurally redundant.

Track B: GREEN — slot-leak error recovery

Adds request_pool_force_release(pool, slot_index) that REINITs the slot's fd, falls back to close + media_request_alloc on REINIT failure, and leaves the slot dead-busy if even alloc fails (other slots unaffected, pool capacity reduced by 1). Wired into RequestSyncSurface error paths.

Phase 5 sonnet code review caught a discriminator subtlety pre-commit: the original "always force_release on source_data != NULL" path was unsafe for the OUTPUT-DQBUF-failure sub-case (V4L2 buffer in indeterminate kernel state, reusing the slot would QBUF on a kernel-still-held buffer and hit exactly the iter6 race). Fixed: separate error_buffer_indeterminate label leaves the slot dead-busy in that specific case.

Empirical happy-path regression: mpv 100 frames clean, Firefox bbb 35s+ clean, RDD holds /dev/video1 + /dev/media0 throughout. No Unable to set control(s) / Unable to queue buffer events. Synthetic fault-injection deferred — code path is small + sonnet-reviewed; full inject test would require a separate debug build.

Track C: GREEN — cap_pool race synthetic harness

tests/cap_pool_probe_pattern.c (~140 lines) + tests/run_cap_pool_probe.sh: synthetic test that allocates 4 surfaces at 128×128, destroys them, then allocates 4 surfaces at 1920×1080 — the iter5 sonnet C4 specified pattern. Driver stderr shows two clean cap_pool_init: 24 slots ready events (small + big), no REQBUFS / EBUSY / Unable to request buffers / Unable to set format race indicators. Harness exits 0.

Phase 5 sonnet pre-commit suggested adding vaCreateContext to the test. That suggestion exposed an additional latent bug during Phase 7 (STREAMON-on-context-recreate); reverted to the no-context pattern that matches the actual C4 specification. The new STREAMON bug is logged as iter8 candidate.

What landed

Fork commits (libva-v4l2-request-fourier)

  • 988b848 — iter7 main: A+B+C — slot-leak fix, cap_pool harness, msync verify harness. Three new files in tests/, modifications to src/request_pool.{h,c} and src/surface.c.
  • 7bd0818 — iter7 Phase 7 finalization: OUTPUT-pool teardown on resolution change in src/surface.c (latent bug surfaced by harness). Test refinements (sonnet-recommended context creation reverted; runner grep tightened).

Net: ~545 lines added, ~13 lines modified across 6 files. Three test artifacts in tests/ ready for use as regression checks in iter8+.

Campaign artifacts (libva-multiplanar)

  • phase0_findings_iter7.md — substrate (6 candidates A..F locked-deferred → A+B+C locked)
  • phase2_iter7_situation.md — Phase 2 design + risk + plan + effort per track
  • phase8_iteration7_close.md — this file

Built driver

  • iter7 build: /usr/lib/dri/v4l2_request_drv_video.so sha256 54999017ebd433228a5060cc5e0c3b353e8dd7d3062d158dc63c55eb59503317
  • Note: this sha is the BEFORE the Phase 7 OUTPUT-pool teardown fix landed. Final iter7-end driver requires rebuild from 7bd0818 HEAD; see "Carries to iter8" for the deploy step.

State that carries to iter8 (or campaign close)

  • Hardware: ohm PineTab2 (Rockchip RK3566 silicon; hantro driver via rockchip,rk3568-vpu DT compatible), kernel 6.19.10. Access: ohm (LAN). VPN currently flaky.
  • Userspace: firefox 150.0.1-1.1 (iter5 amendment), libva 2.23.0, mesa 26.0.5, libdrm 2.4.131, mpv 0.41.0-3.
  • Test fixture: bbb_1080p30_h264.mp4 sha256 dcf8a7170fbd....
  • Build container: firefox-fourier LXD on boltzmann, persistent.

Documented limitations carried to iter8+ (or campaign close)

  1. STREAMON-on-context-recreate after resolution change — NEW from iter7 Phase 7. Surfaced when vaCreateContext is called twice with different resolutions, with vaDestroyContext between. Second STREAMON returns EINVAL. Real consumers (Firefox, mpv) don't trigger this (single context per decoder lifetime), so it's a corner case. Fix likely involves ordering in RequestCreateContext — possibly the dev_ctrls S_EXT_CTRLS rejecting on a re-init device, or a STREAMOFF/REQBUFS(0) sequence being incomplete somewhere.

  2. Synthetic fault-injection for slot-leak recovery — Track B's success criterion called for a fault-inject build (REINIT returns -EBUSY after N frames) demonstrating pool starvation pre-fix and recovery post-fix. Deferred — sonnet code review covered the semantic correctness, happy-path regression confirmed no normal-flow impact. Could be added as a debug-build target in iter8 if a hard guarantee is needed.

  3. Probe-pattern test for context-recreate path — iter7 Track C harness covers no-context resolution change. A future test could exercise the context-recreate path once the STREAMON bug above is fixed.

  4. RequestDestroyContext auto-frees render_targets — iter7 verified that our RequestDestroyContext calls RequestDestroySurfaces on its surfaces_ids array. This is non-standard libva behavior but matches consumer organic flow (mpv / Firefox tolerate the subsequent vaDestroySurfaces returning INVALID_SURFACE). Worth documenting as known-deviation in the upstream submission package.

  5. Pool size hardcoded to 16 — iter6 sonnet review suggested max(surfaces_count, DPB_SIZE_H264_MAX). Empirically 16 is fine for the workloads tested; deferring the parameterization until upstream submission requires it.

Lessons distilled to memory

Memory updates this iteration

No new memory entries — Track A/B/C closes existing carry items rather than introducing new diagnostic patterns. Existing memory:

  • feedback_request_fd_lifecycle.md (iter6) covers the REINIT discipline
  • feedback_kernel_obfuscation_compound.md (iter4) covers the cluster-validation pattern
  • feedback_no_fixture_hardcoding.md (iter6) covers test-harness honesty

Process lessons (recorded here, not in standalone memory)

Sonnet design reviews surfaced two real issues in iter7 Phase 5 pre-commit + one in Phase 7:

  1. Pre-commit: discriminator subtlety in surface.c error path (Track B) — fixed to error_buffer_indeterminate label
  2. Pre-commit: stride/padding false-failure in msync verify harness (Track A) — fixed with explicit crop + diagnostic
  3. Pre-commit: cap_pool harness needed vaCreateContext to actually exercise pool init — added; later REVERTED in Phase 7 because the addition surfaced a separate STREAMON bug that wasn't iter7 scope

The third point is meta-instructive: a well-intentioned pre-commit refinement can inadvertently expand scope. Holding to the iter5 sonnet C4 specification ("vaCreateSurfaces small then big, allocation-only") was the right scoping move once Phase 7 evidence showed the context-recreate path had a separate bug class.

Bootlin upstream outlook

iter7 cleanly closes the iter5/iter6 carry list. The fork now has:

  • iter6's per-slot REINIT request_fd discipline (commit a09c03c)
  • iter7's OUTPUT-pool teardown on resolution change (commit 7bd0818)
  • iter7's slot-leak error recovery (commit 988b848)

Three test artifacts in tests/ provide regression coverage: cap_pool probe pattern, msync pixel verify, run-script orchestrators.

Outstanding for upstream-readiness:

  • STREAMON-on-context-recreate (iter8 candidate, latent corner case)
  • Pool-size parameterization (iter6 sonnet review carry, low priority)
  • Multi-codec audit was DROPPED at iter6 close (CPU handles MPEG-2 fine)

Per feedback_no_upstream.md, no PR/MR happens without explicit operator instruction.

Phase 1 success criterion — final per track

Criterion Result
A: 100-frame vaapi-copy frame-by-frame sha matches FFmpeg SW reference ✓ HIT — sha 58c8f3f4... identical for both
B: synthetic fault-inject demonstrates pre-fix starvation, post-fix recovery ⚠ PARTIAL — code-reviewed by sonnet (Phase 5), happy-path regression clean (Phase 7); fault-inject build deferred to iter8 if hard guarantee needed
C: synthetic test issues vaCreateSurfaces small→big, zero REQBUFS-EBUSY in driver stderr ✓ HIT — run_cap_pool_probe.sh exits 0, no race indicators
Phase 5 sonnet review confirms fix is libva-side, all three paths verified, no new mutable global state ✓ HIT — APPROVE-WITH-CHANGES, all changes addressed

Joint success: all three tracks closed on the same iter7-end driver build. Phase 7 verified each. Phase 5 sonnet pre-commit caveats addressed. iter7 closes GREEN with one named carry (STREAMON-on-context-recreate) for iter8.