Files
panvk-bifrost/mesa-panvk-bifrost/iter16/phase8_close.md
T
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

5.1 KiB

Phase 8 close — iter16: DEFERRED

Result: iter16 closes as Path D — investigation complete, fix deferred. The 162 winding-order CTS fails categorized in iter15 remain known/documented; campaign's iter13 + iter15 deliverables unchanged.

What was attempted

Driver-side primitive decomposition for transform_feedback on non-LIST topologies (TRIANGLE_STRIP / LINE_STRIP / TRIANGLE_FAN / *_WITH_ADJACENCY). Plan: inside panvk_per_arch(CmdDraw), when XFB-active + non-LIST, build a synthetic index buffer encoding the spec-required decomposition, dispatch as indexed-LIST.

Infrastructure built (all working, tested):

  • panvk_vX_winding.c — topology decomposition tables for 7 topologies
  • panvk_winding_table struct + panvk_per_arch(get_winding_table) API
  • cmdbuf->state.gfx.xfb.decomposed_count field + sysval override for vs.num_vertices
  • IB + topology state save/restore around the synthetic dispatch
  • IB dirty bit + MESA_VK_DYNAMIC_IA_PRIMITIVE_TOPOLOGY dirty bit set
  • Regression probe (iter16/probe_winding.c) parametrized for 3+ topologies

What didn't work (Path A & Path B both):

  • Calling panvk_cmd_draw_indirect directly with a manually-constructed VkDrawIndexedIndirectCommand (Path A)
  • Calling panvk_per_arch(CmdDrawIndexed) from inside the injection after state mutation (Path B, per architect's recommendation)

Both produce the same 8-entry non-indexed output (0,1,2,3,4,5,6,7 for an 8-vert triangle strip), not the expected 18-entry decomposed output (0,1,2,1,3,2,...).

What was definitively isolated

  • iter13 XFB + vkCmdDrawIndexed via public entries: works — confirmed by iter16/probe_idx.c. 6 indices [10,11,12,13,14,15] captured exactly.
  • Render-pass scope isn't the issue: vkCmdBindIndexBuffer AFTER pBeginRendering works fine if it's a real vkCmd call.
  • info.index.size being zero isn't the issue (architect's diagnosis): my draw construction set it correctly to 4.
  • The mystery: state-mutation-from-within-CmdDraw doesn't reproduce what a separate vkCmdBindIndexBuffer2 call sets up. Hypotheses still on the table:
    • Pipeline-bind-time descriptor emission captures IB-bound state at that moment
    • VK_FROM_HANDLE(panvk_buffer) in CmdBindIndexBuffer2 registers BO with batch in a way direct state writes skip
    • Mali JM dirty-tracking needs explicit invalidation we're missing
  • Resolving requires either Mali Graph Profiler / RGP (we don't have) or significantly more time in driver internals.

What ships from iter16

  • ALL Phase 0-3 docs in iter16/ (substrate, source map, design lock, probe + Makefile)
  • The full WIP code in iter16/applied_state/panvk_vX_winding.c plus the modifications to panvk_cmd_draw.h, panvk_vX_cmd_draw.c, jm/panvk_vX_cmd_draw.c, meson.build — applied on ohm but reverted from any published package
  • iter16/probe_winding.c + probe_idx.c — both probes work as regression tests if iter16 resumes
  • iter16/phase4_progress.md — detailed status for resumer, including the architect consultation outcome
  • iter16/phase8_close.md — this doc

What does NOT ship from iter16

  • No code changes to the published mesa-panvk-bifrost-26.0.6.r3 package
  • No CTS rerun (the 162 winding fails remain — same as iter15's measurement)
  • No upstream Mesa MR

Why deferred and not "Path C — NIR-pass decomposition"

Path C is the remaining structural option and probably the right long-term fix (200-400 LoC in pan_nir_lower_xfb to emit multiple nir_store_global calls per VS invocation — one per primitive each vertex contributes to). It would bypass the dispatch-level mystery entirely. But:

  • It's multi-day Mesa-internals work (NIR builder + shader-cache invalidation + per-topology lowering rules).
  • Real-world impact is approximately zero: ANGLE on Vulkan (the iter13/Brave motivator) doesn't trigger this path because ANGLE pre-decomposes strip topologies before issuing the Vulkan call (mirroring OpenGL's own decomposition rules).
  • The iter13 + iter15 standing campaign deliverables (Vulkan-on-Brave + 75.7% transform_feedback CTS pass rate) are NOT affected by leaving this open.

Path C remains the right move if someone returns to iter16 with time/motivation.

ohm state cleanup

The WIP iter16 patches are still applied on ohm at /home/mfritsche/mesa-build/mesa-26.0.6/. They build clean. The patched lib is in /home/mfritsche/panvk-patched-libs/libvulkan_panfrost.so but the system-installed /usr/lib/panvk-bifrost/ is r3 untouched. So the campaign's published-package behavior is unchanged.

To fully revert ohm to a clean iter13-only source state (if needed for a future iter): the patches are in iter16/applied_state/. Easy to identify (all marked with iter16: comments) and reverse-patch.

Bottom line

iter16 = investigation closed. Path D (defer) chosen because Path B (architect's pick) didn't pan out and Path C (NIR pass) wasn't worth a multi-day investment given zero real-world impact on the iter9/iter13 ANGLE-on-Vulkan campaign target. Anyone resuming iter16 should start from iter16/phase4_progress.md and the listed hypotheses.

— claude-noether, 2026-05-21