panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.
This retrofit imports:
- mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18)
(libmali stub blobs at iter18/blob/ excluded
— 109MB of RE artifacts replaced with a README
pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/ — frozen .tgz source snapshots at each milestone
(basis for the 0005 patch diff generation)
Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.
Total: 1.9 MB across 124 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5.1 KiB
Phase 8 close — iter16: DEFERRED
Result: iter16 closes as Path D — investigation complete, fix deferred. The 162 winding-order CTS fails categorized in iter15 remain known/documented; campaign's iter13 + iter15 deliverables unchanged.
What was attempted
Driver-side primitive decomposition for transform_feedback on non-LIST topologies (TRIANGLE_STRIP / LINE_STRIP / TRIANGLE_FAN / *_WITH_ADJACENCY). Plan: inside panvk_per_arch(CmdDraw), when XFB-active + non-LIST, build a synthetic index buffer encoding the spec-required decomposition, dispatch as indexed-LIST.
Infrastructure built (all working, tested):
panvk_vX_winding.c— topology decomposition tables for 7 topologiespanvk_winding_tablestruct +panvk_per_arch(get_winding_table)APIcmdbuf->state.gfx.xfb.decomposed_countfield + sysval override forvs.num_vertices- IB + topology state save/restore around the synthetic dispatch
- IB dirty bit +
MESA_VK_DYNAMIC_IA_PRIMITIVE_TOPOLOGYdirty bit set - Regression probe (
iter16/probe_winding.c) parametrized for 3+ topologies
What didn't work (Path A & Path B both):
- Calling
panvk_cmd_draw_indirectdirectly with a manually-constructedVkDrawIndexedIndirectCommand(Path A) - Calling
panvk_per_arch(CmdDrawIndexed)from inside the injection after state mutation (Path B, per architect's recommendation)
Both produce the same 8-entry non-indexed output (0,1,2,3,4,5,6,7 for an 8-vert triangle strip), not the expected 18-entry decomposed output (0,1,2,1,3,2,...).
What was definitively isolated
- iter13 XFB + vkCmdDrawIndexed via public entries: works — confirmed by
iter16/probe_idx.c. 6 indices[10,11,12,13,14,15]captured exactly. - Render-pass scope isn't the issue:
vkCmdBindIndexBuffer AFTER pBeginRenderingworks fine if it's a realvkCmdcall. info.index.sizebeing zero isn't the issue (architect's diagnosis): my draw construction set it correctly to 4.- The mystery: state-mutation-from-within-CmdDraw doesn't reproduce what a separate
vkCmdBindIndexBuffer2call sets up. Hypotheses still on the table:- Pipeline-bind-time descriptor emission captures IB-bound state at that moment
VK_FROM_HANDLE(panvk_buffer)in CmdBindIndexBuffer2 registers BO with batch in a way direct state writes skip- Mali JM dirty-tracking needs explicit invalidation we're missing
- Resolving requires either Mali Graph Profiler / RGP (we don't have) or significantly more time in driver internals.
What ships from iter16
- ALL Phase 0-3 docs in
iter16/(substrate, source map, design lock, probe + Makefile) - The full WIP code in
iter16/applied_state/—panvk_vX_winding.cplus the modifications topanvk_cmd_draw.h,panvk_vX_cmd_draw.c,jm/panvk_vX_cmd_draw.c,meson.build— applied on ohm but reverted from any published package iter16/probe_winding.c+probe_idx.c— both probes work as regression tests if iter16 resumesiter16/phase4_progress.md— detailed status for resumer, including the architect consultation outcomeiter16/phase8_close.md— this doc
What does NOT ship from iter16
- No code changes to the published
mesa-panvk-bifrost-26.0.6.r3package - No CTS rerun (the 162 winding fails remain — same as iter15's measurement)
- No upstream Mesa MR
Why deferred and not "Path C — NIR-pass decomposition"
Path C is the remaining structural option and probably the right long-term fix (200-400 LoC in pan_nir_lower_xfb to emit multiple nir_store_global calls per VS invocation — one per primitive each vertex contributes to). It would bypass the dispatch-level mystery entirely. But:
- It's multi-day Mesa-internals work (NIR builder + shader-cache invalidation + per-topology lowering rules).
- Real-world impact is approximately zero: ANGLE on Vulkan (the iter13/Brave motivator) doesn't trigger this path because ANGLE pre-decomposes strip topologies before issuing the Vulkan call (mirroring OpenGL's own decomposition rules).
- The iter13 + iter15 standing campaign deliverables (Vulkan-on-Brave + 75.7% transform_feedback CTS pass rate) are NOT affected by leaving this open.
Path C remains the right move if someone returns to iter16 with time/motivation.
ohm state cleanup
The WIP iter16 patches are still applied on ohm at /home/mfritsche/mesa-build/mesa-26.0.6/. They build clean. The patched lib is in /home/mfritsche/panvk-patched-libs/libvulkan_panfrost.so but the system-installed /usr/lib/panvk-bifrost/ is r3 untouched. So the campaign's published-package behavior is unchanged.
To fully revert ohm to a clean iter13-only source state (if needed for a future iter): the patches are in iter16/applied_state/. Easy to identify (all marked with iter16: comments) and reverse-patch.
Bottom line
iter16 = investigation closed. Path D (defer) chosen because Path B (architect's pick) didn't pan out and Path C (NIR pass) wasn't worth a multi-day investment given zero real-world impact on the iter9/iter13 ANGLE-on-Vulkan campaign target. Anyone resuming iter16 should start from iter16/phase4_progress.md and the listed hypotheses.
— claude-noether, 2026-05-21