Files
panvk-bifrost/mesa-panvk-bifrost/phase5_iter13_close.md
T
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

5.2 KiB

Phase 5 close — iter13 second-model review

Reviewer: janet (ARM/DDR bare-metal specialist agent — closest available to driver/NIR review). Verdict: NEEDS FIX BEFORE MERGE (one CRITICAL, two HIGH, two MEDIUM, two LOW). Outcome: all CRITICAL + HIGH addressed in this phase, MEDIUM + LOW addressed where cheap.

CRITICAL: single-variant ships, dual-variant was Phase 2 lock

Janet's catch: a pipeline with XFB-decorated outputs used in a NON-XFB draw (or in an XFB draw with fewer buffers bound than declared) would write nir_store_global to address 0 → GPU page fault → DEVICE_LOST.

The Phase 2 lock specified dual-variant (B). Phase 4 shipped single-variant (closer to option C). Janet recommended the dual-variant refactor.

Resolution: option Z (better than B or C) — Panfrost-Gallium memory-sink idiom.

While re-reading gallium/drivers/panfrost/pan_cmdstream.c:1339-1366, I found the Gallium PAN_SYSVAL_XFB handler does exactly this: when no XFB target is bound, it sets the address sysval to 0x8000_0000_0000_0000 (= PAN_SHADER_OOB_ADDRESS). The Bifrost MMU silently discards stores to this address. No fault. No dual variants. Single-variant solution at no runtime cost.

Applied in panvk_vX_cmd_draw.c:

uint64_t _xa0 = PAN_SHADER_OOB_ADDRESS, _xa1 = PAN_SHADER_OOB_ADDRESS,
         _xa2 = PAN_SHADER_OOB_ADDRESS, _xa3 = PAN_SHADER_OOB_ADDRESS;
if (_gfx->xfb.active) {
   if (_gfx->xfb.buffer_count > 0 && _gfx->xfb.buffers[0].addr)
      _xa0 = _gfx->xfb.buffers[0].addr + _gfx->xfb.buffers[0].offset;
   /* ... 1..3 ... */
}

Plus #include "pan_compiler.h" for the constant.

Saved as project memory project_pan_shader_oob_address.md — the canonical conditional-write idiom for Panfrost. Will be useful for any future feature with driver-state-conditional shader writes.

A new regression probe probe_xfb_nodraw.c covers Janet's exact scenario: same XFB-capable pipeline as probe_xfb.c but no Bind/Begin/End and no buffer bound, just a raw vkCmdDraw. Expected: [PASS] XFB-capable pipeline survives non-XFB draw — memory-sink active. (DEVICE_LOST = FAIL).

HIGH: _pad_xfb ghost padding

Janet's catch: the explicit uint32_t _pad_xfb after num_vertices was supposed to keep 8-byte alignment for aligned_u64 xfb_address[4], but the compiler inserts another 4 anonymous bytes regardless because aligned_u64's alignment attribute already triggers padding. The named field is misleading.

Fix: removed _pad_xfb entirely. aligned_u64 does the right thing on its own. Comment in struct explains.

HIGH: dirty-mark on Begin/End

Janet's concern was about variant re-selection. With the memory-sink fix, only one variant exists — variant selection is moot. The sysval address value changing across Begin/End is caught by set_gfx_sysval's memcmp + BITSET mechanism, which marks the FAU upload dirty. No additional dirty-mark needed.

Confirmed by the probe: Begin → buffer addr propagates → store_global writes captured data; End → buffer addr would flip back to OOB on the next draw if there was one. (No new probe needed; this path is exercised by probe_xfb.c.)

MEDIUM: counter-buffer silent drop

CmdBeginTransformFeedbackEXT silently ignored pCounterBuffers != NULL despite advertising transformFeedbackDraw=false. Apps reading the spec carefully will not pass counter buffers, but defensive logging helps debugging.

Fix: loud mesa_logw when counter buffers are passed:

panvk: CmdBeginTransformFeedbackEXT: counter buffers not implemented
       (transformFeedbackDraw=false); XFB resume will restart at buffer offset 0

MEDIUM: buffer_count not reset on cmd buffer reset

Stale xfb.buffer_count from a previous recording could leak into a new one.

Fix: in panvk_per_arch(BeginCommandBuffer) for JM, memset(&cmdbuf->state.gfx.xfb, 0, sizeof(...)). Three-line change. Gated on PAN_ARCH < 9 because the xfb substruct only exists there.

LOW: UINT32_MAX vs (1ULL<<32)-1

Janet noted Phase 2 spec was (1ULL << 32) - 1 but I used UINT32_MAX. Numerically identical. Type expression matters less here than I'd thought — VkDeviceSize is uint64 and both forms widen identically. Left as UINT32_MAX for terseness.

LOW: duplicated transformFeedbackPreservesProvokingVertex

Janet noted a duplicated = false between the features block and the EXT_provoking_vertex properties block. Both correctly false; no behavior impact. Defer to upstream Mesa style — this is how mainline panvk_vX_physical_device.c shapes its physical-device fill-in. Not iter13's place to refactor.

What's open

Items Janet flagged as missing for ANGLE/Chromium GLES3 emulation later:

  1. vkCmdDrawIndirectByteCountEXT — needed if ANGLE hits glDrawTransformFeedback. Deferred — transformFeedbackDraw=false is spec-compliant. If iter14 hits this in Brave testing, add then.
  2. VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_PRIMITIVES_WRITTEN_EXT — needed if ANGLE uses GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN. Deferred for the same reason — transformFeedbackQueries=false.

Both are loud-fail (extension/feature not present), not silent-broken. Acceptable.

Verdict

Review obstacles cleared. All CRITICAL + HIGH addressed; both MEDIUM addressed. iter13 ready for Phase 6 integration test.

— claude-noether, 2026-05-20