a4e7d8ab90
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.
This retrofit imports:
- mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18)
(libmali stub blobs at iter18/blob/ excluded
— 109MB of RE artifacts replaced with a README
pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/ — frozen .tgz source snapshots at each milestone
(basis for the 0005 patch diff generation)
Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.
Total: 1.9 MB across 124 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
85 lines
5.2 KiB
Markdown
85 lines
5.2 KiB
Markdown
# Phase 5 close — iter13 second-model review
|
|
|
|
Reviewer: `janet` (ARM/DDR bare-metal specialist agent — closest available to driver/NIR review).
|
|
Verdict: **NEEDS FIX BEFORE MERGE** (one CRITICAL, two HIGH, two MEDIUM, two LOW).
|
|
Outcome: all CRITICAL + HIGH addressed in this phase, MEDIUM + LOW addressed where cheap.
|
|
|
|
## CRITICAL: single-variant ships, dual-variant was Phase 2 lock
|
|
|
|
Janet's catch: a pipeline with XFB-decorated outputs used in a NON-XFB draw (or in an XFB draw with fewer buffers bound than declared) would write `nir_store_global` to address 0 → GPU page fault → DEVICE_LOST.
|
|
|
|
The Phase 2 lock specified dual-variant (B). Phase 4 shipped single-variant (closer to option C). Janet recommended the dual-variant refactor.
|
|
|
|
**Resolution: option Z (better than B or C) — Panfrost-Gallium memory-sink idiom.**
|
|
|
|
While re-reading `gallium/drivers/panfrost/pan_cmdstream.c:1339-1366`, I found the Gallium PAN_SYSVAL_XFB handler does exactly this: when no XFB target is bound, it sets the address sysval to `0x8000_0000_0000_0000` (= `PAN_SHADER_OOB_ADDRESS`). The Bifrost MMU silently discards stores to this address. No fault. No dual variants. Single-variant solution at no runtime cost.
|
|
|
|
Applied in `panvk_vX_cmd_draw.c`:
|
|
|
|
```c
|
|
uint64_t _xa0 = PAN_SHADER_OOB_ADDRESS, _xa1 = PAN_SHADER_OOB_ADDRESS,
|
|
_xa2 = PAN_SHADER_OOB_ADDRESS, _xa3 = PAN_SHADER_OOB_ADDRESS;
|
|
if (_gfx->xfb.active) {
|
|
if (_gfx->xfb.buffer_count > 0 && _gfx->xfb.buffers[0].addr)
|
|
_xa0 = _gfx->xfb.buffers[0].addr + _gfx->xfb.buffers[0].offset;
|
|
/* ... 1..3 ... */
|
|
}
|
|
```
|
|
|
|
Plus `#include "pan_compiler.h"` for the constant.
|
|
|
|
Saved as project memory `project_pan_shader_oob_address.md` — the canonical conditional-write idiom for Panfrost. Will be useful for any future feature with driver-state-conditional shader writes.
|
|
|
|
A new regression probe `probe_xfb_nodraw.c` covers Janet's exact scenario: same XFB-capable pipeline as `probe_xfb.c` but no Bind/Begin/End and no buffer bound, just a raw vkCmdDraw. Expected: `[PASS] XFB-capable pipeline survives non-XFB draw — memory-sink active.` (DEVICE_LOST = FAIL).
|
|
|
|
## HIGH: `_pad_xfb` ghost padding
|
|
|
|
Janet's catch: the explicit `uint32_t _pad_xfb` after `num_vertices` was supposed to keep 8-byte alignment for `aligned_u64 xfb_address[4]`, but the compiler inserts another 4 anonymous bytes regardless because `aligned_u64`'s alignment attribute already triggers padding. The named field is misleading.
|
|
|
|
**Fix:** removed `_pad_xfb` entirely. `aligned_u64` does the right thing on its own. Comment in struct explains.
|
|
|
|
## HIGH: dirty-mark on Begin/End
|
|
|
|
Janet's concern was about variant re-selection. With the memory-sink fix, only one variant exists — variant selection is moot. The sysval address value changing across Begin/End is caught by `set_gfx_sysval`'s memcmp + BITSET mechanism, which marks the FAU upload dirty. No additional dirty-mark needed.
|
|
|
|
Confirmed by the probe: Begin → buffer addr propagates → store_global writes captured data; End → buffer addr would flip back to OOB on the next draw if there was one. (No new probe needed; this path is exercised by `probe_xfb.c`.)
|
|
|
|
## MEDIUM: counter-buffer silent drop
|
|
|
|
`CmdBeginTransformFeedbackEXT` silently ignored `pCounterBuffers != NULL` despite advertising `transformFeedbackDraw=false`. Apps reading the spec carefully will not pass counter buffers, but defensive logging helps debugging.
|
|
|
|
**Fix:** loud `mesa_logw` when counter buffers are passed:
|
|
|
|
```
|
|
panvk: CmdBeginTransformFeedbackEXT: counter buffers not implemented
|
|
(transformFeedbackDraw=false); XFB resume will restart at buffer offset 0
|
|
```
|
|
|
|
## MEDIUM: buffer_count not reset on cmd buffer reset
|
|
|
|
Stale `xfb.buffer_count` from a previous recording could leak into a new one.
|
|
|
|
**Fix:** in `panvk_per_arch(BeginCommandBuffer)` for JM, `memset(&cmdbuf->state.gfx.xfb, 0, sizeof(...))`. Three-line change. Gated on `PAN_ARCH < 9` because the `xfb` substruct only exists there.
|
|
|
|
## LOW: `UINT32_MAX` vs `(1ULL<<32)-1`
|
|
|
|
Janet noted Phase 2 spec was `(1ULL << 32) - 1` but I used `UINT32_MAX`. Numerically identical. Type expression matters less here than I'd thought — `VkDeviceSize` is uint64 and both forms widen identically. Left as `UINT32_MAX` for terseness.
|
|
|
|
## LOW: duplicated `transformFeedbackPreservesProvokingVertex`
|
|
|
|
Janet noted a duplicated `= false` between the features block and the EXT_provoking_vertex properties block. Both correctly false; no behavior impact. Defer to upstream Mesa style — this is how mainline panvk_vX_physical_device.c shapes its physical-device fill-in. Not iter13's place to refactor.
|
|
|
|
## What's open
|
|
|
|
Items Janet flagged as missing for ANGLE/Chromium GLES3 emulation later:
|
|
1. `vkCmdDrawIndirectByteCountEXT` — needed if ANGLE hits `glDrawTransformFeedback`. Deferred — `transformFeedbackDraw=false` is spec-compliant. If iter14 hits this in Brave testing, add then.
|
|
2. `VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_PRIMITIVES_WRITTEN_EXT` — needed if ANGLE uses `GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN`. Deferred for the same reason — `transformFeedbackQueries=false`.
|
|
|
|
Both are loud-fail (extension/feature not present), not silent-broken. Acceptable.
|
|
|
|
## Verdict
|
|
|
|
Review obstacles cleared. All CRITICAL + HIGH addressed; both MEDIUM addressed. iter13 ready for Phase 6 integration test.
|
|
|
|
— claude-noether, 2026-05-20
|