Files
panvk-bifrost/mesa-panvk-bifrost/phase4_iter13_close.md
T
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

60 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 4 close — iter13 VK_EXT_transform_feedback implementation
**Result:** GREEN. PanVk-Bifrost now implements VK_EXT_transform_feedback end-to-end.
## Probe outcome
```
[info] VK_EXT_transform_feedback present on device
[info] transformFeedback=1 geometryStreams=0
[info] vertex 0: (0.000000, 0.000000, 4660.000000, 51966.000000)
[info] vertex 1: (1.000000, 0.000000, 4660.000000, 51966.000000)
[info] vertex 2: (2.000000, 0.000000, 4660.000000, 51966.000000)
[PASS] PanVk-Bifrost transform feedback: 3 vertices captured correctly.
```
Byte-exact match against expected `vec4(vertex_id, instance_id=0, 0x1234, 0xcafe)` for each of 3 vertices. Output buffer was pre-filled with `0xDEADBEEF` sentinel — verified GPU actually wrote real data, not a stale init pattern.
## Source landings on ohm (mesa 26.0.6)
Files modified (1 NEW + 6 edited):
| File | Change |
|---|---|
| `src/panfrost/vulkan/panvk_shader.h` | sysval struct: + `uint32_t num_vertices`, `uint64_t xfb_address[4]` (under `PAN_ARCH < 9`) |
| `src/panfrost/vulkan/panvk_vX_physical_device.c` | extension + feature + properties exposure (`PAN_ARCH < 9` gate) |
| `src/panfrost/vulkan/panvk_vX_shader.c` | (1) `#include "pan_nir.h"` (2) sysval lowering cases for `load_num_vertices` + `load_xfb_address[0..3]` (3) the 3-pass XFB lowering (`nir_opt_constant_folding``nir_io_add_intrinsic_xfb_info``pan_nir_lower_xfb`) inserted **AFTER `nir_lower_io`** in `panvk_lower_nir` (4) `inputs.no_idvs` true for XFB-bearing vertex shaders |
| `src/panfrost/vulkan/panvk_cmd_draw.h` | + `xfb` substruct in `panvk_cmd_graphics_state` (active flag + buffer_count + 4× buffers) |
| `src/panfrost/vulkan/panvk_vX_cmd_draw.c` | per-draw `set_gfx_sysval` for `vs.num_vertices` + `vs.xfb_address[0..3]` |
| `src/panfrost/vulkan/jm/panvk_vX_cmd_xfb.c` | NEW — `CmdBind/Begin/EndTransformFeedbackEXT` entry points |
| `src/panfrost/vulkan/meson.build` | + `'jm/panvk_vX_cmd_xfb.c'` in jm_files |
## Key learnings (vs Phase 1 source map)
1. **Pass placement matters.** Phase 1's plan put `pan_nir_lower_xfb` inside `panvk_preprocess_nir`. Wrong — at that point the shader still has `store_deref` (var-based) intrinsics. `nir_lower_io` (which converts var-stores → `store_output` intrinsics) runs later inside `panvk_lower_nir`. The pass must run **right after `nir_lower_io`**, mirroring Panfrost-Gallium's flow where `nir_lower_io` precedes the XFB block in `pan_create_shader_state`.
2. **`nir_io_add_intrinsic_xfb_info` is mandatory.** Phase 1 assumed `nir->xfb_info` was the gate. Wrong — Mesa's pass that converts SPV xfb decorations into intrinsic-attached `io_xfb` info needs to run first. Gating on `nir->info.has_transform_feedback_varyings` instead (set by SPV→NIR for XFB-decorated outputs) is the correct trigger.
3. **`no_idvs` is non-negotiable.** Phase 1 noted Panfrost-Gallium sets `inputs.no_idvs = has_transform_feedback_varyings` but framed it as optional. It isn't — IDVS splits vertex shading into position + varying paths, but the JM job model for the varying path doesn't run for raster-discarded draws. Single non-IDVS vertex job is required for XFB.
4. **The sysval dirty mechanism does work for array fields.** `set_gfx_sysval(..., vs.xfb_address[0], _xa0)` expands correctly via `offsetof(struct, vs.xfb_address[0])` + `sizeof(uint64_t)` macros. Confirmed empirically — the FAU upload triggered as expected and the shader read the correct address.
## What the working shader looks like
After all passes, the vertex shader does:
```
store_global(addr = xfb_address[0] + (instance_id * num_vertices + vertex_id) * stride,
value = (vertex_id_as_float, instance_id_as_float, 4660.0, 51966.0))
```
Where `xfb_address[0]` is a 64-bit FAU sysval populated per-draw from `cmdbuf->state.gfx.xfb.buffers[0].addr + offset`.
## Phase 4 artifact snapshot
Working state of all 7 source files captured in `iter13/applied_state/` for replication.
## Next: Phase 5
Per CLAUDE.md "Reviews are never skippable" — second-model review of the implementation.