initial seed: retrofit campaign lineage from local working trees

panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-23 05:25:37 +02:00
parent 430d0da278
commit a4e7d8ab90
124 changed files with 22551 additions and 1 deletions
+59
View File
@@ -0,0 +1,59 @@
# Phase 4 close — iter13 VK_EXT_transform_feedback implementation
**Result:** GREEN. PanVk-Bifrost now implements VK_EXT_transform_feedback end-to-end.
## Probe outcome
```
[info] VK_EXT_transform_feedback present on device
[info] transformFeedback=1 geometryStreams=0
[info] vertex 0: (0.000000, 0.000000, 4660.000000, 51966.000000)
[info] vertex 1: (1.000000, 0.000000, 4660.000000, 51966.000000)
[info] vertex 2: (2.000000, 0.000000, 4660.000000, 51966.000000)
[PASS] PanVk-Bifrost transform feedback: 3 vertices captured correctly.
```
Byte-exact match against expected `vec4(vertex_id, instance_id=0, 0x1234, 0xcafe)` for each of 3 vertices. Output buffer was pre-filled with `0xDEADBEEF` sentinel — verified GPU actually wrote real data, not a stale init pattern.
## Source landings on ohm (mesa 26.0.6)
Files modified (1 NEW + 6 edited):
| File | Change |
|---|---|
| `src/panfrost/vulkan/panvk_shader.h` | sysval struct: + `uint32_t num_vertices`, `uint64_t xfb_address[4]` (under `PAN_ARCH < 9`) |
| `src/panfrost/vulkan/panvk_vX_physical_device.c` | extension + feature + properties exposure (`PAN_ARCH < 9` gate) |
| `src/panfrost/vulkan/panvk_vX_shader.c` | (1) `#include "pan_nir.h"` (2) sysval lowering cases for `load_num_vertices` + `load_xfb_address[0..3]` (3) the 3-pass XFB lowering (`nir_opt_constant_folding``nir_io_add_intrinsic_xfb_info``pan_nir_lower_xfb`) inserted **AFTER `nir_lower_io`** in `panvk_lower_nir` (4) `inputs.no_idvs` true for XFB-bearing vertex shaders |
| `src/panfrost/vulkan/panvk_cmd_draw.h` | + `xfb` substruct in `panvk_cmd_graphics_state` (active flag + buffer_count + 4× buffers) |
| `src/panfrost/vulkan/panvk_vX_cmd_draw.c` | per-draw `set_gfx_sysval` for `vs.num_vertices` + `vs.xfb_address[0..3]` |
| `src/panfrost/vulkan/jm/panvk_vX_cmd_xfb.c` | NEW — `CmdBind/Begin/EndTransformFeedbackEXT` entry points |
| `src/panfrost/vulkan/meson.build` | + `'jm/panvk_vX_cmd_xfb.c'` in jm_files |
## Key learnings (vs Phase 1 source map)
1. **Pass placement matters.** Phase 1's plan put `pan_nir_lower_xfb` inside `panvk_preprocess_nir`. Wrong — at that point the shader still has `store_deref` (var-based) intrinsics. `nir_lower_io` (which converts var-stores → `store_output` intrinsics) runs later inside `panvk_lower_nir`. The pass must run **right after `nir_lower_io`**, mirroring Panfrost-Gallium's flow where `nir_lower_io` precedes the XFB block in `pan_create_shader_state`.
2. **`nir_io_add_intrinsic_xfb_info` is mandatory.** Phase 1 assumed `nir->xfb_info` was the gate. Wrong — Mesa's pass that converts SPV xfb decorations into intrinsic-attached `io_xfb` info needs to run first. Gating on `nir->info.has_transform_feedback_varyings` instead (set by SPV→NIR for XFB-decorated outputs) is the correct trigger.
3. **`no_idvs` is non-negotiable.** Phase 1 noted Panfrost-Gallium sets `inputs.no_idvs = has_transform_feedback_varyings` but framed it as optional. It isn't — IDVS splits vertex shading into position + varying paths, but the JM job model for the varying path doesn't run for raster-discarded draws. Single non-IDVS vertex job is required for XFB.
4. **The sysval dirty mechanism does work for array fields.** `set_gfx_sysval(..., vs.xfb_address[0], _xa0)` expands correctly via `offsetof(struct, vs.xfb_address[0])` + `sizeof(uint64_t)` macros. Confirmed empirically — the FAU upload triggered as expected and the shader read the correct address.
## What the working shader looks like
After all passes, the vertex shader does:
```
store_global(addr = xfb_address[0] + (instance_id * num_vertices + vertex_id) * stride,
value = (vertex_id_as_float, instance_id_as_float, 4660.0, 51966.0))
```
Where `xfb_address[0]` is a 64-bit FAU sysval populated per-draw from `cmdbuf->state.gfx.xfb.buffers[0].addr + offset`.
## Phase 4 artifact snapshot
Working state of all 7 source files captured in `iter13/applied_state/` for replication.
## Next: Phase 5
Per CLAUDE.md "Reviews are never skippable" — second-model review of the implementation.