initial seed: retrofit campaign lineage from local working trees

panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-23 05:25:37 +02:00
parent 430d0da278
commit a4e7d8ab90
124 changed files with 22551 additions and 1 deletions
@@ -0,0 +1,71 @@
# Phase 0 — substrate for iter2
Opened **2026-05-19** by mfritsche + claude-noether, immediately after iter1 closed GREEN ([phase8_iteration1_close.md](phase8_iteration1_close.md)).
## Locked research question — iter2
> **Get a minimal Vulkan image-side workload to execute end-to-end on PanVk-Bifrost (ohm / Mali-G52 r1 MC1 / `PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1`): create a 4×4 `VK_FORMAT_R8G8B8A8_UNORM` image with `TRANSFER_DST|TRANSFER_SRC` usage on device-local memory, transition UNDEFINED → TRANSFER_DST_OPTIMAL, `vkCmdClearColorImage` to color 0x11223344 (R=0x11 G=0x22 B=0x33 A=0x44), transition TRANSFER_DST_OPTIMAL → TRANSFER_SRC_OPTIMAL, `vkCmdCopyImageToBuffer` to host-visible staging, fence-wait, verify all 16 pixels read back as 0x44332211 (little-endian uint32). No GPU faults in dmesg, no validation errors.**
>
> If GREEN: lock iter3 against a first-triangle graphics pipeline (vertex + fragment shader, fullscreen triangle via `gl_VertexIndex`, render pass or dynamic rendering, single draw, readback).
>
> If RED: characterize the first failure point and fix or work around in iter2.
## Why this shape
iter1 collapsed three of four phase0 hypotheses on the compute side (device init, cmd-buffer recording, shader compilation). iter2 bridges from compute to graphics by adding **only image-handling machinery**, keeping the same submit/sync skeleton:
- `VkImage` create + bind (new)
- Image layout transitions via `VkImageMemoryBarrier` (new)
- `vkCmdClearColorImage` (new — but it's a transfer op, not a real graphics pipeline)
- `vkCmdCopyImageToBuffer` (new — but again a transfer op)
- Optimal tiling (new — Bifrost arranges tiles differently from Valhall in some cases)
Notably **not** in iter2:
- Render pass / dynamic rendering
- Vertex + fragment shaders
- Graphics pipeline state (rasterizer, viewport, blend, depth)
- Vertex buffers / index buffers
- Framebuffer
So if iter2 fails, the failure points to **image/layout/transfer machinery**, not "graphics pipeline" in general. That's a usefully narrow target.
## Hypothesis space (where iter2 may fail)
1. **Image creation + memory binding.** Bifrost has specific tiling layouts (e.g. block-based tiling). `vkGetImageMemoryRequirements` may report a size and alignment Mesa's PanVk-Bifrost path can't satisfy, or the allocator may pick a memory type that's not actually usable for an optimal-tiled image.
2. **Layout transitions via image barriers.** The Bifrost cache / tiler invalidation hooks may not be wired into the JM submit path consistently. Specifically: UNDEFINED → TRANSFER_DST and TRANSFER_DST → TRANSFER_SRC transitions need to flush L2 / invalidate tile caches, and that's per-arch code that may have rotted in the v6/v7 paths.
3. **`vkCmdClearColorImage` lowering.** PanVk may lower `vkCmdClearColorImage` to either a real hardware clear (tile-level) or a compute shader (meta clear). Bifrost-specific paths exist (the lone `bifrost/panvk_vX_meta_desc_copy.c` is descriptor-copy meta — a similar clear-meta path may or may not work).
4. **`vkCmdCopyImageToBuffer` + tiling decode.** Bifrost optimal tiling is non-linear — a copy from optimal-tiled image to a linear buffer needs the tiler to detile correctly. If detile is wrong, the readback will show pixel-shuffled output (recognizable from the pattern of 0x11/0x22/0x33/0x44 bytes).
The clear color 0x11223344 was chosen specifically: each pixel byte is distinct, so a pixel-shuffle bug will show up as wrong-byte-order rather than all-zeros (which would mean clear didn't fire at all).
## Phase 0 deliverables
This document. iter2's substrate is lighter than iter1's because iter1 already proved out the broader environment.
## In-scope (LOCKED 2026-05-19 for iter2)
- Hardware: ohm only.
- Format: R8G8B8A8_UNORM, optimal tiling, 4×4, 1 mip, 1 layer.
- Operations: image create + bind + 2 layout transitions + clear + image-to-buffer copy.
- Verification: all 16 pixels = 0x44332211.
## Out-of-scope (LOCKED 2026-05-19 for iter2)
- Render pass / dynamic rendering (iter3).
- Vertex / fragment shaders (iter3).
- Graphics pipeline state (iter3).
- WSI / swapchain (iter4+).
- Larger / multi-mip / multi-layer / multi-sample images.
- Other formats (R5G6B5, R32G32B32A32_SFLOAT, depth/stencil, ASTC). Add later if it makes sense to exercise per-format codepaths.
- Sub-region clears, scissored copies.
- Compute path (proven in iter1; not revisited).
- Upstreaming.
## Reference
- [phase0_findings.md](phase0_findings.md) — campaign substrate.
- [phase8_iteration1_close.md](phase8_iteration1_close.md) — iter1 close.
- [iter1/](iter1/) — compute probe (reusable skeleton for iter2).