initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.
This retrofit imports:
- mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18)
(libmali stub blobs at iter18/blob/ excluded
— 109MB of RE artifacts replaced with a README
pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/ — frozen .tgz source snapshots at each milestone
(basis for the 0005 patch diff generation)
Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.
Total: 1.9 MB across 124 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,71 @@
|
||||
# Phase 0 — substrate for iter2
|
||||
|
||||
Opened **2026-05-19** by mfritsche + claude-noether, immediately after iter1 closed GREEN ([phase8_iteration1_close.md](phase8_iteration1_close.md)).
|
||||
|
||||
## Locked research question — iter2
|
||||
|
||||
> **Get a minimal Vulkan image-side workload to execute end-to-end on PanVk-Bifrost (ohm / Mali-G52 r1 MC1 / `PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1`): create a 4×4 `VK_FORMAT_R8G8B8A8_UNORM` image with `TRANSFER_DST|TRANSFER_SRC` usage on device-local memory, transition UNDEFINED → TRANSFER_DST_OPTIMAL, `vkCmdClearColorImage` to color 0x11223344 (R=0x11 G=0x22 B=0x33 A=0x44), transition TRANSFER_DST_OPTIMAL → TRANSFER_SRC_OPTIMAL, `vkCmdCopyImageToBuffer` to host-visible staging, fence-wait, verify all 16 pixels read back as 0x44332211 (little-endian uint32). No GPU faults in dmesg, no validation errors.**
|
||||
>
|
||||
> If GREEN: lock iter3 against a first-triangle graphics pipeline (vertex + fragment shader, fullscreen triangle via `gl_VertexIndex`, render pass or dynamic rendering, single draw, readback).
|
||||
>
|
||||
> If RED: characterize the first failure point and fix or work around in iter2.
|
||||
|
||||
## Why this shape
|
||||
|
||||
iter1 collapsed three of four phase0 hypotheses on the compute side (device init, cmd-buffer recording, shader compilation). iter2 bridges from compute to graphics by adding **only image-handling machinery**, keeping the same submit/sync skeleton:
|
||||
|
||||
- `VkImage` create + bind (new)
|
||||
- Image layout transitions via `VkImageMemoryBarrier` (new)
|
||||
- `vkCmdClearColorImage` (new — but it's a transfer op, not a real graphics pipeline)
|
||||
- `vkCmdCopyImageToBuffer` (new — but again a transfer op)
|
||||
- Optimal tiling (new — Bifrost arranges tiles differently from Valhall in some cases)
|
||||
|
||||
Notably **not** in iter2:
|
||||
- Render pass / dynamic rendering
|
||||
- Vertex + fragment shaders
|
||||
- Graphics pipeline state (rasterizer, viewport, blend, depth)
|
||||
- Vertex buffers / index buffers
|
||||
- Framebuffer
|
||||
|
||||
So if iter2 fails, the failure points to **image/layout/transfer machinery**, not "graphics pipeline" in general. That's a usefully narrow target.
|
||||
|
||||
## Hypothesis space (where iter2 may fail)
|
||||
|
||||
1. **Image creation + memory binding.** Bifrost has specific tiling layouts (e.g. block-based tiling). `vkGetImageMemoryRequirements` may report a size and alignment Mesa's PanVk-Bifrost path can't satisfy, or the allocator may pick a memory type that's not actually usable for an optimal-tiled image.
|
||||
|
||||
2. **Layout transitions via image barriers.** The Bifrost cache / tiler invalidation hooks may not be wired into the JM submit path consistently. Specifically: UNDEFINED → TRANSFER_DST and TRANSFER_DST → TRANSFER_SRC transitions need to flush L2 / invalidate tile caches, and that's per-arch code that may have rotted in the v6/v7 paths.
|
||||
|
||||
3. **`vkCmdClearColorImage` lowering.** PanVk may lower `vkCmdClearColorImage` to either a real hardware clear (tile-level) or a compute shader (meta clear). Bifrost-specific paths exist (the lone `bifrost/panvk_vX_meta_desc_copy.c` is descriptor-copy meta — a similar clear-meta path may or may not work).
|
||||
|
||||
4. **`vkCmdCopyImageToBuffer` + tiling decode.** Bifrost optimal tiling is non-linear — a copy from optimal-tiled image to a linear buffer needs the tiler to detile correctly. If detile is wrong, the readback will show pixel-shuffled output (recognizable from the pattern of 0x11/0x22/0x33/0x44 bytes).
|
||||
|
||||
The clear color 0x11223344 was chosen specifically: each pixel byte is distinct, so a pixel-shuffle bug will show up as wrong-byte-order rather than all-zeros (which would mean clear didn't fire at all).
|
||||
|
||||
## Phase 0 deliverables
|
||||
|
||||
This document. iter2's substrate is lighter than iter1's because iter1 already proved out the broader environment.
|
||||
|
||||
## In-scope (LOCKED 2026-05-19 for iter2)
|
||||
|
||||
- Hardware: ohm only.
|
||||
- Format: R8G8B8A8_UNORM, optimal tiling, 4×4, 1 mip, 1 layer.
|
||||
- Operations: image create + bind + 2 layout transitions + clear + image-to-buffer copy.
|
||||
- Verification: all 16 pixels = 0x44332211.
|
||||
|
||||
## Out-of-scope (LOCKED 2026-05-19 for iter2)
|
||||
|
||||
- Render pass / dynamic rendering (iter3).
|
||||
- Vertex / fragment shaders (iter3).
|
||||
- Graphics pipeline state (iter3).
|
||||
- WSI / swapchain (iter4+).
|
||||
- Larger / multi-mip / multi-layer / multi-sample images.
|
||||
- Other formats (R5G6B5, R32G32B32A32_SFLOAT, depth/stencil, ASTC). Add later if it makes sense to exercise per-format codepaths.
|
||||
- Sub-region clears, scissored copies.
|
||||
- Compute path (proven in iter1; not revisited).
|
||||
- Upstreaming.
|
||||
|
||||
## Reference
|
||||
|
||||
- [phase0_findings.md](phase0_findings.md) — campaign substrate.
|
||||
- [phase8_iteration1_close.md](phase8_iteration1_close.md) — iter1 close.
|
||||
- [iter1/](iter1/) — compute probe (reusable skeleton for iter2).
|
||||
Reference in New Issue
Block a user