a4e7d8ab90
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.
This retrofit imports:
- mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18)
(libmali stub blobs at iter18/blob/ excluded
— 109MB of RE artifacts replaced with a README
pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/ — frozen .tgz source snapshots at each milestone
(basis for the 0005 patch diff generation)
Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.
Total: 1.9 MB across 124 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
72 lines
4.7 KiB
Markdown
72 lines
4.7 KiB
Markdown
# Iteration 1 close — GREEN
|
||
|
||
Closed **2026-05-19** by mfritsche + claude-noether.
|
||
|
||
## Locked question
|
||
|
||
(From [phase0_findings.md](phase0_findings.md))
|
||
|
||
> Get a minimal Vulkan compute workload to execute end-to-end on PanVk-Bifrost on ohm (PineTab2, Mali-G52 r1 MC1) with `PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1`: write a known value to a host-visible storage buffer from a single-invocation compute shader, fence-wait, read back, verify. No GPU faults in dmesg, no validation errors with `VK_LAYER_KHRONOS_validation` if installable, no submit timeout.
|
||
|
||
## Result: GREEN
|
||
|
||
PanVk-Bifrost on Mali-G52 r1 MC1 (RK3566, kernel 7.0.0-danctnix1-6, Mesa 26.0.6) executed the minimal compute probe **end-to-end on the first try**, 6/6 runs in a row, including 1 run with `VK_LAYER_KHRONOS_validation` active. Every step in the probe trace passed:
|
||
|
||
- `vkCreateInstance` (Vulkan 1.0 core, no extensions)
|
||
- `vkEnumeratePhysicalDevices` → "Mali-G52 r1 MC1"
|
||
- `vkCreateDevice` (1 queue from family 0, flags=`GRAPHICS|COMPUTE|TRANSFER`)
|
||
- `vkCreateBuffer` + `vkAllocateMemory` (memoryType 1: DEVICE_LOCAL|HOST_VISIBLE|HOST_CACHED) + `vkBindBufferMemory`
|
||
- `vkMapMemory` (pre-fill 0xDEADBEEF sentinel)
|
||
- Descriptor set layout + pool + allocate + update (1 STORAGE_BUFFER binding)
|
||
- `vkCreateShaderModule` (560-byte SPV from `glslangValidator -V`)
|
||
- `vkCreatePipelineLayout` + `vkCreateComputePipelines`
|
||
- Command buffer record: bind pipeline + bind descriptor sets + `vkCmdDispatch(1,1,1)` + memory barrier (SHADER_WRITE → HOST_READ)
|
||
- `vkQueueSubmit` + `vkWaitForFences` (5s timeout, completes immediately)
|
||
- `vkInvalidateMappedMemoryRanges` + readback
|
||
|
||
**`buffer[0] = 0xcafebabe` (expected 0xcafebabe)** — no sentinel left behind, no zero, no garbage. The GPU executed the shader and wrote correctly.
|
||
|
||
Evidence: [`phase0_evidence/iter1_compute_probe_run.txt`](phase0_evidence/iter1_compute_probe_run.txt).
|
||
|
||
No GPU faults, no MMU faults, no kernel-side panfrost messages logged across 6 runs.
|
||
|
||
## What the close tells us
|
||
|
||
Three of the four hypotheses in [phase0_findings.md](phase0_findings.md) are **not blockers** for the minimal compute path:
|
||
|
||
| Hypothesis | Status at iter1 |
|
||
|---|---|
|
||
| H1: vkCreateDevice / queue / sync init gap | ✗ no — device + queue + fence + barrier all work |
|
||
| H2: Command buffer recording / cmd_dispatch | ✗ no — single dispatch records + submits cleanly |
|
||
| H3: Shader compilation / NIR lowering | ✗ no — trivial compute shader compiles and runs correctly |
|
||
| H4: WSI / swapchain (deferred out-of-scope) | unchanged — iter1 didn't touch it |
|
||
|
||
This **doesn't** mean PanVk-Bifrost is universally functional — it means the *minimum-viable compute path* works. Failures still expected when we add complexity (multiple workgroups, larger buffers, complex shaders, descriptor indexing, real graphics, WSI).
|
||
|
||
The Mesa upstream "not well-tested on v7" gate (panvk_physical_device.c:413–425) reads as **conservative** rather than reflecting hard breakage on this minimal path.
|
||
|
||
## iter1 in-tree artifacts
|
||
|
||
- [`iter1/probe_compute.c`](iter1/probe_compute.c) — pure Vulkan 1.0 core probe (~270 LoC)
|
||
- [`iter1/probe_compute.comp`](iter1/probe_compute.comp) — 4-line GLSL shader
|
||
- [`iter1/Makefile`](iter1/Makefile) — `make` builds, `make run` / `make run-validation` runs
|
||
|
||
## Deferred to iter2+ (not in iter1 scope)
|
||
|
||
- **Multi-workgroup / multi-invocation compute.** iter1 was 1 workgroup × 1 invocation.
|
||
- **Real graphics workload.** iter1 was compute-only. iter2 lock will pivot to graphics.
|
||
- **WSI / swapchain.** iter1 used host-visible readback, no display.
|
||
- **Larger buffers.** iter1 was 16 bytes nominal / 64 bytes allocated (memReq alignment).
|
||
- **Complex shaders.** iter1's shader was a single store; no math, no math+atomic, no nontrivial control flow.
|
||
- **TuxRacer / Zink-on-PanVk.** Still the end-goal, still many iters away.
|
||
|
||
## Next iter — iter2 lock proposal
|
||
|
||
Smallest viable graphics workload that exercises the **non-compute** pipeline parts on PanVk-Bifrost. Proposed pattern:
|
||
|
||
> **Allocate a 4×4 `VK_FORMAT_R8G8B8A8_UNORM` image (COLOR_ATTACHMENT | TRANSFER_SRC), transition UNDEFINED → TRANSFER_DST, `vkCmdClearColorImage` to a known color (0x11223344), transition TRANSFER_DST → TRANSFER_SRC, `vkCmdCopyImageToBuffer` to a host-visible buffer, fence-wait, verify all 16 pixels match. No rasterizer, no vertex/fragment shaders, no render pass — just exercise image creation + layout transitions + clear + image-to-buffer copy.**
|
||
|
||
If that passes, iter3 adds: render pass + vertex/fragment pipeline + a single full-screen triangle that paints a constant color (still no vertex data — fullscreen triangle via `gl_VertexIndex`).
|
||
|
||
Pacing: iter cadence per libva-multiplanar 8-phase loop. iter2 phase 0 substrate lock when the operator opens the next iter.
|