Files
panvk-bifrost/mesa-panvk-bifrost/phase8_iteration1_close.md
T
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

72 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 1 close — GREEN
Closed **2026-05-19** by mfritsche + claude-noether.
## Locked question
(From [phase0_findings.md](phase0_findings.md))
> Get a minimal Vulkan compute workload to execute end-to-end on PanVk-Bifrost on ohm (PineTab2, Mali-G52 r1 MC1) with `PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1`: write a known value to a host-visible storage buffer from a single-invocation compute shader, fence-wait, read back, verify. No GPU faults in dmesg, no validation errors with `VK_LAYER_KHRONOS_validation` if installable, no submit timeout.
## Result: GREEN
PanVk-Bifrost on Mali-G52 r1 MC1 (RK3566, kernel 7.0.0-danctnix1-6, Mesa 26.0.6) executed the minimal compute probe **end-to-end on the first try**, 6/6 runs in a row, including 1 run with `VK_LAYER_KHRONOS_validation` active. Every step in the probe trace passed:
- `vkCreateInstance` (Vulkan 1.0 core, no extensions)
- `vkEnumeratePhysicalDevices` → "Mali-G52 r1 MC1"
- `vkCreateDevice` (1 queue from family 0, flags=`GRAPHICS|COMPUTE|TRANSFER`)
- `vkCreateBuffer` + `vkAllocateMemory` (memoryType 1: DEVICE_LOCAL|HOST_VISIBLE|HOST_CACHED) + `vkBindBufferMemory`
- `vkMapMemory` (pre-fill 0xDEADBEEF sentinel)
- Descriptor set layout + pool + allocate + update (1 STORAGE_BUFFER binding)
- `vkCreateShaderModule` (560-byte SPV from `glslangValidator -V`)
- `vkCreatePipelineLayout` + `vkCreateComputePipelines`
- Command buffer record: bind pipeline + bind descriptor sets + `vkCmdDispatch(1,1,1)` + memory barrier (SHADER_WRITE → HOST_READ)
- `vkQueueSubmit` + `vkWaitForFences` (5s timeout, completes immediately)
- `vkInvalidateMappedMemoryRanges` + readback
**`buffer[0] = 0xcafebabe` (expected 0xcafebabe)** — no sentinel left behind, no zero, no garbage. The GPU executed the shader and wrote correctly.
Evidence: [`phase0_evidence/iter1_compute_probe_run.txt`](phase0_evidence/iter1_compute_probe_run.txt).
No GPU faults, no MMU faults, no kernel-side panfrost messages logged across 6 runs.
## What the close tells us
Three of the four hypotheses in [phase0_findings.md](phase0_findings.md) are **not blockers** for the minimal compute path:
| Hypothesis | Status at iter1 |
|---|---|
| H1: vkCreateDevice / queue / sync init gap | ✗ no — device + queue + fence + barrier all work |
| H2: Command buffer recording / cmd_dispatch | ✗ no — single dispatch records + submits cleanly |
| H3: Shader compilation / NIR lowering | ✗ no — trivial compute shader compiles and runs correctly |
| H4: WSI / swapchain (deferred out-of-scope) | unchanged — iter1 didn't touch it |
This **doesn't** mean PanVk-Bifrost is universally functional — it means the *minimum-viable compute path* works. Failures still expected when we add complexity (multiple workgroups, larger buffers, complex shaders, descriptor indexing, real graphics, WSI).
The Mesa upstream "not well-tested on v7" gate (panvk_physical_device.c:413425) reads as **conservative** rather than reflecting hard breakage on this minimal path.
## iter1 in-tree artifacts
- [`iter1/probe_compute.c`](iter1/probe_compute.c) — pure Vulkan 1.0 core probe (~270 LoC)
- [`iter1/probe_compute.comp`](iter1/probe_compute.comp) — 4-line GLSL shader
- [`iter1/Makefile`](iter1/Makefile) — `make` builds, `make run` / `make run-validation` runs
## Deferred to iter2+ (not in iter1 scope)
- **Multi-workgroup / multi-invocation compute.** iter1 was 1 workgroup × 1 invocation.
- **Real graphics workload.** iter1 was compute-only. iter2 lock will pivot to graphics.
- **WSI / swapchain.** iter1 used host-visible readback, no display.
- **Larger buffers.** iter1 was 16 bytes nominal / 64 bytes allocated (memReq alignment).
- **Complex shaders.** iter1's shader was a single store; no math, no math+atomic, no nontrivial control flow.
- **TuxRacer / Zink-on-PanVk.** Still the end-goal, still many iters away.
## Next iter — iter2 lock proposal
Smallest viable graphics workload that exercises the **non-compute** pipeline parts on PanVk-Bifrost. Proposed pattern:
> **Allocate a 4×4 `VK_FORMAT_R8G8B8A8_UNORM` image (COLOR_ATTACHMENT | TRANSFER_SRC), transition UNDEFINED → TRANSFER_DST, `vkCmdClearColorImage` to a known color (0x11223344), transition TRANSFER_DST → TRANSFER_SRC, `vkCmdCopyImageToBuffer` to a host-visible buffer, fence-wait, verify all 16 pixels match. No rasterizer, no vertex/fragment shaders, no render pass — just exercise image creation + layout transitions + clear + image-to-buffer copy.**
If that passes, iter3 adds: render pass + vertex/fragment pipeline + a single full-screen triangle that paints a constant color (still no vertex data — fullscreen triangle via `gl_VertexIndex`).
Pacing: iter cadence per libva-multiplanar 8-phase loop. iter2 phase 0 substrate lock when the operator opens the next iter.