panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.
This retrofit imports:
- mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18)
(libmali stub blobs at iter18/blob/ excluded
— 109MB of RE artifacts replaced with a README
pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/ — frozen .tgz source snapshots at each milestone
(basis for the 0005 patch diff generation)
Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.
Total: 1.9 MB across 124 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.4 KiB
Iteration 5 close — GREEN
Closed 2026-05-19, same session as iter1+2+3+4.
Locked question
(From phase0_findings_iter5.md)
Render a non-fullscreen triangle into a 64×64 RGBA8 attachment using vertex buffer (interleaved pos+color, 32-byte stride) + UBO (mat4, scale 0.8). Verify center pixel has interpolated mix, corners are clear, coverage in expected range.
Result: GREEN
7/7 runs PASS (after verification-range fix — see "Process note" below). Center pixel (32, 28) consistently 0xff5d564c (R=0x4c G=0x56 B=0x5d, all 3 channels contributing). 338 covered pixels per run, deterministic. No GPU faults, no validation warnings.
Evidence: phase0_evidence/iter5_vbo_ubo_run.txt.
Process note
First run reported "coverage out of range" — that was my (claude-noether's) arithmetic error on the verification range, not a driver issue. I initially expected 800..1600 covered pixels; the correct expected value was ~328 (triangle area 0.32 sq NDC ÷ viewport area 4 sq NDC = 8% × 4096 = ~328). The driver produced 338, well within edge-rule tolerance. Fixed range to [200, 500] and reran 7/7 PASS. Substantive checks (center color, TL/TR clear) were correct from the start.
Memory worth saving: when writing future probes that bound coverage area, the math is (triangle_area_in_ndc / 4.0) * total_pixels, not triangle_area_as_fraction_of_bbox * total_pixels. Double-check.
What the close tells us
All six hypotheses in phase0_findings_iter5.md — none materialized.
| Hypothesis | Status |
|---|---|
| H1: Vertex input bindings on Bifrost | ✗ works (2 attrs from interleaved buffer) |
| H2: UBO descriptor binding for vertex stage | ✗ works (mat4 read + applied) |
| H3: Vertex-stage descriptor NIR lowering | ✗ works |
| H4: Varying interpolation | ✗ works (barycentric R/G/B at center matches expected) |
| H5: UBO data fetch from GPU memory | ✗ works (triangle scaled to 0.8 ⇒ coverage matches scaled area) |
| H6: Non-fullscreen rasterization edge cases | ✗ works (edges + corners clean) |
Cumulative state (iter1–5, ~33 runs, zero failures from the driver): PanVk-Bifrost on Mali-G52 r1 v7 handles all of:
- Compute pipeline (dispatch + storage buffer)
- Graphics pipeline (vert + frag + rasterizer + tile binner)
- Dynamic rendering
- All barrier flavors + all layout transitions
- Transfer ops (CopyBufferToImage, CopyImageToBuffer, ClearColorImage)
- COMBINED_IMAGE_SAMPLER descriptors (frag stage)
- UNIFORM_BUFFER descriptors (vertex stage)
- Vertex input bindings + attributes
- texelFetch from sampled image
- Varying interpolation
- UBO data flow vertex shader
The "well-tested on v7? NO" gate at panvk_physical_device.c:413 has held up as defensive over five iters. PanVk-Bifrost on this hardware does fundamentally work for what we've thrown at it.
iter5 in-tree artifacts
iter5/probe_vbo_ubo.c— vertex+UBO probeiter5/probe_vbo_ubo.vertiter5/probe_vbo_ubo.fragiter5/Makefile
Next iter — iter6 lock proposal
The campaign has blown through 5 minimal probes without finding a single driver bug. Time to either (a) stress-test with a more complex synthetic workload or (b) jump to a real off-the-shelf app and see what breaks.
Going with (a) stress synthetic first because it's more diagnostically useful — if a real app breaks at iter7, we want to know whether it's something we already tested in isolation.
iter6 lock proposal: depth-tested multi-draw scene. 128×128 RGBA8 color attachment + 128×128 D32_SFLOAT depth attachment. Two triangles drawn in sequence: a "back" red triangle at z=0.7, a "front" green triangle at z=0.3, partially overlapping. Verify: (a) in the overlap region, only green is visible (depth test works), (b) in red-only region, red is visible, (c) in clear region, clear color, (d) coverage counts plausible for both individual triangles.
This adds: depth attachment, depth-stencil image format, two separate draws within one render pass, depth state in graphics pipeline, z-coordinate handling in vertex shader.
Then iter7 = real-app test (vkcube headless or via display). Then iter8 = Zink-on-PanVk smoke (GL → Vulkan via Mesa Zink, run glmark2 or es2gears). Then iter9+ = TuxRacer.
Pacing per the 8-phase loop.