Files
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

4.4 KiB
Raw Permalink Blame History

Iteration 5 close — GREEN

Closed 2026-05-19, same session as iter1+2+3+4.

Locked question

(From phase0_findings_iter5.md)

Render a non-fullscreen triangle into a 64×64 RGBA8 attachment using vertex buffer (interleaved pos+color, 32-byte stride) + UBO (mat4, scale 0.8). Verify center pixel has interpolated mix, corners are clear, coverage in expected range.

Result: GREEN

7/7 runs PASS (after verification-range fix — see "Process note" below). Center pixel (32, 28) consistently 0xff5d564c (R=0x4c G=0x56 B=0x5d, all 3 channels contributing). 338 covered pixels per run, deterministic. No GPU faults, no validation warnings.

Evidence: phase0_evidence/iter5_vbo_ubo_run.txt.

Process note

First run reported "coverage out of range" — that was my (claude-noether's) arithmetic error on the verification range, not a driver issue. I initially expected 800..1600 covered pixels; the correct expected value was ~328 (triangle area 0.32 sq NDC ÷ viewport area 4 sq NDC = 8% × 4096 = ~328). The driver produced 338, well within edge-rule tolerance. Fixed range to [200, 500] and reran 7/7 PASS. Substantive checks (center color, TL/TR clear) were correct from the start.

Memory worth saving: when writing future probes that bound coverage area, the math is (triangle_area_in_ndc / 4.0) * total_pixels, not triangle_area_as_fraction_of_bbox * total_pixels. Double-check.

What the close tells us

All six hypotheses in phase0_findings_iter5.mdnone materialized.

Hypothesis Status
H1: Vertex input bindings on Bifrost ✗ works (2 attrs from interleaved buffer)
H2: UBO descriptor binding for vertex stage ✗ works (mat4 read + applied)
H3: Vertex-stage descriptor NIR lowering ✗ works
H4: Varying interpolation ✗ works (barycentric R/G/B at center matches expected)
H5: UBO data fetch from GPU memory ✗ works (triangle scaled to 0.8 ⇒ coverage matches scaled area)
H6: Non-fullscreen rasterization edge cases ✗ works (edges + corners clean)

Cumulative state (iter15, ~33 runs, zero failures from the driver): PanVk-Bifrost on Mali-G52 r1 v7 handles all of:

  • Compute pipeline (dispatch + storage buffer)
  • Graphics pipeline (vert + frag + rasterizer + tile binner)
  • Dynamic rendering
  • All barrier flavors + all layout transitions
  • Transfer ops (CopyBufferToImage, CopyImageToBuffer, ClearColorImage)
  • COMBINED_IMAGE_SAMPLER descriptors (frag stage)
  • UNIFORM_BUFFER descriptors (vertex stage)
  • Vertex input bindings + attributes
  • texelFetch from sampled image
  • Varying interpolation
  • UBO data flow vertex shader

The "well-tested on v7? NO" gate at panvk_physical_device.c:413 has held up as defensive over five iters. PanVk-Bifrost on this hardware does fundamentally work for what we've thrown at it.

iter5 in-tree artifacts

Next iter — iter6 lock proposal

The campaign has blown through 5 minimal probes without finding a single driver bug. Time to either (a) stress-test with a more complex synthetic workload or (b) jump to a real off-the-shelf app and see what breaks.

Going with (a) stress synthetic first because it's more diagnostically useful — if a real app breaks at iter7, we want to know whether it's something we already tested in isolation.

iter6 lock proposal: depth-tested multi-draw scene. 128×128 RGBA8 color attachment + 128×128 D32_SFLOAT depth attachment. Two triangles drawn in sequence: a "back" red triangle at z=0.7, a "front" green triangle at z=0.3, partially overlapping. Verify: (a) in the overlap region, only green is visible (depth test works), (b) in red-only region, red is visible, (c) in clear region, clear color, (d) coverage counts plausible for both individual triangles.

This adds: depth attachment, depth-stencil image format, two separate draws within one render pass, depth state in graphics pipeline, z-coordinate handling in vertex shader.

Then iter7 = real-app test (vkcube headless or via display). Then iter8 = Zink-on-PanVk smoke (GL → Vulkan via Mesa Zink, run glmark2 or es2gears). Then iter9+ = TuxRacer.

Pacing per the 8-phase loop.