Files
panvk-bifrost/mesa-panvk-bifrost/phase0_findings_iter3.md
T
marfrit a4e7d8ab90 initial seed: retrofit campaign lineage from local working trees
panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan
video decode) shipped before this repo existed; the deliverable
patches live in marfrit-packages, but the reasoning chain, phase docs,
and source-state evidence lived only in local working trees on the
development host.

This retrofit imports:
- mesa-panvk-bifrost/   — r1..r4 era phase docs (iter1..iter18)
                          (libmali stub blobs at iter18/blob/ excluded
                          — 109MB of RE artifacts replaced with a README
                          pointer)
- mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe
- evidence/             — frozen .tgz source snapshots at each milestone
                          (basis for the 0005 patch diff generation)

Future iterations should branch off here from day one, so each iter is
a commit rather than a snapshot. See [[feedback-session-local-process-pins]]
for the process drift this retrofit closes.

Total: 1.9 MB across 124 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00

5.5 KiB
Raw Blame History

Phase 0 — substrate for iter3

Opened 2026-05-19 after iter2 close GREEN.

Locked research question — iter3

Render a single fullscreen triangle into a 64×64 VK_FORMAT_R8G8B8A8_UNORM color attachment via VK_KHR_dynamic_rendering on PanVk-Bifrost (ohm / Mali-G52 r1 MC1 / PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1), using:

  • a trivial vertex shader that emits 3 positions from gl_VertexIndex covering NDC (-1,-1)/(3,-1)/(-1,3) — no vertex buffer
  • a trivial fragment shader that writes gl_FragCoord-encoded color: R = floor(gl_FragCoord.x) (UNORM), G = floor(gl_FragCoord.y) (UNORM), B = 0x80 sentinel, A = 0xff

Copy attachment to a host-visible buffer and verify every pixel at (col, row) reads back as 0xff80(row)(col) (LE uint32, e.g. pixel[0,0] = 0xff800000, pixel[63,63] = 0xff803f3f). No GPU faults, no validation errors.

If GREEN → iter4 adds vertex buffer + UBO + texture sample. If RED → characterize first failure point in the graphics path.

Why this shape

iter1 (compute) + iter2 (image clear + copy) collapsed most non-graphics hypotheses. iter3 introduces only the graphics pipeline machinery:

  • Image with COLOR_ATTACHMENT_BIT usage (new — iter2 used TRANSFER_DST only)
  • VkImageView (new — first time)
  • VK_KHR_dynamic_rendering extension + dynamicRendering = true feature enabled
  • vkCmdBeginRenderingKHR / vkCmdEndRenderingKHR
  • Graphics pipeline with vertex + fragment shaders
  • Rasterizer + viewport + scissor + blend state (static, no dynamic state)
  • vkCmdDraw(3, 1, 0, 0) — triangle list, no vertex buffer

Not in iter3: render pass (VkRenderPass/VkFramebuffer legacy API), dynamic pipeline state, multiple subpasses, multiple attachments, depth/stencil, MSAA, vertex buffers, descriptors (no UBO/SSBO/sampler), push constants.

Why 64×64 (not 4×4)

Bifrost is a tile-based rasterizer. Mali tile size is 16×16 pixels for RGBA8. A 4×4 image fits inside a single tile → tile binning path doesn't really run. 64×64 = 16 tiles (4×4 grid of 16×16 tiles), so the binner does meaningful work. Catches any per-tile bug that a single-tile workload would hide.

Why gl_FragCoord-encoded color

A plain constant-color fragment shader passes even if rasterization is wildly wrong (every pixel gets the same value). An encoded color exposes:

  • Off-by-half-pixel: gl_FragCoord in Vulkan is pixel + 0.5, so floor(gl_FragCoord.x) = pixel_x. Wrong drivers might emit pixel_x + 1 or pixel_x - 1.
  • Y-axis flip: Vulkan's NDC y points down, OpenGL's points up. A driver that gets this backwards encodes (63 - row) instead of row.
  • Partial rasterization: missing tiles will retain the clear value (black) instead of the encoded value.
  • Coverage off-by-one at edges: pixels right at the fullscreen-triangle boundary should still be covered.

Hypothesis space — where iter3 may fail first

  1. Pipeline creation / shader compilation. PanVk-Bifrost's NIR lowering for vertex + fragment shaders may produce shaders that fail to link. iter1 proved compute shader compilation works; vert+frag is a different code path. Specifically: vertex shader output → fragment shader input varyings, which on Bifrost are passed through tile memory.

  2. Dynamic rendering plumbing. PanVk historically supported render passes first; VK_KHR_dynamic_rendering may be a thin shim with bugs on the v6/v7 path. The pColorAttachmentFormats field in VkPipelineRenderingCreateInfoKHR must match the actual attachment image format — if Mesa's PanVk-Bifrost doesn't propagate this correctly to the JM tiler descriptors, we'll get garbage or a fault.

  3. Rasterizer state plumbing. Viewport, scissor, cull mode, polygon mode, blend → tile descriptors. Bifrost's tile descriptor layout differs from Valhall's; any field that's been Valhall-shifted will produce wrong output.

  4. Tile binner / draw submission. The job manager (JM) submit path for a graphics draw fills the binning job + tiler job + frag job descriptors. The single triangle should generate one binning job that covers 16 tiles. Per-tile fragment job emission may fail or emit wrong tile coordinates.

  5. Fragment shader output → tilebuffer → image memory. The shader writes through Mali's tile-resident render target, then the tile gets flushed to the bound image. Any cache-flushing or per-tile detiling bug could show as wrong-but-consistent pixel values.

Phase 0 deliverables

  • This document.
  • iter3 in scope (next phase): the probe.

In-scope (LOCKED 2026-05-19 for iter3)

  • Hardware: ohm only.
  • Image: 64×64 R8G8B8A8_UNORM, optimal tiling, COLOR_ATTACHMENT | TRANSFER_SRC.
  • Pipeline: vert + frag, no vertex input, TRIANGLE_LIST, static viewport+scissor, no blend, no depth.
  • Render: dynamic rendering only.
  • Verify: every pixel matches encoded position.

Out-of-scope (LOCKED 2026-05-19 for iter3)

  • VkRenderPass / VkFramebuffer (legacy API).
  • Vertex buffers / vertex input bindings.
  • Descriptors (UBO, SSBO, sampler, texture).
  • Push constants.
  • Multiple draws, instancing, indexed draws.
  • Depth / stencil buffer.
  • MSAA.
  • Dynamic pipeline state.
  • WSI / present.
  • Per-tile coverage variation (alpha, partial pixels) — keep clear fully-opaque.
  • Other formats.

Reference