From: marfrit-packages noether Subject: [PATCH] panvk-bifrost: fix XFB store channel-extract for packed varyings iter19 — fixes a reliable SIGSEGV during vkCreateGraphicsPipeline on any shader that uses XFB-bound varyings declared with non-zero `layout (component=N)` qualifiers. Surfaced by dEQP-VK.transform_feedback.simple.holes_vert; backtrace lands 11 frames into libvulkan_panfrost.so called from `vkt::TransformFeedback:: TransformFeedbackHolesInstance::iterate`. Root cause: `lower_xfb_output_iter17` (and upstream `lower_xfb_output`, which carries a `// TODO` on the same assertion) computes the source- channel mask as `mask << channel_idx`, where `channel_idx` is the varying-location component (0..3) but `src` only contains channels for the source-side range starting at `nir_intrinsic_component(intr)`. For `flat out float vegeta` declared with `component=2`, NIR emits `store_output src=, component=2`, and the lowering computes `mask << 2` against a single-component src — out-of-range; the resulting malformed nir_def then segfaults inside downstream NIR constant-folding (`nir_constant_expressions.c::evaluate_*`). The assertion `assert(nir_intrinsic_component(intr) == 0)` was inherited from upstream `pan_nir_lower_xfb.c` as a documented `// TODO`; release builds (-DNDEBUG) elide it. The fix translates `channel_idx` to the source-channel space by subtracting `nir_intrinsic_component(intr)` before shifting the mask, and replaces the elided asserts with explicit release-mode guards (the patch closes the same release-mode-elision class as the original bug). Verified on PineTab2 (Mali-G52 r1 MC1, PAN_ARCH 7) against vulkan-cts 1.3.10.0: - holes_vert / holes_extra_draw_vert no longer SIGSEGV (now Fail on color-check; that is a separate iter20 finding — the rasterized varying gets removed alongside the XFB-bound one). - basic_*: 36/36 Pass. depth_clip_*: 1 Pass + 4 NotSupported. lines_or_triangles*: 16 NotSupported. 0 Fail across the full set. - holes_geom / holes_extra_draw_geom remain NotSupported (geometryShader not on G52) — unchanged. Caveat: max_output_components_64/_128/_256 were never reached on the r5 sweep (watchdog killed transform_feedback after the holes_vert crash). With this fix in place, those tests now run and surface *their own pre-existing* coredumps — confirmed on shipped r6 baseline too. They are NOT regressions from this patch; they are latent crashes unmasked by it. iter20+ territory. Phase 5 (2nd-model) review: APPROVE WITH CHANGES (non-blocking). Changes applied: release-mode defensive guards on both preconditions plus a dispatcher-side comment clarifying the i*2+j semantics. Cross-refs: - iter19/phase{0,1,2,3}_holes_vert*.md in panvk-bifrost repo --- src/panfrost/vulkan/panvk_vX_xfb_lower.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/src/panfrost/vulkan/panvk_vX_xfb_lower.c b/src/panfrost/vulkan/panvk_vX_xfb_lower.c @@ -339,7 +339,20 @@ unsigned buffer, unsigned offset_words) { assert(buffer < MAX_XFB_BUFFERS); - assert(nir_intrinsic_component(intr) == 0); + + /* iter19: nir_intrinsic_component(intr) is the source-channel base — + * for a packed varying like `layout (location=0, component=2) flat out + * float vegeta`, NIR emits store_output with component=2 and a single- + * component src. The XFB iteration index `channel_idx` (0..3) is the + * varying-location component, not the source channel. Translate by + * subtracting the base before shifting the mask. Fixes the long- + * standing `assert(nir_intrinsic_component(intr) == 0) // TODO` in + * upstream pan_nir_lower_xfb that surfaces on holes_vert. */ + const unsigned base_comp = nir_intrinsic_component(intr); + /* Defensive against release-build elision: this is precisely the + * bug class the patch is fixing, so don't re-introduce it. */ + if (channel_idx < base_comp) + return; uint16_t stride = b->shader->info.xfb_stride[buffer] * 4; assert(stride != 0); @@ -357,7 +370,11 @@ nir_def *src = intr->src[0].ssa; nir_component_mask_t mask = nir_component_mask(num_components); - nir_def *value = nir_channels(b, src, mask << channel_idx); + const unsigned src_channel = channel_idx - base_comp; + /* Same defensive class as the channel_idx >= base_comp guard above. */ + if (src_channel + num_components > src->num_components) + return; + nir_def *value = nir_channels(b, src, mask << src_channel); /* Topology dispatch ladder. LIST first (fast path). */ nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LIST)); @@ -465,6 +482,9 @@ for (unsigned j = 0; j < 2; ++j) { if (!xfb.out[j].num_components) continue; + /* `i*2+j` is the varying-location component (0..3) — io_xfb covers + * slots 0..1, io_xfb2 covers 2..3. The leaf translates this into + * a source-channel index by subtracting nir_intrinsic_component(intr). */ lower_xfb_output_iter17(b, intr, i * 2 + j, xfb.out[j].num_components, xfb.out[j].buffer, xfb.out[j].offset); progress = true;