c14c22f942
Adds 0007-panvk-bifrost-xfb-component-base-fix.patch — eliminates a
reliable SIGSEGV in vkCreateGraphicsPipeline whenever an XFB-bound
vertex output is declared with non-zero `layout (component=N)`.
Surfaced by dEQP-VK.transform_feedback.simple.holes_vert (Mali-G52 r1
MC1, PAN_ARCH 7). Backtrace lands 11 frames into libvulkan_panfrost.so
called from vkt::TransformFeedback::TransformFeedbackHolesInstance::
iterate.
Root cause: iter17's lower_xfb_output_iter17 (and upstream
pan_nir_lower_xfb, which has the identical `// TODO`) computes the
source-channel mask as `mask << channel_idx`, where channel_idx is
the varying-location component (0..3) but src only contains channels
starting at nir_intrinsic_component(intr). For a scalar declared
component=2, the lowering computed `mask << 2` against a 1-component
src — out-of-range; the malformed nir_def segfaulted in downstream
NIR constant-folding (nir_constant_expressions.c::evaluate_*).
Fix translates channel_idx to source-channel space by subtracting
nir_intrinsic_component(intr) before shifting the mask, and replaces
the elided release-mode asserts with explicit release-mode guards
(closes the same elision class as the original bug).
Verified on PineTab2 (Mali-G52 r1 MC1, PAN_ARCH 7) against vulkan-cts
1.3.10.0:
- holes_vert / holes_extra_draw_vert no longer SIGSEGV (now Fail
on color-check; that is a separate iter20 finding).
- basic_*: 36/36 Pass. depth_clip_*: 1 Pass + 4 NotSupported.
lines_or_triangles*: 16 NotSupported. 0 Fail across the set.
Caveat (not regressions): max_output_components_64/_128/_256 were
never reached on the r5 sweep — watchdog killed transform_feedback
after the holes_vert crash. With this fix in place, they now run
and surface their own pre-existing coredumps, confirmed on shipped
r6 baseline too. iter20+ territory.
Phase 5 (2nd-model) review: APPROVE WITH CHANGES (non-blocking).
Changes applied: release-mode defensive guards on both preconditions
plus a dispatcher-side comment clarifying the i*2+j semantics.
Cross-refs:
- ~/src/panvk-bifrost/iter19/phase{0,1,2,3}_holes_vert*.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
104 lines
5.1 KiB
Diff
104 lines
5.1 KiB
Diff
From: marfrit-packages noether <claude-noether@reauktion.de>
|
|
Subject: [PATCH] panvk-bifrost: fix XFB store channel-extract for packed varyings
|
|
|
|
iter19 — fixes a reliable SIGSEGV during vkCreateGraphicsPipeline on any
|
|
shader that uses XFB-bound varyings declared with non-zero `layout
|
|
(component=N)` qualifiers. Surfaced by
|
|
dEQP-VK.transform_feedback.simple.holes_vert; backtrace lands 11 frames
|
|
into libvulkan_panfrost.so called from `vkt::TransformFeedback::
|
|
TransformFeedbackHolesInstance::iterate`.
|
|
|
|
Root cause: `lower_xfb_output_iter17` (and upstream `lower_xfb_output`,
|
|
which carries a `// TODO` on the same assertion) computes the source-
|
|
channel mask as `mask << channel_idx`, where `channel_idx` is the
|
|
varying-location component (0..3) but `src` only contains channels for
|
|
the source-side range starting at `nir_intrinsic_component(intr)`. For
|
|
`flat out float vegeta` declared with `component=2`, NIR emits
|
|
`store_output src=<vec1>, component=2`, and the lowering computes
|
|
`mask << 2` against a single-component src — out-of-range; the
|
|
resulting malformed nir_def then segfaults inside downstream NIR
|
|
constant-folding (`nir_constant_expressions.c::evaluate_*`).
|
|
|
|
The assertion `assert(nir_intrinsic_component(intr) == 0)` was inherited
|
|
from upstream `pan_nir_lower_xfb.c` as a documented `// TODO`; release
|
|
builds (-DNDEBUG) elide it. The fix translates `channel_idx` to the
|
|
source-channel space by subtracting `nir_intrinsic_component(intr)`
|
|
before shifting the mask, and replaces the elided asserts with explicit
|
|
release-mode guards (the patch closes the same release-mode-elision
|
|
class as the original bug).
|
|
|
|
Verified on PineTab2 (Mali-G52 r1 MC1, PAN_ARCH 7) against vulkan-cts
|
|
1.3.10.0:
|
|
- holes_vert / holes_extra_draw_vert no longer SIGSEGV (now Fail on
|
|
color-check; that is a separate iter20 finding — the rasterized
|
|
varying gets removed alongside the XFB-bound one).
|
|
- basic_*: 36/36 Pass. depth_clip_*: 1 Pass + 4 NotSupported.
|
|
lines_or_triangles*: 16 NotSupported. 0 Fail across the full set.
|
|
- holes_geom / holes_extra_draw_geom remain NotSupported
|
|
(geometryShader not on G52) — unchanged.
|
|
|
|
Caveat: max_output_components_64/_128/_256 were never reached on the
|
|
r5 sweep (watchdog killed transform_feedback after the holes_vert
|
|
crash). With this fix in place, those tests now run and surface
|
|
*their own pre-existing* coredumps — confirmed on shipped r6 baseline
|
|
too. They are NOT regressions from this patch; they are latent crashes
|
|
unmasked by it. iter20+ territory.
|
|
|
|
Phase 5 (2nd-model) review: APPROVE WITH CHANGES (non-blocking).
|
|
Changes applied: release-mode defensive guards on both preconditions
|
|
plus a dispatcher-side comment clarifying the i*2+j semantics.
|
|
|
|
Cross-refs:
|
|
- iter19/phase{0,1,2,3}_holes_vert*.md in panvk-bifrost repo
|
|
|
|
---
|
|
src/panfrost/vulkan/panvk_vX_xfb_lower.c | 24 +++++++++++++++++++++---
|
|
1 file changed, 21 insertions(+), 3 deletions(-)
|
|
|
|
diff --git a/src/panfrost/vulkan/panvk_vX_xfb_lower.c b/src/panfrost/vulkan/panvk_vX_xfb_lower.c
|
|
@@ -339,7 +339,20 @@
|
|
unsigned buffer, unsigned offset_words)
|
|
{
|
|
assert(buffer < MAX_XFB_BUFFERS);
|
|
- assert(nir_intrinsic_component(intr) == 0);
|
|
+
|
|
+ /* iter19: nir_intrinsic_component(intr) is the source-channel base —
|
|
+ * for a packed varying like `layout (location=0, component=2) flat out
|
|
+ * float vegeta`, NIR emits store_output with component=2 and a single-
|
|
+ * component src. The XFB iteration index `channel_idx` (0..3) is the
|
|
+ * varying-location component, not the source channel. Translate by
|
|
+ * subtracting the base before shifting the mask. Fixes the long-
|
|
+ * standing `assert(nir_intrinsic_component(intr) == 0) // TODO` in
|
|
+ * upstream pan_nir_lower_xfb that surfaces on holes_vert. */
|
|
+ const unsigned base_comp = nir_intrinsic_component(intr);
|
|
+ /* Defensive against release-build elision: this is precisely the
|
|
+ * bug class the patch is fixing, so don't re-introduce it. */
|
|
+ if (channel_idx < base_comp)
|
|
+ return;
|
|
|
|
uint16_t stride = b->shader->info.xfb_stride[buffer] * 4;
|
|
assert(stride != 0);
|
|
@@ -357,7 +370,11 @@
|
|
|
|
nir_def *src = intr->src[0].ssa;
|
|
nir_component_mask_t mask = nir_component_mask(num_components);
|
|
- nir_def *value = nir_channels(b, src, mask << channel_idx);
|
|
+ const unsigned src_channel = channel_idx - base_comp;
|
|
+ /* Same defensive class as the channel_idx >= base_comp guard above. */
|
|
+ if (src_channel + num_components > src->num_components)
|
|
+ return;
|
|
+ nir_def *value = nir_channels(b, src, mask << src_channel);
|
|
|
|
/* Topology dispatch ladder. LIST first (fast path). */
|
|
nir_push_if(b, nir_ieq_imm(b, topology, PANVK_XFB_TOPO_LIST));
|
|
@@ -465,6 +482,9 @@
|
|
for (unsigned j = 0; j < 2; ++j) {
|
|
if (!xfb.out[j].num_components)
|
|
continue;
|
|
+ /* `i*2+j` is the varying-location component (0..3) — io_xfb covers
|
|
+ * slots 0..1, io_xfb2 covers 2..3. The leaf translates this into
|
|
+ * a source-channel index by subtracting nir_intrinsic_component(intr). */
|
|
lower_xfb_output_iter17(b, intr, i * 2 + j, xfb.out[j].num_components,
|
|
xfb.out[j].buffer, xfb.out[j].offset);
|
|
progress = true;
|