Files
panvk-bifrost/cts-results/r5_full_sweep_2026-05-25.md
marfrit 02b212b29b iter19 close + r5 full sweep scoreboard
Adds iter19 campaign-close doc (mesa-panvk-bifrost r7 ship, XFB
packed-varying channel-extract fix). Also lands the r5 full
dEQP-VK sweep scoreboard (2.26M tests, 97.65% runnable pass rate)
that surfaced the holes_vert crash in the first place.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 11:55:53 +02:00

4.5 KiB
Raw Permalink Blame History

r5 full dEQP-VK sweep — 2026-05-25

Driver under test: mesa-panvk-bifrost-26.0.6.r5-1 (fragmentStoresAndAtomics flip) Hardware: PineTab2 / RK3566 / Mali-G52 r1 MC1 (PAN_ARCH 7) CTS: vulkan-cts-1.3.10.0 (commit 7aa3551) Run host: ohm Run window: 2026-05-24 ~22:00 → 2026-05-25 02:48 CEST (~5h wall)

Corrected scoreboard

Metric Count % of total
Total tests 2,258,378 100.00
Passed 553,447 24.51
Failed 13,318 0.59
NotSupported 1,691,613 74.90
Pass rate (runnable) - 97.65%
Group status Count
Clean groups (0 failures) 41/53
Groups with failures 12/53
Watchdog-killed (data recovered) 3/53

The 3 watchdog-killed groups (api, subgroups, transform_feedback) hit dEQP's internal watchdog timer and got SIGKILL'd (exit=137), but the per-group .qpa files contain all completed test results — totals were re-derived from those qpa files directly. The wrapper's elapsed=3s field is misleading; the real runs went the full distance.

Failed groups (sorted by fail count)

Group Fails Notes
image 10,445 78% of all fails. Single biggest target.
subgroups 2,342 Bifrost is 4- or 8-wide warps; many fails likely arch-gap
glsl 306
memory_model 106 Bifrost memory ordering — expected partial coverage
draw 68
transform_feedback 27 All resume_*documented iter17 by-design gap
compute 8
pipeline 5
descriptor_indexing 4
api 3
spirv_assembly 3
info 1

Watchdog-stop tests (last test attempted in each group)

Group Last test Verdict
api dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic watchdog (total time)
subgroups dEQP-VK.subgroups.clustered.graphics.subgroupclusteredadd_ivec3 watchdog (touch interval)
transform_feedback dEQP-VK.transform_feedback.simple.holes_vert userspace coredump, reproducible — see #107

transform_feedback 27 fails — all known by-design

Every fail is shape dEQP-VK.transform_feedback.simple.resume_* — 3 stream-count cells × {resume, resume_beginqueryindexed_streamid_0, resume_endqueryindexed_streamid_0} × {256, 512, 131072}. All fail at vktTransformFeedbackSimpleTests.cpp:949 with received:N expected:0. These match the iter17 closeout note ("remaining 81 fails are by-design resume_* tests, transformFeedbackDraw=false"). The set here is a strict subset of the 81; this run only hit the 27 cells that this sweep configuration exercised. No r5 regression on iter17 XFB.

Next actions

  • Phase 0 on holes_vert coredump — userspace-only (dmesg clean), reproducible in isolation against r6 too. Tracked as task #107.
  • r7 candidate sortimage group's 10,445 fails are the largest mass; needs Phase 0 evidence sort to see if it's one root cause replicated across format/dim cells or many.

Provenance

  • Raw per-group qpa + run.log: ohm:/home/mfritsche/cts-results/r5_full/
  • Summary log: ohm:/home/mfritsche/cts-results/r5_full/summary.log
  • Driver: ohm:/usr/lib/panvk-bifrost/libvulkan_panfrost.so (r5 then r6)

Aggregation script: see git history of this file.