iter19 close + r5 full sweep scoreboard

Adds iter19 campaign-close doc (mesa-panvk-bifrost r7 ship, XFB packed-varying channel-extract fix). Also lands the r5 full dEQP-VK sweep scoreboard (2.26M tests, 97.65% runnable pass rate) that surfaced the holes_vert crash in the first place. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 11:55:53 +02:00
parent 6019cce7d1
commit 02b212b29b
2 changed files with 176 additions and 0 deletions
@@ -0,0 +1,78 @@
+# r5 full dEQP-VK sweep — 2026-05-25
+
+Driver under test: `mesa-panvk-bifrost-26.0.6.r5-1` (fragmentStoresAndAtomics flip)
+Hardware: PineTab2 / RK3566 / Mali-G52 r1 MC1 (PAN_ARCH 7)
+CTS: vulkan-cts-1.3.10.0 (commit 7aa3551)
+Run host: ohm
+Run window: 2026-05-24 ~22:00 → 2026-05-25 02:48 CEST (~5h wall)
+
+## Corrected scoreboard
+
+| Metric                | Count        | % of total |
+|-----------------------|--------------|------------|
+| Total tests           | 2,258,378    | 100.00     |
+| Passed                |   553,447    |  24.51     |
+| Failed                |    13,318    |   0.59     |
+| NotSupported          | 1,691,613    |  74.90     |
+| **Pass rate (runnable)** | -         |  **97.65%** |
+
+| Group status                       | Count |
+|------------------------------------|-------|
+| Clean groups (0 failures)          | 41/53 |
+| Groups with failures               | 12/53 |
+| Watchdog-killed (data recovered)   |  3/53 |
+
+The 3 watchdog-killed groups (`api`, `subgroups`, `transform_feedback`) hit dEQP's
+internal watchdog timer and got SIGKILL'd (exit=137), but the per-group `.qpa`
+files contain all completed test results — totals were re-derived from those
+qpa files directly. The wrapper's `elapsed=3s` field is misleading; the real
+runs went the full distance.
+
+## Failed groups (sorted by fail count)
+
+| Group               | Fails  | Notes                                                        |
+|---------------------|--------|--------------------------------------------------------------|
+| image               | 10,445 | 78% of all fails. Single biggest target.                     |
+| subgroups           |  2,342 | Bifrost is 4- or 8-wide warps; many fails likely arch-gap    |
+| glsl                |    306 |                                                              |
+| memory_model        |    106 | Bifrost memory ordering — expected partial coverage          |
+| draw                |     68 |                                                              |
+| transform_feedback  |     27 | All `resume_*` — **documented iter17 by-design gap**         |
+| compute             |      8 |                                                              |
+| pipeline            |      5 |                                                              |
+| descriptor_indexing |      4 |                                                              |
+| api                 |      3 |                                                              |
+| spirv_assembly      |      3 |                                                              |
+| info                |      1 |                                                              |
+
+## Watchdog-stop tests (last test attempted in each group)
+
+| Group              | Last test                                                                   | Verdict          |
+|--------------------|-----------------------------------------------------------------------------|------------------|
+| api                | `dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic` | watchdog (total time) |
+| subgroups          | `dEQP-VK.subgroups.clustered.graphics.subgroupclusteredadd_ivec3`              | watchdog (touch interval) |
+| transform_feedback | `dEQP-VK.transform_feedback.simple.holes_vert`                                 | **userspace coredump, reproducible** — see #107 |
+
+## transform_feedback 27 fails — all known by-design
+
+Every fail is shape `dEQP-VK.transform_feedback.simple.resume_*` —
+3 stream-count cells × {`resume`, `resume_beginqueryindexed_streamid_0`,
+`resume_endqueryindexed_streamid_0`} × {256, 512, 131072}. All fail at
+`vktTransformFeedbackSimpleTests.cpp:949` with `received:N expected:0`.
+These match the iter17 closeout note ("remaining 81 fails are by-design
+resume_* tests, transformFeedbackDraw=false"). The set here is a strict
+subset of the 81; this run only hit the 27 cells that this sweep configuration
+exercised. **No r5 regression on iter17 XFB.**
+
+## Next actions
+
+- **Phase 0 on `holes_vert` coredump** — userspace-only (dmesg clean), reproducible in isolation against r6 too. Tracked as task #107.
+- **r7 candidate sort** — `image` group's 10,445 fails are the largest mass; needs Phase 0 evidence sort to see if it's one root cause replicated across format/dim cells or many.
+
+## Provenance
+
+- Raw per-group qpa + run.log: `ohm:/home/mfritsche/cts-results/r5_full/`
+- Summary log: `ohm:/home/mfritsche/cts-results/r5_full/summary.log`
+- Driver: `ohm:/usr/lib/panvk-bifrost/libvulkan_panfrost.so` (r5 then r6)
+
+Aggregation script: see git history of this file.