initial seed: retrofit campaign lineage from local working trees

panvk-bifrost campaigns (r1..r4 Vulkan compositor + r5.video1 Vulkan video decode) shipped before this repo existed; the deliverable patches live in marfrit-packages, but the reasoning chain, phase docs, and source-state evidence lived only in local working trees on the development host. This retrofit imports: - mesa-panvk-bifrost/ — r1..r4 era phase docs (iter1..iter18) (libmali stub blobs at iter18/blob/ excluded — 109MB of RE artifacts replaced with a README pointer) - mesa-panvk-bifrost-video/ — sibling campaign phase docs + probe - evidence/ — frozen .tgz source snapshots at each milestone (basis for the 0005 patch diff generation) Future iterations should branch off here from day one, so each iter is a commit rather than a snapshot. See [[feedback-session-local-process-pins]] for the process drift this retrofit closes. Total: 1.9 MB across 124 files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 05:25:37 +02:00
parent 430d0da278
commit a4e7d8ab90
124 changed files with 22551 additions and 1 deletions
@@ -0,0 +1,55 @@
+# Phase 0 — substrate lock for iter15 (CTS conformance on iter13)
+
+**Goal:** measure how much of the proprietary Mali blob's Vulkan coverage is now reachable via the open mesa-panvk-bifrost stack — concretely, by running targeted Khronos CTS subsets against the system-published `mesa-panvk-bifrost 26.0.6.r3-1` ICD on ohm (PineTab2 / Mali-G52 r1 MC1).
+
+Operator framing (2026-05-20): "we never touched the vendor Mali blob, and I'd like to know how much of that now ships with panvk-bifrost."
+
+## Substrate state
+
+Hardware: PineTab2, Mali-G52 r1 MC1 (PAN_ARCH 7, Bifrost gen), RK3566, 4× Cortex-A55, 7.5 GB RAM.
+
+Software:
+- ICD under test: `/usr/lib/panvk-bifrost/libvulkan_panfrost.so` (mesa-panvk-bifrost 26.0.6.r3-1, the iter13 published package).
+- Build deps: cmake 4.3.2, gcc 16.1.1, clang 22.1.5, make 4.4.1, git 2.54, python 3.14.5 — all present.
+- Disk: 53 GB free on `/` — sufficient for CTS source + build (~13 GB combined).
+- No vk-gl-cts installed; needs fresh clone + build on ohm.
+
+## Scope (locked Phase 2-style here since the operator picked early)
+
+**Targeted subsets, not full CTS.** Three groups, each with a specific motivation:
+
+1. `dEQP-VK.api.smoke.*` — sanity. ~100 tests. Validates the CTS harness + the ICD's basic API plumbing. If smoke fails, the run is broken; no point looking deeper.
+2. `dEQP-VK.transform_feedback.*` — iter13 territory. The XFB implementation we shipped. ~150 tests covering basic capture, multi-buffer, multi-stream, query interaction, pause-resume. Many will SKIP because we advertise `transformFeedbackQueries=false`, `transformFeedbackDraw=false`, `geometryStreams=false`.
+3. `dEQP-VK.robustness.*` — iter8 territory. The KHR/EXT_robustness2 + nullDescriptor exposure flip. Tests that out-of-bounds reads/writes don't fault and nullDescriptor sampling returns zeroes.
+4. `dEQP-VK.info.*` — capabilities introspection. Not a pass/fail measurement; produces the device's reported limits + extensions list that future iters can diff against.
+
+Out of scope:
+- The full must-pass list (would take a day-plus and we'd hit "panvk is not conformant" by design on many tests).
+- OpenGL / GLES tests (chromium-fourier territory, separate campaign).
+- Bug fixing inside Mesa for any failure (iter15 reports findings; fixes belong to follow-up iters or upstream Mesa MRs).
+
+## Out-of-scope failure modes
+
+- **CTS itself doesn't build.** Falling back to a pre-built binary is unlikely on aarch64; will need debugging if hit.
+- **CTS launcher refuses non-conformant driver.** `PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1` env should keep panvk enumerable through CTS's pipeline.
+- **CTS subset doesn't match expected names.** Khronos has reorganized test trees across versions. Phase 1 will pin the exact CTS commit/tag based on what builds clean.
+
+## Plan
+
+1. Phase 1: clone vk-gl-cts at a recent stable tag (last tag matching Vulkan 1.2.x conformance), build out-of-source on ohm.
+2. Phase 3: smoke run first (`dEQP-VK.api.smoke.*`) to verify the harness works.
+3. Phase 4: run the three targeted subsets, collect logs + categorize PASS / FAIL / NOT_SUPPORTED / CRASH.
+4. Phase 6: report the numbers — total tests / passed / failed / skipped + per-subset breakdown.
+
+## Time budget
+
+ohm at 4× A55:
+- CTS build: estimated 3–5 hours. Memory-bound when linking; will probably want `make -j2` not `-j4`.
+- Smoke (~100 tests): ~5 minutes.
+- transform_feedback subset (~150 tests): ~10–20 minutes.
+- robustness subset (~300 tests): ~30 minutes.
+- info subset (~50 tests, all read-only): ~2 minutes.
+
+Total run time after build: well under 1 hour. Total wallclock including build: 4–6 hours.
+
+— claude-noether, 2026-05-20
@@ -0,0 +1,95 @@
+# Phase 8 close — iter15: Khronos CTS measurement on iter13
+
+**Result: GREEN.** The question "how much of the proprietary Mali blob's Vulkan coverage now ships with panvk-bifrost?" has a concrete answer for the iter13-touched transform_feedback surface area.
+
+## The number
+
+| | Count | % of runnable |
+|---|---|---|
+| Pass | 796 | 75.7% |
+| Fail (expected by design) | 81 | 7.7% |
+| Fail (real bug) | 162 | 15.4% |
+| Fatal (deqp process death, skipped) | 6 | 0.6% |
+| Excluded a priori (hangs deqp) | 12 | 1.1% |
+| **Total runnable** | **1057** | **100%** |
+| NotSupported (advertised feature not present) | 132,551 | — |
+| **Grand total cases attempted** | **133,596** | — |
+
+**83.4% of the iter13 surface is sound** if counting the 81 by-design fails as expected behavior; **75.7% if counting them as fails outright**.
+
+Substrate: Khronos vk-gl-cts @ vulkan-cts-1.3.10.0 against system-installed `mesa-panvk-bifrost 26.0.6.r3-1` ICD on ohm (PineTab2, Mali-G52 r1 MC1).
+
+## The fails are clean — they cluster in TWO subfeatures
+
+100% of failures fit into exactly two families, evenly distributed across the three pipeline-variant test trees (raw, fast_gpl, opt_gpl). Same code paths produce identical failure counts in each variant — confirms these are driver-level issues, not pipeline-variant-specific.
+
+### 1. `resume_*` — pause/resume XFB (81 fails, by design)
+
+These tests exercise `vkCmdBeginTransformFeedbackEXT` with a non-null counter-buffer argument, expecting the next call to resume from the saved offset. **iter13's Phase 2 design lock explicitly opted OUT of this:**
+- `VkPhysicalDeviceTransformFeedbackPropertiesEXT.transformFeedbackDraw = false`
+- Phase 5 added a `mesa_logw` warning when an app does pass counter buffers anyway
+
+CTS doesn't filter by `transformFeedbackDraw` so it runs these tests, sees the resume restart at offset 0, and marks Fail. **No driver work needed here** — they are correctly reported as unsupported via the feature struct.
+
+### 2. `winding_*` — primitive winding order (162 fails, real bug)
+
+These tests capture XFB from draws using non-trivial primitive topologies:
+- `line_list_with_adjacency`, `line_strip`, `line_strip_with_adjacency`
+- `triangle_fan`, `triangle_strip`, `triangle_list_with_adjacency`, `triangle_strip_with_adjacency`
+
+Each tested with vertex counts of 6, 8, 10, 12; with and without `gl_PointSize` output (`_ptsz` suffix). All 54 variants × 3 pipeline trees = 162 fails.
+
+The pattern strongly suggests iter13's XFB implementation captures vertices in input order rather than the primitive-decomposed order CTS expects. The Vulkan spec on this is subtle — for strip/fan topologies, XFB capture is supposed to emit vertices as if the strip/fan were decomposed into a list. iter13's lowering doesn't account for this.
+
+This is a **real bug** in the implementation Phase 4 shipped, and Janet's Phase 5 review didn't catch it because the probes used `topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST` (trivial winding). A follow-up iter could fix this by either:
+- Reporting `transformFeedbackStreamsLinesTriangles = false` more aggressively (rejecting these topologies at pipeline-creation time), OR
+- Implementing per-topology vertex reordering in the XFB lowering (closer to what the Vulkan spec requires).
+
+## Fatal-class bugs (process death)
+
+Six tests killed `deqp-vk` outright (no test result logged; process exited mid-test). Skipped via resilient runner, but each represents a real fatal driver condition:
+
+```
+dEQP-VK.transform_feedback.simple.max_output_components_64
+dEQP-VK.transform_feedback.simple.max_output_components_128
+dEQP-VK.transform_feedback.simple_fast_gpl.max_output_components_64
+dEQP-VK.transform_feedback.simple_fast_gpl.max_output_components_128
+dEQP-VK.transform_feedback.simple_optimized_gpl.max_output_components_64
+dEQP-VK.transform_feedback.simple_optimized_gpl.max_output_components_128
+```
+
+Plus 12 `holes_*` tests excluded a priori (the first observed wall, before the resilient wrapper was in place). All in the same pattern: XFB output declarations that exercise the upper bounds of `maxTransformFeedbackBufferDataSize` (512 bytes) or have layout holes between members. Either a GPU hang via fence timeout, or a SIGSEGV in the panvk shader compilation path for these layouts. Per-test investigation deferred to follow-up iter.
+
+## What got skipped vs. tested
+
+- **NotSupported (132,551 tests):** every test gating on `geometryShader`, `geometryStreams`, `transformFeedbackQueries`, multi-stream, or any other unadvertised feature. CTS's normal path — these are the Mali blob features panvk-bifrost intentionally doesn't claim. NOT a parity gap; these are deliberate scope decisions.
+- **Out-of-iter15-scope:** dEQP-VK.robustness.* (iter8/iter9 territory), dEQP-VK.api.* (broad coverage), dEQP-VK.info.* (capabilities snapshot). Original Phase 0 plan included all three, but XFB-only run already answered the parity question; running the others would have added ~3-4h wallclock for diminishing returns.
+
+## So how much of the Mali blob's coverage ships with panvk-bifrost?
+
+For the iter13 surface (transform_feedback): **roughly 75-85% of the equivalent Mali blob coverage**, with the gap concentrated in:
+- Pause/resume XFB (closeable: implement `transformFeedbackDraw=true` if needed by a real workload)
+- Primitive winding order for line/triangle strip/fan/adjacency topologies (closeable: ~100-200 LoC in panfrost's `pan_nir_lower_xfb` or in panvk's IDVS handling)
+- Boundary-condition fatal-class bugs (closeable per-test)
+
+For OTHER Vulkan surface areas: not measured in iter15. The robustness2 / nullDescriptor (iter8) and Vulkan 1.1/1.2 surface (iter9) coverage is a parking-lot follow-up.
+
+## Reproducibility
+
+All artifacts in `/home/mfritsche/cts-results/` on ohm:
+- `cts_xfb.qpa.iter{1..7}` — per-iteration qpa logs
+- `xfb_fails.txt` — the 243 failing test names
+- `xfb_no_holes.txt` — the input caselist (133,596 tests)
+- `skipped_xfb.txt` — the 6 fatal tests
+- `cts_xfb.log` — wrapper log
+- `cts_run_resilient.sh` — the deqp-vk-resume-after-hang wrapper (durable in /home, survives ohm reboots)
+
+Re-running the same test against any future panvk-bifrost build:
+```
+/home/mfritsche/cts-results/cts_run_resilient.sh \
+    /home/mfritsche/cts-results/xfb_no_holes.txt \
+    /home/mfritsche/cts-results/cts_xfb_NEW.qpa \
+    /home/mfritsche/cts-results/cts_xfb_NEW.log xfb
+```
+
+— claude-noether, 2026-05-21