Campaign-contained data discipline + repo notice

Operator directive 2026-05-03: this campaign acquires all its own measurement data from scratch. Predecessor numbers are documented for context but never imported as binding cells, comparison targets, or success thresholds. The lesson the predecessor (kwin_overlay_subsurface) closed-without-patch on is exactly that: phase 1 cells anchored to a single historical ohm_gl_fix Phase 0 measurement, three weeks of phase planning on a baseline that didn't reproduce in-session. The strongest version of feedback_replicate_baseline_first.md: "don't import predecessor data, acquire it fresh." The discipline is now documented as a governing rule in three places: - README.md § "Campaign-contained data discipline" - phase0_findings.md § "Campaign-contained data discipline (governing rule)" - worklist.md § "Governing rule (every phase)" Concrete consequences: - A1 baseline (Phase 0 task) is now mandatory at N=3 reps. Single-rep wasn't enough to surface session variance in the predecessor; doing 3 up front makes the baseline robust to the same kind of session-state drift that ate the predecessor's premise. - Phase 1 thresholds are drawn against the A1 baseline measured in this campaign, not against any predecessor number. - metrics.csv (when it lands) only carries data from this campaign's reps. No predecessor rows imported. README.md additionally: - Adds the predecessor chain (ohm_gl_fix -> kwin_overlay_subsurface -> this campaign) with explicit "what stays valid for source- reading" vs "numbers that don't" separation. - Calls out durable substrate available from predecessors: KWin scanout-promotion archaeology, measurement-protocol template, WAYLAND_DEBUG parser. All structural; none measurement-numerical. - Carry-over predecessor system state on ohm (governor pin, baloo disabled, fourier packages) is explicitly distinguished from measurement data. System state inherits; data does not. - Repository line points to the gitea remote ssh://gitea@git.reauktion.de:2222/marfrit/x11-session-research.git phase0_findings.md additionally: - Reframes the predecessor-close-out summary section header to "(context, not data)" and rephrases past-tense numbers so none are stated as "the baseline." - Adds the discipline lesson narrative in-line before the predecessor close-out: a 30-minute N=3 same-session baseline on day 1 of the predecessor would have made the campaign different — and that's the move this campaign starts with. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 06:59:06 +00:00
parent ac301b4e48
commit e8bae670d3
3 changed files with 215 additions and 57 deletions
@@ -1,24 +1,68 @@
-# Phase 0 — substrate and provisional research question
+# Phase 0 — substrate and locked research question

-This is the campaign's Phase 0 substrate doc: what we already
-know from the predecessor `kwin_overlay_subsurface` close-out,
-what's open, and what the candidate research question looks
-like. **The research question is provisional and awaits
-operator confirmation before Phase 1 lock.**
+This is the campaign's Phase 0 doc: the locked research
+question (see § "Research question (LOCKED 2026-05-03)"
+below), the substrate inherited as *context* from the
+predecessor `kwin_overlay_subsurface`, and the open questions
+this campaign answers in-session before Phase 1 binding cells
+lock.

-## Predecessor close-out summary
+## Campaign-contained data discipline (governing rule)
+
+**This campaign acquires all its own measurement data from
+scratch in this session.** Predecessor measurement numbers
+(drop counts, perf percentages, Δ_present medians, kwin %CPU
+values, threshold values) are documented for *context* but
+**never imported as binding cells, comparison targets, or
+success criteria**.
+
+Concretely, in this Phase 0 doc:
+
+- Numbers from `kwin_overlay_subsurface` Phase 3 / Phase 3-prime
+  (e.g. "C5' had 32 drops total / 22 post-warmup"; "median
+  Δ_present 45.85 ms"; "kwin %CPU median 36.90") are quoted
+  **only as evidence of past measurements that may or may not
+  reproduce**. They establish the SHAPE of what's measurable
+  and the fact that measurement infrastructure works on this
+  hardware. They do **not** establish "this is the baseline
+  the X11 cells will be compared against."
+- The Phase 0 A1 baseline rep (worklist.md) is the
+  in-session anchor for any with-KWin Wayland measurement
+  this campaign references.
+- No `metrics.csv` row in this campaign is populated from
+  predecessor data, even if the predecessor measured an
+  identical condition.
+- If an X11 cell appears to "match the cage Phase 0 number,"
+  that's incidental — the cage Phase 0 number is not the test
+  this campaign is running. The test is "X11 vs in-session
+  Wayland baseline."
+
+The discipline lesson is concrete: the predecessor's Phase 1
+binding cell `drops_post_warmup == 0` was anchored to a single
+ohm_gl_fix Phase 0 cage measurement (7 drops total / 0
+post-warmup). Three weeks of phase planning ran on the
+assumption that floor existed. At N=3 in-session replication
+on the closing day of the predecessor, that floor was missing
+(cage today: 22 / 26 / 56 post-warmup). The campaign closed
+without patch. **A 30-minute N=3 same-session baseline check
+on day 1 would have made the campaign different — or made it
+honestly close earlier.** This campaign acts on that lesson.
+
+## Predecessor close-out summary (context, not data)

 [`../kwin_overlay_subsurface/phase8_handover.md`](../kwin_overlay_subsurface/phase8_handover.md)
 (closed 2026-05-03 without patch). Three independent reasons
-no patch landed:
+no patch landed (numbers below are quoted as historical
+context per the governing rule above):

-1. The campaign's locked Phase 1 reference floor
-   (`drops_post_warmup == 0` from cage) is unreachable at N=3
-   today. Today's median is 26 post-warmup with the same
+1. The predecessor's locked Phase 1 reference floor
+   (`drops_post_warmup == 0` from cage) was unreachable in the
+   predecessor's closing N=3 measurement session — same
   chromium-fourier binary, same hardware, same kernel, same
-   Mesa, same kwin-fourier — KWin direct reproduces Phase 0's
-   29 post-warmup, but cage now also drops ~22-56 post-warmup
-   instead of Phase 0's 0.
+   Mesa, same kwin-fourier as the original Phase 0 measurement.
+   KWin direct's number reproduced; cage's 0-floor did not.
+   Numbers quoted in the predecessor's `phase3_prime_findings.md`;
+   not used here as a baseline for any cell.
 2. The campaign's surface-of-investigation
   (`wp_subsurface` overlay route) is not engaged by
   `brave_drops_test.html`. Chromium-fourier renders the video