Campaign-contained data discipline + repo notice

Operator directive 2026-05-03: this campaign acquires all its
own measurement data from scratch. Predecessor numbers are
documented for context but never imported as binding cells,
comparison targets, or success thresholds. The lesson the
predecessor (kwin_overlay_subsurface) closed-without-patch on
is exactly that: phase 1 cells anchored to a single historical
ohm_gl_fix Phase 0 measurement, three weeks of phase planning
on a baseline that didn't reproduce in-session.

The strongest version of feedback_replicate_baseline_first.md:
"don't import predecessor data, acquire it fresh." The discipline
is now documented as a governing rule in three places:

- README.md § "Campaign-contained data discipline"
- phase0_findings.md § "Campaign-contained data discipline
  (governing rule)"
- worklist.md § "Governing rule (every phase)"

Concrete consequences:
- A1 baseline (Phase 0 task) is now mandatory at N=3 reps.
  Single-rep wasn't enough to surface session variance in the
  predecessor; doing 3 up front makes the baseline robust to
  the same kind of session-state drift that ate the
  predecessor's premise.
- Phase 1 thresholds are drawn against the A1 baseline measured
  in this campaign, not against any predecessor number.
- metrics.csv (when it lands) only carries data from this
  campaign's reps. No predecessor rows imported.

README.md additionally:
- Adds the predecessor chain (ohm_gl_fix -> kwin_overlay_subsurface
  -> this campaign) with explicit "what stays valid for source-
  reading" vs "numbers that don't" separation.
- Calls out durable substrate available from predecessors:
  KWin scanout-promotion archaeology, measurement-protocol
  template, WAYLAND_DEBUG parser. All structural; none
  measurement-numerical.
- Carry-over predecessor system state on ohm (governor pin,
  baloo disabled, fourier packages) is explicitly distinguished
  from measurement data. System state inherits; data does not.
- Repository line points to the gitea remote
  ssh://gitea@git.reauktion.de:2222/marfrit/x11-session-research.git

phase0_findings.md additionally:
- Reframes the predecessor-close-out summary section header to
  "(context, not data)" and rephrases past-tense numbers so
  none are stated as "the baseline."
- Adds the discipline lesson narrative in-line before the
  predecessor close-out: a 30-minute N=3 same-session baseline
  on day 1 of the predecessor would have made the campaign
  different — and that's the move this campaign starts with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-03 06:59:06 +00:00
parent ac301b4e48
commit e8bae670d3
3 changed files with 215 additions and 57 deletions
+58 -14
View File
@@ -1,24 +1,68 @@
# Phase 0 — substrate and provisional research question
# Phase 0 — substrate and locked research question
This is the campaign's Phase 0 substrate doc: what we already
know from the predecessor `kwin_overlay_subsurface` close-out,
what's open, and what the candidate research question looks
like. **The research question is provisional and awaits
operator confirmation before Phase 1 lock.**
This is the campaign's Phase 0 doc: the locked research
question (see § "Research question (LOCKED 2026-05-03)"
below), the substrate inherited as *context* from the
predecessor `kwin_overlay_subsurface`, and the open questions
this campaign answers in-session before Phase 1 binding cells
lock.
## Predecessor close-out summary
## Campaign-contained data discipline (governing rule)
**This campaign acquires all its own measurement data from
scratch in this session.** Predecessor measurement numbers
(drop counts, perf percentages, Δ_present medians, kwin %CPU
values, threshold values) are documented for *context* but
**never imported as binding cells, comparison targets, or
success criteria**.
Concretely, in this Phase 0 doc:
- Numbers from `kwin_overlay_subsurface` Phase 3 / Phase 3-prime
(e.g. "C5' had 32 drops total / 22 post-warmup"; "median
Δ_present 45.85 ms"; "kwin %CPU median 36.90") are quoted
**only as evidence of past measurements that may or may not
reproduce**. They establish the SHAPE of what's measurable
and the fact that measurement infrastructure works on this
hardware. They do **not** establish "this is the baseline
the X11 cells will be compared against."
- The Phase 0 A1 baseline rep (worklist.md) is the
in-session anchor for any with-KWin Wayland measurement
this campaign references.
- No `metrics.csv` row in this campaign is populated from
predecessor data, even if the predecessor measured an
identical condition.
- If an X11 cell appears to "match the cage Phase 0 number,"
that's incidental — the cage Phase 0 number is not the test
this campaign is running. The test is "X11 vs in-session
Wayland baseline."
The discipline lesson is concrete: the predecessor's Phase 1
binding cell `drops_post_warmup == 0` was anchored to a single
ohm_gl_fix Phase 0 cage measurement (7 drops total / 0
post-warmup). Three weeks of phase planning ran on the
assumption that floor existed. At N=3 in-session replication
on the closing day of the predecessor, that floor was missing
(cage today: 22 / 26 / 56 post-warmup). The campaign closed
without patch. **A 30-minute N=3 same-session baseline check
on day 1 would have made the campaign different — or made it
honestly close earlier.** This campaign acts on that lesson.
## Predecessor close-out summary (context, not data)
[`../kwin_overlay_subsurface/phase8_handover.md`](../kwin_overlay_subsurface/phase8_handover.md)
(closed 2026-05-03 without patch). Three independent reasons
no patch landed:
no patch landed (numbers below are quoted as historical
context per the governing rule above):
1. The campaign's locked Phase 1 reference floor
(`drops_post_warmup == 0` from cage) is unreachable at N=3
today. Today's median is 26 post-warmup with the same
1. The predecessor's locked Phase 1 reference floor
(`drops_post_warmup == 0` from cage) was unreachable in the
predecessor's closing N=3 measurement session — same
chromium-fourier binary, same hardware, same kernel, same
Mesa, same kwin-fourier — KWin direct reproduces Phase 0's
29 post-warmup, but cage now also drops ~22-56 post-warmup
instead of Phase 0's 0.
Mesa, same kwin-fourier as the original Phase 0 measurement.
KWin direct's number reproduced; cage's 0-floor did not.
Numbers quoted in the predecessor's `phase3_prime_findings.md`;
not used here as a baseline for any cell.
2. The campaign's surface-of-investigation
(`wp_subsurface` overlay route) is not engaged by
`brave_drops_test.html`. Chromium-fourier renders the video