From eff6fb5b29a3df0dc3cb3b63a5ad9945e9708cf7 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Sun, 3 May 2026 06:38:24 +0000 Subject: [PATCH] Phase 0: campaign skeleton, research question pending MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Spun off 2026-05-03 from the closed-without-patch kwin_overlay_subsurface campaign (its phase8_handover.md is the predecessor). The candidate research question is whether running an X11 session on PineTab2 reproduces, eliminates, or transforms the drop-inversion phenomenon that motivated the predecessor — but the framing is provisional and awaits operator confirmation before Phase 1 lock. phase0_findings.md is the substrate doc: - Predecessor close-out summary (three reasons no patch landed; replicate-baseline-first lesson). - What stays valid from the predecessor (Phase 1 scanout archaeology, Phase 2-prime KWin Wayland source-read which does NOT transfer to X11, Δ_present-46ms reproducible side-finding which is directly testable under X11, measurement infrastructure with WAYLAND_DEBUG-specific parts that don't transfer). - Current ohm state (carry-over predecessor tooling, governor pin, baloo disabled, kwin-fourier still installed). - Provisional research question with three plausible outcomes (α/β/γ) and four alternate framings the operator may have in mind that this question doesn't cover. - Working-assumption out-of-scope list (no patches, no MRs, no Δ_present chase yet). - Four pre-question Phase 0 deliverables that are unblocked regardless of framing: ohm state snapshot, X11-path inventory, X11 measurement-tool inventory, A1 Wayland-baseline rep on this campaign's session for future comparison anchor. worklist.md tracks Phase 0 only. Phase 1 lock awaits the research question. Discipline carry-overs from kwin_overlay_subsurface listed (replicate baseline first; phase discipline; non-upstreaming default; memory persistence at close). README.md status banner: Phase 0 in progress, research question pending operator confirmation. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 8 ++ README.md | 73 ++++++++++++++++ phase0_findings.md | 213 +++++++++++++++++++++++++++++++++++++++++++++ worklist.md | 55 ++++++++++++ 4 files changed, 349 insertions(+) create mode 100644 .gitignore create mode 100644 README.md create mode 100644 phase0_findings.md create mode 100644 worklist.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..7f8a700 --- /dev/null +++ b/.gitignore @@ -0,0 +1,8 @@ +upstream/ +phase*_evidence/**/perf.data +phase*_evidence/**/wayland_debug.log +phase*_evidence/**/x11_trace.log +phase*_evidence/**/top_full.txt +phase*_evidence/**/stderr.log +__pycache__/ +*.pyc diff --git a/README.md b/README.md new file mode 100644 index 0000000..0419da1 --- /dev/null +++ b/README.md @@ -0,0 +1,73 @@ +# x11-session-research + +> **Status: PHASE 0 — research question pending operator +> confirmation.** See [`phase0_findings.md`](phase0_findings.md) +> § "Research question (provisional)" for the candidate framing +> drafted from the predecessor campaign's close-out. Phase 1 +> lock will not happen before the operator confirms or +> redirects. + +Campaign to investigate X11 session behaviour on PineTab2 +RK3568, in the context of the just-closed +[`kwin_overlay_subsurface`](../kwin_overlay_subsurface/) +campaign. Whether running an X11 (Xorg) session on the same +hardware would (a) reproduce the drop-inversion phenomenon +that motivated the predecessor, (b) bypass it entirely, or +(c) introduce a different set of constraints, is the most +likely campaign question — but this is a working assumption, +not a locked goal. See `phase0_findings.md`. + +## Predecessor + +This campaign exists because +[`../kwin_overlay_subsurface/`](../kwin_overlay_subsurface/) +closed 2026-05-03 without patch (`phase8_handover.md`). Its +diagnostic loop terminated at "Phase 0 cage = 0 post-warmup +drops floor not reproducible at N=3." The natural next move +across the design surface is to vary the display server (X11 +instead of Wayland) on the same hardware and the same client +binary, but the operator has not yet confirmed that as the +campaign's specific question. + +## Hardware target (provisional, same as predecessor) + +ohm — PineTab2, Rockchip RK3568 (4× Cortex-A55, Mali-G52 MP2, +hantro G1/G2 VPU). Kernel `6.19.10-danctnix1-1-pinetab2`. +Mesa 26.0.5. Currently runs KDE Plasma 6.6.4 Wayland. + +For an X11-session campaign, ohm needs an Xorg + Plasma X11 (or +similar X11 desktop) install path verified. As of 2026-05-03, +the only confirmed display path on ohm is +`startplasma-wayland`. **Whether Plasma X11, an alternate X11 +desktop (XFCE, openbox, lightweight WM), or Plasma running +under a Wayland-Xorg shim is in scope is part of the research +question to be locked.** + +## Carry-overs from predecessor (still active on ohm) + +Per `kwin_overlay_subsurface/phase1_evidence/ohm_tooling_revert_log.md`: + +- `qt6-base-fourier 1:6.11.0-3` installed. +- `kwin-fourier 1:6.6.4-3` installed. +- CPU governor pinned to `performance` (was `conservative`). +- Baloo permanently disabled + (`Indexing-Enabled=false` in `~/.config/baloofilerc`). +- `drm-info 2.9.0-1` installed. + +These were not reverted at the predecessor's close-out. This +campaign inherits them unless an explicit revert is part of +the design. + +## Non-upstreaming default + +Inherited from the predecessor and from `ohm_gl_fix`. Bug +reports + MRs are explicit operator-tasked decisions, not +background process steps. + +## File map (will grow) + +| File | What it is | +|---|---| +| `README.md` | This file. | +| `phase0_findings.md` | Substrate from the predecessor + the candidate research question. **Awaits operator confirmation/redirect on the question itself.** | +| `worklist.md` | Phase-by-phase task list. Phase 0 only as of campaign start. | diff --git a/phase0_findings.md b/phase0_findings.md new file mode 100644 index 0000000..8d2f35e --- /dev/null +++ b/phase0_findings.md @@ -0,0 +1,213 @@ +# Phase 0 — substrate and provisional research question + +This is the campaign's Phase 0 substrate doc: what we already +know from the predecessor `kwin_overlay_subsurface` close-out, +what's open, and what the candidate research question looks +like. **The research question is provisional and awaits +operator confirmation before Phase 1 lock.** + +## Predecessor close-out summary + +[`../kwin_overlay_subsurface/phase8_handover.md`](../kwin_overlay_subsurface/phase8_handover.md) +(closed 2026-05-03 without patch). Three independent reasons +no patch landed: + +1. The campaign's locked Phase 1 reference floor + (`drops_post_warmup == 0` from cage) is unreachable at N=3 + today. Today's median is 26 post-warmup with the same + chromium-fourier binary, same hardware, same kernel, same + Mesa, same kwin-fourier — KWin direct reproduces Phase 0's + 29 post-warmup, but cage now also drops ~22-56 post-warmup + instead of Phase 0's 0. +2. The campaign's surface-of-investigation + (`wp_subsurface` overlay route) is not engaged by + `brave_drops_test.html`. Chromium-fourier renders the video + element via internal compositing into its main browser + window surface — a single-surface case. +3. The Phase 2 hot-path hypothesis + (`glEGLImageTargetTexture2DOES` dominates `kwin_wayland`'s + per-frame cost) was rejected by Phase 3 perf measurement + with 100×-margin on the wrong side of the threshold. + +The diagnostic loop terminated at "the campaign's premise was +N=1 to begin with, and the N=3 in-session re-measurement +doesn't replicate it." This is filed as a feedback memory: +*replicate the N=1 baseline at N=3 in the same session BEFORE +building multi-phase infrastructure around it*. + +## What stays valid from the predecessor + +Durable substrate listed in +`kwin_overlay_subsurface/phase8_handover.md` § "What's left +for a future session to pick up": + +- **Phase 1 scanout-promotion archaeology** (rockchip-drm + RK3568 plane format/modifier table, KWin v6.6.4 promotion + predicate). Plane 39 (Primary, NV12 LINEAR) is the GL + framebuffer; Plane 45 (Overlay) does not advertise NV12 in + any modifier. Both KWin scanout-promotion paths are + structurally rejected for windowed Brave on this DRM driver. + This holds regardless of display server. +- **Phase 2 H1 file:line** in + `kwin_overlay_subsurface/phase2_source_findings.md`. Cold + per Phase 3 measurement; informational only. +- **Phase 2-prime Shape C source-read** of + `Display::dispatchEvents` and `TransactionFence` in KWin's + `src/wayland/`. Specific to the Wayland path; **not relevant + to an X11-session campaign**. The X11 path uses different + KWin surface plumbing (`kwin_x11`) and a different per-frame + protocol (X11 Composite extension + Damage + XPresent), not + Wayland protocol dispatch. +- **Δ_present-46 ms reproducible side-finding** under Plasma + Wayland. Across all measured conditions (chromium-fourier on + KWin, chromium-fourier in cage, stock Brave on KWin), median + Δ_present was 41-46 ms on a 60 Hz panel — a stable + ~2.7-vsync queue depth. This finding is independent of the + cage breakdown and **directly testable under X11** as a + comparison point. +- **Measurement infrastructure**: + `kwin_overlay_subsurface/scripts/wayland_debug_to_csv.py` + (libwayland 1.21+ format, 17 unit tests passing) + + `phase3_prime_runs/run_browser.sh` orchestrator on ohm + (handles `WAYLAND_DEBUG=1` capture, perf record, top + sampling, drops trajectory extraction, kill-cleanly). **The + WAYLAND_DEBUG portion does not apply under X11**; an X11 + equivalent would be different tooling (`xtrace`, `xev`, or + XCB-debug instrumentation if the client emits any). The + perf+top+drops capture portion remains usable under X11 + unchanged. + +## Current ohm state (carry-over from predecessor) + +Per `kwin_overlay_subsurface/phase1_evidence/ohm_tooling_revert_log.md`, +not reverted at predecessor close-out: + +- `qt6-base-fourier 1:6.11.0-3` +- `kwin-fourier 1:6.6.4-3` (Wayland-side compositor; not in + the hot path under an X11 session) +- `mesa 1:26.0.5-1` +- CPU governor pinned to `performance` +- Baloo permanently disabled +- `drm-info 2.9.0-1` +- Active session: `startplasma-wayland` on tty2, + `kwin_wayland` PID 3927 (as of 2026-05-03 03:05 UTC). +- Browser binaries available: `/tmp/chromium-ohm-gl-fix-step2/chrome` + (chromium-fourier, Step 1 + Step 2 patches, 149.0.7812.0), + `/usr/bin/brave` (`brave-bin 1:1.89.145-1`). + +If this campaign needs to switch ohm to an X11 session, that +is a session-level operator action (logout, switch via SDDM, +log back in). It cannot be done unattended. + +## Research question (provisional — awaits operator confirmation) + +**Candidate framing**, not locked: + +> *"On PineTab2 RK3568 with the same chromium-fourier binary, +> the same `bbb_1080p30_h264.mp4` 30 fps source clip, and the +> same `brave_drops_test.html` instrumented page, does running +> an X11-session display server (Plasma X11, or an alternative +> X11 desktop) reproduce the drop-inversion phenomenon that +> motivated `kwin_overlay_subsurface`, eliminate it, or +> introduce a different drop characteristic?"* + +This is the most narrowly relevant question given the +predecessor's close-out. Three plausible outcomes: + +- **(α)** X11 reproduces low post-warmup drops (matches Phase 0 + cage = 0 floor): isolates the dropped-frames mechanism to + the Wayland compositor stack on this hardware. The original + campaign's framing was correct in spirit but the cage + comparison was confounded; X11 becomes the better + comparator. +- **(β)** X11 has comparable or higher post-warmup drops: the + drop phenomenon is hardware/kernel/Mesa-bound and does not + localise to the display-server stack at all. Predecessor's + Phase 8 closure stands; the X11 measurement is the + decisive cross-check. +- **(γ)** X11 has a different failure mode entirely (different + drop pattern, different perf hot symbols, different effective + fps): each finding is its own characterisation; the + campaign becomes "what does running X11 on this hardware + look like end-to-end." + +**Alternate framings** the operator may have in mind that +this provisional question doesn't cover: + +- *Daily-driver fitness*: "Can I use X11 instead of Wayland on + this device for everyday browser/video/desktop work, and + what works/breaks?" — different scope; less measurement-heavy, + more workflow-oriented. +- *Specific X11-only feature investigation*: composite + redirection, XRender, GLAMOR, Xinerama on a single-display + device, etc. +- *XWayland behaviour*: many Linux desktops run X11 clients + under Wayland via XWayland. If an "X11 session" really means + "test under XWayland to compare with native Wayland", the + measurement is fundamentally different. +- *Power consumption / thermal*: X11 vs Wayland on a passively + cooled tablet may differ in idle CPU and thermal envelope. + Different metric set. + +**Operator decision needed before Phase 1**: + +1. Which question is in scope? (drop phenomenon, daily-driver, + feature-specific, XWayland-vs-native, power, or something + else). +2. What "X11 session" means specifically: native Xorg + Plasma + X11; native Xorg + lightweight WM (e.g. openbox / i3 / xfwm); + XWayland under the existing Plasma Wayland session; or + another configuration. +3. What the success/failure criteria look like (binding cells, + `metrics.csv` shape). + +Until those are answered, Phase 0 documents the question space +and Phase 1 does not lock. + +## What's NOT in scope (working assumption) + +Until the research question is confirmed, the following are +treated as out of scope so they don't slip into Phase 1 +prematurely: + +- Patches to KWin, Xorg, kwin-fourier, qt6-base-fourier, or any + other component on ohm. This is **research**, not + patch-development. Per non-upstreaming default, MR/bug-report + filing is explicitly tasked and not scheduled here. +- The Δ_present-46 ms finding's investigation. It's a known + hook from the predecessor; whether this campaign chases it + depends on the locked research question. +- Reverting predecessor tooling state. Governor, baloo, + `qt6-base-fourier`, `kwin-fourier` stay as-is unless the + operator decides otherwise. +- File a bug for any of the predecessor's three documented + candidate findings. Same non-upstreaming default applies. + +## What Phase 0 will deliver, regardless of framing + +Even before the research question is locked, the following are +useful Phase 0 deliverables that don't depend on the specific +question: + +1. **State snapshot of ohm under current Plasma Wayland** + captured at campaign start. This is the *before* photo for + any future X11 vs Wayland comparison. Unattended-tractable + (just scripted SSH). +2. **Inventory of available X11 paths on ohm**: what packages + are installed, what session candidates SDDM advertises, + what would need to be installed to enable a Plasma X11 + session, what alternate WMs are available. Read-only, + unattended-tractable. +3. **Inventory of measurement instruments that work under + X11**: `xtrace`, `xprop`, `xrandr --verbose --query`, perf + on `Xorg` PID, frame-timing extraction options. Read-only. +4. **A1 baseline** under current Plasma Wayland: re-run a + single rep of the predecessor's `kwin_timing_nodebug` + condition immediately at the start of this campaign, so + the comparison Wayland-vs-X11 has a same-session anchor. + This is the "set the baseline before instrument changes" + discipline from `feedback_replicate_baseline_first.md`. + +These steps are unblocked. They don't commit to a specific +research question and they produce evidence that's useful +under any of the candidate framings. diff --git a/worklist.md b/worklist.md new file mode 100644 index 0000000..0a0b36a --- /dev/null +++ b/worklist.md @@ -0,0 +1,55 @@ +# Work items — x11-session-research + +## Phase 0 — substrate + research question framing + +**Status: IN PROGRESS.** Substrate doc landed +(`phase0_findings.md`). Research question is provisional, +awaits operator confirmation. Pre-question Phase 0 deliverables +listed below are unblocked. + +- [x] Predecessor close-out summarised. Substrate doc + (`phase0_findings.md`) lists what stays valid from + `kwin_overlay_subsurface`, what's specific to Wayland and + doesn't transfer, and three plausible outcome shapes (α/β/γ) + for the candidate research question. +- [ ] **Operator confirms the research question.** Three + candidate framings + four alternates are listed in + `phase0_findings.md` § "Research question (provisional)". + Pick one (or correct the framing) before Phase 1. +- [ ] State snapshot of ohm under current Plasma Wayland — + the campaign-start *before* photo. Unattended-tractable. +- [ ] Inventory of available X11 paths on ohm: installed + packages, SDDM-advertised sessions, alternate WMs, + XWayland availability. Read-only. +- [ ] Inventory of measurement instruments that work under + X11. `xtrace`, frame-timing tooling, perf on Xorg PID, etc. + Read-only. +- [ ] A1 baseline: 1× `kwin_timing_nodebug`-equivalent run + on current Plasma Wayland session, captured into + `phase0_evidence/wayland_baseline_rep1/`. Same-session + anchor for any future X11 comparison. + +## Phase 1 — locked research question + binding cells + +**Pending operator confirmation of the Phase 0 question.** +Phase 1 lock will produce `phase1_lock.md` with binding cells +specific to whichever framing is locked. + +## Phase 2-onwards + +Pending. + +## Discipline carry-overs from `kwin_overlay_subsurface` + +- *Replicate the baseline first* — per + `feedback_replicate_baseline_first.md`. Phase 0 task "A1 + baseline" exists specifically because of this lesson; do not + skip it. +- *Phase discipline* — no patches before source-read is + documented. Re-scoping must be honest about deferral target. +- *Non-upstreaming default* — bug reports + MRs are explicit + operator-tasked decisions. +- *Memory persistence rule* — when this campaign reaches its + diagnostic terminal state (success or honest closure), update + `project_campaign_overview.md` and add any new feedback + memory worth carrying forward to the next campaign.