3 in-session reps of chromium-fourier 149 / brave_drops_test.html /
Plasma Wayland 6.6.4 (kwin-fourier 6.6.4-3 + qt6-base-fourier
6.11.0-3 carry-overs intact). Tight cluster IQR=0:
drops_total=0, drops_post_warmup=0, frames_total=1685, kwin %CPU
median=0.00, mean=0.04. Perf samples on kwin (~30 over 70s) show
zero composite/dmabuf/GL symbols — only event-loop bookkeeping.
Most likely mechanism: KWin direct-scanout fast-path engaged for
the single-visible-client video case. The campaign's load-bearing
hypothesis ("X11 + non-compositing WM avoids per-frame GL composite
of NV12") is structurally weakened — KWin already avoids that work
under Wayland for this workload. Phase 1 needs to add a
multi-window A1' variant and drm_info-during-playback to confirm
direct-scanout, then revisit matrix cell design.
revert.log entry 6: SDDM autologin + state.conf swap that landed
the Plasma Wayland session for the A1 reps. Backup of original
state.conf preserved at /var/lib/sddm/state.conf.x11-research-bak;
single-command revert documented.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.9 KiB
A1 baseline protocol — in-session Plasma Wayland anchor
Goal: acquire 3 in-session reps of a chromium-fourier under-Plasma-Wayland-with-KWin video playback measurement, so the X11 cells of the matrix have a same-session Wayland reference to compare against. Per the campaign-contained-data discipline, this is the only Wayland baseline this campaign uses; predecessor numbers are reference history only.
Cell
- Browser:
/tmp/chromium-ohm-gl-fix-step2/chrome(chromium-fourier 149.0.7812.0, the existing predecessor build). - Page:
file:///home/mfritsche/fourier-test/brave_drops_test.html(a 30 fps H.264 / video element with autoplay + drops trajectory emitted to console at 1 Hz; used by the predecessor for all Phase 3 reps). - Session: Plasma Wayland tty1 / session 433 (the live one, autologin'd via revert.log entry 6).
- Window: windowed (default chromium behavior, no fullscreen).
- Decode: chromium-fourier's default decode path. With the
Step 1 + Step 2 patches present, this is libva via
libva-v4l2-request-fourierdriver (V4L2 stateless on hantro). - Capture window: 70 s starting at autoplay-detected.
- Instrumentation:
top -p kwin_wayland(1 Hz),top(system, 1 Hz),sudo perf record -F 99 -g --call-graph dwarf -p kwin_wayland, browser stderr (catches the page'sDROPS_TRAJECTORY: t=Xs tot=Y drop=Z1 Hz log). NoWAYLAND_DEBUG=1— this is thenodebugvariant so the kwin %CPU and drop measurements aren't perturbed by WAYLAND_DEBUG's per-message overhead.
Bound metrics per rep
Each rep's evidence dir contains:
start.txt/end.txt/capture_start.txt— wall-clock timestamps of phases.temp_pre.txt/temp_post.txt— thermal_zone0 (cpu) at phase boundaries.top_kwin.txt—kwin_wayland%CPU samples (70 × 1 Hz).top_full.txt— system-wide top (70 × 1 Hz).perf.data— perf record at 99 Hz on kwin_wayland.perf_report_self.txt— perf report (sorted by overhead).perf_report_top50.txt— first 50 lines of perf report.stderr.log— full chromium stderr.drops_trajectory.txt— extracted DROPS_TRAJECTORY lines.kwin_cpu_summary.txt— kwin %CPU samples / median / mean / min / max.drops_summary.txt—frames_total,drops_total,drops_post_warmup(drops accumulated after t=10 s).
Protocol
Three reps back-to-back with ≥ 30 s idle between to let thermals settle. The whole campaign sequence takes ~5 minutes of wall time:
T+0:00 rep 1: launch + 70s capture + cleanup (~95s)
T+1:35 30s idle (thermal settle)
T+2:05 rep 2: same (~95s)
T+3:40 30s idle
T+4:10 rep 3: same (~95s)
T+5:45 done — pull evidence
SSH-driven: the orchestrator
/home/mfritsche/phase3_prime_runs/run_browser_nodebug.sh $RUN_ID chromium-fourier-kwin runs end-to-end from a single
SSH command. Operator-side, a chrome window will appear on
the screen for ~80 s per rep; the only operator action is
not interacting with that window (no clicks, no typing in
the chrome window, no pulling focus). The orchestrator kills
the chrome process cleanly at end of capture.
After the 3 reps complete, this campaign's evidence
sub-directory phase0_evidence/wayland_baseline_2026-05-03/
will contain:
a1_rep1/ (moved from /home/mfritsche/phase3_prime_runs/x11research_a1_rep1/)
a1_rep2/
a1_rep3/
a1_summary.md (this campaign's interpretation of the 3 reps)
The original predecessor evidence at
/home/mfritsche/phase3_prime_runs/kwin_timing_nodebug_rep[1-3]
is untouched.
Exit conditions
- Per-rep success =
drops_summary.txtexists with non-n/avalues,kwin_cpu_summary.txtexists with samples > 0, perf report has > 1000 samples. - Per-rep failure causes:
- autoplay not detected within 30 s → script aborts, evidence dir is partial; rep marked failed.
- workload exits before autoplay → script aborts.
- perf record fails (e.g. paranoid > 1) → script continues
but perf.data is empty; we'd see this in
perf_record_stderr.txt.
If a rep fails, surface the cause and re-run that rep before moving on.
Decision after 3 reps
Compute median + IQR of drops_post_warmup, frames_total,
drops_total, and kwin %CPU across the three reps. Two
possible verdict shapes:
- Tight cluster (IQR / median ≤ 0.3): baseline is stable; Phase 1 binding cells can use the median as the anchor with the IQR as the tolerance band.
- High variance (IQR / median > 0.3): baseline is noisy; Phase 1 needs ≥ 5 reps per cell, not 3, and binding-cell thresholds need IQR-based formulation rather than fixed numbers. This is the predecessor lesson built into the worklist's "3 reps minimum (variance is a real concern)".
Operator green-light request
Before I fire the 3 reps:
- Confirm you're OK with a chrome window popping up on the screen for ~80 s per rep × 3 reps, and during that time not interacting with it (mouse stays still, no key presses).
- Confirm the current Plasma Wayland session is in a "clean measurement state" — i.e. nothing else is doing significant CPU work (ideally close any active terminals/browsers/IDE windows you don't need; the predecessor's 36-37 % kwin %CPU baseline assumes a quiescent desktop with just the test chrome window plus normal Plasma services running).
- (Optional) Decide whether to also include other variants in
this turn's measurements — e.g. add a rep of
Brave 147orFirefox 150under Plasma Wayland to start populating the full matrix. Default scope: just the 3 chromium-fourier reps; matrix-fill cells go into Phase 1 proper.
When green-lit, I fire run_browser_nodebug.sh x11research_a1_rep1 chromium-fourier-kwin first as a smoke
test, surface its drops_summary.txt + kwin_cpu_summary.txt
output, then on confirmation fire reps 2 and 3 back-to-back.