[ka:host-changed] ampere: display stuck on black screen during reboot (ssh recovers but display does not) #13

Closed
opened 2026-05-16 09:19:10 +00:00 by claude-noether · 1 comment
Collaborator

[ka:host-changed] ampere: console / display stuck on black screen during reboot, does not return to greeter

Symptom

After issuing sudo systemctl reboot on ampere (CoolPi CM5 GenBook, RK3588, running linux-ampere-fourier 7.0rc3.kafr1-1), the display goes black and does NOT progress to the SDDM greeter or auto-login session. Operator-reported during ampere-kernel-decoders iter2 Phase 6 Step 5 verification work, 2026-05-16 ~11:14 CEST.

ssh to the host does eventually come back (uptime confirms the kernel rebooted and userspace came up). So the kernel + initramfs + early-userspace boot path is fine; the failure mode is specific to bringing the display / Wayland / SDDM session backup post-reboot.

Reproducibility

Operator-reported reproducer is "every reboot since SDDM auto-login was configured today (2026-05-16 10:36)." Confirmed reachable via SSH within ~60 s of reboot issuance — backend probes confirm 7.0.0-rc3-devices+ running, /run/user/1001/wayland-0 re-appears, mfritsche session is active per loginctl list-sessions. So:

  • Kernel boot ✓
  • SSH-side userspace ✓
  • Wayland session creation ✓
  • Display output ✗ — screen stays black

Likely substrate factors

  • Kernel: vanilla torvalds v7.0-rc3 (per kernel-agent #8/#9/#10 close — minimal-baseline branch ampere-minimal-devices @ 7c241f2e2835)
  • Board patches applied: boltzmann:~/src/misc_patches/genbook/kernel/000{1..8}*.patch (6 patches: pwm15, pwm-fan, power-off via RK806, audio-graph-card, USB-C PD, lid-switch + USB3 PHY)
  • Display stack: panfrost + panthor + KDE Plasma Wayland (the arch_mainline extlinux entry)
  • SDDM auto-login: configured 2026-05-16 ~10:36 via /etc/sddm.conf.d/autologin.conf with User=mfritsche + Session=plasma.desktop (the Wayland session)
  • eDP / DisplayPort panel: GenBook integrated panel (specific compatible TBD — operator can confirm from DTS)

Possible causes worth investigating (operator picks priority)

  1. Display-power-management not surviving suspend-reboot cycle on this kernel — RK3588 GPU power gating + early-boot panel re-init has historically been fragile on mainline-v7.0-rc3. May need a panel-back-power or DRM resume hook present in later -rc / -stable.
  2. panthor (RK3588 GPU) initialization race with SDDM greeter compose — if the panthor module is delay-loaded relative to when SDDM tries to bring up its EGL surface, the greeter may render to an unmounted framebuffer (black). Workaround: explicit kernel parameter modprobe.blacklist=panthor then re-enable post-login — known-bad approach but localizes the issue.
  3. eDP timing / PHY mis-init after reboot — the genbook eDP panel has known timing-fix requirements (see u-boot patches in boltzmann:~/src/misc_patches/genbook/u-boot/000{3,4,5}*.patch). Kernel side may need an equivalent. The PHY-reset path on this specific panel may differ between cold-boot and warm-reboot.
  4. DTS overlay state lost on reboot — if any DTS overlay (e.g. rockchip-rk3588-opp-oc-24ghz.dtbo referenced commented-out in extlinux.conf — see ampere-fourier iter1 phase0 §extlinux dump) was applied during first-boot and not persisted, reboot brings up minus that overlay → display init fails.
  5. Just-installed linux-ampere-fourier 7.0rc3.kafr1-1 may be missing a post-merge stability patch for the panel/display init path — the kernel went from neighbour's hands to ours at PR #8/#9/#10 (2026-05-16 morning); a follow-up [ka:experiment] may be needed for a candidate fix.

Workaround for the campaign right now

ssh-attached workflow continues to function. iter2 Phase 6 testing can proceed entirely over ssh — the display being black during reboot doesn't block the libva backend HEVC validation (which is what iter2 needs). However:

  • C7 (firefox-fourier vendor-default engagement test) requires a working display + active Wayland session that the operator can see. Without display, C7 stays deferred.
  • Operator observing the actual display output for debugging (e.g. looking at kernel log scroll, KDE crash dialogs) is impossible.

Asks

For kernel-agent:

  1. Triage / categorize — is this likely (1) panel-power-management, (2) panthor race, (3) eDP timing, (4) DTS-overlay persistence, or (5) something else entirely?
  2. Suggested debug path — what's the cheapest first instrument? dmesg -wH over ssh during a deliberate reboot, capturing the seconds where the display drops; alternatively, serial-console (ampere has a UART exposed somewhere?) to capture early-boot display init.
  3. Candidate fix path — if this is a known-fixed-upstream issue (-rc4..-rc7 of v7.0, or backported to v7.0-stable), upstream a candidate experiment branch via the kernel-agent flow.

Refs

  • ampere baseline kernel: marfrit/kernel-agent#8/#9/#10 (closed 2026-05-16 morning) producing linux-ampere-fourier 7.0rc3.kafr1-1
  • ampere SDDM auto-login setup: today's session, written /etc/sddm.conf.d/autologin.conf with Session=plasma.desktop (Wayland)
  • ampere extlinux dump + boot history (showing the hand-managed kernel substrate): ~/src/ampere-fourier/phase0_findings.md
  • ampere-kernel-decoders iter2 in progress (the HEVC backend extension): ~/src/ampere-kernel-decoders/ — orthogonal to this display issue but the reboot-cycle requirement is in its critical path.
# [ka:host-changed] ampere: console / display stuck on black screen during reboot, does not return to greeter ## Symptom After issuing `sudo systemctl reboot` on ampere (CoolPi CM5 GenBook, RK3588, running `linux-ampere-fourier 7.0rc3.kafr1-1`), the display goes black and does NOT progress to the SDDM greeter or auto-login session. Operator-reported during ampere-kernel-decoders iter2 Phase 6 Step 5 verification work, 2026-05-16 ~11:14 CEST. ssh to the host *does* eventually come back (uptime confirms the kernel rebooted and userspace came up). So the kernel + initramfs + early-userspace boot path is fine; the failure mode is specific to bringing the display / Wayland / SDDM session backup post-reboot. ## Reproducibility Operator-reported reproducer is "every reboot since SDDM auto-login was configured today (2026-05-16 10:36)." Confirmed reachable via SSH within ~60 s of reboot issuance — backend probes confirm `7.0.0-rc3-devices+` running, `/run/user/1001/wayland-0` re-appears, mfritsche session is active per `loginctl list-sessions`. So: - Kernel boot ✓ - SSH-side userspace ✓ - Wayland session creation ✓ - **Display output ✗** — screen stays black ## Likely substrate factors - Kernel: vanilla `torvalds v7.0-rc3` (per kernel-agent #8/#9/#10 close — minimal-baseline branch `ampere-minimal-devices @ 7c241f2e2835`) - Board patches applied: `boltzmann:~/src/misc_patches/genbook/kernel/000{1..8}*.patch` (6 patches: pwm15, pwm-fan, power-off via RK806, audio-graph-card, USB-C PD, lid-switch + USB3 PHY) - Display stack: panfrost + panthor + KDE Plasma Wayland (the `arch_mainline` extlinux entry) - SDDM auto-login: configured 2026-05-16 ~10:36 via `/etc/sddm.conf.d/autologin.conf` with `User=mfritsche` + `Session=plasma.desktop` (the Wayland session) - eDP / DisplayPort panel: GenBook integrated panel (specific compatible TBD — operator can confirm from DTS) ## Possible causes worth investigating (operator picks priority) 1. **Display-power-management not surviving suspend-reboot cycle on this kernel** — RK3588 GPU power gating + early-boot panel re-init has historically been fragile on mainline-v7.0-rc3. May need a panel-back-power or DRM resume hook present in later -rc / -stable. 2. **panthor (RK3588 GPU) initialization race with SDDM greeter compose** — if the panthor module is delay-loaded relative to when SDDM tries to bring up its EGL surface, the greeter may render to an unmounted framebuffer (black). Workaround: explicit `kernel parameter modprobe.blacklist=panthor` then re-enable post-login — known-bad approach but localizes the issue. 3. **eDP timing / PHY mis-init after reboot** — the genbook eDP panel has known timing-fix requirements (see u-boot patches in `boltzmann:~/src/misc_patches/genbook/u-boot/000{3,4,5}*.patch`). Kernel side may need an equivalent. The PHY-reset path on this specific panel may differ between cold-boot and warm-reboot. 4. **DTS overlay state lost on reboot** — if any DTS overlay (e.g. `rockchip-rk3588-opp-oc-24ghz.dtbo` referenced commented-out in extlinux.conf — see ampere-fourier iter1 phase0 §extlinux dump) was applied during first-boot and not persisted, reboot brings up minus that overlay → display init fails. 5. **Just-installed `linux-ampere-fourier 7.0rc3.kafr1-1` may be missing a post-merge stability patch for the panel/display init path** — the kernel went from neighbour's hands to ours at PR #8/#9/#10 (2026-05-16 morning); a follow-up `[ka:experiment]` may be needed for a candidate fix. ## Workaround for the campaign right now ssh-attached workflow continues to function. iter2 Phase 6 testing can proceed entirely over ssh — the display being black during reboot doesn't block the libva backend HEVC validation (which is what iter2 needs). However: - C7 (firefox-fourier vendor-default engagement test) requires a working display + active Wayland session that the operator can see. Without display, C7 stays deferred. - Operator observing the actual display output for debugging (e.g. looking at kernel log scroll, KDE crash dialogs) is impossible. ## Asks For kernel-agent: 1. **Triage / categorize** — is this likely (1) panel-power-management, (2) panthor race, (3) eDP timing, (4) DTS-overlay persistence, or (5) something else entirely? 2. **Suggested debug path** — what's the cheapest first instrument? `dmesg -wH` over ssh during a deliberate reboot, capturing the seconds where the display drops; alternatively, serial-console (ampere has a UART exposed somewhere?) to capture early-boot display init. 3. **Candidate fix path** — if this is a known-fixed-upstream issue (-rc4..-rc7 of v7.0, or backported to v7.0-stable), upstream a candidate experiment branch via the kernel-agent flow. ## Refs - ampere baseline kernel: `marfrit/kernel-agent#8`/`#9`/`#10` (closed 2026-05-16 morning) producing `linux-ampere-fourier 7.0rc3.kafr1-1` - ampere SDDM auto-login setup: today's session, written `/etc/sddm.conf.d/autologin.conf` with Session=plasma.desktop (Wayland) - ampere extlinux dump + boot history (showing the hand-managed kernel substrate): `~/src/ampere-fourier/phase0_findings.md` - ampere-kernel-decoders iter2 in progress (the HEVC backend extension): `~/src/ampere-kernel-decoders/` — orthogonal to this display issue but the reboot-cycle requirement is in its critical path.
Owner

Fixed by applying misc_patches and rebuilding.

Fixed by applying misc_patches and rebuilding.
Sign in to join this conversation.