Public bug reported: Summary ------- On a Lenovo ThinkPad T14s Gen 6 (Snapdragon X Elite / X1E80100) the machine spontaneously hard-resets while in normal use — not during suspend, not during USB-C hotplug. The reset is firmware/SoC-level: there is no kernel panic, no oops, no soft/hard-lockup trace, nothing in pstore/ramoops, and the systemd journal of the killed boot is left corrupted (the kernel never got to log anything). The machine reboots on its own after a hard cut. This is a different failure from LP #2127013, which is suspend/resume- specific (immediate resume after s2idle). This report is specifically about resets that occur while the machine is awake and in use. Hardware -------- - Model: Lenovo ThinkPad T14s Gen 6 - Machine type / product: 21N10001US (MT 21N1) - SoC: Qualcomm Snapdragon X Elite, X1E80100 (aarch64) - BIOS: LENOVO N42ET97W (2.27), date 2026-02-24 Software -------- - Ubuntu 26.04 LTS - Kernel: 7.0.0-27-generic (#27-Ubuntu SMP PREEMPT_DYNAMIC aarch64) - Suspend mode: s2idle - Kernel command line: ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,... (clk_ignore_unused / pd_ignore_unused / efi=noruntime are the documented required X1E params; arm64.nopauth was added as a speculative mitigation and made no observable difference.) Impact ------ Unpredictable loss of all unsaved work; filesystem orphan-cleanup on every recovery boot. Because the reset is below the OS, kdump never fires and no crash artifact is produced, making it very hard to diagnose. What happens ------------ The system is running normally (light desktop + containers), then without warning the screen cuts and the machine resets and reboots. It is not correlated with suspend or with plugging/unplugging USB-C. Failure signature (forensics from one captured instance) -------------------------------------------------------- - The boot that died ran ~3h10m entirely in-use. It performed ZERO suspend cycles before the reset (so this is not the s2idle path). - The last kernel-ring message preceded the reset by ~2h18m; there is no kernel activity logged at the moment of the cut. - No "panic", "oops", "BUG:", soft/hard-lockup, RCU stall, MCE, or thermal-trip message anywhere near the reset. - /sys/fs/pstore is empty after the reset (kdump-tools active, crashkernel reserved) — nothing was captured. - On the recovery boot: "EXT4-fs (nvmeXn1pY): orphan cleanup on readonly fs" and "system.journal corrupted or uncleanly shut down" — i.e. a hard power cut, not a graceful reboot. - Note: the platform reports "watchdog: NMI not fully supported" / "Hard watchdog permanently disabled", so a CPU soft-lockup would not be caught by an NMI watchdog. Reproducibility --------------- Intermittent — occurs roughly every few hours of uptime, not on demand. I have installed a small boot-flag service (writes a flag on boot, removes it on clean shutdown) so each reset is unambiguously recorded with a timestamp and whether the prior boot had suspended; I can attach this log over time to characterise frequency. Related observation (may point at the layer involved) ---------------------------------------------------- After some of these resets, the next Linux boot comes up with the display(s) black (both internal eDP and external). A full cold power-off + drain does NOT clear it; only booting Windows once and then back into Linux restores the display. This strongly suggests a Qualcomm subsystem / display-PHY / firmware state that the proprietary Windows driver stack tears down but Linux does not — consistent with the reset itself being a firmware/SoC-level event rather than a kernel fault. What I have tried ----------------- - Upgraded 25.10 (6.17) -> 26.04 (7.0.0-27): the lenovo-thinkpad-t14s EC driver is now loaded; it did not stop the in-use resets. - Added arm64.nopauth: no observable change. - Confirmed it is not the suspend path (#2127013) and not USB-C hotplug. Request ------- 1. Is there any known X1E80100 SoC watchdog / PMIC power-collapse / PDR-SSR path that can trigger a full SoC reset without a kernel trace, and any way to surface it (e.g. enabling a Qualcomm-side log, ramoops backend that survives this reset type, or a debug build)? 2. Guidance on capturing anything at all from a reset that leaves pstore empty would be very welcome. 3. Happy to test debug kernels / patches and to provide the reset-frequency log and any apport data. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic aarch64 ApportVersion: 2.34.0-0ubuntu2 Architecture: arm64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: devop 4290 F.... pipewire devop 4573 F.... wireplumber /dev/snd/seq: devop 4290 F.... pipewire CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sat Jun 27 16:35:57 2026 InstallationDate: Installed on 2026-01-07 (171 days ago) InstallationMedia: Ubuntu 25.10 "Questing Quokka" - Release arm64 (20251007) Lspci-vt: -[0004:00]---00.0-[01-ff]----00.0 Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] -[0005:00]---00.0-[01-ff]-- -[0006:00]---00.0-[01-ff]----00.0 Sandisk Corp WD PC SN740 NVMe SSD 512GB (DRAM-less) MachineType: LENOVO 21N10001US ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/usr/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 msmdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=17fb0394-4f34-4e43-bb6a-ae0867401e98 ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M SourcePackage: linux UpgradeStatus: Upgraded to resolute on 2026-06-27 (0 days ago) acpidump: dmi.bios.date: 02/24/2026 dmi.bios.release: 2.27 dmi.bios.vendor: LENOVO dmi.bios.version: N42ET97W (2.27 ) dmi.board.asset.tag: Not Available dmi.board.name: 21N10001US dmi.board.vendor: LENOVO dmi.board.version: SDK0T76576 WIN ptal����8 dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: None dmi.ec.firmware.release: 1.32 dmi.modalias: dmi:bvnLENOVO:bvrN42ET97W(2.27):bd02/24/2026:br2.27:efr1.32:svnLENOVO:pn21N10001US:pvrThinkPadT14sGen6:rvnLENOVO:rn21N10001US:rvrSDK0T76576WINptal8:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21N1_BU_Think_FM_ThinkPadT14sGen6:pfaThinkPadT14sGen6: dmi.product.family: ThinkPad T14s Gen 6 dmi.product.name: 21N10001US dmi.product.sku: LENOVO_MT_21N1_BU_Think_FM_ThinkPad T14s Gen 6 dmi.product.version: ThinkPad T14s Gen 6 dmi.sys.vendor: LENOVO ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: apport-bug arm64 qualcomm resolute snapdragon thinkpad wayland-session x1e80100 ** Attachment added: "launchpad-evidence.txt" https://bugs.launchpad.net/bugs/2158523/+attachment/5979220/+files/launchpad-evidence.txt -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158523 Title: [Snapdragon X Elite / X1E80100] ThinkPad T14s Gen 6 (21N1): spontaneous in-use hard reset — no kernel panic, no pstore dump, corrupted journal (distinct from suspend bug #2127013) Status in linux package in Ubuntu: New Bug description: Summary ------- On a Lenovo ThinkPad T14s Gen 6 (Snapdragon X Elite / X1E80100) the machine spontaneously hard-resets while in normal use — not during suspend, not during USB-C hotplug. The reset is firmware/SoC-level: there is no kernel panic, no oops, no soft/hard-lockup trace, nothing in pstore/ramoops, and the systemd journal of the killed boot is left corrupted (the kernel never got to log anything). The machine reboots on its own after a hard cut. This is a different failure from LP #2127013, which is suspend/resume- specific (immediate resume after s2idle). This report is specifically about resets that occur while the machine is awake and in use. Hardware -------- - Model: Lenovo ThinkPad T14s Gen 6 - Machine type / product: 21N10001US (MT 21N1) - SoC: Qualcomm Snapdragon X Elite, X1E80100 (aarch64) - BIOS: LENOVO N42ET97W (2.27), date 2026-02-24 Software -------- - Ubuntu 26.04 LTS - Kernel: 7.0.0-27-generic (#27-Ubuntu SMP PREEMPT_DYNAMIC aarch64) - Suspend mode: s2idle - Kernel command line: ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,... (clk_ignore_unused / pd_ignore_unused / efi=noruntime are the documented required X1E params; arm64.nopauth was added as a speculative mitigation and made no observable difference.) Impact ------ Unpredictable loss of all unsaved work; filesystem orphan-cleanup on every recovery boot. Because the reset is below the OS, kdump never fires and no crash artifact is produced, making it very hard to diagnose. What happens ------------ The system is running normally (light desktop + containers), then without warning the screen cuts and the machine resets and reboots. It is not correlated with suspend or with plugging/unplugging USB-C. Failure signature (forensics from one captured instance) -------------------------------------------------------- - The boot that died ran ~3h10m entirely in-use. It performed ZERO suspend cycles before the reset (so this is not the s2idle path). - The last kernel-ring message preceded the reset by ~2h18m; there is no kernel activity logged at the moment of the cut. - No "panic", "oops", "BUG:", soft/hard-lockup, RCU stall, MCE, or thermal-trip message anywhere near the reset. - /sys/fs/pstore is empty after the reset (kdump-tools active, crashkernel reserved) — nothing was captured. - On the recovery boot: "EXT4-fs (nvmeXn1pY): orphan cleanup on readonly fs" and "system.journal corrupted or uncleanly shut down" — i.e. a hard power cut, not a graceful reboot. - Note: the platform reports "watchdog: NMI not fully supported" / "Hard watchdog permanently disabled", so a CPU soft-lockup would not be caught by an NMI watchdog. Reproducibility --------------- Intermittent — occurs roughly every few hours of uptime, not on demand. I have installed a small boot-flag service (writes a flag on boot, removes it on clean shutdown) so each reset is unambiguously recorded with a timestamp and whether the prior boot had suspended; I can attach this log over time to characterise frequency. Related observation (may point at the layer involved) ---------------------------------------------------- After some of these resets, the next Linux boot comes up with the display(s) black (both internal eDP and external). A full cold power-off + drain does NOT clear it; only booting Windows once and then back into Linux restores the display. This strongly suggests a Qualcomm subsystem / display-PHY / firmware state that the proprietary Windows driver stack tears down but Linux does not — consistent with the reset itself being a firmware/SoC-level event rather than a kernel fault. What I have tried ----------------- - Upgraded 25.10 (6.17) -> 26.04 (7.0.0-27): the lenovo-thinkpad-t14s EC driver is now loaded; it did not stop the in-use resets. - Added arm64.nopauth: no observable change. - Confirmed it is not the suspend path (#2127013) and not USB-C hotplug. Request ------- 1. Is there any known X1E80100 SoC watchdog / PMIC power-collapse / PDR-SSR path that can trigger a full SoC reset without a kernel trace, and any way to surface it (e.g. enabling a Qualcomm-side log, ramoops backend that survives this reset type, or a debug build)? 2. Guidance on capturing anything at all from a reset that leaves pstore empty would be very welcome. 3. Happy to test debug kernels / patches and to provide the reset-frequency log and any apport data. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic aarch64 ApportVersion: 2.34.0-0ubuntu2 Architecture: arm64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: devop 4290 F.... pipewire devop 4573 F.... wireplumber /dev/snd/seq: devop 4290 F.... pipewire CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sat Jun 27 16:35:57 2026 InstallationDate: Installed on 2026-01-07 (171 days ago) InstallationMedia: Ubuntu 25.10 "Questing Quokka" - Release arm64 (20251007) Lspci-vt: -[0004:00]---00.0-[01-ff]----00.0 Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] -[0005:00]---00.0-[01-ff]-- -[0006:00]---00.0-[01-ff]----00.0 Sandisk Corp WD PC SN740 NVMe SSD 512GB (DRAM-less) MachineType: LENOVO 21N10001US ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/usr/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 msmdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=17fb0394-4f34-4e43-bb6a-ae0867401e98 ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M SourcePackage: linux UpgradeStatus: Upgraded to resolute on 2026-06-27 (0 days ago) acpidump: dmi.bios.date: 02/24/2026 dmi.bios.release: 2.27 dmi.bios.vendor: LENOVO dmi.bios.version: N42ET97W (2.27 ) dmi.board.asset.tag: Not Available dmi.board.name: 21N10001US dmi.board.vendor: LENOVO dmi.board.version: SDK0T76576 WIN ptal����8 dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: None dmi.ec.firmware.release: 1.32 dmi.modalias: dmi:bvnLENOVO:bvrN42ET97W(2.27):bd02/24/2026:br2.27:efr1.32:svnLENOVO:pn21N10001US:pvrThinkPadT14sGen6:rvnLENOVO:rn21N10001US:rvrSDK0T76576WINptal8:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21N1_BU_Think_FM_ThinkPadT14sGen6:pfaThinkPadT14sGen6: dmi.product.family: ThinkPad T14s Gen 6 dmi.product.name: 21N10001US dmi.product.sku: LENOVO_MT_21N1_BU_Think_FM_ThinkPad T14s Gen 6 dmi.product.version: ThinkPad T14s Gen 6 dmi.sys.vendor: LENOVO To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158523/+subscriptions
Комментариев нет:
Отправить комментарий