Hi jmorete, Thanks for taking the time to submit a bug report and helping to improve Ubuntu! Please try the latest mainline build and see if you are able to reproduce the same issue while running that kernel: https://kernel.ubuntu.com/mainline/v7.1.2/. For instructions on how to use mainline builds, refer to this wiki page: https://wiki.ubuntu.com/Kernel/MainlineBuilds. ** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2157924 Title: iscsi_tcp 40-50% sequential read performance regression (5.15 vs 6.8/7.0) due to release_sock serialization bottleneck Status in linux package in Ubuntu: Incomplete Bug description: 1) === System Metadata === OS Release: Description: Ubuntu 26.04 LTS Release: 26.04 Kernel Version: 7.0.0-22-generic CPU Model: Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz Loaded iscsi modules: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi Network controller: 5e:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 5e:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 2) # apt-cache policy linux-image-generic linux-image-generic: Installed: 7.0.0-22.22 3) Expected same or better performance over iscsi LUN volume mounts. 4) Performance got 40% to 50% worst than the baseline test on Ubuntu 22.04.5 LTS with kernel 5.15 # Kernel Bug Report: iscsi_tcp Sequential Throughput Regression (5.15 → 6.8 / 7.0) ## Summary A significant sequential I/O throughput regression has been identified in the `iscsi_tcp` kernel module when comparing Ubuntu kernel 5.15 (Ubuntu 22.04) against kernels 6.8 (Ubuntu 24.04) and 7.0 (Ubuntu 26.04). Sequential read throughput drops by approximately 40-50% on the newer kernels under identical hardware, network, and storage backend conditions. All externally-tunable parameters have been exhaustively tested and eliminated as the cause. ## Affected Versions | Distribution | Kernel Version | Status | |-------------|---------------|--------| | Ubuntu 22.04 LTS | 5.15.0-143-generic | Working (baseline) | | Ubuntu 24.04 LTS | 6.8.0-88-generic | **Regressed** | | Ubuntu 26.04 LTS | 7.0.0-22-generic | **Regressed** | ## Hardware Configuration (identical across all hosts) - **CPU**: 96 cores (dual-socket Intel/AMD server) - **Memory**: 750 GiB - **NIC**: Mellanox ConnectX (25 GbE, dual-port, LACP bond, mlx5 driver) - **Storage Backend**: NetApp ONTAP SAN (iSCSI, ontap-san-economy driver via Trident CSI 25.06) - **Network**: Jumbo frames (MTU 9000), VLAN-tagged storage network - **Multipath**: dm-multipath with ALUA, service-time path selector, 2 paths per LUN ## Test Environment - **Volume**: 100 GiB iSCSI LUN provisioned via Trident CSI (PVC with `Filesystem` volumeMode) - **Test Pod**: Kubernetes pod with fio 3.6, volume mounted at `/mnt/data` - **fio Parameters**: `--direct=1 --ioengine=libaio --iodepth=64 --numjobs=1 --runtime=30 --time_based --group_reporting --directory=/mnt/data` - **Block Sizes Tested**: 128K, 256K, 1M ## Results ### Sequential Read Throughput (256K block size) | Kernel | Bandwidth | IOPS | Avg Latency | |--------|-----------|------|-------------| | 5.15.0-143 | **1542 MiB/s** | 6168 | 10.2 ms | | 6.8.0-88 | **740 MiB/s** | 2958 | 21.3 ms | | 7.0.0-22 | **1034 MiB/s** | 4136 | 15.4 ms | ### Sequential Read Throughput (1M block size) | Kernel | Bandwidth | IOPS | Avg Latency | |--------|-----------|------|-------------| | 5.15.0-143 | **1444 MiB/s** | 1444 | 43.6 ms | | 6.8.0-88 | **815 MiB/s** | 814 | 77.3 ms | | 7.0.0-22 | **851 MiB/s** | 851 | 73.9 ms | ### Regression Magnitude - **Kernel 6.8 vs 5.15**: 44-52% throughput reduction (sequential reads) - **Kernel 7.0 vs 5.15**: 33-41% throughput reduction (sequential reads) ## iSCSI / SCSI Parameters (verified identical across all kernels) | Parameter | Value | |-----------|-------| | `can_queue` (scsi_host) | 113 | | `cmd_per_lun` (scsi_host) | 32 | | `sg_tablesize` (scsi_host) | 4096 | | `queue_depth` (per LUN) | 32 | | `max_hw_sectors_kb` | 32767 | | iSCSI `MaxRecvDataSegmentLength` | 262144 | | iSCSI `FirstBurstLength` | 65536 (negotiated by target) | | iSCSI `MaxBurstLength` | 1048576 | | iSCSI `MaxOutstandingR2T` | 1 | | iSCSI `ImmediateData` | Yes | | iSCSI `InitialR2T` | Yes (negotiated by target) | | TCP congestion control | BBR | | MTU | 9000 | | TCP rmem_max / wmem_max | 134217728 | ## Eliminated Causes The following parameters/settings were systematically tuned on the 7.0 kernel with no measurable impact on throughput: | Tuning Attempted | Result | |-----------------|--------| | Disable WBT (`wbt_lat_usec=0`) | No change | | Increase read-ahead (`read_ahead_kb=16384`) | No change (expected: direct IO) | | IO scheduler: `none` (passthrough) | No change | | IO scheduler: `mq-deadline` (match 5.15 default) | No change | | Reduce `max_sectors_kb` to 64 (match 5.15 value) | No change | | Increase `nr_requests` to 512 | No change | | Enable `recv_from_iscsi_q=Y` (kernel 7.0 parameter) | No change | | Increase `netdev_budget` (1200→4800) and `netdev_budget_usecs` (8000→32000) | No change | | Renice `iscsid` process to -20 | No change | | Enable RPS on storage VLAN interface (`rps_cpus=ffffffff`) | No change | | Enable RFS (`rps_sock_flow_entries=32768`) | No change | | Enable `quickack` on iSCSI storage routes | No change | | Set `tcp_low_latency=1` | No change | | Increase `gro_max_size` / `gso_max_size` | Failed (not supported on VLAN interface) | | Multiple fio jobs (numjobs=2) | No change / slightly worse | ## Analysis ### Per-IO Latency Comparison With `queue_depth=32` and `iodepth=64` (saturating the device queue), throughput is governed by: ``` throughput = queue_depth / avg_latency_per_IO ``` For 256K sequential reads: - **Kernel 5.15**: 32 / 0.0052s = ~6150 IOPS → 1538 MiB/s (matches observed) - **Kernel 6.8**: 32 / 0.0108s = ~2962 IOPS → 741 MiB/s (matches observed) - **Kernel 7.0**: 32 / 0.0077s = ~4156 IOPS → 1039 MiB/s (matches observed) The per-IO latency for the same 256K read operation is: - **5.15**: ~5.2 ms average - **6.8**: ~10.8 ms average (2.1x higher) - **7.0**: ~7.7 ms average (1.5x higher) ### TCP Connection Health (not the bottleneck) TCP socket statistics captured during testing confirm the network path is not limiting: - All connections show healthy `cwnd`, full-speed `delivery_rate`, and sub-0.2ms RTT - The 25 GbE NIC is operating well below capacity (~6-12 Gbps observed vs 25 Gbps available) - No retransmissions or congestion events during testing ### CPU Utilization (not the bottleneck) - Softirq CPU usage remains low during testing - RPS/multi-queue distribution does not improve throughput - The `iscsi_tcp` workqueue threads run at nice -20 (highest priority) - Switching between softirq and workqueue processing (`recv_from_iscsi_q`) has no effect ### Conclusion The regression is internal to the `iscsi_tcp` / `libiscsi` kernel module data path. The per-SCSI-command processing latency is 50-110% higher on kernels 6.8 and 7.0 compared to 5.15, for identical iSCSI PDU sizes, network conditions, and queue depths. This suggests changes in the iSCSI receive/transmit path, SCSI mid-layer command completion, or interaction with the block layer's multi-queue infrastructure introduced between 5.15 and 6.8 are adding overhead per I/O operation. ## Steps to Reproduce ### Prerequisites - Two bare-metal hosts: one running kernel 5.15 (Ubuntu 22.04), one running kernel 6.8+ (Ubuntu 24.04 or 26.04) - iSCSI target (e.g., NetApp ONTAP, LIO, or targetcli) accessible via 10GbE+ network with jumbo frames - `open-iscsi` package installed with default configuration - `dm-multipath` configured (or single-path is sufficient to reproduce) - Kubernetes with Trident CSI is NOT required; direct iSCSI LUN attachment reproduces the issue ### Reproduction Steps 1. **Provision a 100 GiB iSCSI LUN** on the target and present it to both hosts. 2. **Discover and login** on both hosts: ```bash iscsiadm -m discovery -t sendtargets -p <target_ip>:3260 iscsiadm -m node --login ``` 3. **Identify the device**: ```bash # For multipath: multipath -ll # Note the dm-X device # For single path: lsblk --scsi ``` 4. **Create a filesystem and mount**: ```bash mkfs.ext4 /dev/dm-0 # or /dev/sdX for single path mkdir /mnt/iscsi-test mount /dev/dm-0 /mnt/iscsi-test ``` 5. **Run fio benchmark** (identical on both hosts): ```bash # 256K Sequential Read fio --name=seq-read-256k \ --rw=read \ --bs=256k \ --size=1G \ --numjobs=1 \ --iodepth=64 \ --direct=1 \ --ioengine=libaio \ --runtime=30 \ --time_based \ --group_reporting \ --directory=/mnt/iscsi-test # 1M Sequential Read fio --name=seq-read-1M \ --rw=read \ --bs=1M \ --size=1G \ --numjobs=1 \ --iodepth=64 \ --direct=1 \ --ioengine=libaio \ --runtime=30 \ --time_based \ --group_reporting \ --directory=/mnt/iscsi-test ``` 6. **Compare results**: The host running kernel 6.8+ will show 40-50% lower sequential read bandwidth compared to the host running kernel 5.15. ### Verification Confirm the test is hitting the iSCSI device (not page cache or overlay): ```bash # During the test, verify disk utilization: iostat -x 1 | grep dm-0 # Should show ~99% util # Verify the mount is on iSCSI: lsblk -o NAME,TYPE,TRAN,SIZE,MOUNTPOINT | grep -A5 dm-0 ``` ## Additional Diagnostics ### 1. Transparent Huge Pages (THP) **Default state**: `enabled=madvise`, `defrag=madvise` | THP Setting | 256K Seq Read (kernel 7.0) | Change | |-------------|---------------------------|--------| | madvise (default) | 1034 MiB/s | baseline | | never | 882 MiB/s | -15% (worse) | **Conclusion**: THP is not contributing to the regression. Disabling it actually slightly reduces throughput, likely due to losing THP benefits for fio's memory allocations. ### 2. NUMA Locality Impact **Hardware topology**: - 96 cores: node 0 = even CPUs (0,2,4,...,94), node 1 = odd CPUs (1,3,5,...,95) - NIC `ens3f0np0` (active iSCSI bond slave): **NUMA node 0** - NIC `ens5f1np1` (secondary bond slave): **NUMA node 1** **NUMA-pinned fio results** (256K sequential read, direct on `/dev/dm-0`): | CPU Pinning | Throughput | Avg Latency | Delta | |-------------|-----------|-------------|-------| | Node 0 (NIC-local) | **1148 MiB/s** | 13.9 ms | baseline | | Node 1 (cross-socket) | **943 MiB/s** | 16.9 ms | -18% | **Finding**: The iSCSI MSI-X interrupt (IRQ 338, `mlx5_comp62`) is pinned to **CPU29 (NUMA node 1)** despite the NIC residing on **NUMA node 0**. The server has two 25 GbE Mellanox NICs, one per NUMA node, bonded via 802.3ad LACP (`xmit_hash_policy=layer2`). The storage VLAN runs on top of this bond. Due to LACP hashing, iSCSI traffic flows through the node-0 NIC but the receive interrupt is processed on a node-1 CPU, adding ~3ms per-IO latency from cross-socket memory access. However, even the optimally-pinned node-0 result (1148 MiB/s) is still **25% below kernel 5.15** (1542 MiB/s), confirming the regression is not solely NUMA-related. ### 3. perf Profiling (Kernel CPU Time Breakdown) Captured via `perf record -a -g -- sleep 10` during 256K sequential reads at iodepth=64: ``` Top-level call stack (kernel 7.0.0-22-generic): 47.97% ret_from_fork_asm └─ 47.96% kthread └─ 46.05% worker_thread └─ 44.97% process_one_work ├─ 42.77% iscsi_xmitworker ← TX path │ └─ 42.74% iscsi_data_xmit │ └─ 42.24% iscsi_xmit_task │ └─ 41.79% iscsi_tcp_task_xmit │ └─ 41.74% iscsi_sw_tcp_pdu_xmit │ └─ 41.62% iscsi_sw_tcp_xmit_segment │ └─ 41.34% sock_sendmsg │ └─ 41.08% tcp_sendmsg │ ├─ 33.73% release_sock ← CRITICAL │ │ └─ 33.57% __release_sock │ │ └─ 33.42% tcp_v4_do_rcv │ │ └─ 33.29% tcp_rcv_established │ │ └─ 32.24% tcp_data_queue │ │ └─ 31.84% tcp_data_ready │ │ └─ 31.78% iscsi_sw_tcp_data_ready │ │ ├─ 28.03% tcp_read_sock │ │ │ └─ 23.45% iscsi_sw_tcp_recv │ │ │ └─ 23.15% iscsi_tcp_recv_skb │ │ │ └─ 2.35% iscsi_tcp_segment_recv │ │ └─ 4.32% native_queued_spin_lock_slowpath ← LOCK CONTENTION │ └─ (tcp_sendmsg_locked, etc.) └─ 1.78% blk_mq_run_work_fn ``` **Critical Findings**: 1. **TX/RX path serialization via `release_sock`**: 33.73% of total CPU time is spent inside `release_sock()` during the transmit path. When the xmit worker sends a SCSI command via `tcp_sendmsg`, the socket lock release triggers processing of all queued incoming data in the **same thread context** — this includes `iscsi_sw_tcp_data_ready` → `iscsi_tcp_recv_skb` (23.15% of CPU). 2. **Spinlock contention**: `native_queued_spin_lock_slowpath` accounts for **4.32%** of total CPU time — indicating measurable contention on the socket lock between the transmit workqueue and network softirq receive path. 3. **Single-threaded bottleneck**: The entire iSCSI IO path (TX command + RX data) is serialized through a single `iscsi_xmitworker` workqueue thread. Data receive happens as a side-effect of the transmit path's lock release, not in parallel. ### 4. Interrupt Footprint During a 5-second sample of active 256K sequential reads: | IRQ | Queue | CPU | Delta (5s) | Rate | |-----|-------|-----|-----------|------| | 338 | `mlx5_comp62@0000:5e:00.0` | CPU29 | +15,979 | 3,196/s | | 325 | `mlx5_comp49@0000:5e:00.0` | CPU3 | +8 | ~0/s | **Finding**: All iSCSI receive traffic is concentrated on a single MSI-X vector (IRQ 338 → CPU29). This is expected for a single TCP flow (RSS hashes to one queue), but confirms that the entire iSCSI data path is single-CPU-bound. The interrupt is NOT bottlenecking on CPU0 — it's correctly distributed via MSI-X, but still limited to one core. ### 5. Lock Contention (lockstat) `CONFIG_LOCK_STAT` is **not enabled** in the Ubuntu 7.0.0-22-generic kernel. This data point is not available without a custom kernel build with `CONFIG_LOCK_STAT=y`. ### 6. blktrace / iostat Latency Decomposition Captured via `blktrace` and `iostat -xmt` on dm-0 and underlying sdb during active 256K sequential reads: **Raw blktrace** (10s capture on `/dev/dm-0`): Completions arrive every 30-40 µs interval on the SCSI device, confirming fast backend response. **iostat latency breakdown** (steady-state averages over 10 samples): | Device | r_await (ms) | aqu-sz | Throughput | |--------|-------------|--------|-----------| | dm-0 (multipath) | **13.3 ms** | 243 | ~1.15 GB/s | | sdb (SCSI/iSCSI) | **1.7 ms** | 31 | ~1.15 GB/s | **Latency decomposition**: | Component | Time | % of Total | |-----------|------|-----------| | DM/block queue wait (Q2D) | **11.6 ms** | 87% | | SCSI device service time (D2C) | **1.7 ms** | 13% | | **Total r_await** | **13.3 ms** | 100% | **Interpretation**: The actual iSCSI network + storage backend time is only 1.7 ms per 64KB read. The remaining 87% of per-IO latency is spent **waiting in the device-mapper queue** because the SCSI device queue depth is limited to 32 (`cmd_per_lun=32`). With `iodepth=64` from fio, the DM layer queues 243 outstanding IOs but can only dispatch 32 at a time to the underlying SCSI device. On kernel 5.15, the same `cmd_per_lun=32` limit exists but achieves 1542 MiB/s at 256K, implying either: - The DM/blk-mq queue management has higher overhead in 6.8+ (longer Q2D time per IO) - The SCSI device service time was lower on 5.15 (different TCP/socket handling in `release_sock`) - Or both, as the perf profile showing 33% of time in `release_sock` correlates with the serialized TX/RX pattern adding per-IO overhead **Comparative iostat from kernel 5.15 (U22) under identical test**: | Metric | U22 (kernel 5.15) | U26 (kernel 7.0) | Ratio | |--------|-------------------|-------------------|-------| | IOPS (256K) | **~49,000** | ~18,200 | U22 2.7× more | | Throughput | **~3,050 MiB/s** | ~1,150 MiB/s | U22 2.7× faster | | r_await (dm) | **5.5 ms** | 13.3 ms | U26 2.4× slower per IO | | aqu-sz (dm) | 268-286 | 243 | Similar queue depth | The per-IO latency through the DM layer is **2.4× higher on kernel 7.0** (13.3 ms vs 5.5 ms), directly explaining the throughput difference. Both kernels saturate at 100% device utilization with similar application queue depths (~260-280), confirming the bottleneck is per-IO processing efficiency in the kernel iSCSI/SCSI stack, not device or network capacity. --- ## Suggested Investigation Areas 1. **`release_sock()` in `tcp_sendmsg` processing RX data inline (kernel 7.0)**: The perf profile shows that 33.73% of time is spent in `release_sock` → `tcp_v4_do_rcv` → `iscsi_sw_tcp_data_ready` during the **transmit** path. This serializes TX and RX. Investigate whether kernel 5.15 handled the receive callback differently (e.g., via softirq/tasklet rather than inline in `release_sock`). 2. **Socket lock contention in `iscsi_sw_tcp_data_ready`**: The 4.32% `native_queued_spin_lock_slowpath` indicates the socket lock is contended. In kernel 5.15, the receive path may have used `sk->sk_data_ready` differently or with less contention. 3. **`iscsi_xmitworker` single-threaded design**: All SCSI command dispatch and completion happens through one workqueue worker. If kernel 6.8+ changed workqueue scheduling (e.g., unbound → bound, or different WQ flags), this would add latency. 4. **SCSI mid-layer `blk-mq` tag allocation on NUMA**: On 96-core dual-socket systems, cross-NUMA blk-mq tag allocation adds measurable latency. The 18% NUMA penalty observed may be amplified by changes in how the SCSI mid-layer allocates tags in 6.8+. 5. **TCP small-queue (TSQ) or pacing changes**: `tcp_sendmsg` in newer kernels may hold the socket lock longer due to TSQ or pacing changes, increasing the window during which `release_sock` processes RX data inline. ## Environment Details ``` # Modules involved: iscsi_tcp 24576 libiscsi_tcp 32768 libiscsi 81920 scsi_transport_iscsi 176128 # Module parameters (kernel 7.0): iscsi_tcp: max_lun, recv_from_iscsi_q, debug_iscsi_tcp ``` To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2157924/+subscriptions
[РЕШЕНО] Ошибка № ...
Ошибки в Программах и Способы их Исправления
понедельник
воскресенье
[Bug 2158263] [NEW] i915 GPU HANG on Sandy Bridge
You have been subscribed to a public bug: Not running X, but apport doesn't have other Display bug options. Since upgrading to Ubuntu 26.04 the Display (running Wayland) seems to freeze at random times and in random applications. This requires a hard reboot. Happens several time a day. I cannot figure out what might be the cause. I haven't found other similar reports that are recent. $ lsb_release -rd Description: Ubuntu 26.04 LTS Release: 26.04 ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: xorg 1:7.7+26ubuntu1 ProcVersionSignature: Ubuntu 7.0.0-22.22-generic 7.0.0 Uname: Linux 7.0.0-22-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 BootLog: Error: [Errno 13] Permission denied: '/var/log/boot.log' CasperMD5CheckResult: pass CompositorRunning: None CurrentDesktop: ubuntu:GNOME Date: Thu Jun 25 07:01:15 2026 DistUpgraded: Fresh install DistroCodename: resolute DistroVariant: ubuntu DkmsStatus: virtualbox/7.2.6, 7.0.0-22-generic, x86_64: installed ExtraDebuggingInterest: Yes GpuHangFrequency: Several times a day GpuHangReproducibility: Seems to happen randomly GpuHangStarted: Immediately after installing this version of Ubuntu GraphicsCard: Intel Corporation Xeon E3-1200 Processor Family Integrated Graphics Controller [8086:010a] (rev 09) (prog-if 00 [VGA controller]) Subsystem: Hewlett-Packard Company Device [103c:1588] InstallationDate: Installed on 2026-06-22 (3 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) MachineType: Hewlett-Packard HP Z210 Workstation ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-22-generic root=UUID=7a92cc69-74e3-4fa5-b8eb-06e09219410c ro quiet splash resume=UUID=40d8bc57-9b57-4a75-ba47-897fe70055dc crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M SourcePackage: xorg Symptom: display Title: Xorg freeze UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 06/13/2018 dmi.bios.release: 1.55 dmi.bios.vendor: Hewlett-Packard dmi.bios.version: J51 v01.55 dmi.board.asset.tag: 2UA1520PVQ dmi.board.name: 1588h dmi.board.vendor: Hewlett-Packard dmi.chassis.asset.tag: 2UA1520PVQ dmi.chassis.type: 6 dmi.chassis.vendor: Hewlett-Packard dmi.modalias: dmi:bvnHewlett-Packard:bvrJ51v01.55:bd06/13/2018:br1.55:svnHewlett-Packard:pnHPZ210Workstation:pvr:rvnHewlett-Packard:rn1588h:rvr:cvnHewlett-Packard:ct6:cvr:skuSN780UC#ABA:pfa103C_53335XG=D: dmi.product.family: 103C_53335X G=D dmi.product.name: HP Z210 Workstation dmi.product.sku: SN780UC#ABA dmi.sys.vendor: Hewlett-Packard version.compiz: compiz N/A version.libdrm2: libdrm2 2.4.131-1 version.libgl1-mesa-dri: libgl1-mesa-dri 26.0.3-1ubuntu1 version.libgl1-mesa-glx: libgl1-mesa-glx N/A version.xserver-xorg-core: xserver-xorg-core 2:21.1.22-1ubuntu1 version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:22.0.0-1build2 version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20210115-1build2 version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.18-1build1 ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug freeze resolute ubuntu wayland-session -- i915 GPU HANG on Sandy Bridge https://bugs.launchpad.net/bugs/2158263 You received this bug notification because you are subscribed to linux in Ubuntu.
[Bug 2158539] Re: 7.0.0-27-generic hard freezes with NVIDIA 595 open module; __warn_thunk warning in nvidia_init_module
Update after additional testing and off-host netconsole capture. The original report was filed while the freezes were silent in the local journal, and the best visible clue was the NVIDIA `__warn_thunk` warning during `nvidia_init_module`. I have since reproduced the hard-freeze symptom on both kernel versions and captured the final kernel messages from two later freezes using netconsole to a second host. New findings: - The system also hard-froze on `7.0.0-22-generic`, so `7.0.0-22` is not a clean workaround. - The NVIDIA warning from the original report remains unexplained: `Unpatched return thunk in use. This should not happen!` - However, the final messages captured immediately before two later freezes are xHCI resume failures on AMD USB controllers, not NVIDIA NVRM/Xid messages. Crash 1: - Failed boot: 2026-06-28 15:12:35 to 2026-06-28 16:37:45 - Kernel: `7.0.0-22-generic` - Local journal stopped at 16:37:45. - Netconsole captured these final lines at 16:37:46: ```text xhci_hcd 0000:12:00.3: Controller not ready at resume -19 xhci_hcd 0000:12:00.3: PCI post-resume error -19! xhci_hcd 0000:12:00.3: HC died; cleaning up Crash 2: - Failed boot: 2026-06-28 16:41:15 to 2026-06-28 18:51:33 - Kernel: 7.0.0-27-generic - Local journal stopped at 18:51:33. - Netconsole captured these final lines at 18:52:54: 2026-06-28T18:52:54-0400 xhci_hcd 0000:12:00.4: Controller not ready at resume -19 2026-06-28T18:52:54-0400 xhci_hcd 0000:12:00.4: PCI post-resume error -19! 2026-06-28T18:52:54-0400 xhci_hcd 0000:12:00.4: HC died; cleaning up Relevant PCI devices: 12:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI [1022:15b6] 12:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI [1022:15b7] 0e:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset USB 3.2 Controller [1022:43f7] (rev 01) Current interpretation: - The strongest final-event evidence now points to AMD CPU-side xHCI runtime-resume failure, possibly involving runtime PM, firmware/AGESA, or platform PCIe power management. - I am not claiming the NVIDIA __warn_thunk warning is unrelated; it remains unresolved. But netconsole did not capture NVIDIA NVRM/Xid output at the freeze points it caught. - Because the same symptom reproduced on both 7.0.0-22-generic and 7.0.0-27-generic, this may not be specific to 7.0.0-27 despite the original title. Mitigation currently under test: I disabled runtime PM for all PCI xhci_hcd controllers for the current boot by setting each controller's power/control to on. Observed state after applying that mitigation: 0000:0e:00.0 control=on runtime=active 0000:10:00.0 control=on runtime=active 0000:12:00.3 control=on runtime=active 0000:12:00.4 control=on runtime=active 0000:13:00.0 control=on runtime=active If the system remains stable with xHCI runtime PM disabled, that will further support the xHCI runtime-resume hypothesis. If it freezes again, netconsole is still running and should capture whether the final failure remains xHCI-related or moves to another subsystem. Would welcome any other theories or logging or testing recommendations, this seems hard to reproduce but it is leading to frequent issues. ** Summary changed: - 7.0.0-27-generic hard freezes with NVIDIA 595 open module; __warn_thunk warning in nvidia_init_module + Hard freezes on 7.0.0-22 and -27 with X670E/Ryzen; AMD xHCI controller not ready at resume (-19) -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158539 Title: Hard freezes on 7.0.0-22 and -27 with X670E/Ryzen; AMD xHCI controller not ready at resume (-19) Status in linux package in Ubuntu: New Bug description: After unattended-upgrade installed linux-image-7.0.0-27-generic on 2026-06-27, this system began hard-freezing with no clean shutdown and no panic in the journal. The prior kernel, 7.0.0-22-generic, is currently being tested as a workaround. Hardware: - ASUS ROG STRIX X670E-A GAMING WIFI, BIOS 3603 03/09/2026 - AMD Ryzen platform - Dual NVIDIA GeForce RTX 4090 - GNOME Wayland session Kernel/driver: - Bad kernel: 7.0.0-27-generic - Previously stable kernel: 7.0.0-22-generic - NVIDIA driver: 595.71.05, nvidia-driver-595-open - Kernel cmdline: pcie_aspm=off quiet splash Timeline: - 2026-06-27 06:14-06:15: unattended-upgrade installed 7.0.0-27 - 2026-06-27 08:06:35: first post-update boot froze; journal stopped abruptly - 2026-06-27 16:13:40: second boot froze; journal stopped abruptly - No systemd shutdown records for the failed boots Relevant warning on affected boot: Unpatched return thunk in use. This should not happen! WARNING: arch/x86/kernel/cpu/bugs.c:3736 at __warn_thunk Call trace includes: warn_thunk_thunk nvidia_init_module+0x29/0x740 [nvidia] Negative evidence: - No OOM, kernel panic, MCE, NVMe I/O error, SMART media error, thermal trip, or watchdog trace found. - NVMe SMART passed: media errors 0, error log entries 0. - Temperatures after reboot were not alarming. A later, separate GNOME Shell crash after reboot produced repeated: NVRM: VM: invalid mmap context ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-22.22-generic 7.0.0 Uname: Linux 7.0.0-22-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 CRDA: N/A CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sat Jun 27 17:12:43 2026 InstallationDate: Installed on 2024-07-15 (713 days ago) InstallationMedia: Ubuntu 24.04 LTS "Noble Numbat" - Release amd64 (20240424) MachineType: ASUS System Product Name ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/usr/bin/zsh TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 nvidia-drmdrmfb 1 nvidia-drmdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-22-generic root=UUID=5e50224a-2fb6-4607-84e3-9d99dee45bcb ro pcie_aspm=off quiet splash PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. SourcePackage: linux UpgradeStatus: Upgraded to resolute on 2026-04-24 (65 days ago) dmi.bios.date: 03/09/2026 dmi.bios.release: 36.3 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 3603 dmi.board.asset.tag: Default string dmi.board.name: ROG STRIX X670E-A GAMING WIFI dmi.board.vendor: ASUSTeK COMPUTER INC. dmi.board.version: Rev 1.xx dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3603:bd03/09/2026:br36.3:svnASUS:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGSTRIXX670E-AGAMINGWIFI:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:skuSKU:pfaTobefilledbyO.E.M.: dmi.product.family: To be filled by O.E.M. dmi.product.name: System Product Name dmi.product.sku: SKU dmi.product.version: System Version dmi.sys.vendor: ASUS To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158539/+subscriptions
[Bug 2144577] Re: BUG: kernel NULL pointer dereference in amdgpu
** Tags removed: verification-needed-noble-linux-xilinx ** Tags added: verification-done-noble-linux-xilinx -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2144577 Title: BUG: kernel NULL pointer dereference in amdgpu Status in linux package in Ubuntu: Fix Released Status in linux source package in Noble: Fix Released Status in linux source package in Questing: Fix Released Status in linux source package in Resolute: Fix Released Bug description: SRU Justification [Impact] System freezes during boot on machines with AMD Southern Islands (SI) GPUs using the amdgpu driver . The amdgpu driver calls flush_gpu_tlb_pasid() in a workqueue, but on SI hardware this function pointer is NULL. The kernel hits a NULL pointer dereference in amdgpu_gmc_flush_gpu_tlb_pasid() and crashes. Error log: kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 kernel: Workqueue: events amdgpu_tlb_fence_work [amdgpu] kernel: RIP: 0010:0x0 kernel: Call Trace: kernel: amdgpu_gmc_flush_gpu_tlb_pasid+0xfd/0x480 [amdgpu] kernel: amdgpu_tlb_fence_work+0x77/0x110 [amdgpu] Hits every boot on affected hardware. Regression from 6.17.0-14 to 6.17.0-19. [Fix] Two patches fix this together: 1. f4db9913e4d3 ("drm/amdgpu: validate the flush_gpu_tlb_pasid()") Adds a NULL check for flush_gpu_tlb_pasid before calling it. Upstream in v7.0-rc1. 2. e3a6eff92bbd ("drm/amdgpu: Fix validating flush_gpu_tlb_pasid()") Fixes the first patch — the early return skipped the unlock, causing a deadlock. Changes the bare return to a goto that unlocks first. Upstream in v7.0-rc1. Fixes: f4db9913e4d3 [Test Plan] On a machine with an AMD SI GPU (Tahiti, Pitcairn, Verde, Oland, Hainan) booted with amdgpu.si_support=1: $ sudo reboot Without patches: kernel NULL pointer dereference during boot, system freezes. With patches: system boots normally, no crash or error in dmesg. Check dmesg after boot: $ dmesg | grep -i "BUG\|NULL pointer\|amdgpu" Without patches: "BUG: kernel NULL pointer dereference" present. With patches: no BUG or NULL pointer lines. [Where problems could occur] Could break TLB flushing on amdgpu. If the NULL check gates too broadly, TLB flushes could be skipped on GPUs that do have flush_gpu_tlb_pasid. This would cause stale TLB entries and GPU page faults or rendering corruption. The unlock path change in the second patch touches the reset/lock logic in amdgpu_gmc_flush_gpu_tlb_pasid(). A wrong goto target could leave the reset domain lock held, deadlocking the GPU. [Other Info] Both patches are upstream in v7.0-rc1. =========================================================== Ubuntu 25.10 with kernel 6.17.0-19-generic doesn't boot on my PC. I freezes on the booting screen, and the kernel logs show a bug: kernel: Linux version 6.17.0-19-generic (buildd@lcy02-amd64-084) (x86_64-linux-gnu-gcc (Ubuntu 15.2.0-4ubuntu4) 15.2.0, GNU ld (GNU Binutils for Ubuntu) 2.45) #19-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 6 14:02:58 UTC 2026 (Ubuntu 6.17.0-19.19-generic 6.17.13) kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.17.0-19-generic root=UUID=354e3c09-bfde-4e47-850f-fe872a882ae5 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M vt.handoff=7 # ... kernel: [drm] Initialized amdgpu 3.64.0 for 0000:01:00.0 on minor 1 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 kernel: #PF: supervisor instruction fetch in kernel mode kernel: #PF: error_code(0x0010) - not-present page kernel: PGD 0 P4D 0 kernel: Oops: Oops: 0010 [#1] SMP PTI kernel: CPU: 3 UID: 0 PID: 109 Comm: kworker/3:1 Not tainted 6.17.0-19-generic #19-Ubuntu PREEMPT(voluntary) kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro3, BIOS P1.10 04/10/2012 kernel: Workqueue: events amdgpu_tlb_fence_work [amdgpu] kernel: RIP: 0010:0x0 kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. kernel: RSP: 0018:ffffce560061fdb0 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: 0000000000008000 RCX: 0000000000000001 kernel: RDX: 0000000000000002 RSI: 0000000000008000 RDI: ffff8a4a6d180000 kernel: RBP: ffffce560061fe08 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 kernel: R13: 0000000000000000 R14: ffff8a4a6d180000 R15: 0000000000000000 kernel: FS: 0000000000000000(0000) GS:ffff8a4da87ff000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: ffffffffffffffd6 CR3: 00000003de040002 CR4: 00000000001726f0 kernel: Call Trace: kernel: <TASK> kernel: amdgpu_gmc_flush_gpu_tlb_pasid+0xfd/0x480 [amdgpu] kernel: amdgpu_tlb_fence_work+0x77/0x110 [amdgpu] kernel: process_one_work+0x18e/0x370 kernel: worker_thread+0x317/0x450 kernel: ? _raw_spin_lock_irqsave+0xe/0x20 kernel: ? __pfx_worker_thread+0x10/0x10 kernel: kthread+0x10b/0x220 kernel: ? __pfx_kthread+0x10/0x10 kernel: ret_from_fork+0x134/0x150 kernel: ? __pfx_kthread+0x10/0x10 kernel: ret_from_fork_asm+0x1a/0x30 kernel: </TASK> kernel: Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg bnep ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_mac xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat binfmt_misc nf_tables amdgpu(+) usblp intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel amdxcp at24 mei_hdcp mei_pxp kvm snd_hda_codec_atihdmi drm_panel_backlight_quirks gpu_sched irqbypass snd_hda_codec_hdmi drm_buddy snd_hda_codec_alc662 rapl btusb snd_hda_codec_realtek_lib intel_cstate snd_hda_codec_generic radeon btrtl snd_hda_intel btintel i2c_i801 btbcm snd_hda_codec btmtk i2c_smbus drm_ttm_helper i2c_mux ttm bluetooth snd_seq_midi snd_hda_core snd_seq_midi_event drm_exec snd_intel_dspcfg snd_rawmidi drm_suballoc_helper snd_intel_sdw_acpi drm_display_helper lpc_ich snd_hwdep snd_seq snd_pcm snd_seq_device cec snd_timer rc_core snd i2c_algo_bit soundcore mei_me mei intel_smartconnect joydev kernel: input_leds mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 dm_crypt wacom uas usb_storage hid_generic usbhid hid r8169 polyval_clmulni ghash_clmulni_intel psmouse ahci realtek serio_raw libahci video wmi aesni_intel kernel: CR2: 0000000000000000 kernel: ---[ end trace 0000000000000000 ]--- kernel: RIP: 0010:0x0 kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. kernel: RSP: 0018:ffffce560061fdb0 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: 0000000000008000 RCX: 0000000000000001 kernel: RDX: 0000000000000002 RSI: 0000000000008000 RDI: ffff8a4a6d180000 kernel: RBP: ffffce560061fe08 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 kernel: R13: 0000000000000000 R14: ffff8a4a6d180000 R15: 0000000000000000 kernel: FS: 0000000000000000(0000) GS:ffff8a4da87ff000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: ffffffffffffffd6 CR3: 00000003de040002 CR4: 00000000001726f0 kernel: note: kworker/3:1[109] exited with irqs disabled kernel: loop50: detected capacity change from 0 to 8 kernel: fbcon: amdgpudrmfb (fb0) is primary device kernel: fbcon: Deferring console take-over kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device kernel: NET: Registered PF_QIPCRTR protocol family kernel: sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 > kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 kernel: #PF: supervisor instruction fetch in kernel mode kernel: #PF: error_code(0x0010) - not-present page kernel: PGD 0 P4D 0 kernel: Oops: Oops: 0010 [#2] SMP PTI kernel: CPU: 1 UID: 0 PID: 91 Comm: kworker/1:1 Tainted: G D 6.17.0-19-generic #19-Ubuntu PREEMPT(voluntary) kernel: Tainted: [D]=DIE kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro3, BIOS P1.10 04/10/2012 kernel: Workqueue: events amdgpu_tlb_fence_work [amdgpu] kernel: RIP: 0010:0x0 kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. kernel: RSP: 0000:ffffce5600477db0 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: 0000000000008001 RCX: 0000000000000001 kernel: RDX: 0000000000000002 RSI: 0000000000008001 RDI: ffff8a4a6d180000 kernel: RBP: ffffce5600477e08 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 kernel: R13: 0000000000000000 R14: ffff8a4a6d180000 R15: 0000000000000000 kernel: FS: 0000000000000000(0000) GS:ffff8a4da86ff000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: ffffffffffffffd6 CR3: 0000000101242006 CR4: 00000000001726f0 kernel: Call Trace: kernel: <TASK> kernel: amdgpu_gmc_flush_gpu_tlb_pasid+0xfd/0x480 [amdgpu] kernel: amdgpu_tlb_fence_work+0x77/0x110 [amdgpu] kernel: process_one_work+0x18e/0x370 kernel: worker_thread+0x317/0x450 kernel: ? _raw_spin_lock_irqsave+0xe/0x20 kernel: ? __pfx_worker_thread+0x10/0x10 kernel: kthread+0x10b/0x220 kernel: ? __pfx_kthread+0x10/0x10 kernel: ret_from_fork+0x134/0x150 kernel: ? __pfx_kthread+0x10/0x10 kernel: ret_from_fork_asm+0x1a/0x30 kernel: </TASK> kernel: Modules linked in: qrtr rfcomm cmac algif_hash algif_skcipher af_alg bnep ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_mac xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat binfmt_misc nf_tables amdgpu usblp intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel amdxcp at24 mei_hdcp mei_pxp kvm snd_hda_codec_atihdmi drm_panel_backlight_quirks gpu_sched irqbypass snd_hda_codec_hdmi drm_buddy snd_hda_codec_alc662 rapl btusb snd_hda_codec_realtek_lib intel_cstate snd_hda_codec_generic radeon btrtl snd_hda_intel btintel i2c_i801 btbcm snd_hda_codec btmtk i2c_smbus drm_ttm_helper i2c_mux ttm bluetooth snd_seq_midi snd_hda_core snd_seq_midi_event drm_exec snd_intel_dspcfg snd_rawmidi drm_suballoc_helper snd_intel_sdw_acpi drm_display_helper lpc_ich snd_hwdep snd_seq snd_pcm snd_seq_device cec snd_timer rc_core snd i2c_algo_bit soundcore mei_me mei intel_smartconnect joydev kernel: input_leds mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 dm_crypt wacom uas usb_storage hid_generic usbhid hid r8169 polyval_clmulni ghash_clmulni_intel psmouse ahci realtek serio_raw libahci video wmi aesni_intel kernel: CR2: 0000000000000000 kernel: ---[ end trace 0000000000000000 ]--- kernel: RIP: 0010:0x0 kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. kernel: RSP: 0018:ffffce560061fdb0 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: 0000000000008000 RCX: 0000000000000001 kernel: RDX: 0000000000000002 RSI: 0000000000008000 RDI: ffff8a4a6d180000 kernel: RBP: ffffce560061fe08 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 kernel: R13: 0000000000000000 R14: ffff8a4a6d180000 R15: 0000000000000000 kernel: FS: 0000000000000000(0000) GS:ffff8a4da86ff000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: ffffffffffffffd6 CR3: 0000000101242006 CR4: 00000000001726f0 kernel: note: kworker/1:1[91] exited with irqs disabled The previous kernel 6.17.0-14-generic boots without any issues. I'll try to attach the required information using `apport-collect -p linux BUG#`, but it'll be collected after successfully booting with 6.17.0-14, whereas the bug occurs with 6.17.0-19. --- ProblemType: Bug ApportVersion: 2.33.1-0ubuntu3 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: mateusz 3017 F.... wireplumber /dev/snd/controlC1: mateusz 3017 F.... wireplumber /dev/snd/seq: mateusz 2999 F.... pipewire CasperMD5CheckResult: unknown CurrentDesktop: ubuntu:GNOME DistroRelease: Ubuntu 25.10 InstallationDate: Installed on 2020-10-14 (1979 days ago) InstallationMedia: Ubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731) MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M. Package: linux (not installed) ProcEnviron: LANG=pl_PL.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.17.0-14-generic root=UUID=354e3c09-bfde-4e47-850f-fe872a882ae5 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M vt.handoff=7 ProcVersionSignature: Ubuntu 6.17.0-14.14-generic 6.17.9 RelatedPackageVersions: firmware-sof N/A linux-firmware 20250901.git993ff19b-0ubuntu1.9 RfKill: 0: hci0: Bluetooth Soft blocked: yes Hard blocked: no Tags: questing Uname: Linux 6.17.0-14-generic x86_64 UpgradeStatus: Upgraded to questing on 2026-01-10 (65 days ago) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 04/10/2012 dmi.bios.release: 4.6 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: P1.10 dmi.board.name: Z77 Pro3 dmi.board.vendor: ASRock dmi.chassis.asset.tag: To Be Filled By O.E.M. dmi.chassis.type: 3 dmi.chassis.vendor: To Be Filled By O.E.M. dmi.chassis.version: To Be Filled By O.E.M. dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP1.10:bd04/10/2012:br4.6:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnZ77Pro3:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:skuToBeFilledByO.E.M.: dmi.product.family: To Be Filled By O.E.M. dmi.product.name: To Be Filled By O.E.M. dmi.product.sku: To Be Filled By O.E.M. dmi.product.version: To Be Filled By O.E.M. dmi.sys.vendor: To Be Filled By O.E.M. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2144577/+subscriptions
[Bug 2144060] Re: ADATA SU680 causes repeated SATA resets and I/O errors on Ubuntu unless link power management is forced to max_performance
** Tags removed: verification-needed-jammy-linux-xilinx-zynqmp verification-needed-noble-linux-xilinx ** Tags added: verification-done-jammy-linux-xilinx-zynqmp verification-done-noble-linux-xilinx -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2144060 Title: ADATA SU680 causes repeated SATA resets and I/O errors on Ubuntu unless link power management is forced to max_performance Status in linux package in Ubuntu: In Progress Status in linux source package in Jammy: Fix Released Status in linux source package in Noble: Fix Released Status in linux source package in Questing: Fix Released Status in linux source package in Resolute: In Progress Bug description: A Dell Vostro 15 3510 running Ubuntu from an ADATA SU680 1TB SATA SSD (/dev/sda, firmware VE1R910D) experienced repeated storage instability that appears tied to Linux SATA link power management. The detailed evidence is attached in the archive file: sata_adata_bugreport_attachments.zip The attached archive contains the main incident report and supporting evidence files, including SMART output, filtered kernel logs from affected and clean boots, hardware/system identification output, storage-layout output, and SATA power-policy evidence. Observed symptoms included: - repeated WRITE/READ FPDMA QUEUED failures - Emask 0x4 (timeout) and Emask 0x10 (ATA bus error) - irq_stat 0x08000000, interface fatal error - SError: { PHYRdyChg CommWake } - SError: { UnrecovData CommWake Handshk } - ata1: hard resetting link - Sense Key : Aborted Command - I/O error, dev sda, sector ... - abnormal SMART Power_Cycle_Count / Lifetime Power-On Resets growth during normal Ubuntu runtime - one Windows-side symptom where the SATA disk appeared as if GPT/partition metadata might need initialization - user-visible effects including OS slowdowns, temporary hangs/freezes, and application instability This did not look like classic SSD media failure: - SMART overall health stayed PASSED - reallocated sectors stayed 0 - reported uncorrectables stayed 0 - UDMA CRC stayed 0 - reserve space stayed 100 - endurance used stayed 0 - SMART error log stayed empty - a full raw-disk read succeeded - a later extended SMART self-test completed successfully Important power-management evidence: Before the workaround, the observed Linux SATA power-related state was: /sys/class/scsi_host/host0/link_power_management_policy:min_power_with_partial /sys/class/scsi_host/host1/link_power_management_policy:min_power_with_partial /sys/class/block/sda/device/power/control:on The strongest evidence is that the issue stopped immediately after forcing: for f in /sys/class/scsi_host/host*/link_power_management_policy; do echo max_performance | sudo tee "$f" done After that: - SMART power/reset counter stabilized - later long boots became clean - extended SMART self-test passed - system responsiveness improved - earlier freeze/hang behavior stopped being observed This strongly suggests a Linux SATA ALPM / DIPM / AHCI / libata incompatibility with this SSD/platform combination rather than classic SSD NAND/media failure. Expected result: Ubuntu should run normally from the ADATA SU680 without repeated SATA transport resets, SMART reset-counter inflation, transient I/O errors, GPT/metadata instability, or broader runtime instability. Actual result: Under the default Linux storage-power behavior on this system, repeated SATA transport resets and I/O errors occurred across multiple boots, and the SMART reset counter increased during runtime without real reboot/shutdown. ProblemType: Bug DistroRelease: Ubuntu 24.04 Package: linux-image-6.17.0-14-generic 6.17.0-14.14~24.04.1 ProcVersionSignature: Ubuntu 6.17.0-14.14~24.04.1-generic 6.17.9 Uname: Linux 6.17.0-14-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.28.1-0ubuntu3.8 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: khaled 2391 F.... wireplumber /dev/snd/seq: khaled 2388 F.... pipewire CRDA: N/A CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Fri Mar 13 00:55:34 2026 InstallationDate: Installed on 2026-03-07 (5 days ago) InstallationMedia: Ubuntu 24.04.4 LTS "Noble Numbat" - Release amd64 (20260210) Lsusb: Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 17ef:60ac Lenovo Lenovo 300 Wireless Compact Mouse Bus 001 Device 003: ID 0c45:6730 Microdia Integrated_Webcam_HD Bus 001 Device 004: ID 8087:0aaa Intel Corp. Bluetooth 9460/9560 Jefferson Peak (JfP) Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub MachineType: Dell Inc. Vostro 15 3510 ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.17.0-14-generic root=UUID=06fa0712-9df2-4153-9d2f-abd46d850c25 ro quiet splash vt.handoff=7 RelatedPackageVersions: linux-restricted-modules-6.17.0-14-generic N/A linux-backports-modules-6.17.0-14-generic N/A linux-firmware 20240318.git3b128b60-0ubuntu2.25 SourcePackage: linux-hwe-6.17 UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 01/05/2026 dmi.bios.release: 1.44 dmi.bios.vendor: Dell Inc. dmi.bios.version: 1.44.0 dmi.board.name: 00NFD7 dmi.board.vendor: Dell Inc. dmi.board.version: A00 dmi.chassis.type: 10 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr1.44.0:bd01/05/2026:br1.44:svnDellInc.:pnVostro153510:pvr:rvnDellInc.:rn00NFD7:rvrA00:cvnDellInc.:ct10:cvr:sku0ADB: dmi.product.family: Vostro dmi.product.name: Vostro 15 3510 dmi.product.sku: 0ADB dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2144060/+subscriptions
[Bug 2158549] Re: HP 15-ef1013dx ELAN071A touchpad not working - hid_descr_cmd failed
** Tags added: kernel-daily-bug -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158549 Title: HP 15-ef1013dx ELAN071A touchpad not working - hid_descr_cmd failed Status in linux package in Ubuntu: New Bug description: Hardware: HP Laptop 15-ef1013dx CPU: AMD Ryzen 7 4700U Kernel: 7.0.0-27-generic Ubuntu: 26.04 LTS Touchpad ELAN071A detected in I2C bus but fails to initialize. dmesg shows: i2c_hid i2c-ELAN071A:00: supply vdd not found, using dummy regulator i2c_hid i2c-ELAN071A:00: hid_descr_cmd failed i2c_hid i2c-ELAN071A:00: Failed to fetch the HID Descriptor Device visible at /sys/bus/i2c/devices/i2c-ELAN071A:00 uevent shows: waiting_for_supplier Touchpad works in Windows 11. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158549/+subscriptions
[Bug 2158569] Re: 7.0.0 kernel spends ~63s in initramfs retrying a non-enumerable USB port while 6.19.10 doesn't (same hardware)
** Tags added: kernel-daily-bug -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158569 Title: 7.0.0 kernel spends ~63s in initramfs retrying a non-enumerable USB port while 6.19.10 doesn't (same hardware) Status in linux package in Ubuntu: New Bug description: On Ubuntu 26.04, the 7.0.0 kernel frequently adds ~63 seconds to boot because the kernel repeatedly tries and fails to enumerate a USB port during the initramfs phase, proceeding only after exhausting all retries. The 6.19.10 mainline kernel doesn't stall on the same hardware with the same broken port present. Reproduction / behavior: - Once the stall occurs, it then occurs on EVERY subsequent boot - the long ~60+ seconds loading screen happens every time - until I fully drain power (shut down, unplug the power cable, hold the power button ~30s, then plug back in and boot). A normal reboot or power-off does not clear it; only the full power drain does. - After a power drain, the next boot may be clean, but once the stall reappears even once, it again persists on every boot until the next power drain. - This is easy to hit on 7.0.0. On 6.19.10, on the same hardware, I do not see this behavior at all. - I have not isolated exactly what first triggers it on a given session; I am reporting the observed pattern: it latches on and persists across reboots until a full power drain resets it. Hardware: - CPU: AMD Ryzen 7 7700X - GPU: AMD Radeon RX 7900 XT / Navi 31 - Motherboard: ASRock B650M Pro RS WiFi, BIOS 4.20 - Root: NVMe (Lexar SSD NM710 2TB), ext4, no USB needed for root The failing port: - usb 1-8 on controller 0000:0d:00.0 (xhci_hcd). - It never enumerates: it cycles through "device descriptor read/64, error -110", "Device not responding to setup address", and "device not accepting address N, error -71", then "unable to enumerate USB device". - physical_location reports back / left / lower. It does not correspond to any usable socket I can find: my actual front ports enumerate as 1-6 and 1-5, and my working rear ports are on a different controller (0000:10:00.4). So 1-8 appears to be a phantom/dead endpoint on this controller. Boot time comparison (systemd-analyze), same machine, same disk, same broken port physically present: 6.19.10-061910-generic: initrd 2.160s (total 19.820s) 7.0.0-27-generic: initrd ~65-67s (total ~1min 38s) On 7.0.0, the kernel log shows the full retry cycle on usb 1-8 consuming the window from ~1.4s to ~65s, and dracut-initqueue finishes the instant the kernel gives up on the port: [ 1.408] usb 1-8: new high-speed USB device number 3 using xhci_hcd [ 6.880] usb 1-8: device descriptor read/64, error -110 [22.752] usb 1-8: device descriptor read/64, error -110 [28.897] usb 1-8: device descriptor read/64, error -110 [44.768] usb 1-8: device descriptor read/64, error -110 [44.877] usb usb1-port8: attempt power cycle [50.118] usb 1-8: Device not responding to setup address. [55.331] usb 1-8: device not accepting address 5, error -71 [60.510] usb 1-8: Device not responding to setup address. [65.723] usb 1-8: device not accepting address 6, error -71 [65.725] usb usb1-port8: unable to enumerate USB device [65.739] systemd[1]: Finished dracut-initqueue.service - dracut initqueue hook. During this whole window the boot splash stays at a low fallback resolution (simpledrm), because amdgpu does not load until after the initramfs completes. What differs between the kernels: - 6.19.10 reaches the root filesystem and proceeds in ~2s of initrd; it does not sit through a 60+ seconds USB retry cycle during boot, and the slow state does not keep recurring. - 7.0.0 blocks the initramfs for the full retry/timeout budget on this single failed port. Root is on NVMe and does not depend on USB at all, so nothing about reaching the root filesystem requires waiting on this port. Runtime note (not a boot fix): - Once booted, writing 1 to /sys/bus/usb/devices/usb1/1-0:1.0/usb1-port8/disable stops the port retrying, but only after userspace is up, so it does not help the initramfs stall. Per-port quirks (quirks=0x01) and dracut cmdline/pre-trigger hooks did not prevent the boot-time stall in my testing. Expected: the kernel should not block boot for ~63s on a single USB port that fails to enumerate, especially when root is on NVMe and does not depend on USB. 6.19.10 demonstrates the faster behavior on identical hardware. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sun Jun 28 19:10:17 2026 InstallationDate: Installed on 2026-05-01 (58 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) IwDevWlp7s0Link: Not connected. MachineType: ASRock B650M Pro RS WiFi ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=b624231d-8e1e-4b2a-887d-4d219254360d ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M RfKill: 0: phy0: Wireless LAN Soft blocked: no Hard blocked: no SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 04/16/2026 dmi.bios.release: 5.41 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: 4.20 dmi.board.asset.tag: Default string dmi.board.name: B650M Pro RS WiFi dmi.board.vendor: ASRock dmi.board.version: Default string dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr4.20:bd04/16/2026:br5.41:svnASRock:pnB650MProRSWiFi:pvrDefaultstring:rvnASRock:rnB650MProRSWiFi:rvrDefaultstring:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:pfaDefaultstring: dmi.product.family: Default string dmi.product.name: B650M Pro RS WiFi dmi.product.sku: Default string dmi.product.version: Default string dmi.sys.vendor: ASRock To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158569/+subscriptions
[Bug 2158569] [NEW] 7.0.0 kernel spends ~63s in initramfs retrying a non-enumerable USB port while 6.19.10 doesn't (same hardware)
Public bug reported: On Ubuntu 26.04, the 7.0.0 kernel frequently adds ~63 seconds to boot because the kernel repeatedly tries and fails to enumerate a USB port during the initramfs phase, proceeding only after exhausting all retries. The 6.19.10 mainline kernel doesn't stall on the same hardware with the same broken port present. Reproduction / behavior: - Once the stall occurs, it then occurs on EVERY subsequent boot - the long ~60+ seconds loading screen happens every time - until I fully drain power (shut down, unplug the power cable, hold the power button ~30s, then plug back in and boot). A normal reboot or power-off does not clear it; only the full power drain does. - After a power drain, the next boot may be clean, but once the stall reappears even once, it again persists on every boot until the next power drain. - This is easy to hit on 7.0.0. On 6.19.10, on the same hardware, I do not see this behavior at all. - I have not isolated exactly what first triggers it on a given session; I am reporting the observed pattern: it latches on and persists across reboots until a full power drain resets it. Hardware: - CPU: AMD Ryzen 7 7700X - GPU: AMD Radeon RX 7900 XT / Navi 31 - Motherboard: ASRock B650M Pro RS WiFi, BIOS 4.20 - Root: NVMe (Lexar SSD NM710 2TB), ext4, no USB needed for root The failing port: - usb 1-8 on controller 0000:0d:00.0 (xhci_hcd). - It never enumerates: it cycles through "device descriptor read/64, error -110", "Device not responding to setup address", and "device not accepting address N, error -71", then "unable to enumerate USB device". - physical_location reports back / left / lower. It does not correspond to any usable socket I can find: my actual front ports enumerate as 1-6 and 1-5, and my working rear ports are on a different controller (0000:10:00.4). So 1-8 appears to be a phantom/dead endpoint on this controller. Boot time comparison (systemd-analyze), same machine, same disk, same broken port physically present: 6.19.10-061910-generic: initrd 2.160s (total 19.820s) 7.0.0-27-generic: initrd ~65-67s (total ~1min 38s) On 7.0.0, the kernel log shows the full retry cycle on usb 1-8 consuming the window from ~1.4s to ~65s, and dracut-initqueue finishes the instant the kernel gives up on the port: [ 1.408] usb 1-8: new high-speed USB device number 3 using xhci_hcd [ 6.880] usb 1-8: device descriptor read/64, error -110 [22.752] usb 1-8: device descriptor read/64, error -110 [28.897] usb 1-8: device descriptor read/64, error -110 [44.768] usb 1-8: device descriptor read/64, error -110 [44.877] usb usb1-port8: attempt power cycle [50.118] usb 1-8: Device not responding to setup address. [55.331] usb 1-8: device not accepting address 5, error -71 [60.510] usb 1-8: Device not responding to setup address. [65.723] usb 1-8: device not accepting address 6, error -71 [65.725] usb usb1-port8: unable to enumerate USB device [65.739] systemd[1]: Finished dracut-initqueue.service - dracut initqueue hook. During this whole window the boot splash stays at a low fallback resolution (simpledrm), because amdgpu does not load until after the initramfs completes. What differs between the kernels: - 6.19.10 reaches the root filesystem and proceeds in ~2s of initrd; it does not sit through a 60+ seconds USB retry cycle during boot, and the slow state does not keep recurring. - 7.0.0 blocks the initramfs for the full retry/timeout budget on this single failed port. Root is on NVMe and does not depend on USB at all, so nothing about reaching the root filesystem requires waiting on this port. Runtime note (not a boot fix): - Once booted, writing 1 to /sys/bus/usb/devices/usb1/1-0:1.0/usb1-port8/disable stops the port retrying, but only after userspace is up, so it does not help the initramfs stall. Per-port quirks (quirks=0x01) and dracut cmdline/pre-trigger hooks did not prevent the boot-time stall in my testing. Expected: the kernel should not block boot for ~63s on a single USB port that fails to enumerate, especially when root is on NVMe and does not depend on USB. 6.19.10 demonstrates the faster behavior on identical hardware. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sun Jun 28 19:10:17 2026 InstallationDate: Installed on 2026-05-01 (58 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) IwDevWlp7s0Link: Not connected. MachineType: ASRock B650M Pro RS WiFi ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=b624231d-8e1e-4b2a-887d-4d219254360d ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M RfKill: 0: phy0: Wireless LAN Soft blocked: no Hard blocked: no SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 04/16/2026 dmi.bios.release: 5.41 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: 4.20 dmi.board.asset.tag: Default string dmi.board.name: B650M Pro RS WiFi dmi.board.vendor: ASRock dmi.board.version: Default string dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr4.20:bd04/16/2026:br5.41:svnASRock:pnB650MProRSWiFi:pvrDefaultstring:rvnASRock:rnB650MProRSWiFi:rvrDefaultstring:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:pfaDefaultstring: dmi.product.family: Default string dmi.product.name: B650M Pro RS WiFi dmi.product.sku: Default string dmi.product.version: Default string dmi.sys.vendor: ASRock ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug resolute wayland-session -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158569 Title: 7.0.0 kernel spends ~63s in initramfs retrying a non-enumerable USB port while 6.19.10 doesn't (same hardware) Status in linux package in Ubuntu: New Bug description: On Ubuntu 26.04, the 7.0.0 kernel frequently adds ~63 seconds to boot because the kernel repeatedly tries and fails to enumerate a USB port during the initramfs phase, proceeding only after exhausting all retries. The 6.19.10 mainline kernel doesn't stall on the same hardware with the same broken port present. Reproduction / behavior: - Once the stall occurs, it then occurs on EVERY subsequent boot - the long ~60+ seconds loading screen happens every time - until I fully drain power (shut down, unplug the power cable, hold the power button ~30s, then plug back in and boot). A normal reboot or power-off does not clear it; only the full power drain does. - After a power drain, the next boot may be clean, but once the stall reappears even once, it again persists on every boot until the next power drain. - This is easy to hit on 7.0.0. On 6.19.10, on the same hardware, I do not see this behavior at all. - I have not isolated exactly what first triggers it on a given session; I am reporting the observed pattern: it latches on and persists across reboots until a full power drain resets it. Hardware: - CPU: AMD Ryzen 7 7700X - GPU: AMD Radeon RX 7900 XT / Navi 31 - Motherboard: ASRock B650M Pro RS WiFi, BIOS 4.20 - Root: NVMe (Lexar SSD NM710 2TB), ext4, no USB needed for root The failing port: - usb 1-8 on controller 0000:0d:00.0 (xhci_hcd). - It never enumerates: it cycles through "device descriptor read/64, error -110", "Device not responding to setup address", and "device not accepting address N, error -71", then "unable to enumerate USB device". - physical_location reports back / left / lower. It does not correspond to any usable socket I can find: my actual front ports enumerate as 1-6 and 1-5, and my working rear ports are on a different controller (0000:10:00.4). So 1-8 appears to be a phantom/dead endpoint on this controller. Boot time comparison (systemd-analyze), same machine, same disk, same broken port physically present: 6.19.10-061910-generic: initrd 2.160s (total 19.820s) 7.0.0-27-generic: initrd ~65-67s (total ~1min 38s) On 7.0.0, the kernel log shows the full retry cycle on usb 1-8 consuming the window from ~1.4s to ~65s, and dracut-initqueue finishes the instant the kernel gives up on the port: [ 1.408] usb 1-8: new high-speed USB device number 3 using xhci_hcd [ 6.880] usb 1-8: device descriptor read/64, error -110 [22.752] usb 1-8: device descriptor read/64, error -110 [28.897] usb 1-8: device descriptor read/64, error -110 [44.768] usb 1-8: device descriptor read/64, error -110 [44.877] usb usb1-port8: attempt power cycle [50.118] usb 1-8: Device not responding to setup address. [55.331] usb 1-8: device not accepting address 5, error -71 [60.510] usb 1-8: Device not responding to setup address. [65.723] usb 1-8: device not accepting address 6, error -71 [65.725] usb usb1-port8: unable to enumerate USB device [65.739] systemd[1]: Finished dracut-initqueue.service - dracut initqueue hook. During this whole window the boot splash stays at a low fallback resolution (simpledrm), because amdgpu does not load until after the initramfs completes. What differs between the kernels: - 6.19.10 reaches the root filesystem and proceeds in ~2s of initrd; it does not sit through a 60+ seconds USB retry cycle during boot, and the slow state does not keep recurring. - 7.0.0 blocks the initramfs for the full retry/timeout budget on this single failed port. Root is on NVMe and does not depend on USB at all, so nothing about reaching the root filesystem requires waiting on this port. Runtime note (not a boot fix): - Once booted, writing 1 to /sys/bus/usb/devices/usb1/1-0:1.0/usb1-port8/disable stops the port retrying, but only after userspace is up, so it does not help the initramfs stall. Per-port quirks (quirks=0x01) and dracut cmdline/pre-trigger hooks did not prevent the boot-time stall in my testing. Expected: the kernel should not block boot for ~63s on a single USB port that fails to enumerate, especially when root is on NVMe and does not depend on USB. 6.19.10 demonstrates the faster behavior on identical hardware. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sun Jun 28 19:10:17 2026 InstallationDate: Installed on 2026-05-01 (58 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) IwDevWlp7s0Link: Not connected. MachineType: ASRock B650M Pro RS WiFi ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=b624231d-8e1e-4b2a-887d-4d219254360d ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M RfKill: 0: phy0: Wireless LAN Soft blocked: no Hard blocked: no SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 04/16/2026 dmi.bios.release: 5.41 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: 4.20 dmi.board.asset.tag: Default string dmi.board.name: B650M Pro RS WiFi dmi.board.vendor: ASRock dmi.board.version: Default string dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvr4.20:bd04/16/2026:br5.41:svnASRock:pnB650MProRSWiFi:pvrDefaultstring:rvnASRock:rnB650MProRSWiFi:rvrDefaultstring:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:pfaDefaultstring: dmi.product.family: Default string dmi.product.name: B650M Pro RS WiFi dmi.product.sku: Default string dmi.product.version: Default string dmi.sys.vendor: ASRock To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158569/+subscriptions
суббота
[Bug 2158523] Re: [Snapdragon X Elite / X1E80100] ThinkPad T14s Gen 6 (21N1): spontaneous in-use hard reset — no kernel panic, no pstore dump, corrupted journal (distinct from suspend bug #2127013)
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158523 Title: [Snapdragon X Elite / X1E80100] ThinkPad T14s Gen 6 (21N1): spontaneous in-use hard reset — no kernel panic, no pstore dump, corrupted journal (distinct from suspend bug #2127013) Status in linux package in Ubuntu: Confirmed Bug description: Summary ------- On a Lenovo ThinkPad T14s Gen 6 (Snapdragon X Elite / X1E80100) the machine spontaneously hard-resets while in normal use — not during suspend, not during USB-C hotplug. The reset is firmware/SoC-level: there is no kernel panic, no oops, no soft/hard-lockup trace, nothing in pstore/ramoops, and the systemd journal of the killed boot is left corrupted (the kernel never got to log anything). The machine reboots on its own after a hard cut. This is a different failure from LP #2127013, which is suspend/resume- specific (immediate resume after s2idle). This report is specifically about resets that occur while the machine is awake and in use. Hardware -------- - Model: Lenovo ThinkPad T14s Gen 6 - Machine type / product: 21N10001US (MT 21N1) - SoC: Qualcomm Snapdragon X Elite, X1E80100 (aarch64) - BIOS: LENOVO N42ET97W (2.27), date 2026-02-24 Software -------- - Ubuntu 26.04 LTS - Kernel: 7.0.0-27-generic (#27-Ubuntu SMP PREEMPT_DYNAMIC aarch64) - Suspend mode: s2idle - Kernel command line: ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,... (clk_ignore_unused / pd_ignore_unused / efi=noruntime are the documented required X1E params; arm64.nopauth was added as a speculative mitigation and made no observable difference.) Impact ------ Unpredictable loss of all unsaved work; filesystem orphan-cleanup on every recovery boot. Because the reset is below the OS, kdump never fires and no crash artifact is produced, making it very hard to diagnose. What happens ------------ The system is running normally (light desktop + containers), then without warning the screen cuts and the machine resets and reboots. It is not correlated with suspend or with plugging/unplugging USB-C. Failure signature (forensics from one captured instance) -------------------------------------------------------- - The boot that died ran ~3h10m entirely in-use. It performed ZERO suspend cycles before the reset (so this is not the s2idle path). - The last kernel-ring message preceded the reset by ~2h18m; there is no kernel activity logged at the moment of the cut. - No "panic", "oops", "BUG:", soft/hard-lockup, RCU stall, MCE, or thermal-trip message anywhere near the reset. - /sys/fs/pstore is empty after the reset (kdump-tools active, crashkernel reserved) — nothing was captured. - On the recovery boot: "EXT4-fs (nvmeXn1pY): orphan cleanup on readonly fs" and "system.journal corrupted or uncleanly shut down" — i.e. a hard power cut, not a graceful reboot. - Note: the platform reports "watchdog: NMI not fully supported" / "Hard watchdog permanently disabled", so a CPU soft-lockup would not be caught by an NMI watchdog. Reproducibility --------------- Intermittent — occurs roughly every few hours of uptime, not on demand. I have installed a small boot-flag service (writes a flag on boot, removes it on clean shutdown) so each reset is unambiguously recorded with a timestamp and whether the prior boot had suspended; I can attach this log over time to characterise frequency. Related observation (may point at the layer involved) ---------------------------------------------------- After some of these resets, the next Linux boot comes up with the display(s) black (both internal eDP and external). A full cold power-off + drain does NOT clear it; only booting Windows once and then back into Linux restores the display. This strongly suggests a Qualcomm subsystem / display-PHY / firmware state that the proprietary Windows driver stack tears down but Linux does not — consistent with the reset itself being a firmware/SoC-level event rather than a kernel fault. What I have tried ----------------- - Upgraded 25.10 (6.17) -> 26.04 (7.0.0-27): the lenovo-thinkpad-t14s EC driver is now loaded; it did not stop the in-use resets. - Added arm64.nopauth: no observable change. - Confirmed it is not the suspend path (#2127013) and not USB-C hotplug. Request ------- 1. Is there any known X1E80100 SoC watchdog / PMIC power-collapse / PDR-SSR path that can trigger a full SoC reset without a kernel trace, and any way to surface it (e.g. enabling a Qualcomm-side log, ramoops backend that survives this reset type, or a debug build)? 2. Guidance on capturing anything at all from a reset that leaves pstore empty would be very welcome. 3. Happy to test debug kernels / patches and to provide the reset-frequency log and any apport data. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic aarch64 ApportVersion: 2.34.0-0ubuntu2 Architecture: arm64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: devop 4290 F.... pipewire devop 4573 F.... wireplumber /dev/snd/seq: devop 4290 F.... pipewire CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Sat Jun 27 16:35:57 2026 InstallationDate: Installed on 2026-01-07 (171 days ago) InstallationMedia: Ubuntu 25.10 "Questing Quokka" - Release arm64 (20251007) Lspci-vt: -[0004:00]---00.0-[01-ff]----00.0 Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] -[0005:00]---00.0-[01-ff]-- -[0006:00]---00.0-[01-ff]----00.0 Sandisk Corp WD PC SN740 NVMe SSD 512GB (DRAM-less) MachineType: LENOVO 21N10001US ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/usr/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 msmdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-27-generic root=UUID=17fb0394-4f34-4e43-bb6a-ae0867401e98 ro arm64.nopauth clk_ignore_unused pd_ignore_unused cma=128M efi=noruntime quiet splash console=tty0 mem_sleep_default=s2idle crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M SourcePackage: linux UpgradeStatus: Upgraded to resolute on 2026-06-27 (0 days ago) acpidump: dmi.bios.date: 02/24/2026 dmi.bios.release: 2.27 dmi.bios.vendor: LENOVO dmi.bios.version: N42ET97W (2.27 ) dmi.board.asset.tag: Not Available dmi.board.name: 21N10001US dmi.board.vendor: LENOVO dmi.board.version: SDK0T76576 WIN ptal����8 dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: None dmi.ec.firmware.release: 1.32 dmi.modalias: dmi:bvnLENOVO:bvrN42ET97W(2.27):bd02/24/2026:br2.27:efr1.32:svnLENOVO:pn21N10001US:pvrThinkPadT14sGen6:rvnLENOVO:rn21N10001US:rvrSDK0T76576WINptal8:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21N1_BU_Think_FM_ThinkPadT14sGen6:pfaThinkPadT14sGen6: dmi.product.family: ThinkPad T14s Gen 6 dmi.product.name: 21N10001US dmi.product.sku: LENOVO_MT_21N1_BU_Think_FM_ThinkPad T14s Gen 6 dmi.product.version: ThinkPad T14s Gen 6 dmi.sys.vendor: LENOVO To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158523/+subscriptions
[Bug 2142891] Re: NUC10I7FNK hangs on reboot > 6.8.0-94 & 6.17.0-14
Tried reboot= with acpi, efi and psi without any change. Tried disabling PCIe ASPM in bios, still hanging. However, this works: GRUB_CMDLINE_LINUX_DEFAULT="pcie_aspm=off" -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2142891 Title: NUC10I7FNK hangs on reboot > 6.8.0-94 & 6.17.0-14 Status in linux package in Ubuntu: New Bug description: System hangs after "rebooting system" with a black screen on kernels mentioned in subject, downgrading to 6.8.0-94-generic fixes the problem. Have tested reboot= settings on new kernels with no effect. Let me know if you need details from the logs of the failing kernel, the reports added is from the working kernel. ProblemType: Bug DistroRelease: Ubuntu 24.04 Package: linux-image-6.8.0-94-generic 6.8.0-94.96 ProcVersionSignature: Ubuntu 6.8.0-94.96-generic 6.8.12 Uname: Linux 6.8.0-94-generic x86_64 AlsaVersion: Advanced Linux Sound Architecture Driver Version k6.8.0-94-generic. AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.28.1-0ubuntu3.8 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D2', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer' Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer' CasperMD5CheckResult: unknown Date: Fri Feb 27 19:46:25 2026 InstallationDate: Installed on 2025-09-17 (163 days ago) InstallationMedia: Ubuntu-Server 24.04.3 LTS "Noble Numbat" - Release amd64 (20250805.1) IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: Intel(R) Client Systems NUC10i7FNH ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.8.0-94-generic root=/dev/mapper/vg0-ubuntu ro reboot=acpi RelatedPackageVersions: linux-restricted-modules-6.8.0-94-generic N/A linux-backports-modules-6.8.0-94-generic N/A linux-firmware 20240318.git3b128b60-0ubuntu2.25 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/20/2022 dmi.bios.release: 5.16 dmi.bios.vendor: Intel Corp. dmi.bios.version: FNCML357.0058.2022.0720.1011 dmi.board.asset.tag: Default string dmi.board.name: NUC10i7FNB dmi.board.vendor: Intel Corporation dmi.board.version: M38062-307 dmi.chassis.asset.tag: Default string dmi.chassis.type: 35 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 2.0 dmi.ec.firmware.release: 3.12 dmi.modalias: dmi:bvnIntelCorp.:bvrFNCML357.0058.2022.0720.1011:bd07/20/2022:br5.16:efr3.12:svnIntel(R)ClientSystems:pnNUC10i7FNH:pvrM38010-308:rvnIntelCorporation:rnNUC10i7FNB:rvrM38062-307:cvnIntelCorporation:ct35:cvr2.0:skuBXNUC10i7FNHN: dmi.product.family: FN dmi.product.name: NUC10i7FNH dmi.product.sku: BXNUC10i7FNHN dmi.product.version: M38010-308 dmi.sys.vendor: Intel(R) Client Systems To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2142891/+subscriptions
[Bug 2158074] Re: Dynabook RA/ZYB (AMD ACP 6.3, subsystem 3100:f07b): internal speakers silent, headphones have periodic dropouts
Closing as Invalid. The audio issue was caused by warm reboot from Windows. A full power cycle (shutdown from Windows, then boot Ubuntu) resolves the issue. No kernel changes are needed. ** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158074 Title: Dynabook RA/ZYB (AMD ACP 6.3, subsystem 3100:f07b): internal speakers silent, headphones have periodic dropouts Status in linux package in Ubuntu: Invalid Bug description: On Dynabook RA/ZYB laptop running Ubuntu 26.04 LTS (kernel 7.0.0-22-generic), audio is not working correctly. Hardware: - Audio Coprocessor: AMD ACP 6.3 [1022:15e2] rev 63, subsystem Dynabook Inc. [3100:f07b] - SoundWire codec: Realtek rt722-sdca (sdw:0:0:025d:0722:01) - Driver in use: snd_pci_ps (legacy path, snd_acp_sdw_legacy_mach) Issue 1: Internal speakers completely silent - aplay -l detects Card 1 (amd-soundwire) with devices 0 (SimpleJack) and 2 (SmartAmp) - wpctl status shows "Audio Coprocessor Speaker" as default sink - ALSA control "Speaker Switch" is on, volumes at max (87/87) - speaker-test on hw:1,2 produces no audio even with PipeWire stopped - Suspected cause: subsystem ID 3100:f07b (Dynabook RA/ZYB) is not registered in snd_soc_acpi_amd_sdca_quirks, so the SmartAmp function of the rt722 SDCA codec is not properly initialized for this machine. Issue 2: Headphone jack has periodic dropouts (~1 per second) - Audio signal is present but interrupted approximately once per second - dmesg shows: "workqueue: acpi_os_execute_deferred hogged CPU for >10000us" occurring repeatedly, causing PipeWire buffer underruns - No ALSA/audio-specific errors in dmesg Expected: internal speakers and headphones work correctly Actual: internal speakers completely silent, headphones have periodic dropouts A quirk entry for subsystem 3100:f07b is likely needed in sound/soc/amd/ps/acp63-sdca-quirks.c or equivalent. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-22-generic 7.0.0-22.22 ProcVersionSignature: Ubuntu 7.0.0-22.22-generic 7.0.0 Uname: Linux 7.0.0-22-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: ryuji 15355 F.... pipewire ryuji 15356 F.... wireplumber /dev/snd/controlC0: ryuji 15356 F.... wireplumber /dev/snd/seq: ryuji 15355 F.... pipewire CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Wed Jun 24 18:38:02 2026 InstallationDate: Installed on 2026-06-22 (2 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) MachineType: Dynabook Inc. dynabook RA/ZYB ProcEnviron: LANG=ja_JP.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-22-generic root=UUID=002a38ae-4aa4-4550-918d-24be59ef4a54 ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 03/19/2026 dmi.bios.release: 1.80 dmi.bios.vendor: Dynabook Inc. dmi.bios.version: Version 1.80 dmi.board.asset.tag: 0000000000 dmi.board.name: OK012C/0000 dmi.board.vendor: Dynabook Inc. dmi.board.version: Version A0 dmi.chassis.asset.tag: 0000000000 dmi.chassis.type: 10 dmi.chassis.vendor: Dynabook Inc. dmi.chassis.version: Version 1.0 dmi.ec.firmware.release: 1.40 dmi.modalias: dmi:bvnDynabookInc.:bvrVersion1.80:bd03/19/2026:br1.80:efr1.40:svnDynabookInc.:pndynabookRA/ZYB:pvrW6RAZY7BAH:rvnDynabookInc.:rnOK012C/0000:rvrVersionA0:cvnDynabookInc.:ct10:cvrVersion1.0:skuPGA10N:pfadynabook: dmi.product.family: dynabook dmi.product.name: dynabook RA/ZYB dmi.product.sku: PGA10N dmi.product.version: W6RAZY7BAH dmi.sys.vendor: Dynabook Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158074/+subscriptions
[Bug 2142891] Re: NUC10I7FNK hangs on reboot > 6.8.0-94 & 6.17.0-14
Checked linux-image-6.8.0-124-generic today, and the problem is still present. Stuck on linux-image-6.8.0-94-generic, as that is the last kernel I have found that does not hang with a black screen before the system reboots to UEFI log. -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2142891 Title: NUC10I7FNK hangs on reboot > 6.8.0-94 & 6.17.0-14 Status in linux package in Ubuntu: New Bug description: System hangs after "rebooting system" with a black screen on kernels mentioned in subject, downgrading to 6.8.0-94-generic fixes the problem. Have tested reboot= settings on new kernels with no effect. Let me know if you need details from the logs of the failing kernel, the reports added is from the working kernel. ProblemType: Bug DistroRelease: Ubuntu 24.04 Package: linux-image-6.8.0-94-generic 6.8.0-94.96 ProcVersionSignature: Ubuntu 6.8.0-94.96-generic 6.8.12 Uname: Linux 6.8.0-94-generic x86_64 AlsaVersion: Advanced Linux Sound Architecture Driver Version k6.8.0-94-generic. AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.28.1-0ubuntu3.8 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D2', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer' Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer' CasperMD5CheckResult: unknown Date: Fri Feb 27 19:46:25 2026 InstallationDate: Installed on 2025-09-17 (163 days ago) InstallationMedia: Ubuntu-Server 24.04.3 LTS "Noble Numbat" - Release amd64 (20250805.1) IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: Intel(R) Client Systems NUC10i7FNH ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.8.0-94-generic root=/dev/mapper/vg0-ubuntu ro reboot=acpi RelatedPackageVersions: linux-restricted-modules-6.8.0-94-generic N/A linux-backports-modules-6.8.0-94-generic N/A linux-firmware 20240318.git3b128b60-0ubuntu2.25 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 07/20/2022 dmi.bios.release: 5.16 dmi.bios.vendor: Intel Corp. dmi.bios.version: FNCML357.0058.2022.0720.1011 dmi.board.asset.tag: Default string dmi.board.name: NUC10i7FNB dmi.board.vendor: Intel Corporation dmi.board.version: M38062-307 dmi.chassis.asset.tag: Default string dmi.chassis.type: 35 dmi.chassis.vendor: Intel Corporation dmi.chassis.version: 2.0 dmi.ec.firmware.release: 3.12 dmi.modalias: dmi:bvnIntelCorp.:bvrFNCML357.0058.2022.0720.1011:bd07/20/2022:br5.16:efr3.12:svnIntel(R)ClientSystems:pnNUC10i7FNH:pvrM38010-308:rvnIntelCorporation:rnNUC10i7FNB:rvrM38062-307:cvnIntelCorporation:ct35:cvr2.0:skuBXNUC10i7FNHN: dmi.product.family: FN dmi.product.name: NUC10i7FNH dmi.product.sku: BXNUC10i7FNHN dmi.product.version: M38010-308 dmi.sys.vendor: Intel(R) Client Systems To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2142891/+subscriptions
[Bug 2154075] Re: [REGRESSION] ASUS Zenbook S16 UM5606GA (Ryzen AI 9 HX470) graphical freeze before login with Ubuntu 26.04 kernel 7.0.x, fixed by kernel 6.17
Does sound function well with 6.18.7-760061807? While the display functions well with Ubuntu 24.04 kernel (on Ubuntu 26.04), sound does not. I cannot issue a bug on that, because Ubuntu's bug reporting tool refuses to run if the kernel does not match the distribution. Thanks! -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2154075 Title: [REGRESSION] ASUS Zenbook S16 UM5606GA (Ryzen AI 9 HX470) graphical freeze before login with Ubuntu 26.04 kernel 7.0.x, fixed by kernel 6.17 Status in linux package in Ubuntu: Confirmed Bug description: System experiences a reproducible graphical freeze during boot on Ubuntu 26.04 when using Ubuntu kernel 7.0.x. Hardware: * ASUS Zenbook S16 UM5606GA * AMD Ryzen AI 9 HX470 * integrated AMD graphics * BIOS version: UM5606GA.305 * BIOS date: 2026-02-25 * Secure Boot enabled Affected kernels: * 7.0.0-15-generic (fails) Known-good kernels: * 6.17.0-29-generic (works correctly) The issue appears shortly before the graphical login prompt should appear. Symptoms: * display freezes before login screen * mouse cursor may briefly move after resume from suspend * GUI remains visually frozen * keyboard and GUI actions occasionally appear after subsequent wake events * SSH access continues to function normally * system does not panic or reboot This suggests the kernel remains operational while the graphics/display stack hangs. Environment: * Ubuntu 26.04 userspace * GNOME * both Wayland and Xorg tested * same Mesa userspace for both working and failing kernels Regression evidence: The following configuration changes were tested and reverted unless otherwise noted: 1. Booting Ubuntu 26.04 live environment using "Safe graphics" Result: * works correctly 2. Booting installed system with: nomodeset Result: * works correctly * accelerated graphics unavailable 3. GRUB: set gfxpayload=keep Result: * Ubuntu live environment boots correctly in safe graphics mode 4. Kernel parameter: amdgpu.dcdebugmask=0x12 Result: * no improvement 5. Kernel parameter: amdgpu.mes=0 Result: * no improvement 6. Kernel parameter: amdgpu.dc=0 Result: * system freezes earlier during splash screen * SSH remains operational 7. Kernel parameter: initcall_blacklist=simpledrm_platform_driver_init Result: * no improvement 8. Kernel parameter: pcie_aspm=off Result: * no improvement 9. Forced Xorg instead of Wayland Result: * same behavior 10. Replaced /lib/firmware/amdgpu firmware files with Ubuntu 24.04 versions Result: * no improvement * reverted afterward 11. Ubuntu 24.04 live environment Result: * boots correctly without safe graphics 12. Installed Ubuntu kernel: 6.17.0-29-generic Packages installed: * linux-headers-6.17.0-29-generic * linux-image-6.17.0-29-generic * linux-modules-6.17.0-29-generic * linux-tools-6.17.0-29-generic Result: * fully stable system * accelerated graphics functional * graphical login works correctly * suspend/resume functional Current working kernel: Linux aeon 6.17.0-29-generic #29-Ubuntu SMP PREEMPT_DYNAMIC Tue May 5 19:42:34 UTC 2026 x86_64 GNU/Linux Conclusion: This appears to be a regression introduced in Ubuntu kernel 7.0.x affecting AMDGPU graphics initialization or display handling on Ryzen AI 9 HX470 / Strix Point hardware. The issue does not appear specific to: * GNOME * Wayland * Xorg * userspace Mesa because the same userspace environment works correctly under kernel 6.17. Suspected subsystem: * amdgpu * DRM/DCN * DMUB/DMCUB * modesetting * display core * Ryzen AI / Strix Point graphics enablement Potential duplicates / related reports: 1. Bug #2148753 linux-firmware-amd-graphics graphical boot regression on Ryzen AI ASUS Zenbook systems https://bugs.launchpad.net/ubuntu/+source/linux-firmware-amd-graphics/+bug/2148753 Mirror: https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg6269039.html 2. Bug #2147541 Ryzen AI platform regression under Ubuntu 26.04 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2147541 ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-15-generic 7.0.0-15.15 ProcVersionSignature: Ubuntu 7.0.0-15.15-generic 7.0.0 Uname: Linux 7.0.0-15-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CasperMD5CheckResult: pass Date: Sat May 23 17:32:03 2026 InstallationDate: Installed on 2026-05-15 (8 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) MachineType: ASUS Zenbook S16 UM5606GA ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-7.0.0-15-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 02/25/2026 dmi.bios.release: 5.35 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: UM5606GA.305 dmi.board.asset.tag: ATN12345678901234567 dmi.board.name: UM5606GA dmi.board.vendor: ASUSTeK COMPUTER INC. dmi.board.version: 1.0 dmi.chassis.asset.tag: No Asset Tag dmi.chassis.type: 10 dmi.chassis.vendor: ASUSTeK COMPUTER INC. dmi.chassis.version: 1.0 dmi.ec.firmware.release: 3.3 dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvrUM5606GA.305:bd02/25/2026:br5.35:efr3.3:svnASUS:pnZenbookS16UM5606GA:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnUM5606GA:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:sku:pfaZenbookS16: dmi.product.family: Zenbook S16 dmi.product.name: Zenbook S16 UM5606GA dmi.product.version: 1.0 dmi.sys.vendor: ASUS To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2154075/+subscriptions