** Tags added: kernel-daily-bug -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2158614 Title: amdgpu: Granite Ridge iGPU gfx ring timeout after Chrome GPU page fault causes gnome-shell SIGABRT and session loss Status in linux package in Ubuntu: New Bug description: What happened ============= During a Microsoft Teams video call in Google Chrome, the entire GNOME session was terminated at approximately 09:54 CEST on 2026-06-29. The system did not reboot — the kernel remained running and GDM presented the login screen after the session was lost. Failure sequence (from journalctl -b 0) ============= *** 09:53:56 — Chrome triggers AMD GPU page fault (PERMISSION_FAULTS: read+write denied) kernel: amdgpu 0000:70:00.0: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:75) kernel: amdgpu 0000:70:00.0: Process chrome pid 14088 thread chrome:cs0 pid 14114 kernel: amdgpu 0000:70:00.0: in page starting at address 0x000000003f919000 from client 0x1b (UTCL2) kernel: amdgpu 0000:70:00.0: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501430 kernel: amdgpu 0000:70:00.0: Faulty UTCL2 client ID: SQC (data) (0xa) kernel: amdgpu 0000:70:00.0: PERMISSION_FAULTS: 0x3 *** 09:53:58 — gfx_0.1.0 ring times out; gnome-shell is a victim on the same ring; ring reset fails; full GPU reset triggered kernel: amdgpu 0000:70:00.0: ring gfx_0.1.0 timeout, signaled seq=93148, emitted seq=93149 kernel: amdgpu 0000:70:00.0: Process gnome-shell pid 8975 thread gnome-shel:cs0 pid 8992 kernel: amdgpu 0000:70:00.0: Starting gfx_0.1.0 ring reset kernel: amdgpu 0000:70:00.0: Ring gfx_0.1.0 reset failed kernel: amdgpu 0000:70:00.0: GPU reset begin!. Source: 1 Note: Vulkan was already reporting VK_SUBOPTIMAL_KHR on the swapchain from gnome-text-editor at 09:52:19 and 09:53:12, suggesting the GPU was under stress for ~2 minutes before the crash. *** 09:53:59–09:54:00 — GPU MODE2 reset completes, hardware recovers, but gnome-shell loses its rendering context kernel: amdgpu 0000:70:00.0: MODE2 reset kernel: amdgpu 0000:70:00.0: GPU reset succeeded, trying to resume kernel: amdgpu 0000:70:00.0: GPU reset(1) succeeded! kernel: amdgpu 0000:70:00.0: [drm] device wedged, but recovered through reset gnome-shell[8975]: amdgpu: The CS has cancelled because the context is lost. This context is innocent. *** 09:54:04 — gnome-shell crashes (SIGABRT / core dump), taking the Wayland compositor down systemd[8328]: org.gnome.Shell@ubuntu.service: Main process exited, code=dumped, status=6/ABRT All Wayland clients lose their connection simultaneously: gnome-text-editor[62307]: Lost connection to Wayland compositor. google-chrome.desktop[14032]: Gdk-Message: Lost connection to Wayland compositor. ptyxis[10929]: Lost connection to Wayland compositor. gnome-calendar[11623]: Lost connection to Wayland compositor. gnome-shell[10351]: (EE) failed to read Wayland events: Broken pipe *** 09:54:05 — GNOME session is torn down; GDM login screen appears gdm-password][7929]: pam_unix(gdm-password:session): session closed for user albert systemd-logind[3000]: Removed session 2. gdm-launch-environment][62675]: pam_unix(gdm-launch-environment:session): session opened for user gdm-greeter A GPU coredump was created by the kernel: /sys/class/drm/card1/device/devcoredump/data Root cause analysis ==================== 1. Chrome (pid 14088) submitted a GPU command buffer that caused an UTCL2 page fault (bad virtual address mapping, read+write permission violation). 2. This stalled the gfx_0.1.0 command ring, which gnome-shell (pid 8975) was also submitting work to. 3. The ring-level reset failed, requiring a full MODE2 GPU reset. 4. The hardware reset succeeded, but gnome-shell (the Wayland compositor) cannot survive losing its Vulkan/GL context mid-frame and aborted. 5. With the Wayland compositor gone, the entire user session was terminated. The core issue is that one process (Chrome) triggering a GPU fault can destroy the GPU context of an unrelated innocent process (gnome- shell), resulting in total session loss. Ideally either: * the ring reset should succeed without a full GPU reset, or * gnome-shell should be able to recover from context loss after a GPU reset without crashing. Steps to reproduce ================== Reliably reproducible conditions are not yet confirmed, but the crash occurred during: Active Google Chrome usage with Teams video call (hardware-accelerated video decode active) gnome-shell simultaneously rendering the desktop via the same gfx ring Additional notes ================ * No system reboot occurred; the kernel and all non-GNOME processes survived. * The AMD GPU firmware is from 2026-03-19 (20260319.git217ca6e4). Newer upstream firmware from AMD may address the page fault or ring reset failure. * GPU device ID: 1002:13c0 (Granite Ridge / Ryzen 9000 series iGPU) ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-27-generic 7.0.0-27.27 ProcVersionSignature: Ubuntu 7.0.0-27.27-generic 7.0.6 Uname: Linux 7.0.0-27-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: albert 66499 F.... wireplumber /dev/snd/controlC0: albert 66499 F.... wireplumber /dev/snd/seq: albert 66498 F.... pipewire CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Mon Jun 29 12:50:22 2026 InstallationDate: Installed on 2026-06-22 (7 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Release amd64 (20260423.1) MachineType: Gigabyte Technology Co., Ltd. X870M AORUS ELITE WIFI7 ProcEnviron: LANG=en_US.UTF-8 PATH=(custom, no user) SHELL=/bin/bash TERM=xterm-256color XDG_RUNTIME_DIR=<set> ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-7.0.0-27-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro quiet splash crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 06/09/2026 dmi.bios.release: 5.41 dmi.bios.vendor: American Megatrends International, LLC. dmi.bios.version: F10a dmi.board.asset.tag: Default string dmi.board.name: X870M AORUS ELITE WIFI7 dmi.board.vendor: Gigabyte Technology Co., Ltd. dmi.board.version: x.x dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInternational,LLC.:bvrF10a:bd06/09/2026:br5.41:svnGigabyteTechnologyCo.,Ltd.:pnX870MAORUSELITEWIFI7:pvrDefaultstring-WCP-ADO:rvnGigabyteTechnologyCo.,Ltd.:rnX870MAORUSELITEWIFI7:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:pfaX870MB: dmi.product.family: X870 MB dmi.product.name: X870M AORUS ELITE WIFI7 dmi.product.sku: Default string dmi.product.version: Default string-WCP-ADO dmi.sys.vendor: Gigabyte Technology Co., Ltd. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2158614/+subscriptions
Комментариев нет:
Отправить комментарий