воскресенье

[Bug 2150605] Re: `i915 Arrow Lake-S: PHY A / C10 DPLL state mismatch on resume from long s2idle dwell — slow wake (5-10s) with retry storm`

This appears to affect Meteor Lake as well, not just Arrow Lake-S; I'm seeing the same issue on a Dell Pro Max 16. It appears to be a regression since I didn't see this issue on 6.17.0-29, but I do see it on 7.0.0-15-generic. ## System - **Device**: Dell Pro Max 16 Premium MA16250 - **Motherboard**: Dell 085XPR - **CPU/iGPU**: Intel Core Ultra 7 265H (Meteor Lake) - **BIOS**: 1.9.0 (2026-03-31) - **Kernel**: 7.0.0-15-generic (Ubuntu 26.04) - **Display**: Internal eDP panel (DDI A / pipe A) ## Hardware IDs ``` DMI: Dell Inc. Dell Pro Max 16 Premium MA16250/085XPR, BIOS 1.9.0 03/31/2026 SMBIOS 3.8 present iGPU: pci 0000:00:02.0 [8086:7d51] - Meteor Lake integrated display version 14.00 stepping D0 dGPU: pci 0000:01:00.0 [10de:2d39] - NVIDIA (hybrid graphics, PCIe x8 @ 32 GT/s) ``` ## i915 Driver & Firmware ``` i915 version: 1.6.0 DMC firmware: i915/mtl_dmc.bin v2.23 GuC firmware: i915/mtl_guc_70.bin v70.53.0 HuC firmware: i915/mtl_huc_gsc.bin v8.5.4 GSC firmware: i915/mtl_gsc_1.bin (cv1.0, r102.1.15.1926) ``` ## Symptoms Appear to match the reported issue very closely: - Short s2idle dwells (minutes): instant wake, no errors - Long s2idle dwells (4+ hours): ~30-60 second resume delay with PHY/DPLL error storm - Display eventually recovers - Appeared to be a hard hang on display powersave before setting i915.enable_psr=0 ## Timing ``` [65004.726228] PM: suspend entry (s2idle) [80437.248708] PM: suspend exit ``` Sleep duration: ~4.3 hours Resume delay: ~57 seconds (from first resume activity at 80380 to PM: suspend exit) ## Log Excerpt (resume sequence) ``` [80392.232107] i915 0000:00:02.0: [drm] *ERROR* Failed to bring PHY A to idle. [80392.240108] i915 0000:00:02.0: [drm] *ERROR* PHY A Read 0c70 failed after 3 retries. [80392.249106] i915 0000:00:02.0: [drm] *ERROR* PHY A Write 0c70 failed after 3 retries. [80392.474869] i915 0000:00:02.0: [drm] *ERROR* Timeout waiting for DDI BUF A to get active [80393.428861] i915 0000:00:02.0: [drm] *ERROR* Timed out waiting for DP idle patterns [80403.757549] i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] flip_done timed out [80403.967091] i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] mismatch in pixel_rate (expected 317492, found 36124) [80403.967102] i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] mismatch in dpll_hw_state [80403.967106] i915 0000:00:02.0: [drm] *ERROR* cx0pll_hw_state: lane_count: 2, ssc_enabled: no, use_c10: yes, tbt_mode: no [80403.967109] i915 0000:00:02.0: [drm] *ERROR* c10pll_hw_state: clock: 540000, fracen: yes, [80403.967111] i915 0000:00:02.0: [drm] *ERROR* quot: 40960, rem: 0, den: 1, [80403.967112] i915 0000:00:02.0: [drm] *ERROR* multiplier: 140, tx_clk_div: 0. [80403.967127] i915 0000:00:02.0: [drm] *ERROR* found: [80403.967128] i915 0000:00:02.0: [drm] *ERROR* cx0pll_hw_state: lane_count: 2, ssc_enabled: no, use_c10: yes, tbt_mode: no [80403.967130] i915 0000:00:02.0: [drm] *ERROR* c10pll_hw_state: clock: 61440, fracen: no, [80403.967132] i915 0000:00:02.0: [drm] *ERROR* multiplier: 16, tx_clk_div: 0. [80403.967146] i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] mismatch in port_clock (expected 540000, found 61440) [80403.967159] i915 0000:00:02.0: [drm] pipe state doesn't match! [80404.177686] i915 0000:00:02.0: [drm] DPLL 0: pll hw state mismatch ``` Note the same C10 PLL state mismatch pattern: - **Expected**: clock 540000, multiplier 140, PLL registers populated - **Found**: clock 61440, multiplier 16, PLL registers all zeros The ~10 second gap between `[80393] Timed out waiting for DP idle patterns` and `[80403] flip_done timed out` matches the flip_done timeout constant. Issue reproduced on two separate occasions (2026-05-24 and 2026-05-25) with identical symptoms. ## Kernel Command Line ``` BOOT_IMAGE=/vmlinuz-7.0.0-15-generic root=/dev/mapper/vgubuntu-root ro quiet splash intel_iommu=on iommu=pt i915.enable_psr=0 ``` PSR is already disabled (`i915.enable_psr=0`), so this is not a PSR- related issue. Before disabling PSR I was seeing failure to recover from display suspend, another regression since 6.17.0-29. -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2150605 Title: `i915 Arrow Lake-S: PHY A / C10 DPLL state mismatch on resume from long s2idle dwell — slow wake (5-10s) with retry storm` Status in linux package in Ubuntu: Confirmed Bug description: This is what an investigation using claude code yielded regarding a wake-up from sleep issue: On HP ZBook Fury G1i 16 (Arrow Lake-S, integrated display engine identifying as meteorlake D0, PCI 8086:7d67), every resume from s2idle after a multi-hour dwell produces a stack of i915 *ERROR* messages from the C10 PHY / DPLL state-restore path. The driver retries and eventually recovers, so the display *does* come back, but the retry loop takes ~5-10 seconds — long enough that users report "the display didn't wake up." Short cycles (seconds-to-minutes of dwell) wake instantly with no errors. The error stack is byte-for-byte identical across two reproductions and across `i915.enable_psr=0`, `i915.enable_dc=0`, and `i915.enable_fbc=0`. Those flags do not reach the code path that's racing. Earlier, before `i915.enable_psr=0` was applied, the same regime caused hard hangs (`Atomic update failure on pipe A`) requiring power-cycle. PSR-disable converted the failure mode from "hang" to "slow recover" but did not eliminate it. `xe.force_probe=7d67` was also tested as a workaround. xe binds cleanly on this device but suffers a different bug: `Tile0: GT0: Engine reset engine_class=rcs guc_id=48 state=0x289` repeating across each suspend cycle (`drm_WARN_ON_ONCE(ret == -110)`), eventually wedging the display. Not viable as a workaround on this kernel. NVIDIA dGPU is fully exonerated: `nvidia-suspend.service` and `nvidia-resume.service` `Finished` cleanly across every cycle. ## System * Distro: Ubuntu 26.04 LTS (resolute) * Kernel: 7.0.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Mon Apr 13 11:09:53 UTC 2026 x86_64 * Package: linux-image-7.0.0-14-generic 7.0.0-14.14 * Firmware pkg: linux-firmware 20260319.git217ca6e4.1ubuntu * HW: HP ZBook Fury G1i 16 inch Mobile Workstation PC, BIOS X96 Ver. 01.01.19 (2025-11-22) (latest per fwupdmgr) * iGPU: Intel Corporation Arrow Lake-S [Intel Graphics] [8086:7d67] (rev 06), driver: i915 * dGPU: NVIDIA Corporation GB205GLM [RTX PRO 3000 Blackwell Generation Laptop GPU] [10de:2f38] (rev a1), driver: nvidia 580.142 (open kernel modules) * Loaded i915 firmware: mtl_dmc.bin (v2.23), mtl_guc_70.bin v70.53.0, mtl_huc_gsc.bin v8.5.4 * Sleep mode: s2idle only (`ACPI: PM: (supports S0 S4 S5)` — firmware does not expose S3) * Kernel cmdline (current):   `quiet splash i915.enable_psr=0 i915.enable_dc=0 zswap.enabled=1 zswap.compressor=zstd zswap.zpool=zsmalloc zswap.max_pool_percent=20 i915.enable_fbc=0` * Session: KDE Plasma on Wayland (sddm) ## Reproduction 1. Boot, log in, do normal work (browser, IDE, terminals). 2. Close laptop lid (or `systemctl suspend`) for ≥2 hours. 3. Open lid / press a key. Expected: panel relights within ~500 ms, no kernel ERRORs. Actual: panel relights after ~5-10 s, journal contains the error stack below. ## Journal trace (boot 2026-04-29 05:48, dwell 05:51:11 → 07:59:21 = 2 h 8 min) ``` PM: suspend entry (s2idle) PM: Some devices failed to suspend, or early wake event detected PM: suspend exit PM: suspend entry (s2idle) [2 h 8 min later] PM: suspend exit i915 0000:00:02.0: [drm] *ERROR* Failed to bring PHY A to idle. i915 0000:00:02.0: [drm] *ERROR* PHY A Read 0c70 failed after 3 retries. i915 0000:00:02.0: [drm] *ERROR* PHY A Write 0c70 failed after 3 retries. i915 0000:00:02.0: [drm] *ERROR* Timeout waiting for DDI BUF A to get active i915 0000:00:02.0: [drm] *ERROR* Timed out waiting for DP idle patterns i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] flip_done timed out i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] mismatch in pixel_rate                        (expected 1220171, found 92553) i915 0000:00:02.0: [drm] *ERROR* [CRTC:150:pipe A] mismatch in dpll_hw_state i915 0000:00:02.0: [drm] *ERROR* expected: i915 0000:00:02.0: [drm] *ERROR* cx0pll_hw_state: lane_count: 4, ssc_enabled: no,                                    use_c10: yes, tbt_mode: no i915 0000:00:02.0: [drm] *ERROR* c10pll_hw_state: clock: 810000, fracen: yes, i915 0000:00:02.0: [drm] *ERROR* quot: 61440, rem: 0, den: 1, i915 0000:00:02.0: [drm] *ERROR* multiplier: 210, tx_clk_div: 0. i915 0000:00:02.0: [drm] *ERROR* found: i915 0000:00:02.0: [drm] *ERROR* c10pll_hw_state: clock: 61440, fracen: no, i915 0000:00:02.0: [drm] *ERROR* multiplier: 16, tx_clk_div: 0. ``` The expected vs found C10 PLL clock (810 MHz HBR3 vs 61 MHz fallback) is the key signal: the panel's eDP link comes back at the wrong rate, the driver retries and eventually re-locks at the correct rate. This is in `drivers/gpu/drm/i915/display/intel_cx0_phy.c` / DPLL state restore — below the layers reachable by `enable_psr` / `enable_dc` / `enable_fbc`. ## What I have already tried | change | wake outcome on long dwell | |---|---| | (default) | hard hang, `Atomic update failure on pipe A`, requires power-cycle | | `i915.enable_psr=0` (only) | clean wake on short cycles, untested at long dwell | | `i915.enable_psr=0 i915.enable_dc=0` | slow recover (~5-10 s) at 6 h 41 min dwell, full error stack | | `... i915.enable_fbc=0` (added) | same: slow recover, identical error stack at 2 h 8 min dwell | | `i915.force_probe=!7d67 xe.force_probe=7d67` | xe binds cleanly, but `GT0: Engine reset` storm during cycles, eventual wedge | linux-firmware is at the latest candidate for 26.04 (20260319.git217ca6e4.1ubuntu). linux-image-oem-26.04 does not exist yet. ## Why this matters This is a clean reproduction of a likely-upstream Arrow Lake-S regression in the cx0/c10pll DPLL state-restore path during s2idle resume. The error signature is highly diagnostic and consistent across runs. Fix has not landed in 7.0.0-14.14; would benefit from being picked into the 26.04 kernel from upstream once available. ProblemType: Bug DistroRelease: Ubuntu 26.04 Package: linux-image-7.0.0-14-generic 7.0.0-14.14 ProcVersionSignature: Ubuntu 7.0.0-14.14-generic 7.0.0 Uname: Linux 7.0.0-14-generic x86_64 ApportVersion: 2.34.0-0ubuntu2 Architecture: amd64 CasperMD5CheckMismatches: ./boot/grub/i386-pc/eltorito.img CasperMD5CheckResult: fail CurrentDesktop: KDE Date: Wed Apr 29 08:18:09 2026 InstallationDate: Installed on 2026-03-27 (33 days ago) InstallationMedia: Ubuntu 26.04 "Resolute Raccoon" - Daily amd64 (20260325) MachineType: HP HP ZBook Fury G1i 16 inch Mobile Workstation PC ProcFB: 0 i915drmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-7.0.0-14-generic root=UUID=2b2b39cc-2151-49f5-9205-74e3f9f1f999 ro quiet splash i915.enable_psr=0 i915.enable_dc=0 zswap.enabled=1 zswap.compressor=zstd zswap.zpool=zsmalloc zswap.max_pool_percent=20 i915.enable_fbc=0 crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 11/22/2025 dmi.bios.release: 1.19 dmi.bios.vendor: HP dmi.bios.version: X96 Ver. 01.01.19 dmi.board.name: 8DE2 dmi.board.vendor: HP dmi.board.version: KBC Version 55.35.00 dmi.chassis.type: 10 dmi.chassis.vendor: HP dmi.ec.firmware.release: 85.53 dmi.modalias: dmi:bvnHP:bvrX96Ver.01.01.19:bd11/22/2025:br1.19:efr85.53:svnHP:pnHPZBookFuryG1i16inchMobileWorkstationPC:pvrSBKPFV3:rvnHP:rn8DE2:rvrKBCVersion55.35.00:cvnHP:ct10:cvr:skuB14E7AV:pfa103C_5336ANHPZBook: dmi.product.family: 103C_5336AN HP ZBook dmi.product.name: HP ZBook Fury G1i 16 inch Mobile Workstation PC dmi.product.sku: B14E7AV dmi.product.version: SBKPFV3 dmi.sys.vendor: HP To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2150605/+subscriptions

Комментариев нет:

Отправить комментарий