четверг

[Bug 2152281] Re: net: wwan: t7xx: soft lockup in dpmaif_tx_hw_push_thread during system suspend

Update patch file ** Patch added: "0001-net-wwan-t7xx-fix-race-between-TX-thread-and-system-.patch" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2152281/+attachment/5970003/+files/0001-net-wwan-t7xx-fix-race-between-TX-thread-and-system-.patch -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2152281 Title: net: wwan: t7xx: soft lockup in dpmaif_tx_hw_push_thread during system suspend Status in linux package in Ubuntu: New Bug description: == Summary == A soft lockup (system freeze) occurs in the DPMAIF TX kernel thread when system suspend (s2idle) is triggered while the thread is active. == Environment == Kernel: 6.17.0-1017-oem (Ubuntu OEM) Device: MediaTek MT6880 (FM350-GL) 0000:56:00.0 ASPM: L1 Enabled on endpoint (LnkCtl: ASPM L1 Enabled; ClockPM+) Sleep: s2idle == Reproduction == Trigger: SIM registered + ASPM L1 enabled + repeated suspend/resume (hundreds of cycles) Reproduction script (60s cycle): while true; do systemctl suspend sleep 60 done == Kernel Log == watchdog: BUG: soft lockup - CPU#10 stuck for 26s! [dpmaif_tx_hw_pu:625] RIP: 0010:ktime_get_mono_fast_ns+0x67/0xd0 ... watchdog: BUG: soft lockup - CPU#10 stuck for 52s! [dpmaif_tx_hw_pu:625] RIP: 0010:_raw_spin_unlock_irqrestore+0x3d/0x60 Call Trace: __pm_runtime_resume+0x5b/0x80 t7xx_dpmaif_tx_hw_push_thread+0xc4/0x4e0 [mtk_t7xx] == Root Cause == t7xx_dpmaif_suspend() stops the TX work queue via t7xx_dpmaif_tx_stop() but does NOT signal the TX kthread or update dpmaif_ctrl->state. The kthread can pass the state guard and call pm_runtime_resume_and_get() concurrently with the system PM suspend path, causing a spinlock deadlock. == Fix == See attached patch. Three changes: 1. t7xx_dpmaif_suspend(): set state=PWROFF + wake_up() to signal kthread 2. t7xx_dpmaif_resume(): restore state=PWRON (symmetric) 3. t7xx_dpmaif_tx_hw_push_thread(): add state guard before pm_runtime call == Workaround == Disable ASPM L1 on endpoint: LNKCTL=$(setpci -s 56:00.0 CAP_EXP+10.w) setpci -s 56:00.0 CAP_EXP+10.w=$(printf "%04x" $((16#${LNKCTL} & ~0x2))) (reduces probability but does not fully prevent the race) == Testing == Patched module installed to /lib/modules/.../extra/mtk_t7xx.ko Running suspend/resume loop with SIM registered — no lockup observed (testing in progress, over 4000 cycles completed without regression). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2152281/+subscriptions

Комментариев нет:

Отправить комментарий