You have been subscribed to a public bug:
I am observing random memory corruption in Ubuntu 24.04.1 kdump kernel
on Dell R760 servers with IOMMU on. At times, the corruption happens in
mpi3mr driver(Broadcom's storage controller hosting the OS) and it
caused crashdump collection failure. Sometimes the issue hits in other
drivers like tg3.
tg3 traces looks as below:
[ 33.189809] DMAR: [DMA Write NO_PASID] Request device [01:00.1] fault addr 0xfffa0000 [fault reason 0x71] SM: Present bit in first-level paging entry is clear
[ 33.204800] DMAR: [DMA Write NO_PASID] Request device [01:00.1] fault addr 0xfffa0000 [fault reason 0x71] SM: Present bit in first-level paging entry is clear
[ 33.374041] DMAR: DRHD: handling fault status reg 2
[ 33.379349] DMAR: [DMA Write NO_PASID] Request device [01:00.1] fault addr 0xfffa0000 [fault reason 0x71] SM: Present bit in first-level paging entry is clear
[ 33.755334] tg3 0000:01:00.1 eno8403: NETDEV WATCHDOG: CPU: 0: transmit queue 0 timed out 5996 ms
[ 33.764622] tg3 0000:01:00.1 eno8403: transmit timed out, resetting
[ 35.043552] tg3 0000:01:00.1 eno8403: 0x00000000: 0x165f14e4, 0x20100406, 0x02000000, 0x00800000
[ 35.052650] tg3 0000:01:00.1 eno8403: 0x00000010: 0x9590000c, 0x00000000, 0x9591000c, 0x00000000
[ 35.061739] tg3 0000:01:00.1 eno8403: 0x00000020: 0x9592000c, 0x00000000, 0x00000000, 0x0a6b1028
[ 35.070829] tg3 0000:01:00.1 eno8403: 0x00000030: 0xfffc0000, 0x00000048, 0x00000000, 0x000002ff
Notes:
1. The issue does not hit when IOMMU is turned off.
2. The issue does not hit with latest upstream kernel 6.12.
3. The issue hits with 6.8 upstream kernel release also.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
** Tags: bot-comment
--
Ubuntu 24.04.1: memory corruption in kdump kernel
https://bugs.launchpad.net/bugs/2086188
You received this bug notification because you are subscribed to linux in Ubuntu.
Комментариев нет:
Отправить комментарий