[РЕШЕНО] Ошибка № ... : [Bug 2157584] [NEW] Kernel NULL pointer dereference in __queue_work via delayed_work_timer

Public bug reported: Package: linux Affects: Ubuntu 26.04 (resolute) Description A QEMU/KVM virtio guest running Ubuntu 26.04 with kernel 7.0.0-22-generic froze completely. The guest agent stopped responding, SSH timed out, and the VM required a virsh reset to recover. Investigation of the previous boot's journal revealed a kernel NULL pointer dereference in __queue_work, preceded by a WARNING at the same location on a separate occasion earlier in the same boot. This matches a known upstream workqueue use-after-free / NULL-deref pattern where delayed work is queued to a workqueue that has been destroyed (wq->cpu_pwq nullified), causing a NULL pointer dereference when the delayed timer fires and calls delayed_work_timer_fn -> __queue_work. Steps to reproduce Not reliably reproducible. The crash occurs when a delayed_work timer fires after the owning workqueue has been destroyed. The VM had been running for ~2 days under mixed workload (GNOME desktop, Docker containers with nftables/wireguard networking, JetBrains IDEs). Kernel version Linux ubuntu-dev-2024 7.0.0-22-generic #22-Ubuntu SMP PREEMPT_DYNAMIC Mon May 25 15:54:34 UTC 2026 x86_64 GNU/Linux Package: linux-image-7.0.0-22-generic 7.0.0-22.22 Hardware QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.11-3ubuntu8 04/09/2026 16 vCPUs, 20GB RAM, virtio-blk + virtio-net + virtiofs + virtio-gpu Kernel command line BOOT_IMAGE=/boot/vmlinuz-7.0.0-22-generic root=UUID=25a32e36-0866-4e02-b7ef-0703c8b6d784 ro zswap.enabled=1 zswap.compressor=zstd zswap.zpool=zsmalloc zswap.max_pool_percent=30 splash plymouth crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M Timeline 1. 17:44:49 — WARNING at kernel/workqueue.c:2350 on CPUs 4, 11, 8 (swapper/idle). VM continued running. 2. 19:06:42 — BUG: kernel NULL pointer dereference on CPUs 9 and 1 (swapper/idle). VM froze. Oops trace (19:06:42 event) BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page CPU: 9 UID: 0 PID: 0 Comm: swapper/9 Tainted: G W 7.0.0-22-generic #22-Ubuntu PREEMPT(lazy) Tainted: [W]=WARN Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.11-3ubuntu8 04/09/2026 RIP: 0010:__queue_work.part.0+0x190/0x390 Code: ... <0f> 0b e9 65 ff ff ff ... RSP: 0018:ffffcce10016cdd8 EFLAGS: 00010003 RAX: ffff8c35a542f2c0 RBX: ffff8c35a542f2b8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffcce10016ce10 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c392bab2680 R13: 0000000000002000 R14: ffff8c34c0386c00 R15: ffff8c34c0389200 FS: 0000000000000000(0000) GS:ffff8c3980880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000072ea901a3fb8 CR3: 0000000199a02000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <IRQ> __queue_work+0x39/0xc0 ? __pfx_delayed_work_timer_fn+0x10/0x10 delayed_work_timer_fn+0x19/0x30 call_timer_fn+0x30/0x170 ? __pfx_delayed_work_timer_fn+0x10/0x10 __run_timers+0x1af/0x2c0 run_timer_softirq+0x8a/0x100 handle_softirqs+0xe1/0x360 __irq_exit_rcu+0x100/0x120 irq_exit_rcu+0xe/0x20 sysvec_apic_timer_interrupt+0x9f/0xd0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1b/0x20 RIP: 0010:pv_native_safe_halt+0xb/0x10 ... arch_cpu_idle+0x9/0x10 default_idle_call+0x2f/0x130 cpuidle_idle_call+0x114/0x1f0 do_idle+0x94/0xf0 cpu_startup_entry+0x29/0x30 start_secondary+0x125/0x180 ? soft_restart_cpu+0x14/0x14 common_startup_64+0x13e/0x141 </TASK> ---[ end trace 0000000000000000 ]--- A second identical oops was logged simultaneously on CPU 1. WARNING trace (17:44:49 event, same boot, earlier) WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#4: swapper/4/0 WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#11: swapper/11/0 WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#8: swapper/8/0 Modules linked in: nft_ct, wireguard, libcurve25519, ip6_udp_tunnel, udp_tunnel, nf_conntrack_netlink, xt_nat, xt_tcpudp, veth, xt_multiport, xt_conntrack, xt_MASQUERADE, xfrm_user, xfrm_algo, xt_set, ip_set, nft_chain_nat, nf_nat, nf_conntrack, nf_defrag_ipv6, nf_defrag_ipv4, nft_compat, nf_tables, virtiofs, serio_raw, vmw_vsock_virtio_transport, virtio_dma_buf, vsock, virtio_rng, autofs4, libahci, netconsole, virtio_gpu, psmouse, hid_generic, ahci, usbhid, hid Analysis The crash occurs in the timer interrupt path during CPU idle: 1. CPU is idle (pv_native_safe_halt / do_idle) 2. APIC timer interrupt fires 3. __run_timers expires a delayed_work timer 4. delayed_work_timer_fn calls __queue_work 5. __queue_work dereferences a NULL pointer (pwq/pool is NULL) The Tainted [W] flag confirms a prior WARN at the same code path. The simultaneous occurrence on two CPUs suggests a race condition during workqueue teardown — a delayed_work timer fires after the workqueue's cpu_pwq has been nullified by destroy_workqueue(). This matches the upstream pattern described in: - Tim Van Patten's patch "workqueue: Prevent delayed work UAF kernel panic" (June 2024): LKML — adds NULL check for pwq/pool in __queue_work - Tejun Heo's response and removal of WARN_ON_ONCE(!wq) (March 2026): GitHub mirror Related bugs - Launchpad #2068103 — same pattern on kernel 6.8.0-35 (workqueue.c:1790) - bugzilla.kernel.org #218288 — same pattern on kernel 6.6 (workqueue.c:1638) Availability of fix Kernel 7.0.0-26.26 is in resolute-proposed and includes upstream stable releases v7.0.1 through v7.0.6. It is unclear whether the workqueue UAF fix is included in those stable releases. The guest currently has 7.0.0-22.22 installed with no upgradable kernel in updates/security. Workaround None known. Rebooting after a virsh reset recovers the guest. ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2157584 Title: Kernel NULL pointer dereference in __queue_work via delayed_work_timer_fn on 7.0.0-22-generic (Ubuntu 26.04) Status in linux package in Ubuntu: New Bug description: Package: linux Affects: Ubuntu 26.04 (resolute) Description A QEMU/KVM virtio guest running Ubuntu 26.04 with kernel 7.0.0-22-generic froze completely. The guest agent stopped responding, SSH timed out, and the VM required a virsh reset to recover. Investigation of the previous boot's journal revealed a kernel NULL pointer dereference in __queue_work, preceded by a WARNING at the same location on a separate occasion earlier in the same boot. This matches a known upstream workqueue use-after-free / NULL- deref pattern where delayed work is queued to a workqueue that has been destroyed (wq->cpu_pwq nullified), causing a NULL pointer dereference when the delayed timer fires and calls delayed_work_timer_fn -> __queue_work. Steps to reproduce Not reliably reproducible. The crash occurs when a delayed_work timer fires after the owning workqueue has been destroyed. The VM had been running for ~2 days under mixed workload (GNOME desktop, Docker containers with nftables/wireguard networking, JetBrains IDEs). Kernel version Linux ubuntu-dev-2024 7.0.0-22-generic #22-Ubuntu SMP PREEMPT_DYNAMIC Mon May 25 15:54:34 UTC 2026 x86_64 GNU/Linux Package: linux-image-7.0.0-22-generic 7.0.0-22.22 Hardware QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.11-3ubuntu8 04/09/2026 16 vCPUs, 20GB RAM, virtio-blk + virtio-net + virtiofs + virtio-gpu Kernel command line BOOT_IMAGE=/boot/vmlinuz-7.0.0-22-generic root=UUID=25a32e36-0866-4e02-b7ef-0703c8b6d784 ro zswap.enabled=1 zswap.compressor=zstd zswap.zpool=zsmalloc zswap.max_pool_percent=30 splash plymouth crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M Timeline 1. 17:44:49 — WARNING at kernel/workqueue.c:2350 on CPUs 4, 11, 8 (swapper/idle). VM continued running. 2. 19:06:42 — BUG: kernel NULL pointer dereference on CPUs 9 and 1 (swapper/idle). VM froze. Oops trace (19:06:42 event) BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page CPU: 9 UID: 0 PID: 0 Comm: swapper/9 Tainted: G W 7.0.0-22-generic #22-Ubuntu PREEMPT(lazy) Tainted: [W]=WARN Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.11-3ubuntu8 04/09/2026 RIP: 0010:__queue_work.part.0+0x190/0x390 Code: ... <0f> 0b e9 65 ff ff ff ... RSP: 0018:ffffcce10016cdd8 EFLAGS: 00010003 RAX: ffff8c35a542f2c0 RBX: ffff8c35a542f2b8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffcce10016ce10 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c392bab2680 R13: 0000000000002000 R14: ffff8c34c0386c00 R15: ffff8c34c0389200 FS: 0000000000000000(0000) GS:ffff8c3980880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000072ea901a3fb8 CR3: 0000000199a02000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <IRQ> __queue_work+0x39/0xc0 ? __pfx_delayed_work_timer_fn+0x10/0x10 delayed_work_timer_fn+0x19/0x30 call_timer_fn+0x30/0x170 ? __pfx_delayed_work_timer_fn+0x10/0x10 __run_timers+0x1af/0x2c0 run_timer_softirq+0x8a/0x100 handle_softirqs+0xe1/0x360 __irq_exit_rcu+0x100/0x120 irq_exit_rcu+0xe/0x20 sysvec_apic_timer_interrupt+0x9f/0xd0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1b/0x20 RIP: 0010:pv_native_safe_halt+0xb/0x10 ... arch_cpu_idle+0x9/0x10 default_idle_call+0x2f/0x130 cpuidle_idle_call+0x114/0x1f0 do_idle+0x94/0xf0 cpu_startup_entry+0x29/0x30 start_secondary+0x125/0x180 ? soft_restart_cpu+0x14/0x14 common_startup_64+0x13e/0x141 </TASK> ---[ end trace 0000000000000000 ]--- A second identical oops was logged simultaneously on CPU 1. WARNING trace (17:44:49 event, same boot, earlier) WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#4: swapper/4/0 WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#11: swapper/11/0 WARNING: kernel/workqueue.c:2350 at __queue_work.part.0+0x190/0x390, CPU#8: swapper/8/0 Modules linked in: nft_ct, wireguard, libcurve25519, ip6_udp_tunnel, udp_tunnel, nf_conntrack_netlink, xt_nat, xt_tcpudp, veth, xt_multiport, xt_conntrack, xt_MASQUERADE, xfrm_user, xfrm_algo, xt_set, ip_set, nft_chain_nat, nf_nat, nf_conntrack, nf_defrag_ipv6, nf_defrag_ipv4, nft_compat, nf_tables, virtiofs, serio_raw, vmw_vsock_virtio_transport, virtio_dma_buf, vsock, virtio_rng, autofs4, libahci, netconsole, virtio_gpu, psmouse, hid_generic, ahci, usbhid, hid Analysis The crash occurs in the timer interrupt path during CPU idle: 1. CPU is idle (pv_native_safe_halt / do_idle) 2. APIC timer interrupt fires 3. __run_timers expires a delayed_work timer 4. delayed_work_timer_fn calls __queue_work 5. __queue_work dereferences a NULL pointer (pwq/pool is NULL) The Tainted [W] flag confirms a prior WARN at the same code path. The simultaneous occurrence on two CPUs suggests a race condition during workqueue teardown — a delayed_work timer fires after the workqueue's cpu_pwq has been nullified by destroy_workqueue(). This matches the upstream pattern described in: - Tim Van Patten's patch "workqueue: Prevent delayed work UAF kernel panic" (June 2024): LKML — adds NULL check for pwq/pool in __queue_work - Tejun Heo's response and removal of WARN_ON_ONCE(!wq) (March 2026): GitHub mirror Related bugs - Launchpad #2068103 — same pattern on kernel 6.8.0-35 (workqueue.c:1790) - bugzilla.kernel.org #218288 — same pattern on kernel 6.6 (workqueue.c:1638) Availability of fix Kernel 7.0.0-26.26 is in resolute-proposed and includes upstream stable releases v7.0.1 through v7.0.6. It is unclear whether the workqueue UAF fix is included in those stable releases. The guest currently has 7.0.0-22.22 installed with no upgradable kernel in updates/security. Workaround None known. Rebooting after a virsh reset recovers the guest. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2157584/+subscriptions

[РЕШЕНО] Ошибка № ...

пятница

[Bug 2157584] [NEW] Kernel NULL pointer dereference in __queue_work via delayed_work_timer_fn on 7.0.0-22-generic (Ubuntu 26.04)

Комментариев нет:

Отправить комментарий