Hi Matthew,
Thanks for getting this fix together so quickly. I've taken it for a spin on some of my machines. I can confirm that the test that were getting stuck in the 1021 kernel now run to completion in 1022.
Thanks again,
-K
** Tags added: verification-done-noble-linux-azure
** Tags removed: verification-done-noble-linux-azure
** Tags added: verification-done-noble-linux-azure-nvidia
--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2120330
Title:
Incorrect backport for CVE-2025-21861 causes kernel hangs
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Noble:
Fix Committed
Bug description:
BugLink: https://bugs.launchpad.net/bugs/2120330
[Impact]
The patch for CVE-2025-21861 was incorrectly backported to the noble 6.8
kernel, leading to hangs when freeing device memory.
commit 41cddf83d8b00f29fd105e7a0777366edc69a5cf
Author: David Hildenbrand <david@redhat.com>
Date: Mon Feb 10 17:13:17 2025 +0100
Subject: mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41cddf83d8b00f29fd105e7a0777366edc69a5cf
ubuntu-noble: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?id=3858edb1146374f3240d1ec769ba857186531b17
An incorrect backport was performed, causing the old page to be placed
back instead of the new page, e.g.:
src = page_folio(page);
dst = page_folio(newpage);
+ if (!is_zone_device_page(page))
+ putback_lru_page(page);
when in 41cddf83d8b00f29fd105e7a0777366edc69a5cf we have:
+ if (!folio_is_zone_device(dst))
+ folio_add_lru(dst);
in which case, we should really have had the backport as:
+ if (!folio_is_zone_device(newpage))
+ folio_add_lru(newpage);
This keeps references alive to the old memory pages, preventing them from being
released and freed.
Stack traces of stuck processes:
ID: 871438 TASK: ffff007d4d668200 CPU: 95 COMMAND: "nvbandwidth"
#0 [ffff80010e8ef840] __switch_to at ffffc0f22798c550
#1 [ffff80010e8ef8a0] __schedule at ffffc0f22798c89c
#2 [ffff80010e8ef900] schedule at ffffc0f22798cd40
#3 [ffff80010e8ef930] schedule_preempt_disabled at ffffc0f22798d388
#4 [ffff80010e8ef9c0] rwsem_down_write_slowpath at ffffc0f227990dc8
#5 [ffff80010e8efa20] down_write at ffffc0f2279912d0
#6 [ffff80010e8efaa0] uvm_va_space_mm_shutdown at ffffc0f1c2a451ec [nvidia_uvm]
#7 [ffff80010e8efb00] uvm_va_space_mm_unregister at ffffc0f1c2a457a0 [nvidia_uvm]
#8 [ffff80010e8efb30] uvm_release at ffffc0f1c2a226d4 [nvidia_uvm]
#9 [ffff80010e8efc00] uvm_release_entry.part.0 at ffffc0f1c2a227dc [nvidia_uvm]
#10 [ffff80010e8efc20] uvm_release_entry at ffffc0f1c2a22850 [nvidia_uvm]
#11 [ffff80010e8efc30] __fput at ffffc0f2269a5760
#12 [ffff80010e8efc70] ____fput at ffffc0f2269a5a80
#13 [ffff80010e8efc80] task_work_run at ffffc0f2265ceedc
#14 [ffff80010e8efcc0] do_exit at ffffc0f2265a0bc8
#15 [ffff80010e8efcf0] do_group_exit at ffffc0f2265a0fec
#16 [ffff80010e8efd50] get_signal at ffffc0f2265b8750
#17 [ffff80010e8efe10] do_signal at ffffc0f22650166c
#18 [ffff80010e8efe40] do_notify_resume at ffffc0f2265018f0
#19 [ffff80010e8efe70] el0_interrupt at ffffc0f227985564
#20 [ffff80010e8efe90] __el0_irq_handler_common at ffffc0f2279855f0
#21 [ffff80010e8efea0] el0t_64_irq_handler at ffffc0f227986080
#22 [ffff80010e8effe0] el0t_64_irq at ffffc0f2264f17fc
PID: 871467 TASK: ffff007f6aa66000 CPU: 66 COMMAND: "UVM GPU4 BH"
#0 [ffff80015ddef580] __switch_to at ffffc0f22798c550
#1 [ffff80015ddef5e0] __schedule at ffffc0f22798c89c
#2 [ffff80015ddef640] schedule at ffffc0f22798cd40
#3 [ffff80015ddef670] io_schedule at ffffc0f22798cec4
#4 [ffff80015ddef6e0] migration_entry_wait_on_locked at ffffc0f22686e3f0
#5 [ffff80015ddef740] migration_entry_wait at ffffc0f22695a6d4
#6 [ffff80015ddef750] do_swap_page at ffffc0f2268d6378
#7 [ffff80015ddef7d0] handle_pte_fault at ffffc0f2268da688
#8 [ffff80015ddef870] __handle_mm_fault at ffffc0f2268da7f8
#9 [ffff80015ddef8b0] handle_mm_fault at ffffc0f2268dab48
#10 [ffff80015ddef910] handle_fault at ffffc0f1c2aace18 [nvidia_uvm]
#11 [ffff80015ddef950] uvm_populate_pageable_vma at ffffc0f1c2aacf24 [nvidia_uvm]
#12 [ffff80015ddef990] migrate_pageable_vma_populate_mask at ffffc0f1c2aad8c0 [nvidia_uvm]
#13 [ffff80015ddefab0] uvm_migrate_pageable at ffffc0f1c2ab0294 [nvidia_uvm]
#14 [ffff80015ddefb90] service_ats_requests at ffffc0f1c2abf828 [nvidia_uvm]
#15 [ffff80015ddefbb0] uvm_ats_service_faults at ffffc0f1c2ac02f0 [nvidia_uvm]
#16 [ffff80015ddefd40] uvm_parent_gpu_service_non_replayable_fault_buffer at ffffc0f1c2a82e00 [nvidia_uvm]
#17 [ffff80015ddefda0] non_replayable_faults_isr_bottom_half at ffffc0f1c2a3c3e4 [nvidia_uvm]
#18 [ffff80015ddefe00] non_replayable_faults_isr_bottom_half_entry at ffffc0f1c2a3c590 [nvidia_uvm]
#19 [ffff80015ddefe20] _main_loop at ffffc0f1c2a207c8 [nvidia_uvm]
#20 [ffff80015ddefe70] kthread at ffffc0f2265d40dc
There is no workaround.
[Fix]
To make things less confusing, revert the incorrect backport, and backport
"mm: migrate_device: use more folio in migrate_device_finalize()" to use the
new upstream notations, and correctly backport "mm/migrate_device: don't add
folio to be freed to LRU in migrate_device_finalize()". This approach was
suggested and tested by Krister Johansen, and I think it is reasonable.
commit 58bf8c2bf47550bc94fea9cafd2bc7304d97102c
Author: Kefeng Wang <wangkefeng.wang@huawei.com>
Date: Mon Aug 26 14:58:12 2024 +0800
Subject: mm: migrate_device: use more folio in migrate_device_finalize()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=58bf8c2bf47550bc94fea9cafd2bc7304d97102c
commit 41cddf83d8b00f29fd105e7a0777366edc69a5cf
Author: David Hildenbrand <david@redhat.com>
Date: Mon Feb 10 17:13:17 2025 +0100
Subject: mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41cddf83d8b00f29fd105e7a0777366edc69a5cf
The first patch landed in 6.12-rc1 and the second patch in 6.14-rc4. Both are
in plucky.
[Testcase]
There are a few ways to trigger the issue.
You can run the hmm selftests. Note, you need to build a new kernel and set
CONFIG_TEST_HMM=m first.
1) Check out a kernel git tree
2) cd tools/testing/selftests/mm/
3) make
4) sudo ./test_hmm.sh
You can also run nvidia tests like nvbandwidth, if your system has a Nvidia GPU:
https://github.com/NVIDIA/nvbandwidth
$ git clone https://github.com/NVIDIA/nvbandwidth.git
$ cd nvbandwidth
$ sudo ./debian_install.sh
$ sudo ./nvbandwidth
A test package is available in the following ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/sf416039-test
If you install it, and run the hmm selftests, it should no longer
hang.
[Where problems can occur]
This changes some core mm code for device memory from standard pages to using
folios, and carries some additional risk because of this.
If a regression were to occur, it would primarily affect users of devices with
internal memory, such as graphics cards, and quite possibly high end network
cards.
The largest userbase affected by this regression is nvidia users, so it really
would be a bad idea to release with the broken implementation, and instead, to
respin and release with the fixed implementation.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2120330/+subscriptions
Комментариев нет:
Отправить комментарий