Public bug reported: SRU Justification: [IMPACT] A kernel NULL pointer dereference occurs on Nvidia BlueField DPUs running Ubuntu 24.04 (Noble) with linux-bluefield-6.8. The crash is triggered when closing a dma_buf file descriptor associated with a vfio_pci device. The root cause is that vfio_pci_dma_buf_release() and vfio_pci_dma_buf_cleanup() call vfio_put_device() to release a reference, but the reference was acquired with vfio_device_get(), which uses a separate refcount (device->refcount). Using vfio_put_device() incorrectly decrements the kobject refcount, triggering a refcount underflow and kernel crash. [FIX] Two custom patches are submitted: - UBUNTU: SAUCE: vfio: Export vfio device get and put registration helpers — exports vfio_device_try_get_registration and vfio_device_put_registration via EXPORT_SYMBOL_GPL in vfio_main.c and adds their declarations to vfio.h, making them available to other VFIO modules. - UBUNTU: SAUCE: vfio/pci: Use the correct ref count — fixes vfio_pci_dma_buf_release() and vfio_pci_dma_buf_cleanup() to call vfio_device_put_registration() instead of vfio_put_device(), correctly matching the reference acquisition done via vfio_device_get(). [TEST CASE] Compile tested on linux-bluefield-6.8 on the master-next branch. Functionally verified: the NULL pointer dereference no longer reproduces after applying the fix. [Regression Potential] Low. The change corrects an incorrect function call in the release path, and the fix has been functionally verified on the affected setup. ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2148554 Title: Ubuntu 24.04, Noble, linux-bluefield-6.8: Fix refcount mishandling in vfio_pci_dma_buf functions Status in linux package in Ubuntu: New Bug description: SRU Justification: [IMPACT] A kernel NULL pointer dereference occurs on Nvidia BlueField DPUs running Ubuntu 24.04 (Noble) with linux-bluefield-6.8. The crash is triggered when closing a dma_buf file descriptor associated with a vfio_pci device. The root cause is that vfio_pci_dma_buf_release() and vfio_pci_dma_buf_cleanup() call vfio_put_device() to release a reference, but the reference was acquired with vfio_device_get(), which uses a separate refcount (device->refcount). Using vfio_put_device() incorrectly decrements the kobject refcount, triggering a refcount underflow and kernel crash. [FIX] Two custom patches are submitted: - UBUNTU: SAUCE: vfio: Export vfio device get and put registration helpers — exports vfio_device_try_get_registration and vfio_device_put_registration via EXPORT_SYMBOL_GPL in vfio_main.c and adds their declarations to vfio.h, making them available to other VFIO modules. - UBUNTU: SAUCE: vfio/pci: Use the correct ref count — fixes vfio_pci_dma_buf_release() and vfio_pci_dma_buf_cleanup() to call vfio_device_put_registration() instead of vfio_put_device(), correctly matching the reference acquisition done via vfio_device_get(). [TEST CASE] Compile tested on linux-bluefield-6.8 on the master-next branch. Functionally verified: the NULL pointer dereference no longer reproduces after applying the fix. [Regression Potential] Low. The change corrects an incorrect function call in the release path, and the fix has been functionally verified on the affected setup. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2148554/+subscriptions
Комментариев нет:
Отправить комментарий