воскресенье

[Bug 2078659] Re: Hold IOPOLL locks when triggering io_uring's deferred work

** Description changed:

[Impact]
io_commit_cqring() writes the CQ ring tail to make it visible and also triggers any deferred work.
When a ring is set up with IOPOLL, it doesn't require locking around the CQ ring updates.
However, if there is deferred work that needs processing, io_queue_deferred() assumes that the completion_lock is held.
The io_uring subsystem does not properly handle locking for rings with IOPOLL, leading to a double-free vulnerability, which can be exploited as CVE-2023-21400.

[Fix]
There is a commit that fixed this issue.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fb348857e7b67eefe365052f1423427b66dedbf3

- There is no direct upstream commit for this issue, and the patch needs
- to be reworked to apply to version 5.4.
+ There is no direct upstream commit addressing this issue, so the patch
+ needs to be reworked to apply to version 5.4.
+
+ The patch introduces the io_commit_needs_flush function to determine if flushing the cqring while holding ctx->completion_lock is necessary.
+ However, implementing this logic in Focal requires backporting commits that introduce missing io_ring_ctx struct member variables.
+ This approach is not ideal, as those commits are unrelated to the CVE.
+ Additionally, introducing new member variables raises concerns, particularly since Focal is nearing the end of its standard support period.
+ To address this tradeoff, the SUSE commit [1] serves as a suitable reference.
+
+ To address this issue in the simplest way, we can wrap the ctx->completion_lock spinlock around io_commit_cqring in the io_iopoll_complete function.
+ However, this approach has surfaced an existing issue related to invoking kill_fasync under completion_lock as highlighted in commit deadlock (4aa84f2ffa81 io_uring: don't kill fasync under completion_lock).
+ The following patches can help us align the code more closely with the upstream flow while also addressing the deadlock issue.
+ 0791015837f1 io_uring: remove extra check in __io_commit_cqring
+ 4aa84f2ffa81 io_uring: dont kill fasync under completion_lock
+
+ After backporting these two commits, we can adopt SUSE's approach to
+ align the code more closely with the Linux stable branch.

[Test Plan]
This is a timing issue that can be triggered by using the liburing library to implement a test program.
First, set the io_uring_params flag to IORING_SETUP_IOPOLL and open an XFS file with the O_RDWR | O_DIRECT flags, as the XFS filesystem implements the iopoll function hook.
After setting up io_uring, create two threads in the process: one thread will wait for completion queue events, and the other will continuously send readv and writev requests in sequence.
The writev requests should include the IOSQE_IO_DRAIN flag to ensure that previous submission queue events are completed first.

The issue arises when writev requests add entries into the defer_list,
but the io_iopoll_complete function consumes entries from defer_list
without holding the appropriate lock.

[Where problems could occur]
The problematic call path can be triggered under specific usage scenarios and only affects io_uring functionality.
If the patch contains any issues, it may lead to a deadlock.
+
+ [1]
+ https://github.com/SUSE/kernel/commit/a16c5d2daa1cf45f8694ff72bd2665d474a763a7

--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2078659

Title:
Hold IOPOLL locks when triggering io_uring's deferred work

Status in linux package in Ubuntu:
New
Status in linux source package in Focal:
In Progress

Bug description:
[Impact]
io_commit_cqring() writes the CQ ring tail to make it visible and also triggers any deferred work.
When a ring is set up with IOPOLL, it doesn't require locking around the CQ ring updates.
However, if there is deferred work that needs processing, io_queue_deferred() assumes that the completion_lock is held.
The io_uring subsystem does not properly handle locking for rings with IOPOLL, leading to a double-free vulnerability, which can be exploited as CVE-2023-21400.

[Fix]
There is a commit that fixed this issue.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fb348857e7b67eefe365052f1423427b66dedbf3

There is no direct upstream commit addressing this issue, so the patch
needs to be reworked to apply to version 5.4.

The patch introduces the io_commit_needs_flush function to determine if flushing the cqring while holding ctx->completion_lock is necessary.
However, implementing this logic in Focal requires backporting commits that introduce missing io_ring_ctx struct member variables.
This approach is not ideal, as those commits are unrelated to the CVE.
Additionally, introducing new member variables raises concerns, particularly since Focal is nearing the end of its standard support period.
To address this tradeoff, the SUSE commit [1] serves as a suitable reference.

To address this issue in the simplest way, we can wrap the ctx->completion_lock spinlock around io_commit_cqring in the io_iopoll_complete function.
However, this approach has surfaced an existing issue related to invoking kill_fasync under completion_lock as highlighted in commit deadlock (4aa84f2ffa81 io_uring: don't kill fasync under completion_lock).
The following patches can help us align the code more closely with the upstream flow while also addressing the deadlock issue.
0791015837f1 io_uring: remove extra check in __io_commit_cqring
4aa84f2ffa81 io_uring: dont kill fasync under completion_lock

After backporting these two commits, we can adopt SUSE's approach to
align the code more closely with the Linux stable branch.

[Test Plan]
This is a timing issue that can be triggered by using the liburing library to implement a test program.
First, set the io_uring_params flag to IORING_SETUP_IOPOLL and open an XFS file with the O_RDWR | O_DIRECT flags, as the XFS filesystem implements the iopoll function hook.
After setting up io_uring, create two threads in the process: one thread will wait for completion queue events, and the other will continuously send readv and writev requests in sequence.
The writev requests should include the IOSQE_IO_DRAIN flag to ensure that previous submission queue events are completed first.

The issue arises when writev requests add entries into the defer_list,
but the io_iopoll_complete function consumes entries from defer_list
without holding the appropriate lock.

[Where problems could occur]
The problematic call path can be triggered under specific usage scenarios and only affects io_uring functionality.
If the patch contains any issues, it may lead to a deadlock.

[1]
https://github.com/SUSE/kernel/commit/a16c5d2daa1cf45f8694ff72bd2665d474a763a7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2078659/+subscriptions

Комментариев нет:

Отправить комментарий