Performing verification for oracular.
Firstly, I started with 6.11.0-21-generic with the following commit
reverted:
commit ab99a87542f194f28e2364a42afbf9fb48b1c724
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Fri Jun 7 10:27:44 2024 +0300
Subject: md/md-bitmap: fix writing non bitmap pages
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab99a87542f194f28e2364a42afbf9fb48b1c724
This is because the testcases are the same, as the above bug was fixed due to
this bug existing in the first place.
I ran the reproducer:
sudo[1244]: mruffell : TTY=pts/0 ; PWD=/home/mruffell/blktests ; USER=root ; COMMAND=./check md/001
sudo[1244]: pam_unix(sudo:session): session opened for user root(uid=0) by mruffell(uid=1008)
unknown: run blktests md/001 at 2025-04-07 04:17:58
root[1280]: run blktests md/001
brd: module loaded
(udev-worker)[1270]: dm-0: Process '/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/dm-0' failed with exit code 1.
Key type psk registered
nvmet: adding nsid 1 to subsystem blktests-subsystem-1
nvmet_tcp: enabling port 0 (127.0.0.1:4420)
nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
nvme nvme1: creating 2 I/O queues.
nvme nvme1: mapped 2/0/0 default/read/poll queues.
nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
(udev-worker)[1270]: nvme1n1: Process '/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/nvme1n1' failed with exit code 1.
(udev-worker)[1270]: md127: Process '/usr/bin/unshare -m /usr/bin/snap auto-import --mount=/dev/md127' failed with exit code 1.
md/raid1:md127: active with 1 out of 2 mirrors
------------[ cut here ]------------
WARNING: CPU: 0 PID: 60 at net/core/skbuff.c:7140 skb_splice_from_iter+0x1b5/0x370
Modules linked in: nvme_tcp nvmet_tcp nvmet nvme_keyring brd raid1 tls cfg80211 8021q garp mrp stp llc binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rapl pvpanic_mmio pvpanic psmouse i2c_piix4 i2c_smbus nvme input_leds mac_hid serio_raw sch_fq_codel nvme_fabrics nvme_core nvme_auth efi_pstore dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci dmi_sysfs virtio_rng ip_tables x_tables autofs4
CPU: 0 UID: 0 PID: 60 Comm: kworker/0:1H Not tainted 6.11.0-21-generic #20+TEST404844v20250403b1-Ubuntu
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
RIP: 0010:skb_splice_from_iter+0x1b5/0x370
Code: cc 49 89 cd f6 c2 01 0f 85 20 01 00 00 66 90 48 89 da 8b 52 30 81 e2 00 00 00 82 81 fa 00 00 00 80 0f 85 41 ff ff ff 4d 89 fe <0f> 0b 49 c7 c5 fb ff ff ff 48 8b 85 68 ff ff ff 41 01 46 70 41 01
RSP: 0018:ffffac5c0020ba18 EFLAGS: 00010246
RAX: 0000000000000000 RBX: fffff04bc4c1edc0 RCX: 0000000000001000
RDX: 0000000080000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffac5c0020bac0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
R13: 0000000000001000 R14: ffffa094c37d6c00 R15: ffffa094c37d6c00
FS: 0000000000000000(0000) GS:ffffa095f7c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007e272064c6c0 CR3: 0000000104e8c005 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? show_trace_log_lvl+0x1be/0x310
? show_trace_log_lvl+0x1be/0x310
? tcp_sendmsg_locked+0x36e/0xe40
? show_regs.part.0+0x22/0x30
? show_regs.cold+0x8/0x10
? skb_splice_from_iter+0x1b5/0x370
? __warn.cold+0xa7/0x101
? skb_splice_from_iter+0x1b5/0x370
? report_bug+0x114/0x160
? handle_bug+0x51/0xa0
? exc_invalid_op+0x18/0x80
? asm_exc_invalid_op+0x1b/0x20
? skb_splice_from_iter+0x1b5/0x370
? skb_splice_from_iter+0x12a/0x370
tcp_sendmsg_locked+0x36e/0xe40
? tcp_push+0x12d/0x170
? tcp_sendmsg_locked+0xac7/0xe40
? tcp_cleanup_rbuf+0x42/0x90
tcp_sendmsg+0x2c/0x50
inet_sendmsg+0x42/0x80
sock_sendmsg+0x118/0x140
nvme_tcp_try_send_data+0x181/0x4c0 [nvme_tcp]
nvme_tcp_try_send+0x1a6/0x230 [nvme_tcp]
nvme_tcp_io_work+0x6c/0x110 [nvme_tcp]
process_one_work+0x174/0x350
worker_thread+0x32a/0x460
? _raw_spin_unlock_irqrestore+0x11/0x60
? __pfx_worker_thread+0x10/0x10
kthread+0xe1/0x110
? __pfx_kthread+0x10/0x10
ret_from_fork+0x44/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
---[ end trace 0000000000000000 ]---
Okay, we can reproduce the issue.
I then enabled -proposed, and installed 6.11.0-24-generic, note, which
contains:
commit ab99a87542f194f28e2364a42afbf9fb48b1c724
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Fri Jun 7 10:27:44 2024 +0300
Subject: md/md-bitmap: fix writing non bitmap pages
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab99a87542f194f28e2364a42afbf9fb48b1c724
AND the three fixed we are testing:
commit 23a55f4492fcf868d068da31a2cd30c15f46207d
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:12 2024 +0300
Subject: net: introduce helper sendpages_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23a55f4492fcf868d068da31a2cd30c15f46207d
commit 6af7331a70b4888df43ec1d7e1803ae2c43b6981
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:13 2024 +0300
Subject: nvme-tcp: use sendpages_ok() instead of sendpage_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6af7331a70b4888df43ec1d7e1803ae2c43b6981
commit 7960af373ade3b39e10106ef415e43a1d2aa48c6
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:14 2024 +0300
Subject: drbd: use sendpages_ok() instead of sendpage_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7960af373ade3b39e10106ef415e43a1d2aa48c6
I rebooted and ran the reproducer:
$ uname -rv
6.11.0-24-generic #24-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 14 18:13:56 UTC 2025
$ sudo ./check md/001
md/001 (Raid with bitmap on tcp nvmet with opt-io-size over bitmap size) [passed]
runtime ... 0.451
Just to make sure that the three commits really did fix the issue, I then
built and booted into 6.11.0-24-generic, again with
commit ab99a87542f194f28e2364a42afbf9fb48b1c724
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Fri Jun 7 10:27:44 2024 +0300
Subject: md/md-bitmap: fix writing non bitmap pages
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab99a87542f194f28e2364a42afbf9fb48b1c724
reverted, so we just test the new fixes only.
$ uname -rv
6.11.0-24-generic #24+TEST404844v20250328b1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 28
$ sudo ./check md/001
md/001 (Raid with bitmap on tcp nvmet with opt-io-size over bitmap size) [passed]
runtime 0.451s ... 0.449s
We continue to pass with the new patches only.
The kernel in -proposed fixes the issue, happy to mark verified for
oracular.
** Tags removed: verification-needed-oracular-linux
** Tags added: verification-done-oracular-linux
--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2093871
Title:
Introduce and use sendpages_ok() instead of sendpage_ok() in nvme-tcp
and drbd
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Noble:
Fix Committed
Status in linux source package in Oracular:
Fix Committed
Bug description:
BugLink: https://bugs.launchpad.net/bugs/2093871
[Impact]
Currently the nvme-tcp and drbd subsystems try to enable the MSG_SPLICE_PAGES
flag on pages to be written, and when MSG_SPLICE_PAGES is set, eventually it
calls skb_splice_from_iter(), which then checks all pages with sendpage_ok()
to see if all the pages are sendable.
At the moment, both subsystems only check the first page in a potentially
contiguous block of pages, if they are sendpage_ok(), and if the first page is,
then it just assumes all the rest are sendpage_ok() too, and sends the I/O off
to eventually be found out by skb_splice_from_iter(). If one or more of the
pages in the contiguous block is not sendpage_ok(), then we get a warn printed,
data transfer is aborted. In the nvme-tcp case, IO then hangs.
This patchset introduces sendpages_ok() which iterates over each page in a
contiguous block, checks if it is sendpage_ok(), and only returns true if all
of them are.
This resolves the whole MSG_SPLICE_PAGES flag situation, since you can now
depend on the result of sendpages_ok(), instead of just assuming everything is
okay.
This issue is what caused bug 2075110 [0] to be discovered in the first place,
since it was responsible for contigious blocks of pages where the first was
sendpage_ok(), but pages further into the block were not.
[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2075110
Even with "md/md-bitmap: fix writing non bitmap pages" applied, the issue can
still happen, e.g. with merged IO pages, so this fix is still needed to
eliminate the issue.
[Fix]
The fixes landed in mainline 6.12-rc1:
commit 23a55f4492fcf868d068da31a2cd30c15f46207d
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:12 2024 +0300
Subject: net: introduce helper sendpages_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23a55f4492fcf868d068da31a2cd30c15f46207d
commit 6af7331a70b4888df43ec1d7e1803ae2c43b6981
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:13 2024 +0300
Subject: nvme-tcp: use sendpages_ok() instead of sendpage_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6af7331a70b4888df43ec1d7e1803ae2c43b6981
commit 7960af373ade3b39e10106ef415e43a1d2aa48c6
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Thu Jul 18 11:45:14 2024 +0300
Subject: drbd: use sendpages_ok() instead of sendpage_ok()
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7960af373ade3b39e10106ef415e43a1d2aa48c6
They are needed for noble and oracular.
[Testcase]
This is the same testcase as the original bug 2075110 [0], as the fix is
designed to prevent it or similar other bugs from happening again.
[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2075110
Because of this, the fix:
commit ab99a87542f194f28e2364a42afbf9fb48b1c724
Author: Ofir Gal <ofir.gal@volumez.com>
Date: Fri Jun 7 10:27:44 2024 +0300
Subject: md/md-bitmap: fix writing non bitmap pages
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab99a87542f194f28e2364a42afbf9fb48b1c724
needs to be reverted during your test runs, or you won't see the issue
reproduce.
You can use this ppa for updated kernels with the revert to trigger
the issue:
https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-revert
This can be reproduced by running blktests md/001 [1], which the
author of the fix created to act as a regression test for this issue.
[1]
https://github.com/osandov/blktests/commit/a24a7b462816fbad7dc6c175e53fcc764ad0a822
Deploy a fresh Noble VM, that has a scratch NVME disk.
$ sudo apt install build-essential fio
$ git clone https://github.com/osandov/blktests.git
$ cd blktests
$ make
$ echo "TEST_DEVS=(/dev/nvme0n1)" > config
$ sudo ./check md/001
The md/001 test will hang an affected system, and the above oops
message will be visible in dmesg.
A test kernel is available in the following ppa:
https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-test
This has both the fixes for this bug, and also bug 2075110. The issue will not
reproduce.
There is also a test kernel available with the fix for this bug present, and the
fix for bug 2075110 reverted, so you can see the impact of these patches only:
https://launchpad.net/~mruffell/+archive/ubuntu/sf404844-repro
This will also not reproduce the issue anymore.
[Where problems could occur]
What we are changing is rather simple. Instead of checking the first page and
assuming all the rest in the contiguous block are sendpage_ok(), we now
check each page in the contiguous block to see if all of them are sendpage_ok().
If any aren't, then we abort the write to the driver, and try again later. This
saves us time.
However, it does take longer to call sendpage_ok() on each of the pages in the
contiguous block, so there will be a minor performance hit.
Small performance hit for correctness should be okay.
Currently we are only applying to nvme-tcp and drbd subsystems. If a regression
were to occur, it would affect users of those subsystems only.
[Other info]
Upstream mailing list:
https://lore.kernel.org/all/20240718084515.3833733-1-ofir.gal@volumez.com/T/#u
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2093871/+subscriptions
Комментариев нет:
Отправить комментарий