This bug is awaiting verification that the linux/5.15.0-132.143 kernel
in -proposed solves the problem. Please test the kernel and update this
bug with the results. If the problem is solved, change the tag
'verification-needed-jammy-linux' to 'verification-done-jammy-linux'. If
the problem still exists, change the tag 'verification-needed-jammy-
linux' to 'verification-failed-jammy-linux'.
If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.
See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!
** Tags added: kernel-spammed-jammy-linux-v2 verification-needed-jammy-linux
--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2089373
Title:
WARN in trc_wait_for_one_reader about failed IPIs
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Jammy:
Fix Committed
Bug description:
[Impact]
When ending bpf tracing, 5.15 kernels now report a warning in
trc_wait_for_one_reader() on platforms that support hot-plugging CPUs,
but that do not have all of their hotplug slots populated. In this
submitter's environment, it reproduces on Xen EC2 instances, but not
Nitro ones.
The warning looks like this:
kernel: [ 6416.920266] ------------[ cut here ]------------
kernel: [ 6416.920272] trc_wait_for_one_reader(): smp_call_function_single() failed for CPU: 64
kernel: [ 6416.920289] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:1044 trc_wait_for_one_reader+0x2b8/0x300
kernel: [ 6416.920299] Modules linked in: xt_state xt_connmark nf_conntrack_netlink nfnetlink xt_addrtype xt_statistic xt_nat xt_tcpudp ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nvidia_uvm(POE) nvidia_drm(POE) drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt nvidia_modeset(POE) nvidia(POE) iptable_mangle ip6table_mangle ip6table_filter ip6table_nat ip6_tables xt_MASQUERADE xt_conntrack xt_comment iptable_filter xt_mark iptable_nat nf_nat bpfilter aufs overlay udp_diag tcp_diag inet_diag binfmt_misc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel input_leds psmouse crypto_simd cryptd serio_raw floppy sch_fq_codel nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ena drm efi_pstore ip_tables x_tables autofs4
kernel: [ 6416.920368] CPU: 0 PID: 13 Comm: rcu_tasks_trace Tainted: P OE 5.15.0-1071-aws #77~20.04.1-Ubuntu
kernel: [ 6416.920372] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
kernel: [ 6416.920374] RIP: 0010:trc_wait_for_one_reader+0x2b8/0x300
kernel: [ 6416.920376] Code: 00 00 00 4c 89 ef e8 37 ac 4e 00 eb 9f 44 89 fa 48 c7 c6 00 63 e2 b8 48 c7 c7 a0 9a 1e b9 c6 05 2f 2e 09 02 01 e8 15 2e b9 00 <0f> 0b e9 31 ff ff ff 4c 89 ee 48 c7 c7 20 df b7 b9 e8 a2 99 52 00
kernel: [ 6416.920380] RSP: 0018:ffff9e048c4efe00 EFLAGS: 00010286
kernel: [ 6416.920382] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
kernel: [ 6416.920384] RDX: 0000000000000027 RSI: 0000000000000003 RDI: ffff93074ae20588
kernel: [ 6416.920385] RBP: ffff9e048c4efe28 R08: ffff93074ae20580 R09: 0000000000000001
kernel: [ 6416.920387] R10: 0000000000ffff0a R11: ffff93463feb2c7f R12: ffff92cbc6a1e600
kernel: [ 6416.920389] R13: 0000000000000040 R14: 00000000000205a4 R15: 0000000000000040
kernel: [ 6416.920390] FS: 0000000000000000(0000) GS:ffff93074ae00000(0000) knlGS:0000000000000000
kernel: [ 6416.920393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [ 6416.920394] CR2: 00007f4a72b04098 CR3: 00000046c8964001 CR4: 00000000001706f0
kernel: [ 6416.920399] Call Trace:
kernel: [ 6416.920401] <TASK>
kernel: [ 6416.920404] ? show_regs.cold+0x1a/0x1f
kernel: [ 6416.920410] ? trc_wait_for_one_reader+0x2b8/0x300
kernel: [ 6416.920412] ? __warn+0x8b/0xe0
kernel: [ 6416.920418] ? trc_wait_for_one_reader+0x2b8/0x300
kernel: [ 6416.920421] ? report_bug+0xd5/0x110
kernel: [ 6416.920427] ? handle_bug+0x39/0x90
kernel: [ 6416.920431] ? exc_invalid_op+0x19/0x70
kernel: [ 6416.920434] ? asm_exc_invalid_op+0x1b/0x20
kernel: [ 6416.920442] ? trc_wait_for_one_reader+0x2b8/0x300
kernel: [ 6416.920446] rcu_tasks_trace_postscan+0x47/0x80
kernel: [ 6416.920449] rcu_tasks_wait_gp+0x108/0x210
kernel: [ 6416.920453] rcu_tasks_kthread+0x10f/0x1c0
kernel: [ 6416.920456] ? wait_woken+0x60/0x60
kernel: [ 6416.920462] ? show_rcu_tasks_trace_gp_kthread+0x80/0x80
kernel: [ 6416.920464] kthread+0x12a/0x150
kernel: [ 6416.920471] ? set_kthread_struct+0x50/0x50
kernel: [ 6416.920476] ret_from_fork+0x22/0x30
kernel: [ 6416.920485] </TASK>
kernel: [ 6416.920486] ---[ end trace 0500611ddaff33a7 ]---
The problem appears when:
- The system is performing a rcu_tasks_trace grace period wait
- The system has more hot plug CPU slots available than are populated
- The rcu tasks postscan detects a holdout
The problem is actually caused by a mismerge of 9b3c4ab304("sched,rcu:
Rework try_invoke_on_locked_down_task()"). When that patch was
applied, a conflict around task nesting was improperly resolved and
lead to quiescent tasks getting flagged as holdouts. This in turn
results in more IPIs than necessary to idle CPUs, as well as WARNs
about failing to send IPIs to CPUs that aren't running.
The fix is a twofer: 1) manually correct the mismerge in the same way
that mainline resolved the conflict, and 2) backport an additional RCU
patch that confines the rcu_tasks postscan to only CPUs that are
running.
[Backport]
The upstream merge that shows the correct manual resolution of the
merge conflicts is in this commit:
commit 6fedc28076bbbb32edb722e80f9406a3d1d668a8
Merge tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
specifically:
> @@ -951,18 +942,18 @@ static int trc_inspect_reader(struct task_struct *t, void *arg)
> n_heavy_reader_updates++;
> if (ofl)
> n_heavy_reader_ofl_updates++;
> - in_qs = true;
> + nesting = 0;
> } else {
> // The task is not running, so C-language access is safe.
> - in_qs = likely(!t->trc_reader_nesting);
> + nesting = t->trc_reader_nesting;
> }
>
> - // Mark as checked so that the grace-period kthread will
> - // remove it from the holdout list.
> - t->trc_reader_checked = true;
> -
> - if (in_qs)
> - return 0; // Already in quiescent state, done!!!
> + // If not exiting a read-side critical section, mark as checked
> + // so that the grace-period kthread will remove it from the
> + // holdout list.
> + t->trc_reader_checked = nesting >= 0;
> + if (nesting <= 0)
> + return nesting ? -EINVAL : 0; // If in QS, done, otherwise try again later.
The additional rcu_tasks patch for only running postscan on online
cpus is:
commit 5c9a9ca44fda41c5e82f50efced5297a9c19760d
rcu-tasks: Idle tasks on offline CPUs are in quiescent
I've additionally reached out to upstream about including this in
stable:
https://lore.kernel.org/stable/c56243da5c8b4451097b39468166248790f9a1de.1732237776.git.kjlx@templeofstupid.com/T/#t
[Test]
A trivial reproducer for this problem is to use an up-to-date version
of bpftrace to run a kfunc probe, which when destroyed uses the
rcu_tasks_trace facility to cleanup:
bpftrace -e 'kfunc:tcp_reset {@a = count();}'
^C
Is all that's necessary to reproduce the problem on a Xen EC2 system.
I've run with and without the patches applied and can confirm that one
and both are sufficient to resolve the problem. Correcting the
nesting ensures that idling cpus don't get flagged as holdouts, and
confining the scan to just online cpus ensures that even if we
incorrectly flag a cpu as a holdout the warning won't trigger because
sending the IPI won't fail.
[Potential Regression]
The regression potential is low. The corrected commit has been
present in mainline since 2021 and the fix to only run postscan on
online CPUs has been present since 2022.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2089373/+subscriptions
Комментариев нет:
Отправить комментарий