вторник

[Bug 2070329] Re: KOP L2 guest fails to boot with 1 core - SMT8 topology

Man thanks Gautam for the successful verification!
(I'm updating the tags accordingly ...)

** Tags removed: verification-needed-noble-linux
** Tags added: verification-done-noble-linux

--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2070329

Title:
KOP L2 guest fails to boot with 1 core - SMT8 topology

Status in The Ubuntu-power-systems project:
Fix Committed
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Noble:
Fix Committed
Status in linux source package in Oracular:
Fix Released

Bug description:
SRU Justification:

[ Impact ]

* On a P10 system with SMT-8 configured
a level 2 guest (VM) fails to boot in case
it only has one core assigned.

[ Test Plan ]

* Setup an IBM Power 10 system - that support up to SMT-8
and with firmware 1060, that offers support for KVM -
using Ubuntu Server 24.04 for ppc64el.

* Setup qemu/KVM on this system.

* Configure a KVM guest (e.g. using virtinst or
qemu-system-ppc64 directly) now with smt-8,
but only one virtual CPU.

* Try to boot this specific guest:
qemu-system-ppc64 \
-drive file=rhel.qcow2,format=qcow2 \
-m 20G \
-smp 8,cores=1,threads=8 \
-cpu host \
-nographic \
-machine pseries,ic-mode=xics -accel kvm

* It will fail to boot with a kernel that does not
have the two patches in place.

* Since this setup requires a special firmware level,
the verification will be done by the IBM Power team.

[ Where problems could occur ]

* Primarily support for using DPDES (register) is required,
since its needed for enabling usage of doorbells in L2 gusts.
This is mainly done by adding DEFINEs, stubs and case.
If the definitions are not correct or if the code executed by
the new case (KVMPPC_GSID_DPDES) is done wrong,
the guest state could be incorrect, harming the L2 guest doorbell.
(DPDES is to provide the means for the hypervisor to save a
[sub-]processor's Directed Privileged Doorbell exception state
when the set of programs running on the [sub-]processor is
swapped out or moved from one [sub-]processor to another.)

* The missing Doorbell emulation got added by a 4 line if statement
in powerpc/kvm/book3s_hv.c, which is relatively traceable.

* The main issue I can think of is that kvmppc_set_dpdes is called
with wrong arguments.

* And kvmppc_set_dpdes will not work (at all) if the above DPDES
support (and commit/patch) is missing.

[ Other Info ]

* Since (nested) KVM support is new on P10,
this does not affect older Power generation
(P9 is the only other hw generation that is supported by 24.04,
but it only supports native virtualization).

* Both patches are upstream accepted since v6.11(-rc1),
hence will be in oracular
and are also upstream tagged as stable updates.

* Since the required firmware FW1060 is relatively new,
we can assume that not many user ran into this issue yet.
__________

== Comment: #0 - SEETEENA THOUFEEK <sthoufee@in.ibm.com> - 2024-06-25 01:24:11 ==
+++ This bug was initially created as a clone of Bug #205277 +++

---Problem Description---
KOP L2 guest fails to boot with 1 core - SMT8 topology

---Additional Hardware Info---
na

---Debugger Data---
na

---Steps to Reproduce---
 KOP L2 guest fails to boot when we set the CPU topology as 1 core - SMT 8

command line used to verify the issue:
#!/bin/sh

QEMU="/home/mgautam/qemu"
qemu-system-ppc64 -s \
-drive file=/root/debian-12-nocloud-ppc64el.qcow2,format=qcow2 \
-m 20G \
-smp 8,cores=1,sockets=1,threads=8 \
-cpu host \
-nographic \
-machine pseries,ic-mode=xics -accel kvm \
-net nic,model=virtio \
-net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10022-:22

NOTE: L2 boots fine when doorbells are turned off in L1 kernel

As per the investigation so far, the doorbell exception is not getting
fired inside L2 guest. At L1 level, if we set DPDES=1 in the GSB for
L2, the guest never receives the doorbell and also it is never cleared
from the GSB. We are discussing this behaviour with phyp team.

The root cause of this issue is lack of DPDES support at L1. I've
posted the fix upstream - https://lore.kernel.org/linuxppc-
dev/20240522084949.123148-1-gautam@linux.ibm.com/T/#u

The fix has been accepted upstream and will be backported for kernels
>= 6.7

https://lore.kernel.org/linuxppc-
dev/20240605113913.83715-1-gautam@linux.ibm.com/

---Patches Installed---
na

---System Hang---
 na

---uname output---
na

Contact Information = na

Machine Type = na

Userspace rpm: na

Userspace tool common name: na

The userspace tool has the following bit modes: na

Userspace tool obtained from project website: na

*Additional Instructions for na:
-Post a private note with access information to the machine that is currently in the debugger.
-Attach ltrace and strace of userspace application.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/2070329/+subscriptions

Комментариев нет:

Отправить комментарий