Public bug reported:
BugLink: https://bugs.launchpad.net/bugs/2093146
[Impact]
The CPUID instruction is particularly slow on newer generation processors,
with Intel's Emerald Rapids processor taking significantly longer to execute
CPUID than Skylake or Icelake.
This introduces significant latency into the KVM subsystem, as it frequently
calls CPUID when recomputing XSTATE offsets, and especially XSAVE values, as
they need to call CPUID twice for each XSAVE call.
CPUID.0xD.[1..n] are constant and do not change during runtime, as they don't
depend on XCR0 or XSS values, and can be saved and cached for future usage.
By caching CPUID.0xD.[1..n] at kvm.ko module load, latency decreases by up to
400%.
For a round trip transition between VM-Enter and VM-Exit figures from the
commit log are:
Skylake 11650
Icelake 22350
Emerald 28850
When you add the caching in:
Skylake 6850
Icelake 9000
Emerald 7900
That's a saving of 170% for Skylake, 248% for Icelake and 365% for
Emerald Rapids.
[Fix]
The fix is part of a 5 patch series. We will only SRU patch 1 for the moment, as
it is the only one in mainline, and provides a 400% latency improvement, doing
the brunt of the work. Patches 2-5 are refactors and smaller performance
improvements, not yet mainlined due to needing rework, and only account for
about 2.5% latency improvement, quite small, compared to what patch 1 does.
The fix is:
commit 1201f226c863b7da739f7420ddba818cedf372fc
Author: Sean Christopherson <seanjc@google.com>
Date: Tue Dec 10 17:32:58 2024 -0800
Subject: KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1201f226c863b7da739f7420ddba818cedf372fc
This applies cleanly to noble, oracular. For jammy, it requires the below
dependency, and a small backport to fix some minor context mismatches.
commit cc04b6a21d431359eceeec0d812b492088b04af5
Author: Jing Liu <jing2.liu@intel.com>
Date: Wed Jan 5 04:35:14 2022 -0800
Subject: kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc04b6a21d431359eceeec0d812b492088b04af5
[Testcase]
1) Install KVM Stack on Baremetal host
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
2) Enable nested virt
$ vim /etc/modprobe.d/kvm.conf
options kvm-intel nested=1
3) Start a VM.
4) Install KVM Stack in Guest
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
5) Start a VM.
6) Install kvm-unit-tests and run x86/vmexit/cpuid testcase.
https://gitlab.com/kvm-unit-tests/kvm-unit-tests
(To be expanded upon).
[Where problems could occur]
This fix is related to nested virtualisation in the KVM subsystem. We are adding
a new function, called on KVM module load, which caches the CPUID instead of
fetching it every time XSAVE needs to be recomputed, which can be multiple times
on VM-Enter and VM-Exit on nested guests.
CPUID is static and should never change, so there should be no issues in saving
a value and reusing it later.
If a regression were to occur, it would affect all KVM users, and there would
be no workarounds.
[Other info]
Full mailing list series:
https://lore.kernel.org/kvm/20241211013302.1347853-1-seanjc@google.com/T/#u
** Affects: linux (Ubuntu)
Importance: Undecided
Status: Fix Released
** Affects: linux (Ubuntu Jammy)
Importance: Medium
Assignee: Matthew Ruffell (mruffell)
Status: In Progress
** Affects: linux (Ubuntu Noble)
Importance: Medium
Assignee: Matthew Ruffell (mruffell)
Status: In Progress
** Affects: linux (Ubuntu Oracular)
Importance: Medium
Assignee: Matthew Ruffell (mruffell)
Status: In Progress
** Tags: sts
** Also affects: linux (Ubuntu Noble)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Jammy)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Oracular)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu)
Status: New => Fix Released
** Changed in: linux (Ubuntu Jammy)
Status: New => In Progress
** Changed in: linux (Ubuntu Noble)
Status: New => In Progress
** Changed in: linux (Ubuntu Oracular)
Status: New => In Progress
** Changed in: linux (Ubuntu Jammy)
Importance: Undecided => Medium
** Changed in: linux (Ubuntu Jammy)
Assignee: (unassigned) => Matthew Ruffell (mruffell)
** Changed in: linux (Ubuntu Oracular)
Assignee: (unassigned) => Matthew Ruffell (mruffell)
** Changed in: linux (Ubuntu Noble)
Assignee: (unassigned) => Matthew Ruffell (mruffell)
** Changed in: linux (Ubuntu Noble)
Importance: Undecided => Medium
** Changed in: linux (Ubuntu Oracular)
Importance: Undecided => Medium
** Description changed:
- BugLink: https://bugs.launchpad.net/bugs/
+ BugLink: https://bugs.launchpad.net/bugs/2093146
[Impact]
The CPUID instruction is particularly slow on newer generation processors,
with Intel's Emerald Rapids processor taking significantly longer to execute
CPUID than Skylake or Icelake.
This introduces significant latency into the KVM subsystem, as it frequently
calls CPUID when recomputing XSTATE offsets, and especially XSAVE values, as
they need to call CPUID twice for each XSAVE call.
CPUID.0xD.[1..n] are constant and do not change during runtime, as they don't
depend on XCR0 or XSS values, and can be saved and cached for future usage.
By caching CPUID.0xD.[1..n] at kvm.ko module load, latency decreases by up to
400%.
For a round trip transition between VM-Enter and VM-Exit figures from the
commit log are:
Skylake 11650
Icelake 22350
Emerald 28850
When you add the caching in:
Skylake 6850
Icelake 9000
Emerald 7900
That's a saving of 170% for Skylake, 248% for Icelake and 365% for
Emerald Rapids.
[Fix]
The fix is part of a 5 patch series. We will only SRU patch 1 for the moment, as
it is the only one in mainline, and provides a 400% latency improvement, doing
the brunt of the work. Patches 2-5 are refactors and smaller performance
improvements, not yet mainlined due to needing rework, and only account for
about 2.5% latency improvement, quite small, compared to what patch 1 does.
The fix is:
commit 1201f226c863b7da739f7420ddba818cedf372fc
Author: Sean Christopherson <seanjc@google.com>
Date: Tue Dec 10 17:32:58 2024 -0800
Subject: KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1201f226c863b7da739f7420ddba818cedf372fc
This applies cleanly to noble, oracular. For jammy, it requires the below
dependency, and a small backport to fix some minor context mismatches.
-
+
commit cc04b6a21d431359eceeec0d812b492088b04af5
Author: Jing Liu <jing2.liu@intel.com>
Date: Wed Jan 5 04:35:14 2022 -0800
Subject: kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc04b6a21d431359eceeec0d812b492088b04af5
-
+
[Testcase]
1) Install KVM Stack on Baremetal host
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
2) Enable nested virt
$ vim /etc/modprobe.d/kvm.conf
options kvm-intel nested=1
3) Start a VM.
4) Install KVM Stack in Guest
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
5) Start a VM.
6) Install kvm-unit-tests and run x86/vmexit/cpuid testcase.
https://gitlab.com/kvm-unit-tests/kvm-unit-tests
(To be expanded upon).
[Where problems could occur]
This fix is related to nested virtualisation in the KVM subsystem. We are adding
a new function, called on KVM module load, which caches the CPUID instead of
fetching it every time XSAVE needs to be recomputed, which can be multiple times
on VM-Enter and VM-Exit on nested guests.
CPUID is static and should never change, so there should be no issues in saving
- a value and reusing it later.
+ a value and reusing it later.
If a regression were to occur, it would affect all KVM users, and there would
be no workarounds.
[Other info]
Full mailing list series:
https://lore.kernel.org/kvm/20241211013302.1347853-1-seanjc@google.com/T/#u
** Tags added: sts
--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2093146
Title:
KVM: Cache CPUID at KVM.ko module init to reduce latency of VM-Enter
and VM-Exit
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Jammy:
In Progress
Status in linux source package in Noble:
In Progress
Status in linux source package in Oracular:
In Progress
Bug description:
BugLink: https://bugs.launchpad.net/bugs/2093146
[Impact]
The CPUID instruction is particularly slow on newer generation processors,
with Intel's Emerald Rapids processor taking significantly longer to execute
CPUID than Skylake or Icelake.
This introduces significant latency into the KVM subsystem, as it frequently
calls CPUID when recomputing XSTATE offsets, and especially XSAVE values, as
they need to call CPUID twice for each XSAVE call.
CPUID.0xD.[1..n] are constant and do not change during runtime, as they don't
depend on XCR0 or XSS values, and can be saved and cached for future usage.
By caching CPUID.0xD.[1..n] at kvm.ko module load, latency decreases by up to
400%.
For a round trip transition between VM-Enter and VM-Exit figures from the
commit log are:
Skylake 11650
Icelake 22350
Emerald 28850
When you add the caching in:
Skylake 6850
Icelake 9000
Emerald 7900
That's a saving of 170% for Skylake, 248% for Icelake and 365% for
Emerald Rapids.
[Fix]
The fix is part of a 5 patch series. We will only SRU patch 1 for the moment, as
it is the only one in mainline, and provides a 400% latency improvement, doing
the brunt of the work. Patches 2-5 are refactors and smaller performance
improvements, not yet mainlined due to needing rework, and only account for
about 2.5% latency improvement, quite small, compared to what patch 1 does.
The fix is:
commit 1201f226c863b7da739f7420ddba818cedf372fc
Author: Sean Christopherson <seanjc@google.com>
Date: Tue Dec 10 17:32:58 2024 -0800
Subject: KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1201f226c863b7da739f7420ddba818cedf372fc
This applies cleanly to noble, oracular. For jammy, it requires the below
dependency, and a small backport to fix some minor context mismatches.
commit cc04b6a21d431359eceeec0d812b492088b04af5
Author: Jing Liu <jing2.liu@intel.com>
Date: Wed Jan 5 04:35:14 2022 -0800
Subject: kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc04b6a21d431359eceeec0d812b492088b04af5
[Testcase]
1) Install KVM Stack on Baremetal host
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
2) Enable nested virt
$ vim /etc/modprobe.d/kvm.conf
options kvm-intel nested=1
3) Start a VM.
4) Install KVM Stack in Guest
$ sudo apt-get install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils
5) Start a VM.
6) Install kvm-unit-tests and run x86/vmexit/cpuid testcase.
https://gitlab.com/kvm-unit-tests/kvm-unit-tests
(To be expanded upon).
[Where problems could occur]
This fix is related to nested virtualisation in the KVM subsystem. We are adding
a new function, called on KVM module load, which caches the CPUID instead of
fetching it every time XSAVE needs to be recomputed, which can be multiple times
on VM-Enter and VM-Exit on nested guests.
CPUID is static and should never change, so there should be no issues in saving
a value and reusing it later.
If a regression were to occur, it would affect all KVM users, and there would
be no workarounds.
[Other info]
Full mailing list series:
https://lore.kernel.org/kvm/20241211013302.1347853-1-seanjc@google.com/T/#u
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2093146/+subscriptions
Комментариев нет:
Отправить комментарий