среда

[Bug 2111599] Comment bridged from LTC Bugzilla

------- Comment From Niklas.Schnelle@ibm.com 2025-07-02 11:00 EDT-------
Verified this is working as intended on 6.8.0-64-generi from noble-proposed

--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2111599

Title:
[UBUNTU 24.04] s390/pci: Fix zpci_bus_is_isolated_vf() for non-VF

Status in Ubuntu on IBM z Systems:
Fix Committed
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Noble:
Fix Committed
Status in linux source package in Oracular:
Won't Fix
Status in linux source package in Plucky:
Fix Released
Status in linux source package in Questing:
Invalid

Bug description:
SRU Justification:

[ Impact ]

 * For non-VFs, function zpci_bus_is_isolated_vf() should return false,
   because they aren't VFs.
   While zpci_iov_find_parent_pf() specifically checks if a function is a VF,
   it then simply returns that there is no parent.

 * The simplistic check for a parent then leads to these functions being
   confused with isolated VFs, and isolating them on their own domain even
   if sibling PFs should share the domain.

[ Fix ]

 * This is fixed by explicitly checking if a function is not a VF.
   (Notice that at this point the case where RIDs are ignored is already
    handled - and in this case all PCI functions get isolated by being
    properly detected in zpci_bus_is_multifunction_root().)

 * 8691abd3afaa "s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs"

[ Test Plan ]

 * Setup Ubuntu Server (24.04/24.10) for s390x on an IBM z17
   or LinuxONE 5 LPAR.

 * Have at least two PCIe-adapter-based PCHIDs (physical channel IDs),
that support PF/VF (physical functions) available in this LPAR.
(Like RoCE Express aka ConnectX-6, in NETD mode.)

 * Attach multiple PFs of a PF access mode device (notice that this is only
   possible with z17 and L1-5 hardware), such as the two PFs of a NETD
   to the same LPAR.

 * Observe that they are put into separate PCI domains
   instead of sharing the same domain as expected by drivers.

* A PCI ID has the format "DDDD:BB:ff.d",
whereas the 'DDDD' the PCI Domain is and "ff.d" the PCI Function.

* With a fixed kernel "lspci -vvt" shows
a common PCI Domain (here "[0180:00]") in the PCI ID,
that has two (both) PCI Functions attached (here "00.0" and "00.1"):
# lspci -vvt
-[0180:00]-+-00.0 Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
- \-00.1 Mellanox Technologies MT2894 Family [ConnectX-6 Lx]

* However, a kernel without the fix included would show here
two separate PCI Domains, instead of a common one.

 * Due to lack of hardware, the verification will be conducted by IBM.

 * The fix was discussed upstream and flagged a stable kernel update.

[ Where problems could occur ]

 * The modification is limited to one additional if statement
   across two lines) in function zpci_bus_is_isolated_vf()
   in file arch/s390/pci/pci_bus.c.

 * Hence the modification will be limited to the s390x-specific parts of
   the PCI code in the kernel (sometimes refers to as zPCI),
   and will NOT impact any other architecture!

 * If add. if-statement is not correct and a wrong bool is returned,
   function zpci_bus_is_isolated_vf() might report and incorrect zpci_bus
   status. Either isolated when it's not or not-isolated when it really is.

 * And since the new if statement got inserted before the already existing
  'if (!pdev)', the latter code in function zpci_bus_is_isolated_vf()
   might be accidentally skipped.

 * All this might lead to a similar confusion of the functions in regard to
   isolated VFs status and whether isolating them on their own domain even
   or not, hence proper testing is needed.

[ Other Info ]

 * Since the commit got upstream accepted with v6.15-rc1,
   'Questing' is not affected.

 * Plucky got fixed with
   https://bugs.launchpad.net/bugs/2108854
   (cherry-picked as https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/plucky/commit/?id=58412b2aa12aa43c90d5414e22898a041ec05e8a)

 * Hence the only remaining affected releases, that contain the offending
   commit ("s390/pci: Fix handling of isolated VFs"), are oracular and noble.

 * Since this issue does not happen (thus cannot be recreated nor tested)
   on the IBM Z hardware we have in Canonical, the verification(s) will be
   done by IBM on most recent hardware.
__________

s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs

Commit:
8691abd3afaadd816a298503ec1a759df1305d2e
-------------------------

              For non-VFs, zpci_bus_is_isolated_vf() should return false because they
              aren't VFs. While zpci_iov_find_parent_pf() specifically checks if
              a function is a VF, it then simply returns that there is no parent. The
              simplistic check for a parent then leads to these functions being
              confused with isolated VFs and isolating them on their own domain even
              if sibling PFs should share the domain.

              Fix this by explicitly checking if a function is not a VF. Note also
              that at this point the case where RIDs are ignored is already handled
              and in this case all PCI functions get isolated by being detected in
              zpci_bus_is_multifunction_root().

              Cc: stable@vger.kernel.org
              Fixes: 2844ddbd540f ("s390/pci: Fix handling of isolated VFs")
              Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
              Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
              Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2111599/+subscriptions

Комментариев нет:

Отправить комментарий