среда

[Bug 2065369] Re: veth.sh from ubuntu_kselftests_net failed on J-5.15 / N-6.8 (with xdp attached - gro flag)

This bug is awaiting verification that the linux-azure/5.15.0-1109.118
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-azure' to 'verification-done-jammy-
linux-azure'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-azure' to 'verification-failed-jammy-linux-azure'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-azure-v2 verification-needed-jammy-linux-azure

--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2065369

Title:
veth.sh from ubuntu_kselftests_net failed on J-5.15 / N-6.8 (with xdp
attached - gro flag)

Status in ubuntu-kernel-tests:
In Progress
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Jammy:
Fix Released
Status in linux source package in Noble:
Fix Committed

Bug description:
[ Impact ]

The test veth.sh from ubuntu_kselftests_net fails on both Jammy and Noble.
...
bad setting: reducing RX nr below peer TX with XDP set ok
with xdp attached - gro flag fail - expected on found off
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation fail - got 10 packets, expected 1
        - after dev off, flag fail - expected on found off
        - peer flag ok
...

The test execution reveals a consistent failure pattern
during the interaction between XDP program attachment and GRO
feature state management on veth interfaces.

It is possible to notice that the commit
d7db7775ea2e (net: veth: do not manipulate GRO when using XDP)
changed the veth driver's behavior by removing automatic GRO manipulation
when XDP programs attach or detach.
Both Noble and Jammy includes this behavioral change,
but the kselftest net:vet.sh has not been update accordingly.
In practice, commit ba5a6476e386 (selftests: net: veth: test the ability
to independently manipulate GRO and XDP) it's missing.
This creates a mismatch between actual kernel behavior and test expectations.

[ Fix ]

Backport commit ba5a6476e386 (selftests: net: veth: test the ability
to independently manipulate GRO and XDP) from mainline.

[ Test ]

Execute net:vet.sh on both Noble and Jammy.

In Noble:
$ uname -a
Linux ubuntu-noble-amd64-server 6.8.0-91-generic #92-Ubuntu SMP PREEMPT_DYNAMIC Fri Nov 28 16:26:35 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
$ sudo apt install -y build-essential docutils-common ethtool iptables jq kernel-wedge libfuse-dev libnuma-dev libssl-dev net-tools pkg-config tcpdump uuid-runtime socat netsniff-ng libcap-dev libelf-dev clang llvm
$ fakeroot debian/rules clean
$ make -j$(nproc) headers
$ sudo make run_tests -C tools/testing/selftests/net TEST_PROGS=veth.sh
# selftests: net: veth.sh
# default - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation ok
# - aggregation with TSO off ok
# with gro on - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation with TSO off ok
# gro vs xdp while down - gro flag off ok
# - after down ok
# - after xdp off ok
# - after up ok
# - after peer xdp ok
# gro vs xdp while down - gro flag on ok
# - after down ok
# - after xdp off ok
# - after up ok
# - after peer xdp ok
# default channels ok
# with gro enabled on link down - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation with TSO off ok
# setting tx channels ok
# setting both rx and tx channels ok
# bad setting: combined channels ok
# setting invalid channels nr ok
# bad setting: XDP with RX nr less than TX ok
# bad setting: reducing RX nr below peer TX with XDP set ok
# bad setting: increasing peer TX nr above RX with XDP set ok
# setting invalid channels nr ok
# with xdp attached - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - no aggregation ok
# - gro flag with GRO on ok
# - aggregation ok
# - after dev off, flag ok
# - peer flag ok
# - after gro on xdp off, gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# decreasing tx channels with device down ok
# - aggregation ok
# increasing tx channels with device down ok
# aggregation again with default and TSO off ok
ok 14 selftests: net: veth.sh

In Jammy:

$ uname -a
Linux ubuntu-jammy-amd64-server 5.15.0-163-generic #173-Ubuntu SMP Tue Oct 14 17:51:00 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
$ sudo apt install -y build-essential docutils-common ethtool iptables jq kernel-wedge libfuse-dev libnuma-dev libssl-dev net-tools pkg-config tcpdump uuid-runtime socat netsniff-ng libcap-dev libelf-dev clang llvm
$ fakeroot debian/rules clean
$ make -j$(nproc) headers
$ make -j$(nproc) -C tools/testing/selftests TARGETS=bpf SKIP_TARGETS= KDIR=/usr/src/linux-headers-5.15.0-163-generic
$ sudo make run_tests -C tools/testing/selftests/net TEST_PROGS=veth.sh
# selftests: net: veth.sh
# default - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation ok
# - aggregation with TSO off ok
# with gro on - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation with TSO off ok
# gro vs xdp while down - gro flag on ok
# - after down ok
# - after xdp off ok
# - after up ok
# - after peer xdp ok
# default channels ok
# with gro enabled on link down - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - aggregation with TSO off ok
# setting tx channels ok
# setting both rx and tx channels ok
# bad setting: combined channels ok
# setting invalid channels nr ok
# bad setting: XDP with RX nr less than TX ok
# bad setting: reducing RX nr below peer TX with XDP set ok
# bad setting: increasing peer TX nr above RX with XDP set ok
# setting invalid channels nr ok
# with xdp attached - gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# - no aggregation ok
# - gro flag with GRO on ok
# - aggregation ok
# - after dev off, flag ok
# - peer flag ok
# - after gro on xdp off, gro flag ok
# - peer gro flag ok
# - tso flag ok
# - peer tso flag ok
# decreasing tx channels with device down ok
# - aggregation ok
# increasing tx channels with device down ok
# aggregation again with default and TSO off ok
ok 7 selftests: net: veth.sh

[ Regression Potential ]

The fix affects only scripts in kselftest.
No regression potential for the kernel.

---

Issue found with Jammy 5.15.0-111.121 in sru-20240429

Reproduce rate is 100% across different arches on openstack cloud.

Test log:
ubuntu@kt-j-l-gen-5-15-bc2r4d20-u-kselftests-net-amd64:~/autotest/client/tmp/ubuntu_kselftests_net/src/linux/tools/testing/selftests/net$ sudo ./veth.sh
default - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation ok
        - aggregation with TSO off ok
with gro on - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
setting tx channels ok
bad setting: combined channels ok
setting invalid channels nr ok
bad setting: XDP with RX nr less than TX ok
bad setting: reducing RX nr below peer TX with XDP set ok
with xdp attached - gro flag fail - expected on found off
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation fail - got 10 packets, expected 1
        - after dev off, flag fail - expected on found off
        - peer flag ok
        - after gro on xdp off, gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
decreasing tx channels with device down ok
        - aggregation ok
increasing tx channels with device down ok
aggregation again with default and TSO off ok

This failure is different than our known issue of this test (LP:
#1949569 with gro on/aggregation with TSO off) And we don't have this
failure on openstack cloud in the previous cycles.

I have also verified the following combinations:
* 105 kernel + 106 source code - GOOD
* 106 kernel + 106 source code - GOOD
* 111 kernel + 106 source code - BAD
* 111 kernel + 111 source code - BAD
* 106 kernel + 111 source code - GOOD

This appears to be a possible regression in the kernel to me.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2065369/+subscriptions

Комментариев нет:

Отправить комментарий