** Tags added: kernel-daily-bug
--
You received this bug notification because you are subscribed to linux
in Ubuntu.
Matching subscriptions: Bgg, Bmail, Nb
https://bugs.launchpad.net/bugs/2144592
Title:
Punching hole through CephFS hosted file causes corruption when
crossing 4MB RADOS object boundary
Status in linux package in Ubuntu:
New
Bug description:
Running Ceph FS on Ubuntu 24.04 (6.8 kernel) - Ubuntu
6.8.0-100.100-generic 6.8.12
Enclosed script reproduce-ceph-punch-hole-corruption.py exposes issue
that we have found that on recent kernels CephFS silently corrupts
16KB of data before the requested hole when trying to punch a hole
through file (test uses fallocate()). Corruption only occurs when hole
touches or crosses a 4MB RADOS object boundary (4MB is the default
stripe size).
Execution shows the corruption:
# python3 ./reproduce-ceph-punch-hole-corruption.py /Shared_DataStore/
CephFS PUNCH_HOLE data corruption reproducer
============================================================
Mount point: /Shared_DataStore/
Object size: 4194304 (4 MiB)
Tests crossing 4MB object boundary (expect FAIL on buggy kernels):
------------------------------------------------------------
FAIL 1 page before boundary, 2 pages
hole=[4190208, 4198400) checked=[4173824, 4190208)
16384/16384 bytes read as 0x00 (expected 0xFF)
FAIL 2 pages before boundary, 4 pages
hole=[4186112, 4202496) checked=[4169728, 4186112)
16384/16384 bytes read as 0x00 (expected 0xFF)
FAIL 4 pages before boundary, 8 pages
hole=[4177920, 4210688) checked=[4161536, 4177920)
16384/16384 bytes read as 0x00 (expected 0xFF)
FAIL ends at boundary, 2 pages
hole=[4186112, 4194304) checked=[4169728, 4186112)
16384/16384 bytes read as 0x00 (expected 0xFF)
FAIL ends at boundary, 1 page
hole=[4190208, 4194304) checked=[4173824, 4190208)
16384/16384 bytes read as 0x00 (expected 0xFF)
Tests NOT crossing boundary (should always PASS):
------------------------------------------------------------
PASS within object 0
hole=[4161536, 4169728) checked=[4145152, 4161536)
PASS mid object 0
hole=[1048576, 1056768) checked=[1032192, 1048576)
PASS start of object 1
hole=[4194304, 4202496) checked=[4177920, 4194304)
PASS within object 1
hole=[5242880, 5251072) checked=[5226496, 5242880)
============================================================
Results: 4 passed, 5 failed out of 9
BUG CONFIRMED: This kernel has the CephFS PUNCH_HOLE corruption bug.
Enclosed is a patch submission detailing issue (AI created):
0001-ceph-fix-data-corruption-from-short-read-on-punch-hole.patch
With patch test script now passes:
# python3 /home/eceuser/reproduce-ceph-punch-hole-corruption.py /Shared_DataStore/
CephFS PUNCH_HOLE data corruption reproducer
============================================================
Mount point: /Shared_DataStore/
Object size: 4194304 (4 MiB)
Tests crossing 4MB object boundary (expect FAIL on buggy kernels):
------------------------------------------------------------
PASS 1 page before boundary, 2 pages
hole=[4190208, 4198400) checked=[4173824, 4190208)
PASS 2 pages before boundary, 4 pages
hole=[4186112, 4202496) checked=[4169728, 4186112)
PASS 4 pages before boundary, 8 pages
hole=[4177920, 4210688) checked=[4161536, 4177920)
PASS ends at boundary, 2 pages
hole=[4186112, 4194304) checked=[4169728, 4186112)
PASS ends at boundary, 1 page
hole=[4190208, 4194304) checked=[4173824, 4190208)
Tests NOT crossing boundary (should always PASS):
------------------------------------------------------------
PASS within object 0
hole=[4161536, 4169728) checked=[4145152, 4161536)
PASS mid object 0
hole=[1048576, 1056768) checked=[1032192, 1048576)
PASS start of object 1
hole=[4194304, 4202496) checked=[4177920, 4194304)
PASS within object 1
hole=[5242880, 5251072) checked=[5226496, 5242880)
============================================================
Results: 9 passed, 0 failed out of 9
All tests passed. This kernel is not affected (or the fix is applied).
Appears as if following commit causes the issue:
92b6cc5d1e7c ("netfs: Add iov_iters to (sub)requests to describe various buffers") by David Howells, authored 2023-09-27, committed 2023-12-24. Merged in v6.8-rc1.
This is only present in 6.8 and 6.9 kernels, 6.10 rewrote this
activity under ee4cdf7ba857 ("netfs: Speed up buffered reading") by
David Howells, 2024-07-02. Merged in v6.10.) which no longer has this
issue.
Asking for either analysis of enclosed patch to be included into
Stable or if there is another/better way to fix.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2144592/+subscriptions
Комментариев нет:
Отправить комментарий