** Description changed: - Ubuntu Kernel - NFS Client - Regression in 6.8.0-110 - - Package - linux (source) - Component - fs/nfs/client.c - nfs_server_copy_userdata() - Guilty commit - ae0cf4493dd3 - NFS: Fix inheritance of the block sizes when automounting - Severity - High - data write failure, EIO returned to applications - Affects - Ubuntu 24.04 LTS (Noble) - kernel 6.8.0-110+, kernel 6.17.x (HWE) - Not affected - Ubuntu 24.04 LTS kernel 6.8.0-107 and earlier - Mitigated in - Ubuntu 26.04 kernel 7.0 (retry on error, but root cause still present) - Protocol - NFSv4.1 and NFSv4.2 only (NFSv4.0 and NFSv3 not affected) - - - Title NFS: wsize/rsize regression in 6.8.0-110 causes EIO on NFSv4.1/4.2 writes (commit ae0cf4493dd3) - Summary - Commit ae0cf4493dd3 ("NFS: Fix inheritance of the block sizes when automounting"), backported in kernel 6.8.0-110.110, introduced a regression in nfs_server_copy_userdata() (fs/nfs/client.c). The patch made the copy of wsize/rsize conditional on NFS_AUTOMOUNT_INHERIT_WSIZE/RSIZE flags. These flags are only set when the user explicitly specifies wsize/rsize at mount time. When mounting with default options (the standard case), the session-negotiated wsize/rsize values (with COMPOUND overhead correctly subtracted by nfs4_session_limit_rwsize()) are no longer propagated. The client ends up using the raw ca_maxrequestsize value, generating oversized RPC WRITE requests that the server rejects with NFS4ERR_REQ_TOO_BIG. - Impact - Any application performing writes of 1 MiB or larger on an NFSv4.1/4.2 mount where the server's Write Transfer Max Size equals the negotiated wsize will experience data write failures (EIO on fsync). This affects dd, cp, rsync, and any application using large write buffers. The issue impacts any enterprise NAS platform where ca_maxrequestsize equals the maximum write payload (common with Dell PowerScale, NetApp ONTAP, and other platforms). - Root cause analysis - The guilty commit - Commit: ae0cf4493dd3 - Title: NFS: Fix inheritance of the block sizes when automounting - File: fs/nfs/client.c - Function: nfs_server_copy_userdata() - - Function BEFORE the regression (kernel 6.8.0-107, working) + ================================================================================================= + + Package: linux (source) + Component: fs/nfs/client.c - nfs_server_copy_userdata() + Guilty commit: ae0cf4493dd3 - NFS: Fix inheritance of the block sizes when automounting + Severity: High - data write failure, EIO returned to applications + Affects: Ubuntu 24.04 LTS (Noble) - kernel 6.8.0-110+, kernel 6.17.x (HWE) + Not affected: Ubuntu 24.04 LTS kernel 6.8.0-107 and earlier + Mitigated in: Ubuntu 26.04 kernel 7.0 (retry on error, but root cause still present) + Protocol: NFSv4.1 and NFSv4.2 only (NFSv4.0 and NFSv3 not affected) + Impact: Identified on over 100 production Ubuntu 24.04 LTS workstations in an enterprise environment + + + SUMMARY + ------- + + Commit ae0cf4493dd3 ("NFS: Fix inheritance of the block sizes when automounting"), + backported in kernel 6.8.0-110.110, introduced a regression in + nfs_server_copy_userdata() (fs/nfs/client.c). The patch made the copy of + wsize/rsize conditional on NFS_AUTOMOUNT_INHERIT_WSIZE/RSIZE flags. These flags + are only set when the user explicitly specifies wsize/rsize at mount time. When + mounting with default options (the standard case), the session-negotiated + wsize/rsize values (with COMPOUND overhead correctly subtracted by + nfs4_session_limit_rwsize()) are no longer propagated. The client ends up using + the raw ca_maxrequestsize value, generating oversized RPC WRITE requests that + the server rejects with NFS4ERR_REQ_TOO_BIG. + + + IMPACT + ------ + + Any application performing writes of 1 MiB or larger on an NFSv4.1/4.2 mount + where the server's Write Transfer Max Size equals the negotiated wsize will + experience data write failures (EIO on fsync). This affects dd, cp, rsync, and + any application using large write buffers. The issue impacts any enterprise NAS + platform where ca_maxrequestsize equals the maximum write payload (common with + Dell PowerScale, NetApp ONTAP, and other platforms). + + + ROOT CAUSE ANALYSIS + ------------------- + + ### The guilty commit + + Commit: ae0cf4493dd3 + Title: NFS: Fix inheritance of the block sizes when automounting + File: fs/nfs/client.c + Function: nfs_server_copy_userdata() + + + ### Function BEFORE the regression (kernel 6.8.0-107, working) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->rsize = source->rsize; target->wsize = source->wsize; target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } wsize and rsize are always copied unconditionally. Values computed by nfs4_session_limit_rwsize() (subtracting nfs41_maxwrite_overhead from ca_maxrequestsize) are correctly propagated. - Function AFTER the regression (kernel 6.8.0-110, broken) + + ### Function AFTER the regression (kernel 6.8.0-110, broken) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->automount_inherit = source->automount_inherit; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_BSIZE) target->bsize = source->bsize; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_RSIZE) /* <-- BUG */ target->rsize = source->rsize; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_WSIZE) /* <-- BUG */ target->wsize = source->wsize; target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } - Copy of wsize/rsize is now conditional on INHERIT flags. These flags are - only set when the user explicitly passes wsize=/rsize= at mount time - (ctx->wsize != 0). In the default mount case, the flags are never set - and the negotiated values are silently dropped. - - Proposed fix (tested and validated) + Copy of wsize/rsize is now conditional on INHERIT flags. These flags are only + set when the user explicitly passes wsize=/rsize= at mount time (ctx->wsize != 0). + In the default mount case, the flags are never set and the negotiated values are + silently dropped. + + + ### Proposed fix (tested and validated) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->automount_inherit = source->automount_inherit; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_BSIZE) target->bsize = source->bsize; target->rsize = source->rsize; /* FIXED */ target->wsize = source->wsize; /* FIXED */ target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } Restores unconditional copy of wsize/rsize while keeping the new - automount_inherit field and the conditional copy of bsize (the original - intent of the commit). This fix was compiled as a replacement nfs.ko - module, installed on a test machine running kernel 6.8.0-110-generic, - and validated: wsize reverted from 1,048,576 to 1,047,532 and all write - operations succeeded. - - The chain of events - 1. Client mounts NFSv4.2 without explicit wsize (default behavior). - 2. CREATE_SESSION negotiates ca_maxrequestsize = 1,048,576 bytes. - 3. nfs4_session_limit_rwsize() correctly computes wsize = 1,048,576 - 1,044 = 1,047,532. - 4. NFS_AUTOMOUNT_INHERIT_WSIZE is NOT set (ctx->wsize was 0). - 5. nfs_server_copy_userdata() skips copying the negotiated wsize. - 6. The server object ends up with wsize = 1,048,576 (raw ca_maxrequestsize). - 7. WRITE RPC: 1,048,576 payload + ~1,044 overhead = ~1,049,620 bytes total. - 8. Server rejects with NFS4ERR_REQ_TOO_BIG (RFC 8881 Section 18.36). - 9. Client enters a retry loop, sequence IDs desynchronize, returns EIO. - - Why NFSv4.0 is not affected - NFSv4.0 does not use sessions (no CREATE_SESSION, no ca_maxrequestsize, no nfs4_session_limit_rwsize()). Confirmed by testing: NFSv4.0 works correctly on all kernel versions. - - Note on kernel 7.0 (Ubuntu 26.04) behavior - While kernel 7.0 does not exhibit the EIO failure, the underlying bug is still present. Kernel 7.0 negotiates the same incorrect wsize=1,048,576 (verified via nfsstat -m), and the server still rejects the oversized RPC with NFS4ERR_REQ_TOO_BIG (verified via tcpdump). The difference is that kernel 7.0 handles the error by reducing the write size and retrying. + automount_inherit field and the conditional copy of bsize (the original intent + of the commit). This fix was compiled as a replacement nfs.ko module, installed + on a test machine running kernel 6.8.0-110-generic, and validated: wsize + reverted from 1,048,576 to 1,047,532 and all write operations succeeded. + + + ### The chain of events + + 1. Client mounts NFSv4.2 without explicit wsize (default behavior). + 2. CREATE_SESSION negotiates ca_maxrequestsize = 1,048,576 bytes. + 3. nfs4_session_limit_rwsize() correctly computes wsize = 1,048,576 - 1,044 = 1,047,532. + 4. NFS_AUTOMOUNT_INHERIT_WSIZE is NOT set (ctx->wsize was 0). + 5. nfs_server_copy_userdata() skips copying the negotiated wsize. + 6. The server object ends up with wsize = 1,048,576 (raw ca_maxrequestsize). + 7. WRITE RPC: 1,048,576 payload + ~1,044 overhead = ~1,049,620 bytes total. + 8. Server rejects with NFS4ERR_REQ_TOO_BIG (RFC 8881 Section 18.36). + 9. Client enters a retry loop, sequence IDs desynchronize, returns EIO. + + + ### Why NFSv4.0 is not affected + + NFSv4.0 does not use sessions (no CREATE_SESSION, no ca_maxrequestsize, no + nfs4_session_limit_rwsize()). Confirmed by testing: NFSv4.0 works correctly + on all kernel versions. + + + NOTE ON KERNEL 7.0 (UBUNTU 26.04) BEHAVIOR + ------------------------------------------- + + While kernel 7.0 does not exhibit the EIO failure, the underlying bug is still + present. Kernel 7.0 negotiates the same incorrect wsize=1,048,576 (verified via + nfsstat -m), and the server still rejects the oversized RPC with + NFS4ERR_REQ_TOO_BIG (verified via tcpdump). The difference is that kernel 7.0 + handles the error by reducing the write size and retrying. This approach has several drawbacks: - Every first WRITE at 1 MiB is rejected by the server, generating unnecessary network round-trips and latency on every large write operation. - The NFS4ERR_SEQ_MISORDERED errors following each rejection indicate sequence ID desynchronization, which impacts session slot efficiency. - The retry mechanism masks the negotiation defect without eliminating it: the client systematically sends packets the server has to reject. - On high-throughput workloads (backups, large file transfers), the cumulative cost of rejected-then-retried RPCs can impact performance. - - The proper fix is to restore correct wsize/rsize negotiation so that RPC - WRITE requests never exceed ca_maxrequestsize in the first place. The - retry handling in kernel 7.0 is a valuable defense-in-depth mechanism - and should be backported to the Noble kernel series, but it should not - be considered a substitute for fixing the negotiation logic. - - Bisection results - Kernel bisection on the same machine, same NFS export, same default mount options. Only the kernel changed between reboots. - - Kernel - Date - wsize negotiated - dd bs=1M - Status - 6.8.0-100 - Feb 2026 - 1,047,532 - OK - OK - 6.8.0-107 - 13 Mar 2026 - 1,047,532 - OK - Last good - 6.8.0-110 - 19 Mar 2026 - 1,048,576 - EIO - First bad - - - Git source confirmation: diff between tags Ubuntu-6.8.0-107.107 and Ubuntu-6.8.0-110.110 on git.launchpad.net confirms that commit ae0cf4493dd3 is the only change to fs/nfs/client.c affecting wsize/rsize handling. - Full client test matrix - OS - Kernel - wsize - dd bs=1M - Result - Note - Ubuntu 22.04 - 5.15.0-176 - 1,047,532 - OK - OK - Reference - Ubuntu 24.04 - 6.8.0-100 - 1,047,532 - OK - OK - - - Ubuntu 24.04 - 6.8.0-107 - 1,047,532 - OK - OK - Last good - Ubuntu 24.04 - 6.8.0-110 - 1,048,576 - EIO - BUG - First bad - Ubuntu 24.04 - 6.8.0-110* - 1,047,532 - OK - FIXED - Patched nfs.ko - Ubuntu 24.04 - 6.17.0-22 - 1,048,576 - EIO - BUG - HWE - Ubuntu 26.04 - 7.0 - 1,048,576 - OK** - Mitig. - Retry masks bug - - - * Patched nfs.ko with unconditional wsize/rsize copy restored in nfs_server_copy_userdata(). - ** OK via retry after NFS4ERR_REQ_TOO_BIG. The negotiation bug is still present (wsize=1,048,576), only the error handling compensates. - - NFSv4.0: NOT affected (no sessions). NFSv3: NOT affected. - - Evidence - tcpdump on kernel 6.8.0-110 (failing) + - Every first WRITE at 1 MiB is rejected, generating unnecessary network + round-trips and latency on every large write operation. + - NFS4ERR_SEQ_MISORDERED errors following each rejection indicate sequence + ID desynchronization, impacting session slot efficiency. + - The retry mechanism masks the negotiation defect without eliminating it: + the client systematically sends packets the server has to reject. + - On high-throughput workloads (backups, large file transfers), the + cumulative cost of rejected-then-retried RPCs can impact performance. + + The proper fix is to restore correct wsize/rsize negotiation so that RPC WRITE + requests never exceed ca_maxrequestsize in the first place. The retry handling + in kernel 7.0 is a valuable defense-in-depth mechanism and should be backported, + but it should not be considered a substitute for fixing the negotiation logic. + + + BISECTION RESULTS + ----------------- + + Kernel bisection on the same machine, same NFS export, same default mount + options. Only the kernel changed between reboots. + + Kernel Date wsize negotiated dd bs=1M Status + ----------- ----------- ---------------- -------- ---------- + 6.8.0-100 Feb 2026 1,047,532 OK OK + 6.8.0-107 13 Mar 2026 1,047,532 OK Last good + 6.8.0-110 19 Mar 2026 1,048,576 EIO First bad + + Git source confirmation: diff between tags Ubuntu-6.8.0-107.107 and + Ubuntu-6.8.0-110.110 on git.launchpad.net confirms that commit ae0cf4493dd3 + is the only change to fs/nfs/client.c affecting wsize/rsize handling. + + + FULL CLIENT TEST MATRIX + ----------------------- + + OS Kernel wsize dd bs=1M Result Note + ------------- ----------- ---------- -------- ------ ---------------- + Ubuntu 22.04 5.15.0-176 1,047,532 OK OK Reference + Ubuntu 24.04 6.8.0-100 1,047,532 OK OK + Ubuntu 24.04 6.8.0-107 1,047,532 OK OK Last good + Ubuntu 24.04 6.8.0-110 1,048,576 EIO BUG First bad + Ubuntu 24.04 6.8.0-110* 1,047,532 OK FIXED Patched nfs.ko + Ubuntu 24.04 6.17.0-22 1,048,576 EIO BUG HWE + Ubuntu 26.04 7.0 1,048,576 OK** Mitig. Retry masks bug + + * Patched nfs.ko with unconditional wsize/rsize copy restored. + ** OK via retry after NFS4ERR_REQ_TOO_BIG. Negotiation bug still present. + + NFSv4.0: NOT affected (no sessions). + NFSv3: NOT affected. + + + EVIDENCE + -------- + + ### tcpdump on kernel 6.8.0-110 (failing) + 1222 2.970s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 2289 2.979s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 6622 3.015s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED [... 20+ repetitions, client never recovers, returns EIO ...] - tcpdump on kernel 7.0 (mitigated, not fixed) + + ### tcpdump on kernel 7.0 (mitigated, not fixed) + 135 30.593s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 137 30.593s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED 296 44.304s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 298 44.304s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED - Write completes after retry. Errors still occur on every large write. - - nfsstat -m comparison - Kernel - wsize - rsize - 5.15 / 6.8.0-107 - 1,047,532 (correct) - 1,047,672 (correct) - 6.8.0-110 - 1,048,576 (regression) - 1,048,576 (regression) - 6.8.0-110 patched - 1,047,532 (fixed) - 1,047,672 (fixed) - 7.0 (26.04) - 1,048,576 (still wrong) - 1,048,576 (still wrong) - - - Dichotomy tests - exact failure thresholds - Kernel 6.8.0-110: - bs (bytes) - Result - Note - 1,044,480 (1020 KiB) - OK - Last passing - 1,048,576 (1024 KiB) - EIO - = negotiated wsize - - - Kernel 6.17.0-22 (HWE): - bs (bytes) - Result - Note - 1,048,328 - OK - Last passing - 1,048,329 - EIO - First failing - - Delta on 6.17: 248 bytes (vs ~1,044 on working kernels). Suggests - partial overhead subtraction, insufficient. - - Environment - Parameter - Value - Server platform - Dell PowerScale OneFS 9.7 - Protocol - NFSv4.2 (also reproduces on NFSv4.1) - Write Transfer Max Size - 1 MiB (1,048,576 bytes) - Write Transfer Size (preferred) - 512 KiB - Write Transfer Multiple - 512 bytes - Client mount options - Default (no explicit wsize/rsize) - - - Workarounds - Option 1 - Force wsize/rsize at mount time (recommended): - mount -o vers=4.2,wsize=524288,rsize=524288 server:/export /mnt - Sets the NFS_AUTOMOUNT_INHERIT_WSIZE flag, so the value is propagated. Configurable in /etc/fstab, autofs maps, or systemd mount units. - - Option 2 - Boot on kernel 6.8.0-107 or earlier: - Not a long-term solution (missing security patches from -110+). - - Option 3 - Use NFSv4.0: - Not affected (no sessions). Loses NFSv4.1+ features (sessions, trunking, pNFS). - - Requested action + Write completes after retry. Errors still occur on every large write. + + + ### nfsstat -m comparison + + Kernel wsize rsize + ------------------ ----------------------- ----------------------- + 5.15 / 6.8.0-107 1,047,532 (correct) 1,047,672 (correct) + 6.8.0-110 1,048,576 (regression) 1,048,576 (regression) + 6.8.0-110 patched 1,047,532 (fixed) 1,047,672 (fixed) + 7.0 (26.04) 1,048,576 (still wrong) 1,048,576 (still wrong) + + + ### Dichotomy tests - exact failure thresholds + + Kernel 6.8.0-110: + bs = 1,044,480 (1020 KiB) -> OK (last passing) + bs = 1,048,576 (1024 KiB) -> EIO (= negotiated wsize) + + Kernel 6.17.0-22 (HWE): + bs = 1,048,328 -> OK (last passing) + bs = 1,048,329 -> EIO (first failing) + Delta: 248 bytes (vs ~1,044 on working kernels). Partial overhead subtraction. + + + STEPS TO REPRODUCE + ------------------ + + 1. Install Ubuntu 24.04 with kernel 6.8.0-110-generic (or later). + 2. Mount an NFSv4.2 export from a server with Write Transfer Max Size = 1 MiB + without specifying wsize: + mount -t nfs4 -o vers=4.2 server:/export /mnt/nfs + 3. Verify the negotiated wsize: + nfsstat -m | grep wsize + Result on -110: wsize=1048576 (BUG - raw value, no overhead subtracted) + Result on -107: wsize=1047532 (correct - overhead subtracted) + 4. Attempt a write: + dd if=/dev/zero of=/mnt/nfs/test bs=1M count=1 conv=fsync + 5. On -110: dd returns "Input/output error". On -107: dd succeeds. + 6. Capture with tcpdump on port 2049 shows NFS4ERR_REQ_TOO_BIG from server. + + + ENVIRONMENT + ----------- + + Server platform: Dell PowerScale OneFS 9.7 + Protocol: NFSv4.2 (also reproduces on NFSv4.1) + Write Transfer Max Size: 1 MiB (1,048,576 bytes) + Write Transfer Size (pref.): 512 KiB + Write Transfer Multiple: 512 bytes + Client mount options: Default (no explicit wsize/rsize) + + + WORKAROUNDS + ----------- + + Option 1 - Force wsize/rsize at mount time (recommended): + mount -o vers=4.2,wsize=524288,rsize=524288 server:/export /mnt + Sets the NFS_AUTOMOUNT_INHERIT_WSIZE flag. Configurable in /etc/fstab, + autofs maps, or systemd mount units. + + Option 2 - Boot on kernel 6.8.0-107 or earlier: + Not a long-term solution (missing security patches from -110+). + + Option 3 - Use NFSv4.0: + mount -o vers=4.0 server:/export /mnt + Not affected (no sessions). Loses NFSv4.1+ features. + + + REQUESTED ACTION + ---------------- + Two complementary fixes are recommended: - 1. Fix the regression in nfs_server_copy_userdata(): restore the - unconditional copy of wsize/rsize while keeping the conditional copy of - bsize. The proposed fix has been compiled, tested, and validated on - kernel 6.8.0-110-generic (see "Proposed fix" section above). - - 2. Backport the NFS4ERR_REQ_TOO_BIG retry logic from kernel 7.0 as - defense-in-depth. This protects against future negotiation issues but - should not be considered a substitute for correct negotiation, as it - generates unnecessary rejected RPCs, sequence desynchronization, and - added latency on every large write operation. + 1. Fix the regression in nfs_server_copy_userdata(): restore the unconditional + copy of wsize/rsize while keeping the conditional copy of bsize. The proposed + fix has been compiled, tested, and validated on kernel 6.8.0-110-generic. + + 2. Backport the NFS4ERR_REQ_TOO_BIG retry logic from kernel 7.0 as + defense-in-depth. This should not be considered a substitute for correct + negotiation, as it generates unnecessary rejected RPCs, sequence + desynchronization, and added latency on every large write operation. Ubuntu 24.04 LTS is supported until 2029. This regression affects any NFSv4.1/4.2 deployment where the server's ca_maxrequestsize equals the maximum write payload size. - Package versions + + PACKAGE VERSIONS + ---------------- + linux-image-6.8.0-110-generic 6.8.0-110.110 (first bad) linux-image-6.8.0-107-generic 6.8.0-107.107 (last good) nfs-common 1:2.6.4-3ubuntu5.1 libnfsidmap1 1:2.6.4-3ubuntu5.1 libtirpc3t64 1.3.4+ds-1.1build1 rpcbind 1.2.6-7ubuntu2 - Git source reference - Repository: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble - Tags compared: Ubuntu-6.8.0-107.107 .. Ubuntu-6.8.0-110.110 - Guilty commit: ae0cf4493dd3 NFS: Fix inheritance of the block sizes when automounting - Files changed: fs/nfs/client.c (nfs_init_server, nfs_server_copy_userdata) - - Related references - Upstream commit 943cff67b842: "NFSv4.1: Fix the r/wsize checking" by Trond Myklebust. - RFC 8881, Section 18.36: CREATE_SESSION - defines ca_maxrequestsize and NFS4ERR_REQ_TOO_BIG. + + GIT SOURCE REFERENCE + -------------------- + + Repository: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble + Tags compared: Ubuntu-6.8.0-107.107 .. Ubuntu-6.8.0-110.110 + Guilty commit: ae0cf4493dd3 NFS: Fix inheritance of the block sizes when automounting + Files changed: fs/nfs/client.c (nfs_init_server, nfs_server_copy_userdata) + + + RELATED REFERENCES + ------------------- + + Upstream commit 943cff67b842: "NFSv4.1: Fix the r/wsize checking" by Trond Myklebust. + RFC 8881, Section 18.36: CREATE_SESSION - defines ca_maxrequestsize and NFS4ERR_REQ_TOO_BIG. -- You received this bug notification because you are subscribed to linux in Ubuntu. Matching subscriptions: Bgg, Bmail, Nb https://bugs.launchpad.net/bugs/2150318 Title: NFS: wsize/rsize regression in 6.8.0-110 causes EIO on NFSv4.1/4.2 writes (commit ae0cf4493dd3) Status in linux package in Ubuntu: New Bug description: NFS: wsize/rsize regression in 6.8.0-110 causes EIO on NFSv4.1/4.2 writes (commit ae0cf4493dd3) ================================================================================================= Package: linux (source) Component: fs/nfs/client.c - nfs_server_copy_userdata() Guilty commit: ae0cf4493dd3 - NFS: Fix inheritance of the block sizes when automounting Severity: High - data write failure, EIO returned to applications Affects: Ubuntu 24.04 LTS (Noble) - kernel 6.8.0-110+, kernel 6.17.x (HWE) Not affected: Ubuntu 24.04 LTS kernel 6.8.0-107 and earlier Mitigated in: Ubuntu 26.04 kernel 7.0 (retry on error, but root cause still present) Protocol: NFSv4.1 and NFSv4.2 only (NFSv4.0 and NFSv3 not affected) Impact: Identified on over 100 production Ubuntu 24.04 LTS workstations in an enterprise environment SUMMARY ------- Commit ae0cf4493dd3 ("NFS: Fix inheritance of the block sizes when automounting"), backported in kernel 6.8.0-110.110, introduced a regression in nfs_server_copy_userdata() (fs/nfs/client.c). The patch made the copy of wsize/rsize conditional on NFS_AUTOMOUNT_INHERIT_WSIZE/RSIZE flags. These flags are only set when the user explicitly specifies wsize/rsize at mount time. When mounting with default options (the standard case), the session-negotiated wsize/rsize values (with COMPOUND overhead correctly subtracted by nfs4_session_limit_rwsize()) are no longer propagated. The client ends up using the raw ca_maxrequestsize value, generating oversized RPC WRITE requests that the server rejects with NFS4ERR_REQ_TOO_BIG. IMPACT ------ Any application performing writes of 1 MiB or larger on an NFSv4.1/4.2 mount where the server's Write Transfer Max Size equals the negotiated wsize will experience data write failures (EIO on fsync). This affects dd, cp, rsync, and any application using large write buffers. The issue impacts any enterprise NAS platform where ca_maxrequestsize equals the maximum write payload (common with Dell PowerScale, NetApp ONTAP, and other platforms). ROOT CAUSE ANALYSIS ------------------- ### The guilty commit Commit: ae0cf4493dd3 Title: NFS: Fix inheritance of the block sizes when automounting File: fs/nfs/client.c Function: nfs_server_copy_userdata() ### Function BEFORE the regression (kernel 6.8.0-107, working) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->rsize = source->rsize; target->wsize = source->wsize; target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } wsize and rsize are always copied unconditionally. Values computed by nfs4_session_limit_rwsize() (subtracting nfs41_maxwrite_overhead from ca_maxrequestsize) are correctly propagated. ### Function AFTER the regression (kernel 6.8.0-110, broken) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->automount_inherit = source->automount_inherit; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_BSIZE) target->bsize = source->bsize; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_RSIZE) /* <-- BUG */ target->rsize = source->rsize; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_WSIZE) /* <-- BUG */ target->wsize = source->wsize; target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } Copy of wsize/rsize is now conditional on INHERIT flags. These flags are only set when the user explicitly passes wsize=/rsize= at mount time (ctx->wsize != 0). In the default mount case, the flags are never set and the negotiated values are silently dropped. ### Proposed fix (tested and validated) void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) { target->flags = source->flags; target->automount_inherit = source->automount_inherit; if (source->automount_inherit & NFS_AUTOMOUNT_INHERIT_BSIZE) target->bsize = source->bsize; target->rsize = source->rsize; /* FIXED */ target->wsize = source->wsize; /* FIXED */ target->acregmin = source->acregmin; target->acregmax = source->acregmax; target->acdirmin = source->acdirmin; target->acdirmax = source->acdirmax; target->options = source->options; target->auth_info = source->auth_info; target->port = source->port; } Restores unconditional copy of wsize/rsize while keeping the new automount_inherit field and the conditional copy of bsize (the original intent of the commit). This fix was compiled as a replacement nfs.ko module, installed on a test machine running kernel 6.8.0-110-generic, and validated: wsize reverted from 1,048,576 to 1,047,532 and all write operations succeeded. ### The chain of events 1. Client mounts NFSv4.2 without explicit wsize (default behavior). 2. CREATE_SESSION negotiates ca_maxrequestsize = 1,048,576 bytes. 3. nfs4_session_limit_rwsize() correctly computes wsize = 1,048,576 - 1,044 = 1,047,532. 4. NFS_AUTOMOUNT_INHERIT_WSIZE is NOT set (ctx->wsize was 0). 5. nfs_server_copy_userdata() skips copying the negotiated wsize. 6. The server object ends up with wsize = 1,048,576 (raw ca_maxrequestsize). 7. WRITE RPC: 1,048,576 payload + ~1,044 overhead = ~1,049,620 bytes total. 8. Server rejects with NFS4ERR_REQ_TOO_BIG (RFC 8881 Section 18.36). 9. Client enters a retry loop, sequence IDs desynchronize, returns EIO. ### Why NFSv4.0 is not affected NFSv4.0 does not use sessions (no CREATE_SESSION, no ca_maxrequestsize, no nfs4_session_limit_rwsize()). Confirmed by testing: NFSv4.0 works correctly on all kernel versions. NOTE ON KERNEL 7.0 (UBUNTU 26.04) BEHAVIOR ------------------------------------------- While kernel 7.0 does not exhibit the EIO failure, the underlying bug is still present. Kernel 7.0 negotiates the same incorrect wsize=1,048,576 (verified via nfsstat -m), and the server still rejects the oversized RPC with NFS4ERR_REQ_TOO_BIG (verified via tcpdump). The difference is that kernel 7.0 handles the error by reducing the write size and retrying. This approach has several drawbacks: - Every first WRITE at 1 MiB is rejected, generating unnecessary network round-trips and latency on every large write operation. - NFS4ERR_SEQ_MISORDERED errors following each rejection indicate sequence ID desynchronization, impacting session slot efficiency. - The retry mechanism masks the negotiation defect without eliminating it: the client systematically sends packets the server has to reject. - On high-throughput workloads (backups, large file transfers), the cumulative cost of rejected-then-retried RPCs can impact performance. The proper fix is to restore correct wsize/rsize negotiation so that RPC WRITE requests never exceed ca_maxrequestsize in the first place. The retry handling in kernel 7.0 is a valuable defense-in-depth mechanism and should be backported, but it should not be considered a substitute for fixing the negotiation logic. BISECTION RESULTS ----------------- Kernel bisection on the same machine, same NFS export, same default mount options. Only the kernel changed between reboots. Kernel Date wsize negotiated dd bs=1M Status ----------- ----------- ---------------- -------- ---------- 6.8.0-100 Feb 2026 1,047,532 OK OK 6.8.0-107 13 Mar 2026 1,047,532 OK Last good 6.8.0-110 19 Mar 2026 1,048,576 EIO First bad Git source confirmation: diff between tags Ubuntu-6.8.0-107.107 and Ubuntu-6.8.0-110.110 on git.launchpad.net confirms that commit ae0cf4493dd3 is the only change to fs/nfs/client.c affecting wsize/rsize handling. FULL CLIENT TEST MATRIX ----------------------- OS Kernel wsize dd bs=1M Result Note ------------- ----------- ---------- -------- ------ ---------------- Ubuntu 22.04 5.15.0-176 1,047,532 OK OK Reference Ubuntu 24.04 6.8.0-100 1,047,532 OK OK Ubuntu 24.04 6.8.0-107 1,047,532 OK OK Last good Ubuntu 24.04 6.8.0-110 1,048,576 EIO BUG First bad Ubuntu 24.04 6.8.0-110* 1,047,532 OK FIXED Patched nfs.ko Ubuntu 24.04 6.17.0-22 1,048,576 EIO BUG HWE Ubuntu 26.04 7.0 1,048,576 OK** Mitig. Retry masks bug * Patched nfs.ko with unconditional wsize/rsize copy restored. ** OK via retry after NFS4ERR_REQ_TOO_BIG. Negotiation bug still present. NFSv4.0: NOT affected (no sessions). NFSv3: NOT affected. EVIDENCE -------- ### tcpdump on kernel 6.8.0-110 (failing) 1222 2.970s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 2289 2.979s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 6622 3.015s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED [... 20+ repetitions, client never recovers, returns EIO ...] ### tcpdump on kernel 7.0 (mitigated, not fixed) 135 30.593s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 137 30.593s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED 296 44.304s nas -> client V4 Reply SEQUENCE: NFS4ERR_REQ_TOO_BIG 298 44.304s nas -> client V4 Reply SEQUENCE: NFS4ERR_SEQ_MISORDERED Write completes after retry. Errors still occur on every large write. ### nfsstat -m comparison Kernel wsize rsize ------------------ ----------------------- ----------------------- 5.15 / 6.8.0-107 1,047,532 (correct) 1,047,672 (correct) 6.8.0-110 1,048,576 (regression) 1,048,576 (regression) 6.8.0-110 patched 1,047,532 (fixed) 1,047,672 (fixed) 7.0 (26.04) 1,048,576 (still wrong) 1,048,576 (still wrong) ### Dichotomy tests - exact failure thresholds Kernel 6.8.0-110: bs = 1,044,480 (1020 KiB) -> OK (last passing) bs = 1,048,576 (1024 KiB) -> EIO (= negotiated wsize) Kernel 6.17.0-22 (HWE): bs = 1,048,328 -> OK (last passing) bs = 1,048,329 -> EIO (first failing) Delta: 248 bytes (vs ~1,044 on working kernels). Partial overhead subtraction. STEPS TO REPRODUCE ------------------ 1. Install Ubuntu 24.04 with kernel 6.8.0-110-generic (or later). 2. Mount an NFSv4.2 export from a server with Write Transfer Max Size = 1 MiB without specifying wsize: mount -t nfs4 -o vers=4.2 server:/export /mnt/nfs 3. Verify the negotiated wsize: nfsstat -m | grep wsize Result on -110: wsize=1048576 (BUG - raw value, no overhead subtracted) Result on -107: wsize=1047532 (correct - overhead subtracted) 4. Attempt a write: dd if=/dev/zero of=/mnt/nfs/test bs=1M count=1 conv=fsync 5. On -110: dd returns "Input/output error". On -107: dd succeeds. 6. Capture with tcpdump on port 2049 shows NFS4ERR_REQ_TOO_BIG from server. ENVIRONMENT ----------- Server platform: Dell PowerScale OneFS 9.7 Protocol: NFSv4.2 (also reproduces on NFSv4.1) Write Transfer Max Size: 1 MiB (1,048,576 bytes) Write Transfer Size (pref.): 512 KiB Write Transfer Multiple: 512 bytes Client mount options: Default (no explicit wsize/rsize) WORKAROUNDS ----------- Option 1 - Force wsize/rsize at mount time (recommended): mount -o vers=4.2,wsize=524288,rsize=524288 server:/export /mnt Sets the NFS_AUTOMOUNT_INHERIT_WSIZE flag. Configurable in /etc/fstab, autofs maps, or systemd mount units. Option 2 - Boot on kernel 6.8.0-107 or earlier: Not a long-term solution (missing security patches from -110+). Option 3 - Use NFSv4.0: mount -o vers=4.0 server:/export /mnt Not affected (no sessions). Loses NFSv4.1+ features. REQUESTED ACTION ---------------- Two complementary fixes are recommended: 1. Fix the regression in nfs_server_copy_userdata(): restore the unconditional copy of wsize/rsize while keeping the conditional copy of bsize. The proposed fix has been compiled, tested, and validated on kernel 6.8.0-110-generic. 2. Backport the NFS4ERR_REQ_TOO_BIG retry logic from kernel 7.0 as defense-in-depth. This should not be considered a substitute for correct negotiation, as it generates unnecessary rejected RPCs, sequence desynchronization, and added latency on every large write operation. Ubuntu 24.04 LTS is supported until 2029. This regression affects any NFSv4.1/4.2 deployment where the server's ca_maxrequestsize equals the maximum write payload size. PACKAGE VERSIONS ---------------- linux-image-6.8.0-110-generic 6.8.0-110.110 (first bad) linux-image-6.8.0-107-generic 6.8.0-107.107 (last good) nfs-common 1:2.6.4-3ubuntu5.1 libnfsidmap1 1:2.6.4-3ubuntu5.1 libtirpc3t64 1.3.4+ds-1.1build1 rpcbind 1.2.6-7ubuntu2 GIT SOURCE REFERENCE -------------------- Repository: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble Tags compared: Ubuntu-6.8.0-107.107 .. Ubuntu-6.8.0-110.110 Guilty commit: ae0cf4493dd3 NFS: Fix inheritance of the block sizes when automounting Files changed: fs/nfs/client.c (nfs_init_server, nfs_server_copy_userdata) RELATED REFERENCES ------------------- Upstream commit 943cff67b842: "NFSv4.1: Fix the r/wsize checking" by Trond Myklebust. RFC 8881, Section 18.36: CREATE_SESSION - defines ca_maxrequestsize and NFS4ERR_REQ_TOO_BIG. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2150318/+subscriptions
Комментариев нет:
Отправить комментарий