CVE-2025-39756
Published: Sep 11, 2025
Modified: May 12, 2026
Description
In the Linux kernel, the following vulnerability has been resolved: fs: Prevent file descriptor table allocations exceeding INT_MAX When sysctl_nr_open is set to a very high value (for example, 1073741816 as set by systemd), processes attempting to use file descriptors near the limit can trigger massive memory allocation attempts that exceed INT_MAX, resulting in a WARNING in mm/slub.c: WARNING: CPU: 0 PID: 44 at mm/slub.c:5027 __kvmalloc_node_noprof+0x21a/0x288 This happens because kvmalloc_array() and kvmalloc() check if the requested size exceeds INT_MAX and emit a warning when the allocation is not flagged with __GFP_NOWARN. Specifically, when nr_open is set to 1073741816 (0x3ffffff8) and a process calls dup2(oldfd, 1073741880), the kernel attempts to allocate: - File descriptor array: 1073741880 * 8 bytes = 8,589,935,040 bytes - Multiple bitmaps: ~400MB - Total allocation size: > 8GB (exceeding INT_MAX = 2,147,483,647) Reproducer: 1. Set /proc/sys/fs/nr_open to 1073741816: # echo 1073741816 > /proc/sys/fs/nr_open 2. Run a program that uses a high file descriptor: #include <unistd.h> #include <sys/resource.h> int main() { struct rlimit rlim = {1073741824, 1073741824}; setrlimit(RLIMIT_NOFILE, &rlim); dup2(2, 1073741880); // Triggers the warning return 0; } 3. Observe WARNING in dmesg at mm/slub.c:5027 systemd commit a8b627a introduced automatic bumping of fs.nr_open to the maximum possible value. The rationale was that systems with memory control groups (memcg) no longer need separate file descriptor limits since memory is properly accounted. However, this change overlooked that: 1. The kernel's allocation functions still enforce INT_MAX as a maximum size regardless of memcg accounting 2. Programs and tests that legitimately test file descriptor limits can inadvertently trigger massive allocations 3. The resulting allocations (>8GB) are impractical and will always fail systemd's algorithm starts with INT_MAX and keeps halving the value until the kernel accepts it. On most systems, this results in nr_open being set to 1073741816 (0x3ffffff8), which is just under 1GB of file descriptors. While processes rarely use file descriptors near this limit in normal operation, certain selftests (like tools/testing/selftests/core/unshare_test.c) and programs that test file descriptor limits can trigger this issue. Fix this by adding a check in alloc_fdtable() to ensure the requested allocation size does not exceed INT_MAX. This causes the operation to fail with -EMFILE instead of triggering a kernel warning and avoids the impractical >8GB memory allocation request.
| Vendor | Product | Versions |
|---|---|---|
Linux | Linux | affected 9cfe015aa424b3c003baba3841a60dd9b5ad319b - < b4159c5a90c03f8acd3de345a7f5fc63b0909818affected 9cfe015aa424b3c003baba3841a60dd9b5ad319b - < f95638a8f22eba307dceddf5aef9ae2326bbcf98affected 9cfe015aa424b3c003baba3841a60dd9b5ad319b - < 749528086620f8012b83ae032a80f6ffa80c45cdaffected 9cfe015aa424b3c003baba3841a60dd9b5ad319b - < 628fc28f42d979f36dbf75a6129ac7730e30c04eaffected 9cfe015aa424b3c003baba3841a60dd9b5ad319b - < 237e416eb62101f21b28c9e6e564d10efe1ecc6f+4 more versions |
Linux | Linux | affected 2.6.25unaffected 0 - < 2.6.25unaffected 5.4.297 - <= 5.4.*unaffected 5.10.241 - <= 5.10.*unaffected 5.15.190 - <= 5.15.*+6 more versions |
References
Security Training
Train your team to recognize and prevent security threats with our comprehensive security awareness program.
Start TrainingVulnerability Scanning
Discover vulnerabilities in your applications and infrastructure before attackers do.
Scan Now