Add support for reading symlinks longer than PATH_MAX to readlink and readlinkat#1231
Conversation
asomers
left a comment
There was a problem hiding this comment.
I just checked, and I can't create a too-long symlink on FreeBSD, NetBSD, OpenBSD, or Solaris. Is there any operating system that does allow it?
| Ok(len) => { | ||
| unsafe { v.set_len(len as usize) } | ||
| v.shrink_to_fit(); | ||
| Ok(OsString::from_vec(v.to_vec())) |
There was a problem hiding this comment.
Not your fault, but I notice that this line is doing a data copy. Could you eliminate the data copy by taking the v argument by value instead of by reference?
| unsafe { libc::readlink(cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t) } | ||
| unsafe { | ||
| match dirfd { | ||
| Some(dirfd) => libc::readlinkat(dirfd, cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t), |
There was a problem hiding this comment.
What is this project's wrap width? (I normally wrap to 80 columns, but this project clearly has a wider convention.)
Edit: Apparently the rest of the file is almost all wrapped to 80 columns, and only the code I was editing wasn't. Wrapped all the code I touched to 80 columns.
| // Uh oh, the result is too long... | ||
| // Let's try to ask lstat how many bytes to allocate. | ||
| let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0); | ||
| let mut try_size = if reported_size > 0 { reported_size as usize + 1 } |
There was a problem hiding this comment.
As a matter of style, I prefer
if condition {
something
} else {
something_else
}rather than cramming everything onto two lines
| // If lstat doesn't cooperate, or reports an error, be a little less precise. | ||
| else { (libc::PATH_MAX as usize).max(128) << 1 }; | ||
| loop { | ||
| v.reserve_exact(try_size); |
There was a problem hiding this comment.
Interesting. I think this may be the first good use of reserve_exact I've seen.
| } | ||
| else { | ||
| // Ugh! Still not big enough! | ||
| let next_size = try_size << 1; |
There was a problem hiding this comment.
I don't think we need to gracefully handle symlinks with paths greater than 32 bits. A panic would suffice. You can do
try_size = try_size.checked_mul(2).unwrap();There was a problem hiding this comment.
I think it's better to return an error in this case (as below), but if you ask me to I'll replace that logic with a checked multiply.
There was a problem hiding this comment.
In fact, I think this code will still panic if the size overflows 32 bits. I think << , like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl.
| else { (libc::PATH_MAX as usize).max(128) << 1 }; | ||
| loop { | ||
| v.reserve_exact(try_size); | ||
| let res = path.with_nix_path(|cstr| { |
There was a problem hiding this comment.
These 13 lines are duped. Do you think you could deduplicate them?
There was a problem hiding this comment.
I think the wrong lines got selected. Do you mean L197-L204 and L218-L225?
| // Uh oh, the result is too long... | ||
| // Let's try to ask lstat how many bytes to allocate. | ||
| let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0); | ||
| let mut try_size = if reported_size > 0 { reported_size as usize + 1 } |
There was a problem hiding this comment.
Instead of comparing reported_size to 0, you may as well compare it to PATH_MAX. After all, lstat reports less than what we already tried, then we shouldn't trust lstat's output.
There was a problem hiding this comment.
Trusting lstat after a failed readlink handles a race condition where the link size changes between the first readlink and the first lstat.
On Linux, as far as I can tell, it's up to the particular filesystem. None of [ext*, XFS, btrfs] support very long symlinks, though. |
|
Thank you for the thorough review. I have implemented every requested change except the panic on absurdly long symlinks. I'll implement that too while I'm "in the code" if you like. (I do agree that a >4GiB symlink is absolutely, without a doubt useless, but I also subscribe to the GNU convention of avoiding arbitrary limits where possible, even reasonable ones.) |
| fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> { | ||
| fn wrap_readlink_result(mut v: Vec<u8>, res: ssize_t) -> Result<OsString> { | ||
| match Errno::result(res) { | ||
| Err(err) => Err(err), |
There was a problem hiding this comment.
This line is dead code. We only ever call wrap_readlink_result when res is >= 0, so you may as well eliminate the Errno::result part. I think that issued predated your changes, but since you're here would you please fix it?
| // Uh oh, the result is too long... | ||
| // Let's try to ask lstat how many bytes to allocate. | ||
| let reported_size = super::sys::stat::lstat(path) | ||
| .and_then(|x| Ok(x.st_size)).unwrap_or(0); |
There was a problem hiding this comment.
You could simply this line into .map_or(0, |x| x.st_size)
| } | ||
| else { | ||
| // Ugh! Still not big enough! | ||
| let next_size = try_size << 1; |
There was a problem hiding this comment.
In fact, I think this code will still panic if the size overflows 32 bits. I think << , like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl.
It seems that, barring obviously foolish cases caught by the compiler, bit shifting wraps in both debug mode and release mode. Guess Rust's designers figured that overflow in bit shifting is likely to be intentional. Regardless, I replaced that code with Thanks again for the feedback. I've pushed the requested changes. If you think of any other tweaks you'd like me to make, let me know. |
|
Huh. Using |
Yes. Nix's MSRV is currently 1.36.0. We're pretty conservative with MSRV because we're consumed by platforms that are slow to update, like OpenBSD and embedded systems. Could you please remove |
|
Done, and now all the checks pass again. |
|
Great! Now all it needs is a squash. |
|
Should I be squashing it on my end? I assumed this would be a case for the "Squash and merge" button. |
|
Yes you should squash it. Because we use the bors merge bot, we cannot squash on our end. |
… and `readlinkat`
6fdd54d to
dfee424
Compare
|
There. Hopefully I did that properly. |
|
Build succeeded: |
This is in response to issue #1178.
The new logic uses the following approach.
readlinkreturns an error, or a value ≥ 0 and < (not ≤!) the buffer size, we're done.readlinkinto aPATH_MAXsized buffer. (This will almost always succeed, and saves a system call over callinglstatfirst.)lstatthe link. If it succeeds and returns a sane value, allocate the buffer to be that large plus one byte. Otherwise, allocate the buffer to bePATH_MAX.max(128) << 1bytes.readlink. Any time its result is ≥ (not >!) the buffer size, double the buffer size and try again.While testing this, I discovered that ext4 doesn't allow creation of a symlink > 4095 (Linux's
PATH_MAXminus one) bytes long. This is in spite of Linux happily allowing paths in other contexts to be longer than this—including on ext4! This was probably instated to avoid breaking programs that assumePATH_MAXwill always be enough, but ironically hindered my attempt to test support for not assuming. I tested the code using an artificially smallPATH_MAXand (separately) a wired-to-faillstat.straceshowed the code behaving precisely as expected. Unfortunately, I can't add an automatic test for this.Other changes made by this PR:
wrap_readlink_resultnow callsshrink_to_fiton the buffer before returning, potentially reclaiming kilobytes of memory per call. This could be very important if the returned buffer is long-lived.readlinkandreadlink_atnow both call aninner_readlinkfunction that contains the bulk of the logic, avoiding copy-pasting of code. (This is much more important now that the logic is more than a few lines long.)Notably, this PR does not add support for systems that don't define
PATH_MAXat all. As far as I know, I don't have access to any POSIX-ish OS that doesn't havePATH_MAX, and I suspect it would have other compatibility issues withnixanyway.