[nextest-runner] correctly dup the fd with combined output#2316
Merged
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2316 +/- ##
==========================================
- Coverage 78.49% 78.47% -0.02%
==========================================
Files 105 105
Lines 24480 24484 +4
==========================================
- Hits 19216 19215 -1
- Misses 5264 5269 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Previously, for combined output, we were using the same file descriptor and storing it three times: 1. In `.stdout` 2. In `.stderr` 3. In `state.theirs`. `Stdio` consumes ownership of the fd, so this effectively meant that three `OwnedFd` instances were present, and fds would be dropped three times. Now in a single-threaded program this is relatively harmless -- Rust just gets `EBADF` for the latter two calls. But in multithreaded programs this was completely broken, with a fun failure mode on Linux. Since Tokio uses pidfds for process event handling on Linux, this caused missed wakeups in this scenario: 1. Let's say we allocated fd 20 as the `theirs` end provided to the child process `Stdio`. 2. In thread 1, we dropped the child, closing fd 20 the first time. 3. In thread 2, Tokio created a new pidfd and the kernel allocated fd 20 to this pidfd. 4. In thread 1, we continued cleaning up the child, closing fd 20 two more times. 5. The second fd 20 close caused epoll to no longer dispatch events to it. 6. Nextest then assumed the process was hung. To fix this: 1. Don't store `theirs` in `State` -- we don't need to. 2. Call `try_clone` to ensure that stdout and stderr get two separate fd numbers that refer to the same pipe.
sunshowers
added a commit
that referenced
this pull request
Apr 29, 2025
sunshowers
added a commit
that referenced
this pull request
Apr 29, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
Previously, for combined output, we were using the same file descriptor and storing it three times:
.stdout.stderrstate.theirs.Stdioconsumes ownership of the fd, so this effectively meant that threeOwnedFdinstances were present, and fds would be dropped three times.Now in a single-threaded program this is relatively harmless -- Rust just gets
EBADFfor the latter two calls. But in multithreaded programs this was completely broken, with a bit of a horrific race condition on Linux. Since Tokio uses pidfds for process event handling on Linux, this caused missed wakeups in this scenario:theirsend provided to the child processStdio.The fix
To fix the bug:
theirsinState-- we don't need it, and in fact had anexpect(dead_code)on it.try_cloneto ensure that stdout and stderr get two separate fd numbers that refer to the same pipe.Notes
Fixes #2295.