Use Rust S3 SDK to upload self-profiles to S3 instead of an external aws script#2410
Merged
Kobzol merged 1 commit intorust-lang:masterfrom Feb 13, 2026
Merged
Use Rust S3 SDK to upload self-profiles to S3 instead of an external aws script#2410Kobzol merged 1 commit intorust-lang:masterfrom
aws script#2410Kobzol merged 1 commit intorust-lang:masterfrom
Conversation
28cb85a to
923084f
Compare
Member
|
Will people running the collector or the website, or building rustc-perf in the rust-lang/rust repo have to build this huge new number of dependencies? |
Member
Author
No :) There is no need for this for local benchmarking at all, nor for the website. Otherwise I wouldn't do it, the dependency set is really massive :) |
Member
|
Thank god. |
Mark-Simulacrum
approved these changes
Feb 13, 2026
Member
|
Not seeing anything particularly iffy about it, though timeouts obviously might need tweaking. |
Member
Author
|
Ok, let's try :) Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Right now, the time to run all benchmarks on a single collector is ~40 minutes. From the logs, 5-8 minutes from that (depending on the collector) is spent just on uploading the self-profiles to S3!
That is quite ludicrous and wasteful.
I think that we could remove this latency completely by overlapping the upload of the previous benchmark with the preparation of the next one. However, that is not trivial, and I think that we can also optimize the upload itself.
This PR moves the S3 upload from using an external Python
awsscript to using the Rust S3 SDK directly. We now also don't have to write the compressed self-profiles to disk. However, reading and writing the self-profiles was actually super fast, so I don't think that will help much.What should help the most is caching of the client. I did some local benchmarks, and found out that initiating the S3 connection costs a ~constant 1-2s latency delay. So even uploading e.g. 50 KiB of data took 1.5s, which is very slow and adds up quickly.
By caching the client in memory, that same upload now took just 0.2s, after the initial connection was made!
Hopefully keeping the client in memory won't cause some connection issues, but I hope that the S3 SDK should have that thing solved, with retries/reconnections in case of need, etc.
This also reduces another (maybe the last? except for
perfofc) external thing that has to exist to run the collectors. Since this adds ~100 dependencies to the build graph, it is hidden behind a feature that is only used on the actual collectors. Note that to read the S3 data, we just use normal HTTP requests, so the website does not use this feature.CC @Mark-Simulacrum to check if I'm holding the SDK right :) I've never used it before.
(The actual code change is quite small, most of the diff are lockfile changes)