Replace custom rate limit header parsing with rate-limits crate#2135
Replace custom rate limit header parsing with rate-limits crate#2135
Conversation
- Remove internal header parsing logic in favor of the rate-limits crate - Update dependencies and remove unused httpdate crate - Add integration tests for rate limit handling
| rust-version = "1.88.0" | ||
|
|
||
| [dependencies] | ||
| rate-limits = { git = "https://github.com/mre/rate-limits", branch = "parser" } |
There was a problem hiding this comment.
Will replace this with the version from crates.io, once mre/rate-limits#11 got released.
There was a problem hiding this comment.
Sounds good. Should I review the PR over there?
There was a problem hiding this comment.
Welll you can, but beware that it's a big diff. So it's optional, but highly appreciated. 😆
| ], | ||
| &["/1", "/2"], | ||
| // Give 100ms leeway for scheduler flakiness in CI environments | ||
| Duration::from_millis(1900), |
There was a problem hiding this comment.
I don't understand these numbers here. reset_time indicates 3 seconds. And here we test with 2 seconds. It should be better documented if this is no mistake. Same is true for the other tests.
The 100ms could be moved as a constant into the test function, hidden away for better abstraction.
Note that there is a test called test_retry_rate_limit_headers in cli.rs. Did you see it? Maybe it can be incorporated?
There was a problem hiding this comment.
Remember that this includes the time to spawn lychee as a completely new process. If we set the reset time to 2s, the actual sleep duration inside lychee would be ~900ms and the test could randomly fail. 🤔 I wonder if I can isolate that part somehow to make the tests clearer...
There was a problem hiding this comment.
Hmm, I still don't really understand..
reset_time is in seconds and has + 3 and now we assert against 1900ms. I don't understand the relation between the numbers. I think it should be explained somehow. Also it would be nice if the expected number (1900ms) is calculated instead of hardcoded if that is possible.
| let elapsed = start.elapsed(); | ||
|
|
||
| assert!( | ||
| elapsed >= expected_min_duration, |
There was a problem hiding this comment.
I think we should also assert the other way (<) similar as in test_retry_rate_limit_headers and use some sort of tolerance. Currently we wouldn't notice if the duration is much bigger than expected unless the test gets really slow at some point.
| if rate_limit.is_limited() { | ||
| let duration = rate_limit.reset().duration(); | ||
| if !duration.is_zero() { | ||
| self.increase_backoff(duration); |
There was a problem hiding this comment.
Code here looks reasonable. I also took a squiz at mre/rate-limits#11.
| 200, | ||
| &[("Retry-After", "2")], | ||
| &["/1", "/2"], | ||
| Duration::from_millis(1900), |
There was a problem hiding this comment.
1900 is a bit magical. Can the run_rate_limit_test take the predicted ideal duration then apply a threshold in its own code? edit:oh @ thomas-zahner already got this in his comment.
This migrates our rate-limit handling to the rate-limits crate.
This PR is currently in draft, but I'd love to get some feedback on the code and the rewrite of the rate-limits crate in mre/rate-limits#11. I rewrote it from scratch to be more efficient as well as use a state-machine under the hood for more reliable vendor detection.
I tested it locally as best as I could and so far I haven't seen any regressions.
Once all looks good, I will release a new version of the rate-limits crate based on this new approach and will remove the "draft" setting on this PR. We could also move the crate into the lycheeverse org, which I believe will make shared maintenance / bugfix releases easier in the future. wdyt?
Edit: while refactoring the code, I also decided to remove the 80% threshold for throttling request, which we had before. I don't think it's helpful. Instead, I now believe, we should go to the maximum number of allowed requests and get throttled. AFAICT, there is no penalty for that, and we could squeeze out some more performance this way. Once we get throttled, we just wait until we get unblocked. From what I understand, this is how to work with those headers correctly.
Closes #2027