feat: retry with exponential backoff for DynamoDb interaction#1975
feat: retry with exponential backoff for DynamoDb interaction#1975rtyler merged 2 commits intodelta-io:mainfrom
Conversation
|
I decided against using the existing retry logic from
@rtyler: We're currently only retrying
In theory, this indicates that the error is also transient, but I'm not an AWS expert enough to understand if it's sensible to handle this as well. |
rtyler
left a comment
There was a problem hiding this comment.
I think pulling the backoff crate into deltalake-aws is reasonable. The nice thing about the subcrates is that this new dependency only affects AWS users 😄
There was a problem hiding this comment.
The max elapsed interval of 60 I am assuming is just a number that you've picked?
I think this is something that will likely need to be configurable since different systems would have a different tolerance here. Based on how retry is being invoked I don't see a clear path to pulling configuration down into this call 🤔
There was a problem hiding this comment.
Yes, these numbers would ideally be configurable. It seems other aspects of delta-rs, including the locking provider choice itself, are being configured via environment variables, so I guess that's the way to go.
My question would mostly be if we want all five of possible parameters to be configurable:
- max elapsed time
- max interval
- multiplier
- initial interval (using defaults in this PR)
- randomization factor (using defaults in this PR)
The most obvious one is probably the one you singled out, max elapsed time, but there's probably a user out there for any of these :).
There was a problem hiding this comment.
@dispanser I agree there are folks likely that would want to tune every single one of these...maybe.
I think making an environment variable for max elapsed time is the only one really important to implement to merge this. The others people may want, or not, but they should make themselves known in the future 😄
There was a problem hiding this comment.
I added an environment variable for the max elapsed time setting.
I'd address the diverging docs in a separate PR, they are already way off anyways :)
d478153 to
3b4d6d4
Compare
3b4d6d4 to
fa0bc7d
Compare
|
I think the failing test (macos, |
82eeaca to
760b205
Compare
|
@dispanser please rebase this on the latest main with the mega refactor 😄 |
760b205 to
251dded
Compare
We use an external crate, [backoff](https://crates.io/crates/backoff), to retry DynamoDb read and write operations when the error in the response is `ProvisionedThroughPutExceeded`, indicating an overload of DynamoDb wrt to the configured read and write capacity.
251dded to
caa6fb8
Compare
|
@rtyler rebase is done, all tests pass locally. |
caa6fb8 to
21338cd
Compare
Description
We use an external crate, backoff, to retry DynamoDb read and write operations when the error in the response is
ProvisionedThroughPutExceeded, indicating an overload of DynamoDb wrt to the configured read and write capacity.