Enable fail-fast behavior for health checks#933
Merged
stefanprodan merged 1 commit intomainfrom Aug 2, 2023
Merged
Conversation
Fail the health check as soon as a resource becomes stalled without waiting for the timeout to expire. This behavior can be disabled using the `DisableFailFastBehavior` feature flag. Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
hiddeco
reviewed
Jul 31, 2023
Member
hiddeco
left a comment
There was a problem hiding this comment.
While this pull request by itself looks good to me, I am wondering if there is any thought and/or reasoning around this being a global flag versus a field in the API?
Member
Author
|
@hiddeco I think failing fast is how heath checking should behave. The feature flag is temporary, if no issues will arise, we'll remove it in a future minor version. |
hiddeco
approved these changes
Jul 31, 2023
makkes
approved these changes
Aug 1, 2023
Member
makkes
left a comment
There was a problem hiding this comment.
🎉 Would we be able to add a test for this feature?
Member
Author
|
@makkes fail-fast tests where added in the ssa package. |
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fail the health check as soon as a resource becomes stalled without waiting for the timeout to expire.
For example, if
wait: trueandtimeout: 20mand a Deployment has reached its deadline progressing in 5m, the controller will not wait for another 15m, it will fail the reconciliation when the Deployment rollout has stalled.Note that the fail-fast behavior does not currently work with HelmReleases as these don't have a stalled condition. We expect to ship stalled conditions in the HelmRelease API v2beta2.
Fix: fluxcd/flux2#3980
This behavior can be disabled using the
DisableFailFastBehaviorfeature flag.