Skip to content

tests: better resilience for concurrent node starts#3011

Merged
prasannavl merged 1 commit into
masterfrom
pvl/fix-flaky-tests-cc
Aug 27, 2024
Merged

tests: better resilience for concurrent node starts#3011
prasannavl merged 1 commit into
masterfrom
pvl/fix-flaky-tests-cc

Conversation

@prasannavl

Copy link
Copy Markdown
Contributor

Summary

  • Fixes most node start failures due to high concurrency during normal operation.
  • Rationale: When concurrency levels are high, nodes end up failing to start due to the strict flaky way of checking of RPC start. While this is still not bulletproof, and if system is out of resources, there can still be other failures, however this adds significant resiliency for on-start check.
    • For most normal operation (eg: MAKE_JOBS=<no-of-logical-CPUs>), this provides a good default and avoids most false failures due to slightly delayed node starts.

Implications

  • Storage

    • Database reindex required
    • Database reindex optional
    • Database reindex not required
    • None
  • Consensus

    • Network upgrade required
    • Includes backward compatible changes
    • Includes consensus workarounds
    • Includes consensus refactors
    • None

@prasannavl prasannavl merged commit 94d1aab into master Aug 27, 2024
@prasannavl prasannavl deleted the pvl/fix-flaky-tests-cc branch August 27, 2024 04:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant