fix: add timeout to dns check#6977
Conversation
WalkthroughThis update simplifies the interval tick handling logic within the Tari Pulse service. It changes the tick behavior for DNS and health check intervals to skip missed ticks rather than delay them, removes logic related to tracking and skipping DNS ticks, and adds a timeout to the checkpoint fetching process. If fetching checkpoints times out, the service now logs a warning and assumes the node is not behind. Additionally, notification sending errors are now logged as warnings instead of causing a panic. No public interfaces or exported function signatures were altered. Changes
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
base_layer/core/src/base_node/tari_pulse_service/mod.rs (1)
195-206: Good implementation of timeout for DNS checkpoint fetchingAdding a 1-second timeout prevents the DNS check from hanging indefinitely when the DNS service is unresponsive. The error handling is comprehensive with appropriate fallbacks.
Consider making the timeout duration configurable through the
TariPulseConfiginstead of hardcoding it:pub struct TariPulseConfig { pub dns_check_interval: Duration, pub liveness_interval: Duration, pub network: Network, + pub dns_timeout: Duration, } impl Default for TariPulseConfig { fn default() -> Self { Self { dns_check_interval: Duration::from_secs(120), liveness_interval: Duration::from_secs(60 * 10), network: Network::default(), + dns_timeout: Duration::from_secs(1), } } }Then use it in the
passed_checkpointsmethod:- let dns_checkpoints = match timeout(Duration::from_secs(1), self.fetch_checkpoints()).await { + let dns_checkpoints = match timeout(self.config.dns_timeout, self.fetch_checkpoints()).await {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
base_layer/core/src/base_node/tari_pulse_service/mod.rs(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci
🔇 Additional comments (3)
base_layer/core/src/base_node/tari_pulse_service/mod.rs (3)
44-44: Good addition of the timeout importAdding the
timeoutimport alongsideMissedTickBehaviorkeeps the imports organized and focused on what's needed.
127-127: Improvement in missed tick handlingChanging from
DelaytoSkipfor both DNS and health check intervals is a good improvement. When the system is under load and misses interval ticks, it will now skip to the next scheduled tick rather than trying to catch up with delayed ticks, which prevents work backlogs.Also applies to: 131-131
175-178: Better error handling for notification sendingThe improved error handling with
inspect_erris more robust than the previous implementation. By logging a warning instead of potentially panicking when notification sending fails, the service can continue operating even when the receiver disconnects.
Adds a timeout to pulse dns check and replaces skipped interval with skipped instead of delay
Summary by CodeRabbit