feat: Conditionally emit metrics based on enablement#19903
Conversation
WalkthroughWalkthroughThe changes focus on optimizing the telemetry system in the Changes
Assessment against linked issues
Recent Review DetailsConfiguration used: .coderabbit.yml Files selected for processing (1)
Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
Review Status
Configuration used: .coderabbit.yml
Files selected for processing (15)
- CHANGELOG.md (1 hunks)
- baseapp/abci.go (1 hunks)
- telemetry/metrics.go (3 hunks)
- telemetry/wrapper.go (3 hunks)
- telemetry/wrapper_test.go (1 hunks)
- x/circuit/module.go (2 hunks)
- x/crisis/abci.go (2 hunks)
- x/crisis/module.go (2 hunks)
- x/distribution/keeper/abci.go (2 hunks)
- x/evidence/keeper/abci.go (2 hunks)
- x/gov/keeper/abci.go (1 hunks)
- x/mint/keeper/abci.go (2 hunks)
- x/slashing/abci.go (2 hunks)
- x/staking/keeper/abci.go (2 hunks)
- x/upgrade/keeper/abci.go (2 hunks)
Additional Context Used
Path-based Instructions (15)
x/crisis/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/staking/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/slashing/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/evidence/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/distribution/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.telemetry/wrapper_test.go (2)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
**/*_test.go:
Assess the unit test code assessing sufficient code coverage for the changes associated in the pull requestx/mint/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.telemetry/wrapper.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/circuit/module.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/upgrade/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/crisis/module.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.telemetry/metrics.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.x/gov/keeper/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.baseapp/abci.go (1)
**/*.go:
Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.CHANGELOG.md (1)
**/*.md:
Assess the documentation for misspellings, grammatical errors, missing documentation and correctness
Additional comments not posted (16)
x/crisis/abci.go (1)
14-14: The replacement oftime.Now()withtelemetry.Now()in theEndBlockerfunction aligns with the PR's objective to conditionally measure time based on telemetry enablement. This change ensures that time is only measured when telemetry is active, optimizing performance.x/staking/keeper/abci.go (2)
15-15: The use oftelemetry.Now()at the beginning of theBeginBlockerfunction is a good practice for conditionally measuring execution time based on telemetry enablement, contributing to the PR's efficiency improvements.
22-22: Similarly, in theEndBlockerfunction, replacingtime.Now()withtelemetry.Now()ensures that time measurements are only taken when telemetry is enabled, aligning with the PR's objectives for performance optimization.x/slashing/abci.go (1)
16-16: Replacingtime.Now()withtelemetry.Now()in theBeginBlockerfunction is a strategic change that ensures time measurements are only performed when telemetry is enabled, which is in line with the PR's goal to enhance telemetry efficiency.x/evidence/keeper/abci.go (1)
17-17: Utilizingtelemetry.Now()in theBeginBlockerfunction to measure execution time only when telemetry is enabled is a thoughtful approach that aligns with the PR's objectives of optimizing telemetry operations.x/distribution/keeper/abci.go (1)
14-14: The introduction oftelemetry.Now()in theBeginBlockerfunction to conditionally measure time based on telemetry enablement is a positive change that supports the PR's goal of reducing unnecessary performance costs.telemetry/wrapper_test.go (2)
25-36: The unit test for theNowfunction correctly tests both scenarios: when telemetry is enabled and when it is disabled. This ensures that theNowfunction behaves as expected under both conditions.
39-48: The unit test forisTelemetryEnabledfunction accurately checks the function's return value based on the global telemetry enablement state, ensuring the function's correctness.x/mint/keeper/abci.go (1)
15-15: In theBeginBlockerfunction, replacingtime.Now()withtelemetry.Now()is a strategic change that ensures time measurements are only performed when telemetry is enabled, aligning with the PR's objectives for optimizing telemetry operations.telemetry/wrapper.go (2)
27-29: Adding conditional checks for telemetry enablement inModuleMeasureSinceand other telemetry functions is a crucial improvement that ensures telemetry operations are only performed when necessary, aligning with the PR's efficiency goals.
103-109: The implementation of theNowfunction, which returns the current time if telemetry is enabled or a zero time otherwise, is a key addition that supports conditional time measurements across the SDK, in line with the PR's objectives.telemetry/metrics.go (2)
18-31: The introduction of theglobalTelemetryEnabledvariable and theinitTelemetryandisTelemetryEnabledfunctions are well-implemented for managing telemetry enablement state in a thread-safe manner.
114-114: The call toinitTelemetrywithin theNewfunction correctly initializes the telemetry enablement state, ensuring subsequent telemetry operations behave as expected.baseapp/abci.go (1)
180-181: The adjustment to measure the query processing time by storing the start time before the defer statement is a good practice for accurate timing measurements. This change ensures that the time measurement includes the entire scope of theQuerymethod execution.CHANGELOG.md (2)
62-65: The summary of changes related to the telemetry module is clear and accurately reflects the PR objectives. It provides a concise overview of the enhancements made, including the introduction of theNowfunction, the implementation of an atomic global variable, and the conditional emission of telemetry metrics. Well done.
65-65: The entries regarding unrelated changes to thetypesandservermodules are also well-documented. They provide clear information about the changes made, contributing to the overall completeness of the changelog. Good job on maintaining clarity and relevance in documenting these changes.
| // InitGenesis performs genesis initialization for the circuit module. | ||
| func (am AppModule) InitGenesis(ctx context.Context, data json.RawMessage) error { | ||
| start := time.Now() | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
The change from using time.Now() to telemetry.Now() in the InitGenesis function aligns with the PR's objective to optimize telemetry operations. However, it would be beneficial to add a comment explaining the reason for this change, enhancing code maintainability.
+ // Using telemetry.Now() to optimize performance when telemetry is disabled
start := telemetry.Now()Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| start := telemetry.Now() | |
| // Using telemetry.Now() to optimize performance when telemetry is disabled | |
| start := telemetry.Now() |
| // skipUpgradeHeightArray is a set of block heights for which the upgrade must be skipped | ||
| func (k Keeper) PreBlocker(ctx context.Context) error { | ||
| defer telemetry.ModuleMeasureSince(types.ModuleName, time.Now(), telemetry.MetricKeyBeginBlocker) | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
The update to use telemetry.Now() instead of time.Now() in the PreBlocker function is a good optimization when telemetry is disabled. Consider adding a comment to explain this choice for future code maintainers.
+ // Using telemetry.Now() to optimize performance when telemetry is disabled
start := telemetry.Now()Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| start := telemetry.Now() | |
| // Using telemetry.Now() to optimize performance when telemetry is disabled | |
| start := telemetry.Now() |
| // InitGenesis performs genesis initialization for the crisis module. | ||
| func (am AppModule) InitGenesis(ctx context.Context, data json.RawMessage) error { | ||
| start := time.Now() | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
Replacing time.Now() with telemetry.Now() in the InitGenesis function aligns with the PR's goal of optimizing telemetry operations. Adding a comment to explain this choice would enhance code maintainability.
+ // Using telemetry.Now() to optimize performance when telemetry is disabled
start := telemetry.Now()Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| start := telemetry.Now() | |
| // Using telemetry.Now() to optimize performance when telemetry is disabled | |
| start := telemetry.Now() |
| // EndBlocker is called every block. | ||
| func (k Keeper) EndBlocker(ctx context.Context) error { | ||
| defer telemetry.ModuleMeasureSince(types.ModuleName, time.Now(), telemetry.MetricKeyEndBlocker) | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
The adjustment to use telemetry.Now() at the start of the EndBlocker function is a good optimization for telemetry performance. Adding a comment to explain this choice would be helpful for future code maintainers.
+ // Using telemetry.Now() to optimize performance when telemetry is disabled
start := telemetry.Now()Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| start := telemetry.Now() | |
| // Using telemetry.Now() to optimize performance when telemetry is disabled | |
| start := telemetry.Now() |
| // check all registered invariants | ||
| func EndBlocker(ctx context.Context, k keeper.Keeper) { | ||
| defer telemetry.ModuleMeasureSince(types.ModuleName, time.Now(), telemetry.MetricKeyEndBlocker) | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
this would take the time here and pass it into the defer later on, is that expected?
There was a problem hiding this comment.
thanks for the comment! if time.Now() is called within the defer statement, it takes the timestamp at the point where the defer is declared, not when the function exits. This leads to an inaccurate duration measurement because it doesn't reflect the actual time span of the operations we're interested in measuring, I realized that from @alexanderbez comment (here). By assigning time.Now() to start outside of defer, we ensure that we're measuring the duration of the function's operations accurately
There was a problem hiding this comment.
The statement is not correct. The parameters are evaluated when the function is deferred already and not on execution. See demo or stackoverflow
Inlined or not is personal preference.
There was a problem hiding this comment.
💯 the parameters are evaluated and closed over when the function is deferred, not executed, I prefer the prior syntax (not using a local var) since it's fewer LoC.
There was a problem hiding this comment.
Thanks you both! So I'll use the prior syntax!
alpe
left a comment
There was a problem hiding this comment.
Good work on the telemetry problem. Especially avoiding downstream calls to `metrics.* when deactivated.
I was wondering if the atomic Bool can be avoided. Other than this, no blockers but minor notes or nits.
|
|
||
| // globalTelemetryEnabled is a private variable that stores the telemetry enabled state. | ||
| // It is set on initialization and does not change for the lifetime of the program. | ||
| var globalTelemetryEnabled atomic.Bool |
There was a problem hiding this comment.
Good use of global var but why do you use an atomic bool when it is not modified after init? Could be a simple bool.
|
|
||
| // initTelemetry sets the global variable based on the configuration. | ||
| // It is called only once, at startup, to set the telemetry enabled state. | ||
| func initTelemetry(enabled bool) { |
There was a problem hiding this comment.
Nit: this is called only once (beside tests). No need for a function IMHO
|
|
||
| // New creates a new instance of Metrics | ||
| func New(cfg Config) (_ *Metrics, rerr error) { | ||
| initTelemetry(cfg.Enabled) |
| } | ||
|
|
||
| // isTelemetryEnabled provides controlled access to check if telemetry is enabled. | ||
| func isTelemetryEnabled() bool { |
There was a problem hiding this comment.
personal preference: with a standard bool, this method can be inlined. If you want to provide the information to other packages or modules, it can make sense to make this public though.
| // It includes a mutex to ensure that tests that depend on the global state | ||
| // do not run in parallel, which can cause race conditions and unpredictable results. | ||
| type TelemetrySuite struct { | ||
| suite.Suite |
There was a problem hiding this comment.
personal preference: the suite type adds a lot of complexity and boiler plate code to the test cases compared to vanilla go + testify asserts.
There was a problem hiding this comment.
I share this preference.
| // check all registered invariants | ||
| func EndBlocker(ctx context.Context, k keeper.Keeper) { | ||
| defer telemetry.ModuleMeasureSince(types.ModuleName, time.Now(), telemetry.MetricKeyEndBlocker) | ||
| start := telemetry.Now() |
There was a problem hiding this comment.
The statement is not correct. The parameters are evaluated when the function is deferred already and not on execution. See demo or stackoverflow
Inlined or not is personal preference.
|
|
||
| // New creates a new instance of Metrics | ||
| func New(cfg Config) (_ *Metrics, rerr error) { | ||
| initTelemetry(cfg.Enabled) |
There was a problem hiding this comment.
Please note: the constructor is called by the server/start.go only when cfg.Telemetry is enabled. Any solution must work without this constructor being called or the file must be edited.
There was a problem hiding this comment.
Actionable comments posted: 1
Actionable comments outside the diff hunks (1)
server/start.go (1)
Line range hint
1000-1000: Removing the conditional check for telemetry enablement instartTelemetryfunction may lead to telemetry being initialized even when it is disabled in the configuration. This could result in unnecessary resource usage and potential privacy concerns if telemetry data is collected without explicit enablement.- if cfg.Telemetry.Enabled { + if true {
| // globalTelemetryEnabled is a private variable that stores the telemetry enabled state. | ||
| // It is set on initialization and does not change for the lifetime of the program. | ||
| var globalTelemetryEnabled bool | ||
|
|
||
| // IsTelemetryEnabled provides controlled access to check if telemetry is enabled. | ||
| func IsTelemetryEnabled() bool { | ||
| return globalTelemetryEnabled | ||
| } |
There was a problem hiding this comment.
The introduction of globalTelemetryEnabled and the IsTelemetryEnabled() function are well-implemented. However, consider changing globalTelemetryEnabled from an atomic type to a regular boolean since it is not modified after initialization.
- var globalTelemetryEnabled bool
+ var globalTelemetryEnabled atomic.BoolCommittable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| // globalTelemetryEnabled is a private variable that stores the telemetry enabled state. | |
| // It is set on initialization and does not change for the lifetime of the program. | |
| var globalTelemetryEnabled bool | |
| // IsTelemetryEnabled provides controlled access to check if telemetry is enabled. | |
| func IsTelemetryEnabled() bool { | |
| return globalTelemetryEnabled | |
| } | |
| // globalTelemetryEnabled is a private variable that stores the telemetry enabled state. | |
| // It is set on initialization and does not change for the lifetime of the program. | |
| var globalTelemetryEnabled atomic.Bool | |
| // IsTelemetryEnabled provides controlled access to check if telemetry is enabled. | |
| func IsTelemetryEnabled() bool { | |
| return globalTelemetryEnabled | |
| } |
alexanderbez
left a comment
There was a problem hiding this comment.
Whoops I meant to approve this a while ago 😆
Description
Closes: #10245
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking changeCHANGELOG.mdReviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
Summary by CodeRabbit
New Features
Nowfunction in the telemetry module to conditionally emit metrics, enhancing performance by reducing unnecessary operations.BaseApp,circuit,crisis,distribution,evidence,gov,mint,slashing,staking,upgrade) for more accurate tracking of query processing and module execution times.Refactor
timepackage with the custom telemetry package for time-related operations in several modules, ensuring consistent and accurate performance metrics.Tests
TelemetrySuiteto validate the functionality of telemetry-related functions, ensuring correct behavior when telemetry is enabled and disabled.