Refactor GaugeBucketCollector metrics to be homeserver-scoped#18715
Refactor GaugeBucketCollector metrics to be homeserver-scoped#18715MadLittleMods merged 17 commits intodevelopfrom
GaugeBucketCollector metrics to be homeserver-scoped#18715Conversation
devonh
left a comment
There was a problem hiding this comment.
What if people add metrics in the future that use the upstream GaugeHistogram type and we don't catch that it doesn't have the server_name label?
| def add_metric( | ||
| self, | ||
| labelvalues: StrSequence, | ||
| buckets: Sequence[Tuple[str, float]], | ||
| gsum_value: float, | ||
| timestamp: Optional[Union[float, Timestamp]] = None, | ||
| ) -> None: |
There was a problem hiding this comment.
Is this function exactly the same as from the upstream code?
It looks like it is.
If so, could we not just inherit from GaugeHistogramMetricFamily directly (instead of Metric) to not need to duplicate this code?
It's already not ideal to need to almost duplicate the __init__ function.
There was a problem hiding this comment.
I've updated to inherit from GaugeHistogramMetricFamily. It seems a bit brittle as we're just relying on setting self._labelnames = tuple(labelnames) and hoping the super implementation does something with it.
I think I'd rather just have our own full-custom version but will go forward with your preference.
There was a problem hiding this comment.
I see your point. Both cases have a level of fragility. Since if the upstream version changes something else about how it is adding metrics, then we have no way of knowing we should update our copied code snippet.
But either way, I think this PR is good to go. If we see this being too fragile the way it is, we can always revisit. Since it's just a metric label we aren't dealing with a potential Synapse functionality breakage so the stakes are low.
| @@ -0,0 +1 @@ | |||
| Refactor `GaugeBucketCollector` metrics to be homeserver-scoped. | |||
There was a problem hiding this comment.
What if people add metrics in the future that use the upstream GaugeHistogram type and we don't catch that it doesn't have the server_name label?
The linting for this is coming as part of #18733
It doesn't really matter whether lints are introduced along with the changes as when we introduce the lints, it will catch whatever is leftover. To be clear, I think this PR is exhaustive but I'm just pointing out that it isn't a hard requirement.
devonh
left a comment
There was a problem hiding this comment.
Small request to comment the risk of setting labelname.
Otherwise this looks good! Thanks for bearing with me :)
| if len(labelvalues) != len(labelnames): | ||
| raise ValueError("Incorrect label count") | ||
|
|
||
| self._labelnames = tuple(labelnames) |
There was a problem hiding this comment.
Since we are relying on the inheritance now to apply the labels, we should add a comment here to describe that risk / fragility.
There was a problem hiding this comment.
As an alternative, I've updated to use the super() constructor to set the internal label names field. This feels a bit better to use their stable API to get everything done.
| def add_metric( | ||
| self, | ||
| labelvalues: StrSequence, | ||
| buckets: Sequence[Tuple[str, float]], | ||
| gsum_value: float, | ||
| timestamp: Optional[Union[float, Timestamp]] = None, | ||
| ) -> None: |
There was a problem hiding this comment.
I see your point. Both cases have a level of fragility. Since if the upstream version changes something else about how it is adding metrics, then we have no way of knowing we should update our copied code snippet.
But either way, I think this PR is good to go. If we see this being too fragile the way it is, we can always revisit. Since it's just a metric label we aren't dealing with a potential Synapse functionality breakage so the stakes are low.
|
Thanks for the review @devonh 🦚 |
Refactor
GaugeBucketCollectormetrics to be homeserver-scopedPart of #18592
Testing strategy
metricslistener in yourhomeserver.yamlpoetry run synapse_homeserver --config-path homeserver.yamlhttp://localhost:9322/_synapse/metricsand/orhttp://localhost:9323/metricsmsecsin thelooping_callso that_read_forward_extremitiesruns immediately instead of after an hour.synapse_forward_extremitiesandsynapse_excess_extremity_eventsmetrics with theserver_namelabelPull Request Checklist
EventStoretoEventWorkerStore.".code blocks.