Add timestamp to metrics, implement Azure Monitor Metrics Exporter and Console Metrics Exporter by lzchen · Pull Request #186 · open-telemetry/opentelemetry-python

lzchen · 2019-09-28T01:30:44Z

Implementation of Metrics Exporter for Azure Monitor. Currently there is no concept of metrics processor or a polling mechanism, so export is called directly with the exporter.
Most of the code was copied from OpenCensus, with some style adjustments.

As well, implementation of Console Metrics Exporter, in which takes the metrics and prints them to the console.

reyang · 2019-09-28T03:21:40Z

+        # Metric tuples is a sequence of metric to label values pairs
+        envelopes = tuple(map(self.metric_tuple_to_envelope, metric_tuples))
+
+        try:


This piece of code is >95% duplicated with

opentelemetry-python/ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/trace.py

Line 38 in 250b9fa

try:

.

I moved the transport logic to a separate class TransportMixin and validation of the ikey into utils.py. More sophisticated validation of the ikey can be added there once we implement that.

toumorokoshi

I'll approve if this is something we want delivered by alpha, but I think there's some valuable feedback to action on.

toumorokoshi · 2019-09-28T03:54:43Z

+                pass
+
+        if response.status_code == 200:
+            logger.info("Transmission succeeded: %s.", text)


I think a lot of people use basicConfig, which would print logs for things like this.

Thoughts about either turning off explicit propagation of this logger, or moving this to debug? Might get noisy by default.

This is a good point. I like basicConfig but once we start implement logging using log handlers this might disrupt that process (does nothing if there are already handles configured).

What does it mean by "explicit propagation"?

It means doing

logging.getLogger('opentelemetry.ext.azure_monitor.transport').propagate = False

so the opentelemetry (or opentelemetry...) logger doesn't handle messages from this logger.

Muting the whole logger sounds like overkill, this seems like a clear case for a debug log.

toumorokoshi · 2019-09-28T03:59:47Z

+    FAILED_NOT_RETRYABLE = 2
+
+
+class MetricsExporter:


I'm generally not clear on the strategy to put exporters in the SDK. This effectively means that probably everyone who uses opentelemetry-python will consume both the API and SDK, as I doubt vendors will go write their own.

But if there's a use case I'm not considering, I'd love an example.

Are you referring to the case where vendors use their own SDK but want to use one of our exporters? In that case they would have to have a dependency on our SDK (which is wrong). Maybe we need the common exporter behavior in the API.

When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).

There's some conversation in open-telemetry/opentelemetry-specification#186, but no real justification for keeping exporters out of the API altogether.

As I understand it, the reasoning is that applications should take a dependency on some SDK, and only applications need to deal with exporting spans/metrics. Libraries should only depend on the API. They can generate spans/metrics, but should leave it up to the application to export them.

If the purpose of the API is to give libraries a lightweight way to generate spans/metrics then there's no reason to include an export API. If it's to give implementers (read: vendors) a set of common interfaces and data types then there's a good reason to include it.

toumorokoshi · 2019-09-28T04:02:49Z

+    ) -> "MetricsExportResult":
+        for metric_tuple in metric_tuples:
+            handle = metric_tuple[0].get_handle(metric_tuple[1])
+            print(


could we have this be emitter to a logger vs printed to stdout? I'm thinking it would be nice to redirect this stream to something else (e.g. some streaming service), and loghandlers are a great way to redirect that stream.

conversely I could imagine maybe passing a file object? and then write to it?

I believe the main idea of the console exporter is for simple debugging purposes, so simply printing out to the console could be enough. Although there may be cases where the outputs get overloaded, in which case redirecting to a file would allow easiness for debugging. I think we can revisit this in a separate issue.

IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.

toumorokoshi · 2019-09-28T04:06:55Z

Would it be valuable to add some unit tests? specifically around consolexporter at least.

toumorokoshi · 2019-09-28T04:13:38Z

@lzchen sorry to put an unrelated question in the pr, but: is there any updates on metrics processors? really curious how we're gonna hook this all up in practice.

lzchen · 2019-09-28T17:44:56Z

Would it be valuable to add some unit tests? specifically around consolexporter at least.

Yes most definitely. I'll open an issue to track this. [#187]

lzchen · 2019-09-28T18:05:22Z

@toumorokoshi
Thanks for reviewing. I'll be opening another issue to describe the plans for how we will be hooking up metrics in the pipeline.

Oberon00

Requesting changes here as IMHO the PR does too many things that do not belong under the title. It should be split up.

Feel free to dismiss my review though if the PR would otherwise make it into the release today.

Oberon00 · 2019-09-30T15:56:35Z

                logger.warning("Monotonic counter cannot descend.")
                return
            self.data += value
+            self.timestamp = time_ns()


If this is really required, I'd name it "last_update_timestamp" and also take the timestamp as early as possible, as it should probably represent the timestamp the value is from not the timestamp it was recorded in the metric (although maybe it should, I don't know 😄)

This is a good idea. I'll rename the field.

Oberon00 · 2019-09-30T16:00:14Z

+    FAILED_NOT_RETRYABLE = 2
+
+
+class MetricsExporter:


When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).

Oberon00 · 2019-09-30T16:02:32Z

+        """
+
+
+class ConsoleMetricsExporter(MetricsExporter):


This class also does not seem to belong under this PR title.

Normally I would split this up into multiple PRs. However, for the sake of the Alpha, I will update the PR title and add more of a description to include both changes.

Title updated, OK for alpha (didn't review most of the code though)

c24t

Looks good overall, but I agree that that log line should be debug-level.

c24t

LGTM, and it looks like you've covered all of @Oberon00's comments. Thanks!

c24t · 2019-09-30T19:03:46Z

+    ) -> "MetricsExportResult":
+        for metric_tuple in metric_tuples:
+            handle = metric_tuple[0].get_handle(metric_tuple[1])
+            print(


IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.

reyang · 2019-09-30T19:22:09Z

+
+
+console_exporter.export([(counter, label_values), (gauge, label_values2)])
+exporter.export([(counter, label_values), (gauge, label_values2)])


Not sure if this is what we want. I guess this is not what we want to user to use in order to export metrics?

I do not think we will use exporter.export explicitly to export metrics. With that being said, this is in the examples and exposed to the users, so it might be confusing in the future. However, this is the only way currently to export metrics. If we do not include it in the examples, then creating a metrics exporter would have been pointless, as users would not know how to use it (even if improperly).

Then we probably don't want to release this piece?

Makes sense. In that case, I will add a TODO or the rest of the export pipeline and remove the example for now until we decide on implementation details. These should be decided very soon. As for this PR, since it is not crucial for the Alpha release anymore, I will split this into separate PRs, Azure Metrics Exporter and Console Metrics Exporter.

reyang · 2019-09-30T19:27:19Z

+        try:
+            text = response.text
+        except Exception:  # noqa pylint: disable=broad-except
+            logger.exception("Error while reading response body %s.")


The format string is wrong.
Why do we want logger.exception here?

Does it not make sense to log at an exception level when an exception is raised?

I don't think so. We choose log level based on what we expect the users to see.

To clarify, is it because we do not expect users to configure their logging levels to expose exception level logs?

reyang · 2019-09-30T19:28:47Z

        self.value_type = value_type
        self.enabled = enabled
        self.monotonic = monotonic
+        self.last_update_timestamp = ns_to_iso_str(time_ns())


I thought we want to use nanoseconds since epoch as the time format across the SDK, and do the conversion in the exporter?

reyang · 2019-09-30T19:31:47Z

+            )
+        except requests.RequestException:
+            logger.exception("Transient client side error.")
+            return self.export_result_type.FAILED_RETRYABLE


This seems like a dangerous approach given the result_type is coming from two different places.

I agree. SUCCESS, FAILED_RETRYABLE and FAILED_NOT_RETRYABLE seem to be generic enough results to be shared across traces and metrics. I will consolidate them into one result type in the SDK.

lzchen · 2019-10-01T00:44:30Z

[WIP] Wait for export pipeline to be spec'd out.

lzchen · 2019-10-03T03:55:07Z

Follows SDK spec

* chore: add Pull request template * fix: review comment

lzchen requested review from Oberon00, a-feld, c24t, carlosalberto, reyang and toumorokoshi as code owners September 28, 2019 01:30

reyang reviewed Sep 28, 2019

View reviewed changes

Comment thread opentelemetry-sdk/src/opentelemetry/sdk/metrics/__init__.py Outdated

reyang reviewed Sep 28, 2019

View reviewed changes

Comment thread opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py Outdated

reyang reviewed Sep 28, 2019

View reviewed changes

Comment thread opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py Outdated

reyang reviewed Sep 28, 2019

View reviewed changes

Comment thread opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py Outdated

toumorokoshi approved these changes Sep 28, 2019

View reviewed changes

Oberon00 reviewed Sep 30, 2019

View reviewed changes

Comment thread ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/transport.py Outdated

Oberon00 previously requested changes Sep 30, 2019

View reviewed changes

lzchen changed the title ~~Implement Azure Monitor Metrics Exporter~~ Implement Azure Monitor Metrics Exporter and Console Metrics Exporter Sep 30, 2019

Oberon00 changed the title ~~Implement Azure Monitor Metrics Exporter and Console Metrics Exporter~~ Add timestamp to metrics, implement Azure Monitor Metrics Exporter and Console Metrics Exporter Sep 30, 2019

c24t reviewed Sep 30, 2019

View reviewed changes

Comment thread ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/transport.py Outdated

c24t approved these changes Sep 30, 2019

View reviewed changes