Implement shutdown procedure for OTLP grpc exporters#3138
Implement shutdown procedure for OTLP grpc exporters#3138srikanthccv merged 14 commits intoopen-telemetry:mainfrom
Conversation
- Add `_shutdown` variable for checking if the exporter has been shutdown. - Prevent export if the `_shutdown` flag has been set. Log a warning message is exporter has been shutdown. - Use thread lock to synchronize the last export call before shutdown timeout. The `shutdown` method will wait until the `timeout_millis` if there is an ongoing export. If there is no ongiong export, set the `_shutdown` flag to prevent further exports and return. - Add unit tests for the `OTLPExporterMixIn` and the sub classes for traces and metrics.
...pentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py
Show resolved
Hide resolved
|
@srikanthccv Any opinions on the open question in the PR description? After more investigation I found out that if the code doesn't end the exponential back off in the |
As soon as the shutdown request is received, it should do one last flush before exiting. IIRC there was no reliable way to cancel the ongoing export call and just try once before shutting down everything. I haven't looked at the implementation but if you have some ideas please suggest them. |
|
Combining the last flush with the shutdown process can make the logic a bit more complicated. If nobody is complaining then let's leave it as it is. Otherwise we would need to change the |
|
Please fix the lint |
|
Thank you! |
| return self._export(spans) | ||
|
|
||
| def shutdown(self) -> None: | ||
| OTLPExporterMixin.shutdown(self) |
There was a problem hiding this comment.
@girishc13 Can you please help me understand why you are calling shutdown on mixin class directly instead of using super()?
There was a problem hiding this comment.
I don't remember the exact details but the shutdown method is implemented by both the OTLPExporterMixin and the SpanExporter interfaces. The exporter.shutdown is handled by different logic and this pr was targeting the backend client that needs to be shutdown. You need to trace the calls for shutdown from the usage of the OTLPSpanExporter.
There was a problem hiding this comment.
I think the issue maybe due to mixin inheritance being applied incorrectly, currently we have
class OTLPSpanExporter(SpanExporter, OTLPExporterMixin[...]):but usually in python mixin should come before the base class, e.g.
class OTLPSpanExporter(OTLPExporterMixin[...], SpanExporter):this way, super().shutdown() call will correctly use shutdown method from mixin
|
@girishc13 @lzchen @srikanthccv why in grpc verision, use _export_lock, and why http.OTLPSpanExporter not need the lock flag? |
|
Hi @Liubey, either there was no HTTP exporter at the time or the author only addressed the issue for the gRPC exporter. |
Description
Partial fix for #1791. Implements the
shutdownprocedure for OTLP gRPC exporters._shutdownvariable for checking if the exporter has been shutdown._shutdownflag has been set. Log a warning message is exporter has been shutdown.shutdownmethod will wait until thetimeout_millisif there is an ongoing export. If there is no ongiong export, set the_shutdownflag to prevent further exports and return.OTLPExporterMixInand the sub classes for traces and metrics.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
TestOTLPExporterMixintest_shutdown: assert that the further exports don't happen after calling theshutdownmethod.test_shutdown_wait_last_export: assert that the shutdown method waits for at leasttimeout_millistime if an export is in progress before setting the_shutdownflag to prevent further exports.TestOTLPMetricExporterandTestOTLPTracesExporterDoes This PR Require a Contrib Repo Change?
Answer the following question based on these examples of changes that would require a Contrib Repo Change:
The OTel specification has changed which prompted this PR to update the method interfaces of
opentelemetry-api/oropentelemetry-sdk/The method interfaces of
test/utilhave changedScripts in
scripts/that were copied over to the Contrib repo have changedConfiguration files that were copied over to the Contrib repo have changed (when consistency between repositories is applicable) such as in
pyproject.tomlisort.cfg.flake8When a new
.github/CODEOWNERis addedMajor changes to project information, such as in:
README.mdCONTRIBUTING.mdYes. - Link to PR:
No.
Checklist:
tox -e linton my machine due to some issue with thedistutilsmodule. I have manually runblackandisortcommands but there are some changes which might be questionable. I'll try to fix thedistutilspackage issues.Open Question
The OTLP gRPC exporters run in a daemon thread with an exponential backoff of max 64 seconds. This is questionable because this often leaves a hanging thread even if the main process has been shutdown. Should the export function stop after the shutdown flag has been set even if the retry is in progress?