fix(groq): align instrumentation with OTel GenAI semconv v1.40.0#4010
fix(groq): align instrumentation with OTel GenAI semconv v1.40.0#4010lenatraceloop wants to merge 6 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMigrates Groq instrumentation to OpenTelemetry GenAI semantic conventions: switches span naming/attributes to GenAI keys, aggregates streaming content and tool-call deltas across chunks, normalizes finish reasons, forwards tool_call data into events, and tightens span lifecycle and error handling for sync and async flows. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Instr as Groq Instrumentation
participant StreamProc as Stream Processor
participant SpanUtils as Span Utils
participant EventEm as Event Emitter
participant Exporter as Exporter/Logger
Client->>Instr: invoke chat call (sync/async/streaming)
Instr->>SpanUtils: start span (GenAI provider/operation/request.model)
alt streaming
Instr->>StreamProc: iterate chunks
StreamProc->>SpanUtils: _process_streaming_chunk -> (content, tool_calls deltas, finish_reasons)
StreamProc->>StreamProc: _accumulate_tool_calls (merge deltas by index)
StreamProc->>SpanUtils: set_streaming_response_attributes(accumulated content, tool_calls)
StreamProc->>EventEm: emit_streaming_response_events(tool_calls, finish_reasons)
else non-streaming
Instr->>SpanUtils: set_model_response_attributes(choices, finish_reasons)
Instr->>EventEm: emit_choice/message events(tool_calls)
end
SpanUtils->>Exporter: set span attributes, set status (OK/ERROR), end span
EventEm->>Exporter: emit events/logs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_emitter.py (1)
57-58: Minor: Redundantor ""after_map_groq_finish_reason.The
_map_groq_finish_reasonfunction already returns""for falsy inputs (seespan_utils.pylines 33-35), so theor ""is redundant. Not a bug, just slightly redundant.🔧 Optional simplification
- finish_reason=_map_groq_finish_reason(choice.finish_reason) or "", + finish_reason=_map_groq_finish_reason(choice.finish_reason),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_emitter.py` around lines 57 - 58, Remove the redundant "or ''" in the finish_reason assignment in event_emitter.py: use finish_reason=_map_groq_finish_reason(choice.finish_reason) since _map_groq_finish_reason already returns an empty string for falsy inputs; this simplifies the code and avoids unnecessary fallback duplication while leaving span_utils._map_groq_finish_reason behavior unchanged.packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/span_utils.py (1)
164-175: Unusedusageparameter in function signature.The
usageparameter is accepted but not used within the function body. If this is intentional (e.g., for API consistency), consider documenting it. Otherwise, it could be removed.💡 Suggested fix to remove unused parameter
-def set_streaming_response_attributes(span, accumulated_content, finish_reason=None, usage=None, tool_calls=None): +def set_streaming_response_attributes(span, accumulated_content, finish_reason=None, tool_calls=None):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/span_utils.py` around lines 164 - 175, The function set_streaming_response_attributes declares an unused parameter usage; remove the unused parameter from the function signature (and any associated type hints/docs) and update any call sites of set_streaming_response_attributes to stop passing usage, or alternatively if usage is intended to be kept for API compatibility, add a brief docstring comment noting it is intentionally unused (e.g., "_unused usage for API compatibility") and/or reference it in a noop way; locate the symbol set_streaming_response_attributes to make the change and update tests or callers accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py`:
- Around line 204-206: The loop that merges streaming chunk finish reasons
currently deduplicates values and should instead preserve one entry per choice:
in the block handling chunk_finish_reasons, remove the conditional check "if fr
not in accumulated_finish_reasons" and always append each fr to
accumulated_finish_reasons (i.e., for each fr in chunk_finish_reasons do
accumulated_finish_reasons.append(fr)), so the streaming path matches the
non-streaming path and satisfies the OTel GenAI semconv
gen_ai.response.finish_reasons requirement.
In
`@packages/opentelemetry-instrumentation-groq/tests/traces/test_event_emitter.py`:
- Around line 173-184: The code in _emit_choice_event is currently removing
"role" whenever should_send_prompts() (TRACELOOP_TRACE_CONTENT) is False; change
it so only "content" is redacted when tracing is disabled and do not
blanket-remove "role" — preserve "role" except in the existing
assistant-deduplication branch that intentionally strips role for assistant
messages; update the logic in _emit_choice_event to stop the unconditional
deletion of message["role"] when should_send_prompts() is False so the test in
tests/traces/test_event_emitter.py
(test_no_prompts_removes_content_and_role_from_message) will only assert content
is removed and role remains.
In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py`:
- Around line 347-361: Add an async test in the TestAwrap class that mirrors the
existing suppression test but sets _SUPPRESS_INSTRUMENTATION_KEY in the context
instead of SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY; use
context_api.attach(context_api.set_value(_SUPPRESS_INSTRUMENTATION_KEY, True)),
call wrapper = _awrap(tracer, None, None, None, None, {}), await wrapper on an
AsyncMock like in the current test, then assert the result and that
tracer.start_span.assert_not_called() to confirm spans are suppressed for the
other key.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_emitter.py`:
- Around line 57-58: Remove the redundant "or ''" in the finish_reason
assignment in event_emitter.py: use
finish_reason=_map_groq_finish_reason(choice.finish_reason) since
_map_groq_finish_reason already returns an empty string for falsy inputs; this
simplifies the code and avoids unnecessary fallback duplication while leaving
span_utils._map_groq_finish_reason behavior unchanged.
In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/span_utils.py`:
- Around line 164-175: The function set_streaming_response_attributes declares
an unused parameter usage; remove the unused parameter from the function
signature (and any associated type hints/docs) and update any call sites of
set_streaming_response_attributes to stop passing usage, or alternatively if
usage is intended to be kept for API compatibility, add a brief docstring
comment noting it is intentionally unused (e.g., "_unused usage for API
compatibility") and/or reference it in a noop way; locate the symbol
set_streaming_response_attributes to make the change and update tests or callers
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6ebef7cc-769a-4131-b78c-5717f47be9d8
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-groq/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_emitter.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_models.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/span_utils.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/utils.pypackages/opentelemetry-instrumentation-groq/tests/traces/conftest.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_chat_tracing.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_event_emitter.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_finish_reasons.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_init.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_span_utils.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_utils.py
|
Hi CodeRabbit! Thanks for the detailed review — really appreciate the thoroughness 😊 Here's a summary of how we addressed each point: ✅ Addressed in the latest commit:
🟡 Intentionally kept as-is:
Thanks again for the great catch on the deduplication — that was a real correctness issue! 🙌 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py (1)
482-506: Consider adding assertions to metrics disabled test.The
test_metrics_disabled_sets_histograms_to_nonetest runs without errors but doesn't assert that histograms are actuallyNone. While it works as a smoke test, adding an assertion would make the test name more accurate.💡 Possible enhancement
If you want to verify the histograms are None, you could patch and inspect the internal state, though this may require exposing internals. Alternatively, rename the test to
test_metrics_disabled_does_not_raiseto better reflect its current behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py` around lines 482 - 506, The test test_metrics_disabled_sets_histograms_to_none currently only exercises GroqInstrumentor.instrument/uninstrument but doesn't assert histograms are None; update the test to after instrumentor.instrument(...) assert that the instrumentor's histogram attributes (e.g., instrumentor._request_histogram and instrumentor._response_histogram or whatever internal names GroqInstrumentor uses for request/response histograms) are None, or if internals are intentionally private/unstable, rename the test to test_metrics_disabled_does_not_raise to match behavior; locate the symbols GroqInstrumentor, instrument(), and the histogram attribute names in the GroqInstrumentor implementation to pick the correct properties to assert.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py`:
- Around line 176-186: The test comment incorrectly documents
_process_streaming_chunk's return for empty choices; update the inline comment
in test_span_not_recording_skips_set_status to reflect the actual 4-tuple return
(None, [], [], None) instead of (None, None, None); locate the comment near the
test function name test_span_not_recording_skips_set_status and references to
_process_streaming_chunk and _create_stream_processor and change the comment
text accordingly.
---
Nitpick comments:
In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py`:
- Around line 482-506: The test test_metrics_disabled_sets_histograms_to_none
currently only exercises GroqInstrumentor.instrument/uninstrument but doesn't
assert histograms are None; update the test to after
instrumentor.instrument(...) assert that the instrumentor's histogram attributes
(e.g., instrumentor._request_histogram and instrumentor._response_histogram or
whatever internal names GroqInstrumentor uses for request/response histograms)
are None, or if internals are intentionally private/unstable, rename the test to
test_metrics_disabled_does_not_raise to match behavior; locate the symbols
GroqInstrumentor, instrument(), and the histogram attribute names in the
GroqInstrumentor implementation to pick the correct properties to assert.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7d56602a-a73f-4870-ab36-68f57c161f07
📒 Files selected for processing (4)
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/event_emitter.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/span_utils.pypackages/opentelemetry-instrumentation-groq/tests/traces/test_init.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py (1)
245-255: Assert the actual ERROR status on exception paths.These tests cover a key PR objective, but most only verify that
set_status()was called—or don’t verify async API exception status/end at all. A regression that setsOK, omits async status, or uses the wrong status would still pass.🧪 Tighten the exception assertions
+from opentelemetry.trace import StatusCode + @@ span.end.assert_called_once() # span must be ended even on exception span.set_status.assert_called_once() + assert span.set_status.call_args.args[0].status_code == StatusCode.ERROR @@ span.end.assert_called_once() + span.set_status.assert_called_once() + assert span.set_status.call_args.args[0].status_code == StatusCode.ERROR @@ duration_histogram.record.assert_called_once() + span.end.assert_called_once() + span.set_status.assert_called_once() + assert span.set_status.call_args.args[0].status_code == StatusCode.ERROR @@ span.end.assert_called_once() + span.set_status.assert_called_once() + assert span.set_status.call_args.args[0].status_code == StatusCode.ERRORAlso applies to: 303-321, 378-392, 436-455
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py` around lines 245 - 255, Update the exception-path assertions to verify that span.set_status was called with an actual ERROR status (not just called) — in test_api_exception_records_duration_and_reraises ensure span.set_status was asserted with a Status containing StatusCode.ERROR and keep span.end.assert_called_once(); apply the same stronger assertion to the async exception tests referenced (the other tests around lines 303-321, 378-392, 436-455) so they verify both span.end was called and span.set_status was invoked with an ERROR status rather than merely asserting it was called.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py`:
- Around line 503-507: Remove the broad try/except that swallows errors from
instrumentor.uninstrument(); instead call instrumentor.uninstrument() directly
so failures surface, or if a particular exception is expected use an explicit
assertion (e.g., with pytest.raises(ExpectedException):
instrumentor.uninstrument()). Update the test around the
instrumentor.uninstrument() call (the uninstrument invocation) to either remove
the try/except block or replace it with a targeted pytest.raises for the
specific exception.
---
Nitpick comments:
In `@packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py`:
- Around line 245-255: Update the exception-path assertions to verify that
span.set_status was called with an actual ERROR status (not just called) — in
test_api_exception_records_duration_and_reraises ensure span.set_status was
asserted with a Status containing StatusCode.ERROR and keep
span.end.assert_called_once(); apply the same stronger assertion to the async
exception tests referenced (the other tests around lines 303-321, 378-392,
436-455) so they verify both span.end was called and span.set_status was invoked
with an ERROR status rather than merely asserting it was called.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fa3e3b0c-ae07-4a90-862b-44202a26e07a
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-groq/tests/traces/test_init.py
| finish_reasons = [] | ||
| for choice in chunk.choices: | ||
| delta = choice.delta | ||
| if hasattr(delta, "content") and delta.content: |
There was a problem hiding this comment.
Isnt this hasattr(delta, "content") the same as delta.content?
| delta = choice.delta | ||
| if hasattr(delta, "content") and delta.content: | ||
| content += delta.content | ||
| if hasattr(delta, "tool_calls") and delta.tool_calls: |
| Delta objects may be Pydantic models or dicts; arguments arrive as JSON fragments. | ||
| """ | ||
| for tc in tool_calls_delta: | ||
| if isinstance(tc, dict): |
There was a problem hiding this comment.
When is the tool calls bit a dict?
| return accumulated | ||
|
|
||
|
|
||
| def _handle_streaming_response(span, accumulated_content, tool_calls, finish_reasons, usage, event_logger): |
There was a problem hiding this comment.
You should state the type of each param in this function
| if chunk_finish_reason: | ||
| finish_reason = chunk_finish_reason | ||
| if tool_calls_delta: | ||
| _accumulate_tool_calls(accumulated_tool_calls, tool_calls_delta) |
There was a problem hiding this comment.
You do not use the return value here..
There was a problem hiding this comment.
Fixed: _accumulate_tool_calls now returns None — modifies the dict in-place, no return value needed.
| if chunk_finish_reason: | ||
| finish_reason = chunk_finish_reason | ||
| if tool_calls_delta: | ||
| _accumulate_tool_calls(accumulated_tool_calls, tool_calls_delta) |
There was a problem hiding this comment.
Same here - you do not use the return file
There was a problem hiding this comment.
Fixed: _accumulate_tool_calls now returns None — modifies the dict in-place, no return value needed.
| _handle_response(span, response, token_histogram, event_logger) | ||
|
|
||
| _handle_response(span, response, token_histogram, event_logger) | ||
| except Exception as ex: # pylint: disable=broad-except |
There was a problem hiding this comment.
Why do you need th pylint comment here?
There was a problem hiding this comment.
not needed, removed
Summary
Fixes TLP-1986
Data:
https://app.traceloop.dev/projects/lena-bedrock-anthropic-js-tests/trace?projectSlug=lena-bedrock-anthropic-js-tests&tab=traces&date_setting=%7B%22preset%22%3A%22Last+7+days%22%7D
Changes
Bug fixes:
_wrapand_awrap) —span.end()+ERRORstatus now always called_awrapnon-streaming response handling wrapped intry/except(symmetric with_wrap)delta.tool_callsnever accumulated — fixed via_accumulate_tool_callshelperemit_choice_eventsdroppedtool_callsfromgen_ai.choiceeventsawaiton async generator causedTypeErrorin async streaming pathSemconv compliance:
gen_ai.tool.definitionsnow Opt-In (gated byshould_send_prompts()) per OTel specfinish_reasondefault changed from"unknown"→""per OutputMessage JSON schemafinish_reasonsaccumulated as list (correct forn > 1)_tool_calls_to_partshandles both dict and Pydantic object representationsTest plan
uv run pytest tests/traces/— all 114 tests passuv run ruff check .— no linting errorsSummary by CodeRabbit
New Features
Bug Fixes
Tests