feat(instrumentation-openai): Migrate OpenAI instrumentation to OTel 1.40 semantic conventions#909
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR migrates OpenAI instrumentation to standardized OpenTelemetry GenAI semantic conventions (v1.40.0), replacing legacy prompt/completion attributes with structured input/output messages. It updates OpenAI SDK support to v6, introduces new message-building helpers in instrumentation-utils, and updates all test assertions accordingly. Changes
Sequence DiagramsequenceDiagram
participant Client as Client Code
participant Instr as OpenAI Instrumentation
participant Helpers as Message Helpers
participant OTel as OTel Attributes
participant SDK as OpenAI SDK
Client->>Instr: Chat/Image Request
Instr->>Helpers: buildOpenAIInputMessages(messages)
Helpers->>Helpers: mapOpenAIContentBlock(parts)
Helpers->>OTel: gen_ai.input.messages
Instr->>SDK: Execute Request
SDK->>SDK: Process (model inference)
SDK-->>Instr: Response (choices, usage)
Instr->>Helpers: buildOpenAIOutputMessage(choice)
Helpers->>Helpers: Map finish_reason via map
Helpers->>Helpers: Build tool_calls/content parts
Helpers->>OTel: gen_ai.output.messages
Instr->>OTel: gen_ai.provider_name<br/>gen_ai.operation_name<br/>gen_ai.usage.input_tokens<br/>gen_ai.usage.output_tokens<br/>gen_ai.response.finish_reasons
Instr-->>Client: Return Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
packages/instrumentation-openai/src/instrumentation.ts (1)
872-874: OpenRouter provider uses raw string instead of constant.Unlike other providers that use
GEN_AI_PROVIDER_NAME_VALUE_*constants, OpenRouter returns a hardcoded"openrouter"string. This is acceptable since there's no OTel standard constant for OpenRouter. Consider defining a local constant for consistency if more non-standard providers are added.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/instrumentation.ts` around lines 872 - 874, The return for the OpenRouter detection currently uses a hardcoded string in the conditional that checks baseURL.includes("openrouter") and returns { provider: "openrouter" }; define a local constant (e.g., GEN_AI_PROVIDER_NAME_VALUE_OPENROUTER) near the other GEN_AI_PROVIDER_NAME_VALUE_* constants and replace the literal with that constant in the detection branch inside instrumentation.ts so the provider name follows the same constant-based style as other providers (update the conditional/return that uses baseURL.includes("openrouter")).packages/instrumentation-openai/src/message-helpers.ts (1)
270-277: Audio response handling missingmime_type.The blob part for audio responses includes
modality: "audio"but nomime_type. OpenAI's audio response format isn't captured here. Consider addingmime_typeif available frommessage.audio.formator defaulting to a common format.♻️ Proposed fix to include mime_type
// Audio response if (message.audio?.data) { parts.push({ type: "blob", modality: "audio", + mime_type: `audio/${message.audio.format || "wav"}`, content: message.audio.data, }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/message-helpers.ts` around lines 270 - 277, The audio blob part currently pushed by parts.push({ type: "blob", modality: "audio", content: message.audio.data }) lacks a mime_type; update the logic that handles message.audio?.data to include a mime_type property (prefer message.audio.format or message.audio.mime_type if present, otherwise default to a sensible value like "audio/mpeg" or "audio/wav") when constructing the blob object so the audio modality and format are both captured.packages/instrumentation-utils/src/content-block-mappers.ts (1)
185-200: Data URI parsing may miss edge cases.The regex
/^data:([^;]+);base64,(.+)$/assumes:
- A semicolon always separates MIME type from encoding
- Encoding is always
base64Some valid data URIs use different formats (e.g.,
data:text/plain,Hellowithout encoding, ordata:image/png;charset=utf-8;base64,...). However, for OpenAI image inputs,base64is the standard, so this is acceptable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-utils/src/content-block-mappers.ts` around lines 185 - 200, The data-URI handling in the "image_url" case currently uses url.match(/^data:([^;]+);base64,(.+)$/) which misses variants like charset parameters; update the logic in the image_url branch (the url variable and the match check) to only treat as a blob when the data URI actually contains a base64 segment and to extract the MIME type even if additional params exist (e.g., handle patterns with extra params before ;base64), otherwise fall back to returning the URI; implement this by loosening the match to allow optional params before the final ";base64" (case-insensitive) or by parsing the data URI components and checking for a base64 marker, then populate mime_type and content only when base64 is confirmed.packages/instrumentation-openai/src/image-wrappers.ts (1)
229-245: JSON parse/stringify pattern is fragile but functional.The code parses the existing
ATTR_GEN_AI_INPUT_MESSAGESJSON, appends a new message, and re-serializes. This works but has risks:
- If
existingMessagesis malformed,JSON.parsethrows- Type safety is lost during the parse
Consider building the messages array in memory before serializing once, rather than parsing/modifying/re-serializing.
♻️ Suggested refactor to avoid parse/stringify cycle
export async function setImageEditRequestAttributes( span: Span, params: ImageEditParams, uploadCallback?: ImageUploadCallback, ): Promise<void> { const attributes: Attributes = {}; + const inputMessages: object[] = []; if (params.model) { attributes[ATTR_GEN_AI_REQUEST_MODEL] = params.model; } // ... other attributes ... if (params.prompt) { - attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify([ - { role: "user", parts: [{ type: "text", content: params.prompt }] }, - ]); + inputMessages.push({ + role: "user", + parts: [{ type: "text", content: params.prompt }], + }); } // Process input image if upload callback is available if (params.image && uploadCallback && ...) { // ... if (imageUrl) { - // Add the image as a second input message - const existingMessages = attributes[ATTR_GEN_AI_INPUT_MESSAGES]; - if (existingMessages) { - const parsed = JSON.parse(existingMessages as string); - parsed.push({ - role: "user", - parts: [{ type: "uri", modality: "image", uri: imageUrl }], - }); - attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify(parsed); - } else { - attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify([ - { - role: "user", - parts: [{ type: "uri", modality: "image", uri: imageUrl }], - }, - ]); - } + inputMessages.push({ + role: "user", + parts: [{ type: "uri", modality: "image", uri: imageUrl }], + }); } } + if (inputMessages.length > 0) { + attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify(inputMessages); + } // ... set attributes on span ... }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/image-wrappers.ts` around lines 229 - 245, The current code directly JSON.parse(attributes[ATTR_GEN_AI_INPUT_MESSAGES]) which will throw on malformed JSON and loses type safety; instead, read attributes[ATTR_GEN_AI_INPUT_MESSAGES] into a local variable, attempt a guarded parse (try/catch) or validate its type, fall back to an empty array on failure, then create a newMessages array in memory, push the new message object ({ role: "user", parts: [{ type: "uri", modality: "image", uri: imageUrl }] }) into that array, and finally set attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify(newMessages); use ATTR_GEN_AI_INPUT_MESSAGES, attributes, and imageUrl to locate and replace the parse/modify/re-serialize pattern.packages/instrumentation-openai/test/instrumentation.test.ts (1)
182-186: Inconsistent type comparison for token values.Token assertions mix string and number comparisons across tests:
- Lines 183-184:
"15"(string)- Line 231:
"8"(string)- Line 414:
82(number)- Line 492:
82(number)While
assert.equaluses loose equality so these will pass, the inconsistency suggests the implementation may return different types in different contexts, or this is a test oversight. Consider normalizing to a consistent type.Suggested fix for consistency
assert.equal( completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], - "15", + 15, );Apply similar changes to line 231 (
"8"→8).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 182 - 186, The token assertions are inconsistent: some tests expect strings ("15", "8") while others expect numbers (82); update the tests to normalize types by comparing numeric values consistently. In the assertions that reference completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] and ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, convert the asserted expected values to numbers (e.g., 15, 8) or coerce the actual attribute with + before asserting so all tests use numeric comparisons consistently; ensure you update both occurrences (the lines using "15" and "8") to match the numeric style used elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/traceloop-sdk/package.json`:
- Line 122: The lockfile is out of sync with package.json's "openai" dependency
("openai": "^6.32.0"); run pnpm install at the repository root to regenerate the
pnpm lockfile (pnpm-lock.yaml), verify the updated lockfile includes the new
openai version, and commit the updated lockfile so CI's pnpm install
--frozen-lockfile succeeds.
---
Nitpick comments:
In `@packages/instrumentation-openai/src/image-wrappers.ts`:
- Around line 229-245: The current code directly
JSON.parse(attributes[ATTR_GEN_AI_INPUT_MESSAGES]) which will throw on malformed
JSON and loses type safety; instead, read attributes[ATTR_GEN_AI_INPUT_MESSAGES]
into a local variable, attempt a guarded parse (try/catch) or validate its type,
fall back to an empty array on failure, then create a newMessages array in
memory, push the new message object ({ role: "user", parts: [{ type: "uri",
modality: "image", uri: imageUrl }] }) into that array, and finally set
attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify(newMessages); use
ATTR_GEN_AI_INPUT_MESSAGES, attributes, and imageUrl to locate and replace the
parse/modify/re-serialize pattern.
In `@packages/instrumentation-openai/src/instrumentation.ts`:
- Around line 872-874: The return for the OpenRouter detection currently uses a
hardcoded string in the conditional that checks baseURL.includes("openrouter")
and returns { provider: "openrouter" }; define a local constant (e.g.,
GEN_AI_PROVIDER_NAME_VALUE_OPENROUTER) near the other
GEN_AI_PROVIDER_NAME_VALUE_* constants and replace the literal with that
constant in the detection branch inside instrumentation.ts so the provider name
follows the same constant-based style as other providers (update the
conditional/return that uses baseURL.includes("openrouter")).
In `@packages/instrumentation-openai/src/message-helpers.ts`:
- Around line 270-277: The audio blob part currently pushed by parts.push({
type: "blob", modality: "audio", content: message.audio.data }) lacks a
mime_type; update the logic that handles message.audio?.data to include a
mime_type property (prefer message.audio.format or message.audio.mime_type if
present, otherwise default to a sensible value like "audio/mpeg" or "audio/wav")
when constructing the blob object so the audio modality and format are both
captured.
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 182-186: The token assertions are inconsistent: some tests expect
strings ("15", "8") while others expect numbers (82); update the tests to
normalize types by comparing numeric values consistently. In the assertions that
reference completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] and
ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, convert the asserted expected values to numbers
(e.g., 15, 8) or coerce the actual attribute with + before asserting so all
tests use numeric comparisons consistently; ensure you update both occurrences
(the lines using "15" and "8") to match the numeric style used elsewhere.
In `@packages/instrumentation-utils/src/content-block-mappers.ts`:
- Around line 185-200: The data-URI handling in the "image_url" case currently
uses url.match(/^data:([^;]+);base64,(.+)$/) which misses variants like charset
parameters; update the logic in the image_url branch (the url variable and the
match check) to only treat as a blob when the data URI actually contains a
base64 segment and to extract the MIME type even if additional params exist
(e.g., handle patterns with extra params before ;base64), otherwise fall back to
returning the URI; implement this by loosening the match to allow optional
params before the final ";base64" (case-insensitive) or by parsing the data URI
components and checking for a base64 marker, then populate mime_type and content
only when base64 is confirmed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f4c7037-eaba-4676-8e22-a0047995420a
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (9)
packages/instrumentation-openai/package.jsonpackages/instrumentation-openai/src/image-wrappers.tspackages/instrumentation-openai/src/instrumentation.tspackages/instrumentation-openai/src/message-helpers.tspackages/instrumentation-openai/test/instrumentation.test.tspackages/instrumentation-utils/src/content-block-mappers.tspackages/instrumentation-utils/src/index.tspackages/sample-app/package.jsonpackages/traceloop-sdk/package.json
| attributes[`${ATTR_GEN_AI_PROMPT}.0.content`] = params.prompt; | ||
| attributes[`${ATTR_GEN_AI_PROMPT}.0.role`] = "user"; | ||
| attributes[ATTR_GEN_AI_INPUT_MESSAGES] = JSON.stringify([ | ||
| { role: "user", parts: [{ type: "text", content: params.prompt }] }, |
There was a problem hiding this comment.
I wonder if it would be cleaner to have a util function the creates this json and it will be used all over the place.
WDYT?
There was a problem hiding this comment.
Not worth extracting further — it's only 2 places, and they're building arrays they may extend with image parts. A function that returns one message object would save almost nothing and add indirection.
| * @param messages - The messages array from the OpenAI chat completion request | ||
| * @returns Array of OTel-shaped chat messages | ||
| */ | ||
| export function buildOpenAIInputMessages(messages: any[]): OTelChatMessage[] { |
There was a problem hiding this comment.
| * @param finishReasonMap - Mapping of OpenAI finish reasons to OTel standard values | ||
| * @returns Array with a single OTelOutputMessage | ||
| */ | ||
| export function buildOpenAIOutputMessage( |
There was a problem hiding this comment.
This is also not in use
There was a problem hiding this comment.
| * Maps a single OpenAI SDK content part to its OTel-compliant part object. | ||
| * Used for input content parts in user messages. | ||
| */ | ||
| export function mapOpenAIContentBlock(block: any): object { |
There was a problem hiding this comment.
It's in use here:
https://github.com/dvirski/openllmetry-js/blob/feat-migrate-openai-semcov-1.40/packages/instrumentation-openai/src/message-helpers.ts#L77
And in more places in this file
915c2a8 to
6fcadd6
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/instrumentation-openai/src/image-wrappers.ts (2)
355-403:⚠️ Potential issue | 🟠 Major
ATTR_GEN_AI_OUTPUT_MESSAGESonly captures the first generated image.
setImageGenerationResponseAttributes()is shared by generate/edit/variation, but this block always readsresponse.data[0]even though token accounting uses the fullresponse.data.length. Forn > 1, the span silently drops the rest of the model output, so downstream viewers cannot reconstruct the actual response payload.Based on learnings, "Instrumentations must extract request/response data and token usage from wrapped calls."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/image-wrappers.ts` around lines 355 - 403, setImageGenerationResponseAttributes currently only reads response.data[0] and so ATTR_GEN_AI_OUTPUT_MESSAGES drops all but the first image; iterate response.data instead of using firstImage: for each item in response.data call uploadCallback (or fallback to item.url) and build the parts array with one entry per image ({type:"uri", modality:"image", uri}). Replace the single-part payload assignment with the assembled parts list and stringify that into ATTR_GEN_AI_OUTPUT_MESSAGES; reference the setImageGenerationResponseAttributes function, ATTR_GEN_AI_OUTPUT_MESSAGES, uploadCallback, and response.data when making the change.
321-325:⚠️ Potential issue | 🟠 Major
gpt-image-1still uses the fallback token cost.
calculateImageGenerationTokens()documentsgpt-image-1pricing above, but the helper only branches on DALL-E models. Because this result is now written directly into the new usage attrs, anygpt-image-1span reports1056output tokens regardless of size/quality, and total tokens inherit the same error.Based on learnings, "Instrumentations must extract request/response data and token usage from wrapped calls."
Also applies to: 344-350
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/image-wrappers.ts` around lines 321 - 325, The image token calculation is using the DALL·E fallback for all image models, causing gpt-image-1 spans to always report the same output tokens; update the logic in calculateImageGenerationTokens (or immediately before it's used where completionTokens is assigned) to detect the model name 'gpt-image-1' and apply its correct pricing/token computation (based on response size/quality as documented) instead of the DALL·E branch, and ensure the computed value is written to ATTR_GEN_AI_USAGE_OUTPUT_TOKENS and that ATTR_GEN_AI_USAGE_TOTAL_TOKENS is updated accordingly; also apply the same fix to the other usage calculation site that mirrors this code path so both output and total token attributes reflect gpt-image-1 correctly.
🧹 Nitpick comments (2)
packages/instrumentation-openai/test/instrumentation.test.ts (2)
385-388: Use the shared tool-definition attribute constant here.The runtime side already relies on
ATTR_GEN_AI_TOOL_DEFINITIONS; keeping the raw"gen_ai.tool.definitions"literal in these assertions makes the tests drift-prone for no gain.♻️ Suggested change
import { ATTR_GEN_AI_REQUEST_MODEL, ATTR_GEN_AI_PROVIDER_NAME, ATTR_GEN_AI_OPERATION_NAME, ATTR_GEN_AI_INPUT_MESSAGES, ATTR_GEN_AI_OUTPUT_MESSAGES, ATTR_GEN_AI_USAGE_INPUT_TOKENS, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, ATTR_GEN_AI_RESPONSE_FINISH_REASONS, + ATTR_GEN_AI_TOOL_DEFINITIONS, } from "@opentelemetry/semantic-conventions/incubating"; @@ - const toolDefs = JSON.parse( - completionSpan.attributes["gen_ai.tool.definitions"] as string, - ); + const toolDefs = JSON.parse( + completionSpan.attributes[ATTR_GEN_AI_TOOL_DEFINITIONS] as string, + );Also applies to: 467-470
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 385 - 388, Replace the hard-coded attribute key string usage completionSpan.attributes["gen_ai.tool.definitions"] with the shared constant ATTR_GEN_AI_TOOL_DEFINITIONS (import it from its module if not already) in the test, and do the same for the second occurrence around the other assertion (lines noted in the review); update both assertions to read completionSpan.attributes[ATTR_GEN_AI_TOOL_DEFINITIONS] so the tests use the runtime-shared key constant.
649-656: The image edit/variation tests still don't guard the migrated operation attribute.
image_generationnow validatesATTR_GEN_AI_OPERATION_NAME, but the edit test still checks the legacygen_ai.request.typekey and the variation test doesn't assert an operation at all. Once these are re-enabled, they still won't catch regressions in the semconv migration.♻️ Suggested change
assert.strictEqual( - editSpan.attributes["gen_ai.request.type"], + editSpan.attributes[ATTR_GEN_AI_OPERATION_NAME], "image_edit", ); @@ + assert.strictEqual( + variationSpan.attributes[ATTR_GEN_AI_OPERATION_NAME], + "image_variation", + );Also applies to: 690-702
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 649 - 656, Update the image edit and image variation tests to assert the migrated semantic convention attribute ATTR_GEN_AI_OPERATION_NAME instead of (or in addition to) the legacy "gen_ai.request.type" key: for the edit test (referencing editSpan and ATTR_GEN_AI_PROVIDER_NAME) assert ATTR_GEN_AI_OPERATION_NAME === "image.edit" and for the variation test assert ATTR_GEN_AI_OPERATION_NAME === "image.variation" (also ensure the provider assertion still checks ATTR_GEN_AI_PROVIDER_NAME === "openai"); replace the old "gen_ai.request.type" checks at the edit and variation locations so the tests will catch regressions in the semconv migration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/instrumentation-openai/src/image-wrappers.ts`:
- Around line 355-403: setImageGenerationResponseAttributes currently only reads
response.data[0] and so ATTR_GEN_AI_OUTPUT_MESSAGES drops all but the first
image; iterate response.data instead of using firstImage: for each item in
response.data call uploadCallback (or fallback to item.url) and build the parts
array with one entry per image ({type:"uri", modality:"image", uri}). Replace
the single-part payload assignment with the assembled parts list and stringify
that into ATTR_GEN_AI_OUTPUT_MESSAGES; reference the
setImageGenerationResponseAttributes function, ATTR_GEN_AI_OUTPUT_MESSAGES,
uploadCallback, and response.data when making the change.
- Around line 321-325: The image token calculation is using the DALL·E fallback
for all image models, causing gpt-image-1 spans to always report the same output
tokens; update the logic in calculateImageGenerationTokens (or immediately
before it's used where completionTokens is assigned) to detect the model name
'gpt-image-1' and apply its correct pricing/token computation (based on response
size/quality as documented) instead of the DALL·E branch, and ensure the
computed value is written to ATTR_GEN_AI_USAGE_OUTPUT_TOKENS and that
ATTR_GEN_AI_USAGE_TOTAL_TOKENS is updated accordingly; also apply the same fix
to the other usage calculation site that mirrors this code path so both output
and total token attributes reflect gpt-image-1 correctly.
---
Nitpick comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 385-388: Replace the hard-coded attribute key string usage
completionSpan.attributes["gen_ai.tool.definitions"] with the shared constant
ATTR_GEN_AI_TOOL_DEFINITIONS (import it from its module if not already) in the
test, and do the same for the second occurrence around the other assertion
(lines noted in the review); update both assertions to read
completionSpan.attributes[ATTR_GEN_AI_TOOL_DEFINITIONS] so the tests use the
runtime-shared key constant.
- Around line 649-656: Update the image edit and image variation tests to assert
the migrated semantic convention attribute ATTR_GEN_AI_OPERATION_NAME instead of
(or in addition to) the legacy "gen_ai.request.type" key: for the edit test
(referencing editSpan and ATTR_GEN_AI_PROVIDER_NAME) assert
ATTR_GEN_AI_OPERATION_NAME === "image.edit" and for the variation test assert
ATTR_GEN_AI_OPERATION_NAME === "image.variation" (also ensure the provider
assertion still checks ATTR_GEN_AI_PROVIDER_NAME === "openai"); replace the old
"gen_ai.request.type" checks at the edit and variation locations so the tests
will catch regressions in the semconv migration.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 52d3fe45-dc87-4fa7-90bc-93ce84db9aa0
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (9)
packages/instrumentation-openai/package.jsonpackages/instrumentation-openai/src/image-wrappers.tspackages/instrumentation-openai/src/instrumentation.tspackages/instrumentation-openai/src/message-helpers.tspackages/instrumentation-openai/test/instrumentation.test.tspackages/instrumentation-utils/src/content-block-mappers.tspackages/instrumentation-utils/src/index.tspackages/sample-app/package.jsonpackages/traceloop-sdk/package.json
✅ Files skipped from review due to trivial changes (3)
- packages/traceloop-sdk/package.json
- packages/instrumentation-openai/package.json
- packages/instrumentation-openai/src/message-helpers.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/instrumentation-utils/src/index.ts
- packages/sample-app/package.json
- packages/instrumentation-utils/src/content-block-mappers.ts
- packages/instrumentation-openai/src/instrumentation.ts
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/instrumentation-openai/test/instrumentation.test.ts (1)
608-612:⚠️ Potential issue | 🟠 MajorAdd constants for image attributes to SemanticAttributes and use them throughout.
Four image attribute keys are hardcoded as strings in both the implementation (
image-wrappers.ts) and test file:
gen_ai.request.image.sizegen_ai.request.image.countgen_ai.request.image.qualitygen_ai.request.image.stylePer coding guidelines, these should be defined as constants in
packages/ai-semantic-conventions/src/SemanticAttributes.tsand imported inpackages/instrumentation-openai/src/image-wrappers.tsand the test file rather than hardcoded.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 608 - 612, Add four new constants in the SemanticAttributes export (e.g., IMAGE_SIZE, IMAGE_COUNT, IMAGE_QUALITY, IMAGE_STYLE) with values "gen_ai.request.image.size", "gen_ai.request.image.count", "gen_ai.request.image.quality", and "gen_ai.request.image.style"; then replace the hardcoded string keys in the image wrappers module (image-wrappers.ts) and the test file (instrumentation.test.ts) to import and use these new SemanticAttributes constants instead of literal strings so all references share the canonical identifiers.
🧹 Nitpick comments (2)
packages/instrumentation-openai/test/instrumentation.test.ts (2)
385-393: Hardcoded attribute string violates coding guidelines.
"gen_ai.tool.definitions"should use a constant from@traceloop/ai-semantic-conventionsor@opentelemetry/semantic-conventions/incubatingrather than a hardcoded string. The same issue appears at line 469.♻️ Suggested fix
Import and use the constant:
import { ATTR_GEN_AI_REQUEST_MODEL, ATTR_GEN_AI_PROVIDER_NAME, ATTR_GEN_AI_OPERATION_NAME, ATTR_GEN_AI_INPUT_MESSAGES, ATTR_GEN_AI_OUTPUT_MESSAGES, ATTR_GEN_AI_USAGE_INPUT_TOKENS, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, ATTR_GEN_AI_RESPONSE_FINISH_REASONS, + ATTR_GEN_AI_TOOL_DEFINITIONS, } from "@opentelemetry/semantic-conventions/incubating";Then replace:
- const toolDefs = JSON.parse( - completionSpan.attributes["gen_ai.tool.definitions"] as string, - ); + const toolDefs = JSON.parse( + completionSpan.attributes[ATTR_GEN_AI_TOOL_DEFINITIONS] as string, + );As per coding guidelines: "Import AI/LLM semantic attribute constants from
@traceloop/ai-semantic-conventionsrather than hardcoding strings".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 385 - 393, The test currently uses a hardcoded attribute key completionSpan.attributes["gen_ai.tool.definitions"]; update the test to import the AI semantic attribute constant (e.g., GEN_AI_TOOL_DEFINITIONS) from `@traceloop/ai-semantic-conventions` (or the appropriate `@opentelemetry` incubating constant) and replace the hardcoded string usages with that constant in the assertions for completionSpan.attributes; also update the other occurrence noted in the file (the similar use around line ~469) to use the same imported constant so both tests follow the coding guideline.
638-643: Type assertion bypasses type safety.Using
(openai as any)hides potential API mismatches. Since these tests are skipped, consider adding a comment explaining why the type assertion is needed (e.g., API method deprecated, type definitions outdated) or removing the skipped tests if they're no longer relevant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 638 - 643, The test uses a type-escaping cast "(openai as any)" when calling images.edit with mockImageFile which hides API/type mismatches; either remove the skipped test if it's obsolete or make the intent explicit by adding a one-line comment above the call explaining why the assertion is necessary (e.g., deprecated API, outdated types) and, if you want safer typing, replace the broad any with a targeted cast to the expected client interface (reference the openai variable and the images.edit invocation and mockImageFile) so the reason and scope of the bypass are clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 188-192: Tests assert token attributes as strings but
instrumentation sets them as numbers; update the assertions to expect numeric
values. In the test file change assertions referencing
completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] and
ATTR_GEN_AI_USAGE_OUTPUT_TOKENS (and the similar string-quoted checks) to
compare numeric values (e.g., remove quotes or coerce to Number) so they match
how enrichTokens/tokenCountFromString and
result.usage?.prompt_tokens/completion_tokens set attributes as numbers.
---
Outside diff comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 608-612: Add four new constants in the SemanticAttributes export
(e.g., IMAGE_SIZE, IMAGE_COUNT, IMAGE_QUALITY, IMAGE_STYLE) with values
"gen_ai.request.image.size", "gen_ai.request.image.count",
"gen_ai.request.image.quality", and "gen_ai.request.image.style"; then replace
the hardcoded string keys in the image wrappers module (image-wrappers.ts) and
the test file (instrumentation.test.ts) to import and use these new
SemanticAttributes constants instead of literal strings so all references share
the canonical identifiers.
---
Nitpick comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 385-393: The test currently uses a hardcoded attribute key
completionSpan.attributes["gen_ai.tool.definitions"]; update the test to import
the AI semantic attribute constant (e.g., GEN_AI_TOOL_DEFINITIONS) from
`@traceloop/ai-semantic-conventions` (or the appropriate `@opentelemetry` incubating
constant) and replace the hardcoded string usages with that constant in the
assertions for completionSpan.attributes; also update the other occurrence noted
in the file (the similar use around line ~469) to use the same imported constant
so both tests follow the coding guideline.
- Around line 638-643: The test uses a type-escaping cast "(openai as any)" when
calling images.edit with mockImageFile which hides API/type mismatches; either
remove the skipped test if it's obsolete or make the intent explicit by adding a
one-line comment above the call explaining why the assertion is necessary (e.g.,
deprecated API, outdated types) and, if you want safer typing, replace the broad
any with a targeted cast to the expected client interface (reference the openai
variable and the images.edit invocation and mockImageFile) so the reason and
scope of the bypass are clear.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0796ca96-1e4c-4b39-bcf7-c1f8f93e2bc0
📒 Files selected for processing (2)
packages/instrumentation-openai/src/image-wrappers.tspackages/instrumentation-openai/test/instrumentation.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/instrumentation-openai/src/image-wrappers.ts
| assert.equal( | ||
| completionSpan.attributes[`${ATTR_GEN_AI_USAGE_PROMPT_TOKENS}`], | ||
| "8", | ||
| ); | ||
| assert.ok( | ||
| +completionSpan.attributes[`${ATTR_GEN_AI_USAGE_COMPLETION_TOKENS}`]! > 0, | ||
| completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], | ||
| "15", | ||
| ); | ||
| assert.ok(+completionSpan.attributes[ATTR_GEN_AI_USAGE_OUTPUT_TOKENS]! > 0); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how token attributes are being set in the instrumentation
rg -n "ATTR_GEN_AI_USAGE_INPUT_TOKENS|ATTR_GEN_AI_USAGE_OUTPUT_TOKENS" --type ts -A2 -B2Repository: traceloop/openllmetry-js
Length of output: 25662
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '170,200p'Repository: traceloop/openllmetry-js
Length of output: 1256
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '220,245p'Repository: traceloop/openllmetry-js
Length of output: 1140
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '400,425p'Repository: traceloop/openllmetry-js
Length of output: 1176
🏁 Script executed:
cat -n packages/instrumentation-openai/src/instrumentation.ts | sed -n '690,720p'Repository: traceloop/openllmetry-js
Length of output: 1325
🏁 Script executed:
# Look for streaming-specific token handling
rg -n "streaming|stream" packages/instrumentation-openai/src/instrumentation.ts -A 3 -B 3 | head -80Repository: traceloop/openllmetry-js
Length of output: 922
🏁 Script executed:
cat -n packages/instrumentation-openai/src/instrumentation.ts | sed -n '422,550p'Repository: traceloop/openllmetry-js
Length of output: 5078
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '140,200p'Repository: traceloop/openllmetry-js
Length of output: 2245
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '195,245p'Repository: traceloop/openllmetry-js
Length of output: 2115
🏁 Script executed:
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '1,120p'Repository: traceloop/openllmetry-js
Length of output: 4622
🏁 Script executed:
rg -n "toString|String|\.split|String\(" packages/instrumentation-openai/src/ -A 2 -B 2 | grep -A 5 -B 5 "token"Repository: traceloop/openllmetry-js
Length of output: 2379
🏁 Script executed:
# Check the recorded mock responses to see what type tokens are
find packages/instrumentation-openai/test/recordings -name "*.json" -exec grep -l "prompt_tokens\|completion_tokens" {} \; 2>/dev/null | head -3Repository: traceloop/openllmetry-js
Length of output: 50
🏁 Script executed:
cat -n packages/instrumentation-openai/src/instrumentation.ts | sed -n '528,560p'Repository: traceloop/openllmetry-js
Length of output: 1198
🏁 Script executed:
# Look at lines 416 and 492 test context
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '360,420p'Repository: traceloop/openllmetry-js
Length of output: 2453
🏁 Script executed:
# Check if there's any token coercion or conversion happening in the span attribute setter
rg -n "setAttribute" packages/instrumentation-openai/src/instrumentation.ts | head -20Repository: traceloop/openllmetry-js
Length of output: 506
🏁 Script executed:
# Verify the enrichTokens test configuration
cat -n packages/instrumentation-openai/test/instrumentation.test.ts | sed -n '99,110p'Repository: traceloop/openllmetry-js
Length of output: 631
Token value assertions use inconsistent types; tests expect strings but instrumentation sets numbers.
Lines 189-190 and 236-237 assert token values as strings ("15" and "8"), while lines 416 and 492 assert them as numbers (82). The instrumentation code at lines 701-703 consistently sets these attributes directly from result.usage?.prompt_tokens and result.usage?.completion_tokens, which are numbers. When enrichTokens is enabled (as configured in all tests), tokens are computed numerically via tokenCountFromString() and stored as numbers (lines 543-547). The string assertions should be changed to expect numbers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines
188 - 192, Tests assert token attributes as strings but instrumentation sets
them as numbers; update the assertions to expect numeric values. In the test
file change assertions referencing
completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] and
ATTR_GEN_AI_USAGE_OUTPUT_TOKENS (and the similar string-quoted checks) to
compare numeric values (e.g., remove quotes or coerce to Number) so they match
how enrichTokens/tokenCountFromString and
result.usage?.prompt_tokens/completion_tokens set attributes as numbers.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/instrumentation-openai/src/image-wrappers.ts (2)
355-404:⚠️ Potential issue | 🟠 MajorExport all generated images, not just
response.data[0].When
n > 1, this serializes a single assistant image even though the API returned multiple results. The span then loses most of the response payload whilegen_ai.request.image.countand token usage still reflect all images.Based on learnings, "Instrumentations must extract request/response data and token usage from wrapped calls".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/image-wrappers.ts` around lines 355 - 404, The code only serializes response.data[0]; change it to iterate over all entries in response.data, processing each item's b64_json or url similarly to the current logic (use span.spanContext().traceId/spanId and uploadCallback when provided, fetch+base64 when only url is present, fallback to the original url on error), collect a URI for each image, and then set ATTR_GEN_AI_OUTPUT_MESSAGES to a JSON string containing a single assistant message whose parts array contains one { type: "uri", modality: "image", uri: ... } entry per image; ensure per-image errors are caught and do not abort processing of remaining images.
322-326:⚠️ Potential issue | 🟠 Major
gpt-image-1usage tokens still come out of the generic fallback.
calculateImageGenerationTokens()documentsgpt-image-1pricing, but this path still publishes the fallback1056value for any non-DALL·E model. That makes bothATTR_GEN_AI_USAGE_OUTPUT_TOKENSandSpanAttributes.GEN_AI_USAGE_TOTAL_TOKENSinaccurate forgpt-image-1requests.Possible fix
- } else { - // Default fallback for unknown models - tokensPerImage = 1056; + } else if (model === "gpt-image-1") { + const gptImageCosts: Record<string, Record<string, number>> = { + low: { + "1024x1024": 272, + "1024x1536": 408, + "1536x1024": 400, + }, + medium: { + "1024x1024": 1056, + "1024x1536": 1584, + "1536x1024": 1568, + }, + high: { + "1024x1024": 4160, + "1024x1536": 6240, + "1536x1024": 6208, + }, + }; + tokensPerImage = + gptImageCosts[quality]?.[size] ?? gptImageCosts.medium["1024x1024"]; + } else { + // Default fallback for unknown models + tokensPerImage = 1056;Based on learnings, "Instrumentations must extract request/response data and token usage from wrapped calls".
Also applies to: 341-351
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/src/image-wrappers.ts` around lines 322 - 326, The image token count path currently falls back to the generic 1056 for non-DALL·E models; update the two spots that set ATTR_GEN_AI_USAGE_OUTPUT_TOKENS (the block using calculateImageGenerationTokens and the similar 341-351 block) to detect and pass the actual image model identifier (e.g., params.model or params.modelName) into calculateImageGenerationTokens so it can apply the gpt-image-1 pricing branch, and ensure you compute the response byte/length from response.data (or sum of response.data entries) for gpt-image-1 instead of using the generic fallback so SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS reflects the real token usage.
♻️ Duplicate comments (1)
packages/instrumentation-openai/test/instrumentation.test.ts (1)
188-192:⚠️ Potential issue | 🟡 MinorTighten these token checks to numeric
strictEqual.These assertions still use string literals with
assert.equal, so a numeric attribute passes via coercion. The rest of the file already treats token attrs as numbers, so this won't catch a type regression.Suggested change
- assert.equal( - completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], - "15", - ); + assert.strictEqual( + completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], + 15, + ); @@ - assert.equal( - completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], - "8", - ); + assert.strictEqual( + completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], + 8, + );Also applies to: 235-239
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 188 - 192, The assertions use string coercion and non-strict equality; convert token attributes to numbers and use strict assertions: replace checks on completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] with numeric comparisons via Number(...) (or unary +) and assert.strictEqual to 15, and replace the output token presence check on completionSpan.attributes[ATTR_GEN_AI_USAGE_OUTPUT_TOKENS] with a strict numeric assertion (e.g. assert.strictEqual(Number(... ) > 0, true) or assert.ok(Number(... ) > 0) if you prefer truthiness). Do the same fixes for the duplicated assertions later in the file (the other occurrences around the second block at lines ~235-239) referencing the same ATTR_GEN_AI_USAGE_INPUT_TOKENS and ATTR_GEN_AI_USAGE_OUTPUT_TOKENS symbols.
🧹 Nitpick comments (1)
packages/instrumentation-openai/test/instrumentation.test.ts (1)
629-667: The migrated image edit/variation paths still have no active regression coverage.Both cases stay
it.skip(...), and the skipped edit case still checks the legacygen_ai.request.typefield instead of the new operation attr. Right now the image-generation path is covered, but the two other image endpoints changed in this PR can still regress silently.Also applies to: 669-706
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-openai/test/instrumentation.test.ts` around lines 629 - 667, Un-skip the image edit and image variation tests (remove it.skip) so they run, and update their assertion that checks the legacy "gen_ai.request.type" to assert the new operation attribute instead (use the new operation attribute symbol/name used in your codebase, e.g. ATTR_GEN_AI_OPERATION or the string key "gen_ai.operation") while keeping the existing checks for ATTR_GEN_AI_PROVIDER_NAME, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS and ATTR_GEN_AI_OUTPUT_MESSAGES; ensure you locate the assertions in instrumentation.test.ts that reference it.skip, memoryExporter.getFinishedSpans(), the editSpan variable, and the legacy "gen_ai.request.type" to perform the replacement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/instrumentation-openai/src/image-wrappers.ts`:
- Around line 355-404: The code only serializes response.data[0]; change it to
iterate over all entries in response.data, processing each item's b64_json or
url similarly to the current logic (use span.spanContext().traceId/spanId and
uploadCallback when provided, fetch+base64 when only url is present, fallback to
the original url on error), collect a URI for each image, and then set
ATTR_GEN_AI_OUTPUT_MESSAGES to a JSON string containing a single assistant
message whose parts array contains one { type: "uri", modality: "image", uri:
... } entry per image; ensure per-image errors are caught and do not abort
processing of remaining images.
- Around line 322-326: The image token count path currently falls back to the
generic 1056 for non-DALL·E models; update the two spots that set
ATTR_GEN_AI_USAGE_OUTPUT_TOKENS (the block using calculateImageGenerationTokens
and the similar 341-351 block) to detect and pass the actual image model
identifier (e.g., params.model or params.modelName) into
calculateImageGenerationTokens so it can apply the gpt-image-1 pricing branch,
and ensure you compute the response byte/length from response.data (or sum of
response.data entries) for gpt-image-1 instead of using the generic fallback so
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS reflects the real token usage.
---
Duplicate comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 188-192: The assertions use string coercion and non-strict
equality; convert token attributes to numbers and use strict assertions: replace
checks on completionSpan.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS] with numeric
comparisons via Number(...) (or unary +) and assert.strictEqual to 15, and
replace the output token presence check on
completionSpan.attributes[ATTR_GEN_AI_USAGE_OUTPUT_TOKENS] with a strict numeric
assertion (e.g. assert.strictEqual(Number(... ) > 0, true) or
assert.ok(Number(... ) > 0) if you prefer truthiness). Do the same fixes for the
duplicated assertions later in the file (the other occurrences around the second
block at lines ~235-239) referencing the same ATTR_GEN_AI_USAGE_INPUT_TOKENS and
ATTR_GEN_AI_USAGE_OUTPUT_TOKENS symbols.
---
Nitpick comments:
In `@packages/instrumentation-openai/test/instrumentation.test.ts`:
- Around line 629-667: Un-skip the image edit and image variation tests (remove
it.skip) so they run, and update their assertion that checks the legacy
"gen_ai.request.type" to assert the new operation attribute instead (use the new
operation attribute symbol/name used in your codebase, e.g.
ATTR_GEN_AI_OPERATION or the string key "gen_ai.operation") while keeping the
existing checks for ATTR_GEN_AI_PROVIDER_NAME, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS,
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS and ATTR_GEN_AI_OUTPUT_MESSAGES; ensure
you locate the assertions in instrumentation.test.ts that reference it.skip,
memoryExporter.getFinishedSpans(), the editSpan variable, and the legacy
"gen_ai.request.type" to perform the replacement.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 91ea0ab4-cc48-4fe7-a58e-df17baf2ee3a
📒 Files selected for processing (2)
packages/instrumentation-openai/src/image-wrappers.tspackages/instrumentation-openai/test/instrumentation.test.ts
…tercept the new version http requests so tests fail
There was a problem hiding this comment.
🧹 Nitpick comments (3)
packages/traceloop-sdk/test/agent_decorator.test.ts (2)
281-289: Consider using the imported constant for consistency.The comment helpfully explains that manual instrumentation uses the legacy indexed format. However, the hardcoded strings
gen_ai.prompt.0.roleandgen_ai.prompt.0.contentcould use the importedATTR_GEN_AI_PROMPTconstant for consistency:- assert.strictEqual( - completionSpan.attributes[`gen_ai.prompt.0.role`], + assert.strictEqual( + completionSpan.attributes[`${ATTR_GEN_AI_PROMPT}.0.role`],This would require importing
ATTR_GEN_AI_PROMPT(already imported in the other test file). As per coding guidelines, AI/LLM semantic attribute constants should be imported from the conventions package rather than hardcoding strings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/traceloop-sdk/test/agent_decorator.test.ts` around lines 281 - 289, Replace the hardcoded legacy attribute keys in the assertions with the imported ATTR_GEN_AI_PROMPT constant: import ATTR_GEN_AI_PROMPT at the top of the test file and use string concatenation (e.g., ATTR_GEN_AI_PROMPT + ".0.role" and ATTR_GEN_AI_PROMPT + ".0.content") when accessing completionSpan.attributes in the assertions so the test uses the canonical constant instead of raw strings.
110-110: Prefix matching is appropriate but could be more specific.The change from exact
"openai.chat"tostartsWith("chat ")aligns with the new span naming convention that includes the model (e.g.,"chat gpt-4o-mini"). This is correct for the migration.Consider whether a more specific pattern like matching against a regex
^chat .+or checking for additional attributes would reduce false positive matches in more complex test scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/traceloop-sdk/test/agent_decorator.test.ts` at line 110, The test is using a broad prefix match for span names (spans.find(... startsWith("chat "))) which may yield false positives; update the matcher to be more specific by using a regex match like /^chat .+/ on span.name (e.g., replace the startsWith check in the spans.find callback) and/or assert an additional distinguishing attribute on the found span (e.g., check span.attributes.model or span.attributes["telemetry.span_type"]) to ensure the correct "chat <model>" span is selected in agent_decorator.test.ts.packages/traceloop-sdk/test/decorators.test.ts (1)
571-580: Functional but fragile string search in JSON.The approach of using
.includes()on the raw JSON string works for these tests with unique content, but is somewhat fragile. A more robust approach would parse the JSON first:const openAI1Span = spans.find((span) => { const input = span.attributes[ATTR_GEN_AI_INPUT_MESSAGES] as string; if (!input) return false; return JSON.parse(input).some((msg: any) => msg.parts?.some((p: any) => p.content?.includes("OpenTelemetry")) ); });However, for test code with known unique content strings, the current approach is acceptable and more concise.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/traceloop-sdk/test/decorators.test.ts` around lines 571 - 580, The test currently looks for substrings inside the stored JSON by using spans.find with .includes on the ATTR_GEN_AI_INPUT_MESSAGES string which is fragile; change the predicate to parse the JSON string from span.attributes[ATTR_GEN_AI_INPUT_MESSAGES] (guarding for falsy), iterate the array of messages and their parts, and check message.parts[].content for the expected keyword (e.g., "OpenTelemetry" or "Typescript") so the test asserts against parsed structure rather than raw string matching.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/traceloop-sdk/test/agent_decorator.test.ts`:
- Around line 281-289: Replace the hardcoded legacy attribute keys in the
assertions with the imported ATTR_GEN_AI_PROMPT constant: import
ATTR_GEN_AI_PROMPT at the top of the test file and use string concatenation
(e.g., ATTR_GEN_AI_PROMPT + ".0.role" and ATTR_GEN_AI_PROMPT + ".0.content")
when accessing completionSpan.attributes in the assertions so the test uses the
canonical constant instead of raw strings.
- Line 110: The test is using a broad prefix match for span names
(spans.find(... startsWith("chat "))) which may yield false positives; update
the matcher to be more specific by using a regex match like /^chat .+/ on
span.name (e.g., replace the startsWith check in the spans.find callback) and/or
assert an additional distinguishing attribute on the found span (e.g., check
span.attributes.model or span.attributes["telemetry.span_type"]) to ensure the
correct "chat <model>" span is selected in agent_decorator.test.ts.
In `@packages/traceloop-sdk/test/decorators.test.ts`:
- Around line 571-580: The test currently looks for substrings inside the stored
JSON by using spans.find with .includes on the ATTR_GEN_AI_INPUT_MESSAGES string
which is fragile; change the predicate to parse the JSON string from
span.attributes[ATTR_GEN_AI_INPUT_MESSAGES] (guarding for falsy), iterate the
array of messages and their parts, and check message.parts[].content for the
expected keyword (e.g., "OpenTelemetry" or "Typescript") so the test asserts
against parsed structure rather than raw string matching.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2b959867-d793-4bd0-9d22-c20b2bb111d9
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (3)
packages/traceloop-sdk/test/agent_decorator.test.tspackages/traceloop-sdk/test/decorators.test.tspackages/traceloop-sdk/test/test-setup.ts
✅ Files skipped from review due to trivial changes (1)
- packages/traceloop-sdk/test/test-setup.ts
Summary by CodeRabbit
Release Notes
New Features
Improvements
Dependencies