Skip to content

feat(langchain): implement callback-based instrumentation with auto-injection#649

Merged
nirga merged 17 commits intomainfrom
feat/langchain-callback-instrumentation
Aug 18, 2025
Merged

feat(langchain): implement callback-based instrumentation with auto-injection#649
nirga merged 17 commits intomainfrom
feat/langchain-callback-instrumentation

Conversation

@nirga
Copy link
Copy Markdown
Member

@nirga nirga commented Aug 17, 2025

Summary

  • Implement callback-based LangChain instrumentation using LangChain's callback system instead of method patching
  • Add automatic callback injection via CallbackManager patching for seamless auto-instrumentation
  • Improve Bedrock instrumentation to support messages API format used by LangChain
  • Fix token usage tracking, model extraction, and vendor detection for BedrockChat

Details

This PR switches the LangChain instrumentation approach from method patching to using LangChain's native callback system, matching the Python implementation approach. Key changes:

  • New callback handler: TraceloopCallbackHandler extends BaseCallbackHandler to create both task and LLM spans
  • Automatic injection: Patches CallbackManager._configureSync to automatically inject our callback handler
  • Better Bedrock support: Enhanced Bedrock instrumentation handles both legacy prompt format and new messages API format
  • Improved metrics: Proper token usage tracking, model name extraction from Bedrock IDs, and vendor detection
  • Span naming: Follows Python convention with bedrock.chat span names and AWS vendor attribution

Test plan

  • All existing tests pass
  • New BedrockChat test added and passing
  • Manual testing in sample app confirms spans are created automatically
  • Token usage, model names, and vendor detection working correctly
  • Both streaming and non-streaming Bedrock calls instrumented properly

🤖 Generated with Claude Code


Important

Implements callback-based LangChain instrumentation with auto-injection and enhances Bedrock support for messages API format.

  • LangChain Instrumentation:
    • Implements TraceloopCallbackHandler extending BaseCallbackHandler for task and LLM spans in callback_handler.ts.
    • Patches CallbackManager._configureSync for automatic callback injection in instrumentation.ts.
  • Bedrock Instrumentation:
    • Enhances support for messages API format in instrumentation.ts.
    • Fixes token usage tracking, model extraction, and vendor detection.
  • Testing:
    • Adds tests for BedrockChat and LangChain instrumentation in instrumentation.test.ts.
  • Miscellaneous:
    • Updates dependencies in package.json and sample-app/package.json.
    • Modifies sample app scripts in sample_langchain_bedrock.ts and sample_structured_output.ts.

This description was created by Ellipsis for 6af4c33. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • New Features

    • Automatic LangChain tracing via a new callback-based handler and runtime patching of LangChain’s callback manager; Bedrock/Anthropic/Cohere now support message/content array formats.
  • Bug Fixes

    • Safer prompt/content handling with explicit early returns; improved finish-reason/content extraction and Cohere token-usage reporting when available.
  • Chores

    • Dependency upgrades (LangChain, OpenAI, LangChain AWS/community), simplified LangChain toggle in SDK, new sample-app script for LangChain+Bedrock.
  • Tests

    • Added BedrockChat test; updated, added, and removed HAR recording fixtures.

…njection

- Switch from method patching to LangChain callback handler approach
- Implement automatic callback injection via CallbackManager patching
- Add proper token usage tracking for Bedrock models
- Improve model name extraction and vendor detection
- Update Bedrock instrumentation to handle messages API format
- All tests passing with correct span naming (bedrock.chat)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 17, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Caution

Review failed

The pull request is closed.

Walkthrough

Runtime-patches LangChain CallbackManager to inject TraceloopCallbackHandler; extends Bedrock instrumentation to handle message/content arrays, safer prompt handling, Cohere token-usage fields, and Anthropic message formats; updates tests, HAR fixtures, sample apps, and simplifies LangChain init options to a boolean.

Changes

Cohort / File(s) Summary
Bedrock instrumentation
packages/instrumentation-bedrock/src/instrumentation.ts
Parse requestBody.messages and legacy prompt; prefer `max_tokens_to_sample
LangChain callback handler & runtime patching
packages/instrumentation-langchain/src/callback_handler.ts, packages/instrumentation-langchain/src/instrumentation.ts, packages/instrumentation-langchain/src/index.ts
Add exported TraceloopCallbackHandler (OTel spans, attributes, traceContent control); replace multi-module patching with runtime patch of @langchain/core/callbacks/manager _configureSync to inject the handler; simplify manuallyInstrument signature; re-export handler.
LangChain tests & recordings
packages/instrumentation-langchain/test/instrumentation.test.ts, packages/instrumentation-langchain/recordings/.../*.har
Replace per-module manual instrumentation with BedrockInstrumentation in tests; add BedrockChat-with-tools test; update AWS env handling in tests; add/update multiple HAR fixtures (new Bedrock Claude HAR, refreshed Wikipedia/OpenAI snapshots); remove two legacy HARs.
LangChain package metadata
packages/instrumentation-langchain/package.json
Bump @langchain/core and related packages; add devDeps for @traceloop/instrumentation-bedrock and @types/node.
Sample apps
packages/sample-app/package.json, packages/sample-app/src/sample_langchain_bedrock.ts, packages/sample-app/src/sample_llama_index_openai_agent.ts, packages/sample-app/src/sample_structured_output.ts
Add run:langchain_bedrock script and workspace instrumentation dep; bump LangChain/OpenAI deps; switch sample to BedrockChat + HumanMessage and message-array invocation; simplify LLamaOpenAI usage to apiKey; update OpenAI chat call path and response handling.
SDK interfaces & tracing init
packages/traceloop-sdk/src/lib/interfaces/initialize-options.interface.ts, packages/traceloop-sdk/src/lib/tracing/index.ts
Replace LangChain per-module typed config with a boolean flag; always instantiate LangChainInstrumentation during init and remove per-module manual wiring.
Misc recordings & HARs
packages/instrumentation-langchain/recordings/.../*.har
Add new Bedrock Claude HAR, refresh multiple Wikipedia/OpenAI snapshots (timestamps, headers, content), and remove two older HAR fixtures.

Sequence Diagram(s)

sequenceDiagram
  participant App
  participant LC as LangChain CallbackManager
  participant TL as TraceloopCallbackHandler
  participant OTel as OpenTelemetry Tracer

  App->>LC: start run (LLM / Chain / Tool)
  Note over LC: CallbackManager is runtime-patched to inject TL
  LC->>TL: onStart(run)
  TL->>OTel: start span(s) with attributes (model, prompts, inputs)
  LC->>App: execute
  LC->>TL: onEnd(run, outputs, usage)
  TL->>OTel: set attributes (completions, usage), end span(s)
Loading
sequenceDiagram
  participant App
  participant SDK as Bedrock SDK
  participant Inst as Bedrock Instrumentation
  participant Cloud as Bedrock Service

  App->>SDK: invoke(messages|prompt, params)
  SDK->>Inst: intercept request
  Inst->>Inst: extract prompts, max_tokens, decide traceContent
  SDK->>Cloud: HTTP POST
  Cloud-->>SDK: response (content/finish_reason/usage)
  SDK->>Inst: intercept response
  Inst->>Inst: extract content, finish_reason, usage attributes
  SDK-->>App: return result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • galkleinman
  • doronkopit5

Poem

A rabbit hops through spans and thread,
I patch callback managers where traces spread.
Prompts parsed, tokens counted too,
Spans record models, content, and who.
Hop, trace, repeat — a tracing stew! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c0c6b94 and 54c66fc.

📒 Files selected for processing (1)
  • packages/instrumentation-langchain/src/callback_handler.ts (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/langchain-callback-instrumentation

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to ed79220 in 2 minutes and 11 seconds. Click for details.
  • Reviewed 4233 lines of code in 18 files
  • Skipped 1 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/instrumentation.ts:65
  • Draft comment:
    Good job patching the CallbackManager by extending _configureSync. Consider adding some minimal logging in the else clause (line 59-60) or in the catch block of instrumentCallbackManagerDirectly for better observability when the CallbackManager isn’t found.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
2. packages/instrumentation-langchain/src/instrumentation.ts:105
  • Draft comment:
    The _shouldSendPrompts function correctly converts context value to a boolean. This explicit conversion is a good practice to avoid type issues.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
3. packages/traceloop-sdk/src/lib/tracing/index.ts:313
  • Draft comment:
    Resource fallback handling is robust; however, it might be beneficial to log a warning (or propagate the error) if creating the resource fails, rather than silently falling back.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
4. packages/traceloop-sdk/src/lib/tracing/index.ts:250
  • Draft comment:
    In manuallyInitInstrumentations, the code conditionally creates instrumentations based on provided modules. The explicit enabling of LangChain instrumentation as default is clear and consistent.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
5. packages/instrumentation-bedrock/src/instrumentation.ts:369
  • Draft comment:
    Typographical issue: In the comment "// The format is removing when we are setting span attribute", the word 'removing' seems awkward. Consider rewording it, e.g., "// The format is adjusted when setting the span attribute" or "// The format is removed when setting the span attribute".
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment does point out a real grammatical issue, it's a very minor change to documentation that doesn't affect functionality. The existing comment, while not perfectly worded, is still understandable. The rules state we should not make purely informative comments or obvious/unimportant changes. The grammar issue could make the code slightly harder to maintain if other developers are confused by the awkward wording. Documentation quality does matter. While documentation quality matters, this is an extremely minor grammatical issue in an internal comment that doesn't significantly impact code understanding. The meaning is still clear enough. Delete this comment as it suggests a trivial documentation change that doesn't meaningfully improve code quality or maintainability.
6. packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-tools-instrumentation_252498424/recording.har:208
  • Draft comment:
    Typographical issue: In the JSON string of the 'text' field, there is a double space after 'retrieval-augmented generation,' (i.e., "retrieval-augmented generation, document summarization"). Please remove the extra space.
  • Reason this comment was not posted:
    Comment was on unchanged code.
7. packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-tools-instrumentation_252498424/recording.har:366
  • Draft comment:
    Typographical note: In the JSON text content, there is a double space before "etc." ("object detection, etc."). You might want to correct it to use a single space.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and focuses on a typographical issue that doesn't affect the functionality or logic of the code. It doesn't provide a suggestion for code improvement or address a potential issue with the code's behavior.
8. packages/sample-app/src/sample_structured_output.ts:21
  • Draft comment:
    Typographical error: The model field is set to "gpt-4o", which might be a typo. Please confirm if it should be "gpt-4".
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_b3krkOrJvA7hyh0J

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

- Add eslint disable comments for required any types
- Prefix unused parameters with underscore
- Fix syntax errors in try-catch blocks
- Address TypeScript warnings

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 7759fb4 in 1 minute and 2 seconds. Click for details.
  • Reviewed 203 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:107
  • Draft comment:
    Renaming optional parameters to use an underscore prefix improves clarity for unused values. Ensure that these changes align with the base callback interface and that no necessary data is inadvertently discarded.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/instrumentation-langchain/src/callback_handler.ts:218
  • Draft comment:
    Renaming the index parameter to '_idx' in the map callback clearly indicates it is unused. This enhances readability.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/instrumentation-langchain/src/instrumentation.ts:60
  • Draft comment:
    Adding diagnostic logging in the catch block for instrumenting the CallbackManager is a good practice for troubleshooting instrumentation issues.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
4. packages/instrumentation-langchain/src/instrumentation.ts:97
  • Draft comment:
    The removal of the empty else block when the CallbackManager is already patched cleans up the code. Consider adding a debug log here for improved traceability if double-patching is encountered.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
5. packages/instrumentation-langchain/test/instrumentation.test.ts:371
  • Draft comment:
    Changing the tool function parameter from '_args' to 'args' improves clarity since the argument is actively used in the function.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_kpQ9J2aIe7KgvkGg

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

- Update agent test to use AgentExecutor.workflow span name
- Fix LCEL test to handle nested JSON structure in tool input
- All tests now passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 50fcc66 in 1 minute and 8 seconds. Click for details.
  • Reviewed 50 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:68
  • Draft comment:
    Use _runName (parameter) instead of runName for 'traceloop.workflow.name' to align with the method signature.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. packages/instrumentation-langchain/test/instrumentation.test.ts:154
  • Draft comment:
    Updated agent span name to 'AgentExecutor.workflow' per new naming conventions.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, as it only states what was done without providing any suggestions, questions, or issues to address. It does not align with the rules for useful comments.
3. packages/instrumentation-langchain/test/instrumentation.test.ts:160
  • Draft comment:
    Adjusted input extraction by removing '.args[0]' to match updated span entity structure.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, explaining what was changed without providing any actionable feedback or suggestions. It doesn't ask for confirmation or suggest improvements.
4. packages/instrumentation-langchain/test/instrumentation.test.ts:351
  • Draft comment:
    Refined test to parse nested JSON; now verifies the 'input' property from the tool input.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment seems to be purely informative, describing what the test now does. It doesn't provide a suggestion, ask for confirmation, or point out a potential issue.

Workflow ID: wflow_2g2LyZT4VfjQgCQt

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (6)
packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-correct-attributes-in-span-for-LCEL_182789169/recording.har (1)

328-330: PII leakage: x-client-ip header exposes a real IP. Mask it.

This should not be committed as-is. Replace with a neutral placeholder.

Apply:

             {
               "name": "x-client-ip",
-              "value": "31.154.188.106"
+              "value": "0.0.0.0"
             },

I can add a Polly redaction/sanitizer to strip or mask this header at record-time so future HARs don’t include it. Want me to open an issue and a PR?

packages/sample-app/package.json (1)

42-44: Bump Node.js engine in sample-app to ≥18

The sample app’s current "node": ">=14" is too permissive given its dependencies on OpenAI v5 and LangChain v0.3.x, both of which require Node.js 18+ for full support and to avoid runtime failures.

• File to update:

  • packages/sample-app/package.json (engines.node)

Suggested change:

   "engines": {
-    "node": ">=14"
+    "node": ">=18"
   },
packages/instrumentation-bedrock/src/instrumentation.ts (2)

209-281: Incorrect detection of streaming vs non-streaming responses (can break parsing).

result.body is checked against Object.getPrototypeOf(Uint8Array), which doesn’t detect a Uint8Array instance. This causes non-streaming bodies to be treated as async streams, leading to runtime errors and missing attributes.

Fix by reliably detecting async iterables and Uint8Array bodies.

Apply this diff:

-          if (!(result.body instanceof Object.getPrototypeOf(Uint8Array))) {
-            const rawRes = result.body as AsyncIterable<bedrock.ResponseStream>;
+          // Distinguish streaming vs non-streaming responses
+          const isAsyncIterable =
+            result.body &&
+            typeof (result.body as any)[Symbol.asyncIterator] === "function";
+          if (isAsyncIterable) {
+            const rawRes = result.body as AsyncIterable<bedrock.ResponseStream>;
@@
-          } else if (result.body instanceof Object.getPrototypeOf(Uint8Array)) {
+          } else if (result.body instanceof Uint8Array) {
             // Convert it to a JSON String
             const jsonString = new TextDecoder().decode(
               result.body as Uint8Array,
             );

244-266: Anthropic streaming chunks aren’t handled; content becomes “undefined” and concatenation is wrong.

Bedrock’s Anthropic SSE uses events like content_block_delta with delta.text. The current logic calls _setResponseAttributes(), which returns no content for these events, and then blindly appends undefined to the accumulator.

Patch to extract delta.text for Anthropic streaming and guard against undefined appends.

Apply this diff:

               let responseAttributes = this._setResponseAttributes(
                 modelVendor,
                 parsedResponse,
                 true,
               );
 
+              // Handle Anthropic streaming events (Bedrock): accumulate delta.text
+              if (
+                modelVendor === "anthropic" &&
+                parsedResponse &&
+                parsedResponse["type"] === "content_block_delta" &&
+                parsedResponse["delta"] &&
+                typeof parsedResponse["delta"]["text"] === "string"
+              ) {
+                responseAttributes = {
+                  ...responseAttributes,
+                  [`${SpanAttributes.LLM_COMPLETIONS}.0.content`]:
+                    parsedResponse["delta"]["text"],
+                };
+              }
+
               // ! NOTE: This make sure the content always have all streamed chunks
               if (this._shouldSendPrompts()) {
                 // Update local value with attribute value that was set by _setResponseAttributes
-                streamedContent +=
-                  responseAttributes[
-                    `${SpanAttributes.LLM_COMPLETIONS}.0.content`
-                  ];
+                const chunkContent =
+                  (responseAttributes[
+                    `${SpanAttributes.LLM_COMPLETIONS}.0.content`
+                  ] as string) || "";
+                streamedContent += chunkContent;
                 // re-assign the new value to responseAttributes
                 responseAttributes = {
                   ...responseAttributes,
                   [`${SpanAttributes.LLM_COMPLETIONS}.0.content`]:
                     streamedContent,
                 };
               }
packages/traceloop-sdk/src/lib/tracing/index.ts (1)

254-279: Trace content gating omits LangChain instrumentation (privacy gap).

When shouldSendTraces() is false, most instrumentations have traceContent disabled, but LangChainInstrumentation is not updated. This can lead to prompts/content being traced via the callback handler despite the global gating.

Add the missing setConfig call.

Apply this diff:

   if (!shouldSendTraces()) {
     openAIInstrumentation?.setConfig({
       traceContent: false,
     });
+    langchainInstrumentation?.setConfig({
+      traceContent: false,
+    });
     llamaIndexInstrumentation?.setConfig({
       traceContent: false,
     });
     vertexaiInstrumentation?.setConfig({
       traceContent: false,
     });
     aiplatformInstrumentation?.setConfig({
       traceContent: false,
     });
     bedrockInstrumentation?.setConfig({
       traceContent: false,
     });
     cohereInstrumentation?.setConfig({
       traceContent: false,
     });
     chromadbInstrumentation?.setConfig({
       traceContent: false,
     });
     togetherInstrumentation?.setConfig({
       traceContent: false,
     });
   }
packages/sample-app/src/sample_structured_output.ts (1)

20-35: Use typed output with Zod parsing
Your sample-app is on OpenAI SDK ^5.12.2 (packages/sample-app/package.json), which includes chat.completions.parse(). Rather than returning an untyped string from message.content, switch to one of these patterns to preserve the structure and validation:

File: packages/sample-app/src/sample_structured_output.ts (lines 20–35)

Option A (preferred): use the built-in parse helper

-  const completion = await openai.chat.completions.create({
+  const completion = await openai.chat.completions.parse({
     model: "gpt-4o",
     messages: [
       { role: "system", content: "Extract the event information." },
       { role: "user", content: "Alice and Bob are going to a science fair on Friday." },
     ],
     response_format: zodResponseFormat(CalendarEvent, "event"),
   });
-  const content = completion.choices[0].message.content;
-  console.log(content);
-  return content;
+  const event = completion.choices[0].message.parsed;
+  console.log(event);
+  return event;

Option B: explicitly JSON-parse then Zod-validate

   const completion = await openai.chat.completions.create({
     model: "gpt-4o",
     messages: [
       { role: "system", content: "Extract the event information." },
       { role: "user", content: "Alice and Bob are going to a science fair on Friday." },
     ],
     response_format: zodResponseFormat(CalendarEvent, "event"),
   });
-  const content = completion.choices[0].message.content;
-  console.log(content);
-  return content;
+  const raw = completion.choices[0]?.message?.content;
+  if (!raw) throw new Error("No content returned from Chat Completions");
+  const json = typeof raw === "string" ? JSON.parse(raw) : raw;
+  const event = CalendarEvent.parse(json);
+  console.log(event);
+  return event;

Note: if you keep create(), also guard for message.content being an array of parts.

🧹 Nitpick comments (23)
packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-correct-attributes-in-span-for-LCEL_182789169/recording.har (5)

79-86: Sanitize OpenAI cookies and request IDs to reduce churn and avoid leaking ephemeral identifiers.

Replace cookie/set-cookie and x-request-id values with placeholders. This won’t affect playback but will stabilize diffs.

Apply:

           "cookies": [
             {
               "domain": ".api.openai.com",
               "httpOnly": true,
               "name": "_cfuvid",
               "path": "/",
               "sameSite": "None",
               "secure": true,
-              "value": "V0OG2Nd40Whqb_raqdwmaTqKdTAH3KDWW9xjac3d90c-1755439109144-0.0.1.1-604800000"
+              "value": "REDACTED"
             }
           ],
...
             {
               "name": "set-cookie",
-              "value": "_cfuvid=V0OG2Nd40Whqb_raqdwmaTqKdTAH3KDWW9xjac3d90c-1755439109144-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"
+              "value": "_cfuvid=REDACTED; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"
             },
...
             {
               "name": "x-request-id",
-              "value": "req_ad99e205b14f4d118afdc11741e7168b"
+              "value": "req_redacted"
             }

Also applies to: 142-144, 187-187


196-206: Normalize timestamps/timings to reduce recording churn (optional).

startedDateTime, time, and timings.wait are inherently volatile. Consider pinning to sentinel values in HARs to avoid future re-record noise.

Example:

-        "startedDateTime": "2025-08-17T13:58:28.541Z",
-        "time": 490,
+        "startedDateTime": "1970-01-01T00:00:00.000Z",
+        "time": 0,
...
-          "wait": 490
+          "wait": 0

and similarly for the Wikipedia entry below.

Also applies to: 350-360


249-256: Sanitize Wikipedia cookies in recording.

WMF-Uniq cookies are session-scoped and volatile. Mask them to prevent secret/identifier leakage and diff churn.

Apply:

           "cookies": [
             {
               "domain": ".wikipedia.org",
               "expires": "2026-08-17T00:00:00.000Z",
               "httpOnly": true,
               "name": "WMF-Uniq",
               "path": "/",
               "sameSite": "None",
               "secure": true,
-              "value": "Mvw3UBIBsU4O-nEn5JUItgJSAAAAAFvdqrIaFwJVG8ILcwDxZFoTj75Qh4UfcxBd"
+              "value": "REDACTED"
             }
           ],
...
             {
               "name": "set-cookie",
-              "value": "WMF-Uniq=Mvw3UBIBsU4O-nEn5JUItgJSAAAAAFvdqrIaFwJVG8ILcwDxZFoTj75Qh4UfcxBd;Domain=.wikipedia.org;Path=/;HttpOnly;secure;SameSite=None;Expires=Mon, 17 Aug 2026 00:00:00 GMT"
+              "value": "WMF-Uniq=REDACTED;Domain=.wikipedia.org;Path=/;HttpOnly;secure;SameSite=None;Expires=Mon, 17 Aug 2026 00:00:00 GMT"
             },

Also applies to: 305-306


316-318: Optional: redact volatile correlation headers (x-analytics, x-search-id).

These vary per request and don’t affect test semantics. Masking them cuts noise in future diffs.

Apply:

             {
               "name": "x-analytics",
-              "value": ""
+              "value": "REDACTED"
             },
...
             {
               "name": "x-search-id",
-              "value": "5hlfn7v0swivj3w2c82fa4hvg"
+              "value": "REDACTED"
             }

Also applies to: 340-342


56-59: Filter volatile HAR fields for stable tests

The HAR fixtures in packages/instrumentation-langchain/recordings/.../recording.har include environment-dependent values that will break on every re-record:

  • Size/timing keys: headersSize, bodySize, startedDateTime
  • Runtime headers: x-stainless-runtime-version, x-stainless-package-version, x-stainless-os, x-stainless-arch, cf-ray, x-request-id, x-envoy-upstream-service-time

Introduce a preprocessing step in your test harness (e.g. in the HAR-loader utility) to strip or normalize these fields before parsing. This way tests only assert on the stable request/response attributes and won’t churn when versions or timings change.

• Update the HAR-loader in packages/instrumentation-langchain/test/utils (or equivalent)
• Remove the above keys from each entry in har.log.entries before running assertions

packages/instrumentation-langchain/src/callback_handler.ts (11)

25-28: Type the stored span to OpenTelemetry’s Span instead of any

Using any loses type safety and IntelliSense. Prefer the OTEL Span type.

Apply this diff:

-import { Tracer, trace, SpanKind, SpanStatusCode, context } from "@opentelemetry/api";
+import { Tracer, trace, SpanKind, SpanStatusCode, context, type Span } from "@opentelemetry/api";
@@
 interface SpanData {
-  span: any;
+  span: Span;
   runId: string;
 }

64-77: Prefer using semantic-conventions constants for Traceloop attributes

You already import SpanAttributes; using constants for traceloop.* reduces typo risk and keeps consistency with the rest of the codebase.

Apply this diff here (and similarly in other occurrences):

-    taskSpan.setAttributes({
-      "traceloop.span.kind": "task",
-      "traceloop.workflow.name": runName || taskSpanName,
-    });
+    taskSpan.setAttributes({
+      [SpanAttributes.TRACELOOP_SPAN_KIND]: "task",
+      [SpanAttributes.TRACELOOP_WORKFLOW_NAME]: runName || taskSpanName,
+    });
@@
-      taskSpan.setAttributes({
-        "traceloop.entity.input": JSON.stringify(flatMessages.map(m => ({ 
-          role: m._getType(), 
-          content: typeof m.content === 'string' ? m.content : JSON.stringify(m.content)
-        }))),
-      });
+      taskSpan.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_INPUT]: JSON.stringify(
+          flatMessages.map((m) => ({
+            role: m._getType(),
+            content: typeof m.content === "string" ? m.content : JSON.stringify(m.content),
+          })),
+        ),
+      });

Other places to update similarly:

  • Lines 126-133, 228-229, 295-297, 301-302, 322-323, 365-367, 371-372, 391-392.

238-247: Mark handleChatModelEnd as override for consistency

Other lifecycle methods use the override keyword. Minor consistency fix.

Apply this diff:

-  async handleChatModelEnd(
+  override async handleChatModelEnd(
     output: LLMResult,
     runId: string,

220-230: Use semantic-conventions constant for entity output attribute

Small consistency and typo-proofing improvement.

Apply this diff:

-        taskSpan.setAttributes({
-          "traceloop.entity.output": JSON.stringify(completions),
-        });
+        taskSpan.setAttributes({
+          [SpanAttributes.TRACELOOP_ENTITY_OUTPUT]: JSON.stringify(completions),
+        });

294-303: Use constants for workflow/input attributes

Same rationale as above.

Apply this diff:

-    span.setAttributes({
-      "traceloop.span.kind": "workflow",
-      "traceloop.workflow.name": runName || chainName,
-    });
+    span.setAttributes({
+      [SpanAttributes.TRACELOOP_SPAN_KIND]: "workflow",
+      [SpanAttributes.TRACELOOP_WORKFLOW_NAME]: runName || chainName,
+    });
@@
-      span.setAttributes({
-        "traceloop.entity.input": JSON.stringify(inputs),
-      });
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_INPUT]: JSON.stringify(inputs),
+      });

320-327: Use constants for entity output on chain end

Apply this diff:

-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.output": JSON.stringify(outputs),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_OUTPUT]: JSON.stringify(outputs),
+      });
+    }

348-376: Tool input type should be unknown (not string); capture args robustly

Tool inputs in LangChain can be structured. Widen the type and avoid forcing stringification of a string-only shape.

Apply this diff:

-  override async handleToolStart(
+  override async handleToolStart(
     tool: Serialized,
-    input: string,
+    input: unknown,
     runId: string,
@@
-    span.setAttributes({
-      "traceloop.span.kind": "task",
-      "traceloop.entity.name": toolName,
-    });
+    span.setAttributes({
+      [SpanAttributes.TRACELOOP_SPAN_KIND]: "task",
+      [SpanAttributes.TRACELOOP_ENTITY_NAME]: toolName,
+    });
@@
-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.input": JSON.stringify({ args: [input] }),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_INPUT]: JSON.stringify({ args: [input] }),
+      });
+    }

389-397: Use constants for tool output attribute

Apply this diff:

-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.output": JSON.stringify(output),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_OUTPUT]: JSON.stringify(output),
+      });
+    }

439-447: Span name conversion splits consecutive capitals (e.g., ChatOpenAI -> chat.open.a.i)

The current implementation inserts a dot before every capital, splitting acronym blocks. Use a boundary-aware conversion to keep acronym runs together.

Apply this diff:

-  private convertClassNameToSpanName(className: string): string {
-    // Convert PascalCase to lowercase with dots
-    // BedrockChat -> bedrock.chat
-    // ChatOpenAI -> chat.openai
-    return className
-      .replace(/([A-Z])/g, (match, char, index) => {
-        return index === 0 ? char.toLowerCase() : `.${char.toLowerCase()}`;
-      });
-  }
+  private convertClassNameToSpanName(className: string): string {
+    // Convert PascalCase to lowercase with dots; preserve acronym runs.
+    // BedrockChat -> bedrock.chat, ChatOpenAI -> chat.openai
+    return className
+      // boundary between lower/number and upper
+      .replace(/([a-z0-9])([A-Z])/g, "$1.$2")
+      // boundary between acronym and next word (e.g., OpenAIChat -> OpenAI.Chat)
+      .replace(/([A-Z]+)([A-Z][a-z])/g, "$1.$2")
+      .toLowerCase();
+  }

420-434: Generalize Bedrock model name extraction; avoid hard-coded Anthropic suffix

The removal of "-20250219-v1" is vendor/date-specific. Prefer stripping generic version/date suffixes and Bedrock revisions.

Apply this diff:

-    if (className === "BedrockChat") {
-      // The model name might be available in kwargs - cast to any to access kwargs
-      const llmAny = llm as any;
-      const modelId = llmAny.kwargs?.model || llmAny.kwargs?.model_id;
-      if (modelId && typeof modelId === 'string') {
-        // Extract clean model name from full ID (e.g., "us.anthropic.claude-3-7-sonnet-20250219-v1:0" -> "claude-3-7-sonnet")
-        const parts = modelId.split('.');
-        if (parts.length >= 3) {
-          const modelPart = parts.slice(2).join('.').split(':')[0]; // Remove region and version
-          return modelPart.replace('-20250219-v1', ''); // Clean up version suffix
-        }
-        return modelId;
-      }
-    }
+    if (className === "BedrockChat") {
+      const llmAny = llm as any;
+      const modelId = llmAny.kwargs?.model || llmAny.kwargs?.model_id;
+      if (typeof modelId === "string" && modelId.length > 0) {
+        // Drop Bedrock revision suffix (":<n>") if present
+        const withoutRevision = modelId.split(":")[0];
+        const parts = withoutRevision.split(".");
+        // Typically: "<region>.<vendor>.<model...>"
+        const modelPart = parts.length >= 3 ? parts.slice(2).join(".") : parts[parts.length - 1];
+        // Remove common date/version suffix patterns like "-20250219-v1"
+        return modelPart.replace(/-\d{6,8}-v\d+$/i, "");
+      }
+    }

449-481: Vendor detection: consider case-insensitive checks and additional providers

Minor robustness: use case-insensitive checks to avoid missing variants; optionally include Mistral, Meta, etc.

If desired, replace includes checks with case-insensitive regex, e.g.,

  • OpenAI: /\bopenai|gpt\b/i
  • Anthropic: /\banthropic|claude\b/i
  • Bedrock: /\bbedrock\b/i
  • Google: /\bvertex|google\b/i
  • Azure: /\bazure\b/i
  • Cohere: /\bcohere\b/i
  • Etc.

Happy to provide a concrete diff if you want this standardized.

packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-tools-instrumentation_252498424/recording.har (1)

50-58: Consider redacting dynamic/PII-like headers and cookies in HARs to reduce churn and leakage.

Headers like x-client-ip, set-cookie, server, date, x-request-id, and x-analytics, as well as cookie values (WMF-Uniq), make fixtures noisy and potentially expose sensitive data. Scrubbing them keeps recordings stable and safer.

You can sanitize via Polly.JS server hooks:

  • server.any().on("beforePersist", (_, recording) => { delete/normalize volatile headers in recording.request/response; })
  • or configure matchRequestsBy/recording options to ignore these fields.

I can provide a concrete patch to your Polly setup if you share its location.

Also applies to: 102-104, 211-220, 264-266, 371-378, 422-424, 581-582

packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-agent-instrumentation_1316891278/recording.har (1)

81-86: Scrub volatile and identifying headers/cookies from recordings.

OpenAI cookies (_cfuvid), cf-ray, date, x-request-id, and x-envoy-upstream-service-time change frequently and add maintenance overhead; cookies are also sensitive. Sanitizing these in the recorder improves stability and reduces risk.

Also applies to: 138-140, 271-280, 332-334

packages/instrumentation-bedrock/src/instrumentation.ts (1)

460-490: Anthropic response handling doesn’t account for streaming event shapes.

The anthopic branch assumes response.content (messages API) or response.completion (legacy). For Bedrock streaming, events like content_block_delta contain delta.text and no content/completion. You partly mitigate this in the loop above; consider centralizing in _setResponseAttributes to keep vendor-specific logic in one place.

I can craft a follow-up patch to add:

  • if (isStream && response.type === "content_block_delta") -> use response.delta.text
packages/sample-app/src/sample_llama_index_openai_agent.ts (1)

61-62: Guard against missing OPENAI_API_KEY for clearer failures

Fail fast with a helpful error if the key isn’t set; avoids opaque auth errors later.

Apply this diff:

 async function main() {
   const agent = new OpenAIAgent({
     llm: new LLamaOpenAI({
-      apiKey: process.env.OPENAI_API_KEY,
+      apiKey: process.env.OPENAI_API_KEY,
     }),
     tools: [sumNumbers, divideNumbers],
   });

Add the check just before constructing LLamaOpenAI:

 async function main() {
+  if (!process.env.OPENAI_API_KEY) {
+    throw new Error("OPENAI_API_KEY is not set");
+  }
   const agent = new OpenAIAgent({
packages/sample-app/src/sample_langchain_bedrock.ts (1)

25-31: Avoid dumping full model objects to logs; log content instead

Full objects can be noisy and may include sensitive fields. Logging the content is cleaner for a sample and reduces PII/verbosity risk.

Apply this diff:

   console.log("About to invoke BedrockChat...");
-  const response = await model.invoke([message]);
-  console.log("Response received:", response);
+  const response = await model.invoke([message]);
+  console.log("Response received:", response?.content);
packages/instrumentation-langchain/test/instrumentation.test.ts (2)

411-418: bindTools returns a new runnable; assign it and invoke that instance

bindTools doesn’t mutate in place. Not assigning means the tool isn’t actually bound to the model used for invoke().

Apply this diff:

-    model.bindTools([get_cities_data_by_country]);
-
-    const message = new HumanMessage({
+    const modelWithTools = model.bindTools([get_cities_data_by_country]);
+    const message = new HumanMessage({
       content: "What is a popular landmark in the most populous city in the US?",
     });
-
-    const response = await model.invoke([message]);
+    const response = await modelWithTools.invoke([message]);

419-423: Flush tracer provider before asserting on exported spans for stability

Force-flushing reduces flakiness across environments by ensuring spans are exported before you inspect memoryExporter.

Apply this diff:

-    const spans = memoryExporter.getFinishedSpans();
+    await provider.forceFlush();
+    const spans = memoryExporter.getFinishedSpans();
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fabc504 and ed79220.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (18)
  • packages/instrumentation-bedrock/src/instrumentation.ts (3 hunks)
  • packages/instrumentation-langchain/package.json (2 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-BedrockChat-with-tools_4092244032/recording.har (1 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-agent-instrumentation_1316891278/recording.har (19 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-chain-instrumentation_2401024958/recording.har (0 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-retrieval-qa-instrumentation_860608959/recording.har (0 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-tools-instrumentation_252498424/recording.har (21 hunks)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-correct-attributes-in-span-for-LCEL_182789169/recording.har (14 hunks)
  • packages/instrumentation-langchain/src/callback_handler.ts (1 hunks)
  • packages/instrumentation-langchain/src/index.ts (1 hunks)
  • packages/instrumentation-langchain/src/instrumentation.ts (1 hunks)
  • packages/instrumentation-langchain/test/instrumentation.test.ts (8 hunks)
  • packages/sample-app/package.json (3 hunks)
  • packages/sample-app/src/sample_langchain_bedrock.ts (1 hunks)
  • packages/sample-app/src/sample_llama_index_openai_agent.ts (1 hunks)
  • packages/sample-app/src/sample_structured_output.ts (2 hunks)
  • packages/traceloop-sdk/src/lib/interfaces/initialize-options.interface.ts (1 hunks)
  • packages/traceloop-sdk/src/lib/tracing/index.ts (1 hunks)
💤 Files with no reviewable changes (2)
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-chain-instrumentation_2401024958/recording.har
  • packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-retrieval-qa-instrumentation_860608959/recording.har
🧰 Additional context used
🧬 Code Graph Analysis (6)
packages/traceloop-sdk/src/lib/tracing/index.ts (1)
packages/instrumentation-langchain/src/instrumentation.ts (1)
  • LangChainInstrumentation (25-118)
packages/instrumentation-langchain/src/callback_handler.ts (2)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)
  • tool (276-282)
packages/instrumentation-langchain/src/instrumentation.ts (2)
packages/instrumentation-langchain/src/types.ts (1)
  • LangChainInstrumentationConfig (3-14)
packages/instrumentation-langchain/src/callback_handler.ts (1)
  • TraceloopCallbackHandler (30-497)
packages/sample-app/src/sample_structured_output.ts (1)
packages/sample-app/src/sample_decorators.ts (1)
  • completion (29-37)
packages/instrumentation-bedrock/src/instrumentation.ts (1)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
packages/instrumentation-langchain/test/instrumentation.test.ts (3)
packages/instrumentation-bedrock/src/instrumentation.ts (1)
  • BedrockInstrumentation (39-570)
packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)
  • tool (276-282)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
🪛 GitHub Actions: CI
packages/traceloop-sdk/src/lib/tracing/index.ts

[warning] 356-356: ESLint: Unexpected any. Specify a different type. (no-explicit-any)

🪛 ESLint
packages/instrumentation-langchain/src/instrumentation.ts

[error] 59-60: Empty block statement.

(no-empty)


[error] 61-62: Empty block statement.

(no-empty)


[error] 71-71: Unexpected aliasing of 'this' to local variable.

(@typescript-eslint/no-this-alias)


[error] 100-101: Empty block statement.

(no-empty)


[error] 101-102: Empty block statement.

(no-empty)

🔇 Additional comments (17)
packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-correct-attributes-in-span-for-LCEL_182789169/recording.har (3)

216-216: Query update looks correct and aligned with LLM output.

Including quotes and site:wikipedia.org matches the assistant message and ensures the second request replays deterministically. No action needed.

Also applies to: 230-231, 237-238


71-76: Instrumentation Consumes Token Usage Attributes

I’ve confirmed that the prompt_tokens, completion_tokens, and total_tokens fields from the OpenAI‐style usage block are read and applied as span attributes across all relevant packages. Specifically:

  • packages/traceloop-sdk/src/lib/tracing/manual.ts
  • packages/instrumentation-openai/src/instrumentation.ts
  • packages/instrumentation-together/src/instrumentation.ts
  • packages/instrumentation-langchain (recordings and instrumentation code)
  • packages/instrumentation-cohere/src/instrumentation.ts

No additional references to prompt_tokens_details or completion_tokens_details are needed for span attributes. Everything required by the PR’s token-tracking fix is in place.


240-245: LCEL span attribute tests are agnostic to search hits
I confirmed that the “should set correct attributes in span for LCEL” test only asserts the span’s kind and its traceloop.entity.input/output values (deep-equal to result) and does not reference totalhits, search, or any search result fields. The empty Wikipedia response won’t affect this test. No changes required.

packages/instrumentation-langchain/package.json (2)

50-63: devDependency addition for Bedrock instrumentation looks good

Adding "@traceloop/instrumentation-bedrock" as a workspace devDependency aligns with the new Bedrock-based tests and local wiring. No issues spotted.


41-48: Verify and align Node.js engine requirements

LangChain v0.3.70+ targets Node.js 18+, but packages/instrumentation-langchain/package.json (and several sibling packages) still declare

"engines": { "node": ">=14" }

Please confirm whether these instrumentation packages remain compatible with Node 14–17. If they aren’t, bump the engines.node field to >=18 (and consider doing so across all instrumentation packages) to prevent installations on unsupported runtimes.

  • Affected file:
    • packages/instrumentation-langchain/package.json: "node": ">=14"
packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-BedrockChat-with-tools_4092244032/recording.har (1)

1-115: HAR fixture appears sanitized and representative

The recording includes only non-secret metadata (e.g., request IDs, token counts) and a messages-style payload. Suitable for deterministic tests.

packages/sample-app/package.json (1)

30-31: New sample script is consistent with the Bedrock/LangChain path

"run:langchain_bedrock" matches the sample source file and build flow. Looks good.

packages/traceloop-sdk/src/lib/interfaces/initialize-options.interface.ts (1)

97-101: Breaking change: instrumentModules.langchain switched from object to boolean

I didn’t find any direct call sites or type references in the SDK that still use the old object shape, but absence of matches isn’t proof that downstream consumers aren’t relying on it. Please:

  • Update SDK docs, README, and add a MIGRATION.md entry explaining:
    • The new boolean flag syntax.
    • How to convert from the old { /* modules */ } shape.
  • Consider a temporary overload in initializeOptions to accept both shapes (with a deprecation warning) for a smoother upgrade path.
  • Review any sample apps or tutorials for references to the former object API.

Let me know if you’d like me to draft the migration snippet and update the docs.

packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-tools-instrumentation_252498424/recording.har (1)

42-47: HAR fixture refresh looks consistent.

Timestamps, cookies, server headers, and response bodies have been updated to a recent snapshot. Nothing functionally concerning from the test perspective.

Also applies to: 117-121, 146-162, 204-210, 279-283, 304-320, 362-370, 437-441, 462-478, 520-525, 560-566, 595-599, 620-636

packages/instrumentation-langchain/recordings/Test-Langchain-instrumentation_2353722718/should-set-attributes-in-span-for-agent-instrumentation_1316891278/recording.har (1)

55-60: HAR updates align with the new agent/tool_calls flow and streaming payloads.

The expanded SSE payloads and tool_calls entries reflect the updated instrumentation pathway. Looks good.

Also applies to: 71-76, 102-116, 122-124, 155-172, 182-186, 192-202, 205-213, 249-254, 265-270, 296-310, 316-350, 364-378, 386-396

packages/instrumentation-bedrock/src/instrumentation.ts (2)

388-390: Cohere prompt extraction fallback is a good improvement.

Using message || prompt handles both chat and legacy formats consistently.


556-569: Vendor/model parsing is simple and effective.

Splitting on the first dot works for Bedrock IDs like anthropic.claude-3-5-... and amazon.titan-.... This aligns with the rest of the file’s usage.

packages/instrumentation-langchain/src/index.ts (1)

18-18: Re-exporting the callback handler is the right call.

Exposing TraceloopCallbackHandler at the package root simplifies consumption and aligns with the new callback-based instrumentation.

packages/traceloop-sdk/src/lib/tracing/index.ts (2)

101-109: Unconditionally enabling LangChain instrumentation is aligned with the new auto-injection design.

Good move; it avoids requiring the module to be passed in and leverages the handler injection approach.


248-256: Manual-init path now always enables LangChain instrumentation — verify no double-registration.

You clear the global instrumentations array before pushing, so duplicate registration should not happen. Just ensure consumer code won’t also explicitly create a LangChainInstrumentation to avoid duplicate patching.

If you want, I can generate a quick repo scan script to confirm there are no other LangChainInstrumentation instantiations outside this file.

packages/instrumentation-langchain/src/instrumentation.ts (2)

30-33: Constructor-time patching is fine; confirm tracer provider ordering

Patching in the constructor ensures early injection, but make sure typical SDK initialization sets the tracer provider before any LangChain usage triggers _configureSync, so handler spans use the correct tracer. If not guaranteed, document ordering constraints in the README.


47-50: init() returning [] is appropriate with require-time patching

Since you’re directly patching CallbackManager at runtime, leaving init() empty avoids duplicate patches. Looks good.

Comment thread packages/instrumentation-bedrock/src/instrumentation.ts
Comment thread packages/instrumentation-langchain/src/callback_handler.ts
Comment thread packages/instrumentation-langchain/src/instrumentation.ts
Comment thread packages/instrumentation-langchain/src/instrumentation.ts
Comment thread packages/instrumentation-langchain/test/instrumentation.test.ts
- Add eslint disable for require() usage in Node.js context
- Apply prettier formatting to all source files
- Build and tests still passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 6b4a270 in 1 minute and 10 seconds. Click for details.
  • Reviewed 637 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:456
  • Draft comment:
    Consider parameterizing or extracting the hard-coded version suffix ('-20250219-v1') in extractModelName so that future model ID changes can be handled without code changes.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/instrumentation-langchain/src/callback_handler.ts:195
  • Draft comment:
    The token usage extraction logic in handleLLMEnd is duplicated for different formats. Consider refactoring this into a helper function to improve clarity and maintainability.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/instrumentation-langchain/src/callback_handler.ts:473
  • Draft comment:
    The regex in convertClassNameToSpanName works for PascalCase names, but consider edge cases if new naming patterns are introduced.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
4. packages/instrumentation-langchain/src/instrumentation.ts:69
  • Draft comment:
    The patchCallbackManager implementation properly avoids double patching by using a _traceloopPatched flag. Consider adding unit tests for scenarios where _configureSync might be absent or modified in future LangChain versions.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
5. packages/instrumentation-langchain/test/instrumentation.test.ts:430
  • Draft comment:
    The tests comprehensively validate span attributes. Consider adding extra assertions for edge-case scenarios (e.g., when spans are not created) to catch potential regressions in instrumentation behavior.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_5w9fYdByYnIRafdY

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

nirga and others added 3 commits August 17, 2025 21:46
- Remove redundant task span creation in handleChatModelStart
- Create only LLM completion span like Python implementation
- Update error handling for single span approach
- Update tests to expect single span
- Remove unused imports (trace, context)
- Build and tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Both methods now create single LLM completion span
- Remove task span creation from handleLLMStart
- Consistent span naming pattern across all LLM methods
- Remove unused context import
- Build and tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Set LLM_RESPONSE_MODEL in handleLLMEnd
- Now have complete OpenTelemetry LLM semantic attributes:
  - Model (request/response)
  - Prompts (role/content)
  - Completions (role/content)
  - Usage metrics (prompt/completion/total tokens)
  - System vendor and request type

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed b4003af in 2 minutes and 1 seconds. Click for details.
  • Reviewed 313 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:63
  • Draft comment:
    Unified Span Implementation: In handleChatModelStart the previous dual spans (task and LLM) have been merged into a single span (with '.completion' suffix). Make sure that this change doesn’t lose necessary debugging granularity. Consider adding documentation on the rationale.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/instrumentation-langchain/src/callback_handler.ts:106
  • Draft comment:
    Naming Consistency: The handleLLMStart method still uses a span name ending in '.task' and sets 'traceloop.span.kind' as 'task', whereas handleChatModelStart now uses '.completion'. Consider standardizing these names unless the difference is intentional.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/instrumentation-langchain/src/callback_handler.ts:156
  • Draft comment:
    Token Usage Metrics Duplication: The extraction of token usage metrics is performed twice (once using 'usage' and again using 'tokenUsage'). If both exist, attributes might be overwritten. Consider unifying the logic or clearly documenting the fallback behavior for backward compatibility.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The code does show clear duplication in handling token metrics, with the second check potentially overwriting values from the first. However, the comment on line 176 indicates this is intentional for compatibility ("Also check for tokenUsage format (for compatibility)"). This suggests it's a deliberate fallback mechanism rather than an oversight. The current implementation allows handling both formats without complex merging logic. The comment raises a valid technical concern about attribute overwriting. Perhaps there could be a better way to handle the fallback that makes the behavior more explicit. While the concern is valid, the current implementation appears to be an intentional design choice for backward compatibility, clearly documented in the code comments. The comment should be deleted because the apparent duplication is actually an intentional fallback mechanism for compatibility, which is already documented in the code.
4. packages/instrumentation-langchain/test/instrumentation.test.ts:465
  • Draft comment:
    Test Update for BedrockChat: The test now expects a single span named 'bedrock.chat.completion'. Ensure that both streaming and non-streaming scenarios are covered as intended and that no old references to a task span remain.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_rEsPPxDhC4xAJjRy

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 6e6b639 in 1 minute and 23 seconds. Click for details.
  • Reviewed 25 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:103
  • Draft comment:
    Span Naming: Previously, a span was created with ${spanBaseName}.task. Now it's using ${spanBaseName}.completion, aligning with handleChatModelStart. Ensure this change is intentional and consistent with downstream expectations.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. packages/instrumentation-langchain/src/callback_handler.ts:112
  • Draft comment:
    Attribute Removal: The attributes 'traceloop.span.kind' and 'traceloop.workflow.name' were removed. Confirm that these metadata are not needed for non-chat LLM completions or in downstream processing.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.

Workflow ID: wflow_S2inbwIPDFQ1TRh7

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

- Change span names from {model}.completion to {model}
- bedrock.chat.completion -> bedrock.chat
- Matches Python callback handler naming convention
- Update test to expect correct span names
- All tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed ea3b08b in 1 minute and 25 seconds. Click for details.
  • Reviewed 16 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:194
  • Draft comment:
    Avoid reading span.attributes directly; capture the LLM_REQUEST_MODEL in a local variable during span creation to ensure reliability.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The current approach of reading from span.attributes is actually fine. The span is guaranteed to exist since we check for spanData earlier. The LLM_REQUEST_MODEL is set when the span is created and won't change. Using the optional chaining operator handles the case where the attribute might not exist. Storing in a local variable wouldn't provide any real benefit and would require passing state between multiple handler methods. The comment raises a valid point about avoiding direct attribute access. There could be edge cases where the attribute is modified by other code between span creation and completion. In this specific codebase and usage pattern, the span attributes are controlled entirely by this class and there's no realistic way for the REQUEST_MODEL to be modified between setting and reading it. The comment should be deleted as it suggests a more complex solution without providing meaningful benefits in this specific context.

Workflow ID: wflow_eigJd5iHsHzKrBsd

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 1bcbeb5 in 1 minute and 27 seconds. Click for details.
  • Reviewed 38 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:64
  • Draft comment:
    Updated span creation to use the base span name (e.g. 'bedrock.chat') without appending a '.completion' suffix in both handleChatModelStart and handleLLMStart. This aligns with the Python implementation. Ensure any related documentation or downstream dependencies are updated accordingly.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/instrumentation-langchain/test/instrumentation.test.ts:465
  • Draft comment:
    Updated test expectation to match the new span naming: looking for 'bedrock.chat' instead of 'bedrock.chat.completion'. This ensures the test aligns with the updated callback handler behavior.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
3. packages/instrumentation-langchain/test/instrumentation.test.ts:464
  • Draft comment:
    There's an extra space at the end of the comment which may be unintentional. Please remove the trailing whitespace.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 20% <= threshold 50% The comment is pointing out a minor formatting issue, which is not critical to the functionality of the code. It doesn't provide a code suggestion or ask for a test to be written. It also doesn't address a potential bug or issue with the logic. Therefore, it seems to be more of a nitpick rather than a necessary comment.

Workflow ID: wflow_P1Sz5IBbKsXnxs49

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
packages/instrumentation-langchain/test/instrumentation.test.ts (1)

87-89: Enable Bedrock instrumentation; otherwise AWS client patching won’t apply

You set the tracer provider but never call enable(), so patching may not occur and Bedrock LLM spans may not be emitted.

Apply this diff:

   bedrockInstrumentation = new BedrockInstrumentation();
   bedrockInstrumentation.setTracerProvider(provider);
+  bedrockInstrumentation.enable();
🧹 Nitpick comments (4)
packages/instrumentation-langchain/src/callback_handler.ts (3)

375-389: Bedrock model parsing is brittle; generalize to handle varied IDs and versions

Hard-coding “-20250219-v1” will break for other dates/versions. Normalize Bedrock model IDs by removing optional region prefixes and version suffixes via regex.

Apply this diff:

-    if (className === "BedrockChat") {
-      // The model name might be available in kwargs - cast to any to access kwargs
-      const llmAny = llm as any;
-      const modelId = llmAny.kwargs?.model || llmAny.kwargs?.model_id;
-      if (modelId && typeof modelId === "string") {
-        // Extract clean model name from full ID (e.g., "us.anthropic.claude-3-7-sonnet-20250219-v1:0" -> "claude-3-7-sonnet")
-        const parts = modelId.split(".");
-        if (parts.length >= 3) {
-          const modelPart = parts.slice(2).join(".").split(":")[0]; // Remove region and version
-          return modelPart.replace("-20250219-v1", ""); // Clean up version suffix
-        }
-        return modelId;
-      }
-    }
+    if (className === "BedrockChat") {
+      const llmAny = llm as any;
+      const modelId: string | undefined =
+        llmAny.kwargs?.model || llmAny.kwargs?.model_id;
+      if (typeof modelId === "string") {
+        // Strip optional region/vendor prefix (e.g., "us.anthropic.") and version suffixes like "-YYYYMMDD-vN"
+        // Examples:
+        // - us.anthropic.claude-3-7-sonnet-20250219-v1:0 -> claude-3-7-sonnet
+        // - anthropic.claude-3-5-sonnet-20240620-v1:0 -> claude-3-5-sonnet
+        // - cohere.command-r-plus-v1:0 -> command-r-plus
+        const afterVendor = modelId
+          .replace(/^[a-z]{2}\./i, "") // remove leading region like "us."
+          .replace(/^[^.]+\./, ""); // remove vendor like "anthropic." or "cohere."
+        const base = afterVendor.split(":")[0]; // drop ":0"
+        return base.replace(/-\d{8}-v\d+$/i, "").replace(/-v\d+$/i, "");
+      }
+    }

142-151: Chat generations may expose content on message; add a safe fallback

For chat models, generation[0].text may be empty; prefer message.content when available.

Apply this diff:

-          span.setAttributes({
-            [`${SpanAttributes.LLM_COMPLETIONS}.${idx}.role`]: "assistant",
-            [`${SpanAttributes.LLM_COMPLETIONS}.${idx}.content`]: generation[0].text,
-          });
+          const g0: any = generation[0];
+          const content =
+            (g0.message && typeof g0.message.content === "string"
+              ? g0.message.content
+              : undefined) ?? g0.text;
+          if (content) {
+            span.setAttributes({
+              [`${SpanAttributes.LLM_COMPLETIONS}.${idx}.role`]: "assistant",
+              [`${SpanAttributes.LLM_COMPLETIONS}.${idx}.content`]: content,
+            });
+          }

249-258: Use semantic attribute constants consistently for traceloop. keys*

You already import SpanAttributes; prefer constants to avoid typos and ensure consistency.

Apply this diff:

-    span.setAttributes({
-      "traceloop.span.kind": "workflow",
-      "traceloop.workflow.name": runName || chainName,
-    });
+    span.setAttributes({
+      [SpanAttributes.TRACELOOP_SPAN_KIND]: "workflow",
+      [SpanAttributes.TRACELOOP_WORKFLOW_NAME]: runName || chainName,
+    });
-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.input": JSON.stringify(inputs),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_INPUT]: JSON.stringify(inputs),
+      });
+    }
-    span.setAttributes({
-      "traceloop.span.kind": "task",
-      "traceloop.entity.name": toolName,
-    });
+    span.setAttributes({
+      [SpanAttributes.TRACELOOP_SPAN_KIND]: "task",
+      [SpanAttributes.TRACELOOP_ENTITY_NAME]: toolName,
+    });
-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.input": JSON.stringify({ args: [input] }),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_INPUT]: JSON.stringify({
+          args: [input],
+        }),
+      });
+    }
-    if (this.traceContent) {
-      span.setAttributes({
-        "traceloop.entity.output": JSON.stringify(output),
-      });
-    }
+    if (this.traceContent) {
+      span.setAttributes({
+        [SpanAttributes.TRACELOOP_ENTITY_OUTPUT]: JSON.stringify(output),
+      });
+    }

Also applies to: 275-279, 319-327, 345-347

packages/instrumentation-langchain/test/instrumentation.test.ts (1)

433-494: Make the BedrockChat test deterministic by asserting the Bedrock span and removing fallback

Once BedrockInstrumentation is enabled, the test should reliably find the bedrock.completion span and its attributes. Remove the fallback to LangChain callback spans to avoid masking regressions.

Apply this diff:

-    // Look for LLM span created by Bedrock instrumentation
-    const llmSpan = spans.find(
-      (span) => span.attributes[SpanAttributes.LLM_SYSTEM] === "AWS",
-    );
-
-    if (llmSpan) {
-      // Test LLM span attributes like in amazon.test.ts
-      const attributes = llmSpan.attributes;
-      assert.strictEqual(attributes[SpanAttributes.LLM_SYSTEM], "AWS");
-      assert.strictEqual(
-        attributes[SpanAttributes.LLM_REQUEST_TYPE],
-        "completion",
-      );
-      assert.ok(attributes[SpanAttributes.LLM_REQUEST_MODEL]);
-      assert.strictEqual(
-        attributes[`${SpanAttributes.LLM_PROMPTS}.0.role`],
-        "user",
-      );
-      assert.strictEqual(
-        attributes[`${SpanAttributes.LLM_PROMPTS}.0.content`],
-        "What is a popular landmark in the most populous city in the US?",
-      );
-      assert.strictEqual(
-        attributes[`${SpanAttributes.LLM_COMPLETIONS}.0.role`],
-        "assistant",
-      );
-      assert.ok(attributes[`${SpanAttributes.LLM_COMPLETIONS}.0.content`]);
-      assert.ok(attributes[SpanAttributes.LLM_USAGE_PROMPT_TOKENS]);
-      assert.ok(attributes[SpanAttributes.LLM_USAGE_COMPLETION_TOKENS]);
-      assert.ok(attributes[SpanAttributes.LLM_USAGE_TOTAL_TOKENS]);
-    } else {
-      // Test LangChain callback handler spans - now only creates completion span  
-      const completionSpan = spans.find(
-        (span) => span.name === "bedrock.chat",
-      );
-
-      assert.ok(
-        completionSpan,
-        `No completion span found. Available spans: ${spans.map((s) => s.name).join(", ")}`,
-      );
-
-      // Test completion span attributes
-      const completionAttributes = completionSpan.attributes;
-      assert.strictEqual(
-        completionAttributes[SpanAttributes.LLM_SYSTEM],
-        "AWS",
-      );
-      assert.strictEqual(
-        completionAttributes[SpanAttributes.LLM_REQUEST_TYPE],
-        "completion",
-      );
-      assert.strictEqual(
-        completionAttributes[SpanAttributes.LLM_REQUEST_MODEL],
-        "claude-3-7-sonnet",
-      );
-      assert.ok(completionAttributes[SpanAttributes.LLM_USAGE_PROMPT_TOKENS]);
-      assert.ok(
-        completionAttributes[SpanAttributes.LLM_USAGE_COMPLETION_TOKENS],
-      );
-      assert.ok(completionAttributes[SpanAttributes.LLM_USAGE_TOTAL_TOKENS]);
-    }
+    // Look for Bedrock instrumentation span
+    const bedrockSpan = spans.find((span) => span.name === "bedrock.completion");
+    assert.ok(
+      bedrockSpan,
+      `No bedrock.completion span found. Spans: ${spans
+        .map((s) => s.name)
+        .join(", ")}`,
+    );
+    const attributes = bedrockSpan.attributes;
+    assert.strictEqual(attributes[SpanAttributes.LLM_SYSTEM], "AWS");
+    assert.strictEqual(attributes[SpanAttributes.LLM_REQUEST_TYPE], "completion");
+    assert.ok(attributes[SpanAttributes.LLM_REQUEST_MODEL]);
+    assert.strictEqual(
+      attributes[`${SpanAttributes.LLM_PROMPTS}.0.role`],
+      "user",
+    );
+    assert.strictEqual(
+      attributes[`${SpanAttributes.LLM_PROMPTS}.0.content`],
+      "What is a popular landmark in the most populous city in the US?",
+    );
+    assert.strictEqual(
+      attributes[`${SpanAttributes.LLM_COMPLETIONS}.0.role`],
+      "assistant",
+    );
+    assert.ok(attributes[`${SpanAttributes.LLM_COMPLETIONS}.0.content`]);
+    assert.ok(attributes[SpanAttributes.LLM_USAGE_PROMPT_TOKENS]);
+    assert.ok(attributes[SpanAttributes.LLM_USAGE_COMPLETION_TOKENS]);
+    assert.ok(attributes[SpanAttributes.LLM_USAGE_TOTAL_TOKENS]);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ed79220 and 1bcbeb5.

📒 Files selected for processing (3)
  • packages/instrumentation-langchain/src/callback_handler.ts (1 hunks)
  • packages/instrumentation-langchain/src/instrumentation.ts (1 hunks)
  • packages/instrumentation-langchain/test/instrumentation.test.ts (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/instrumentation-langchain/src/instrumentation.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
packages/instrumentation-langchain/test/instrumentation.test.ts (3)
packages/instrumentation-bedrock/src/instrumentation.ts (1)
  • BedrockInstrumentation (39-570)
packages/instrumentation-langchain/src/instrumentation.ts (1)
  • LangChainInstrumentation (25-124)
packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)
  • tool (276-282)
packages/instrumentation-langchain/src/callback_handler.ts (1)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
🪛 GitHub Actions: CI
packages/instrumentation-langchain/test/instrumentation.test.ts

[error] 1-1: Prettier formatting check failed. 4 files have formatting issues (pnpm prettier --check .). Run 'pnpm prettier --write' to fix.

packages/instrumentation-langchain/src/callback_handler.ts

[error] 1-1: Prettier formatting check failed. 4 files have formatting issues (pnpm prettier --check .). Run 'pnpm prettier --write' to fix.

🔇 Additional comments (2)
packages/instrumentation-langchain/src/callback_handler.ts (1)

1-15: Prettier check is failing in CI; format the file

CI reports Prettier violations. Run pnpm prettier --write packages/instrumentation-langchain/src/callback_handler.ts (or pnpm prettier --write .) to resolve.

packages/instrumentation-langchain/test/instrumentation.test.ts (1)

1-1: Prettier check is failing in CI; format the tests

CI reports Prettier violations across multiple files. Run pnpm prettier --write packages/instrumentation-langchain/test/instrumentation.test.ts (or pnpm prettier --write .) to fix formatting.

Comment thread packages/instrumentation-langchain/src/callback_handler.ts Outdated
Comment thread packages/instrumentation-langchain/src/callback_handler.ts
Apply code formatting with prettier after linting fixes.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 51b8557 in 1 minute and 55 seconds. Click for details.
  • Reviewed 354 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:155
  • Draft comment:
    The previous inline model extraction logic (which trimmed version suffixes) was removed. Confirm that relying solely on output.llmOutput for model name (without cleaning) meets BedrockChat expectations.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to confirm that a change in logic meets certain expectations. It is not making a specific code suggestion or asking for a test to be written. It is asking for confirmation, which violates the rules.
2. packages/instrumentation-langchain/src/callback_handler.ts:401
  • Draft comment:
    Vendor fallback naming is inconsistent. The final return value 'Langchain' may differ from expected capitalization ('LangChain'). Ensure consistent naming conventions.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While technically correct about the inconsistency, this seems like a very minor issue. The string is just used as an identifier for vendor attribution in traces. The capitalization difference likely has no functional impact. This feels more like a nitpick than a meaningful code quality issue. The comment correctly identifies a real inconsistency. Consistent naming can be important for maintainability and professionalism. However, this is an extremely minor cosmetic issue that doesn't affect functionality. Our rules say not to make comments that are obvious or unimportant. This comment should be removed as it addresses a trivial capitalization issue that doesn't meaningfully impact code quality or functionality.
3. packages/instrumentation-langchain/test/instrumentation.test.ts:441
  • Draft comment:
    Test assertions updated to expect 'chat' instead of 'completion'. Verify that this change accurately reflects the intended instrumentation behavior for chat models.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to verify the change, which violates the rule against asking for confirmation or verification of intention. It does not provide a specific suggestion or point out a clear issue with the code.

Workflow ID: wflow_1WUASnSxlBILaqix

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

_metadata?: Record<string, unknown>,
_runName?: string,
): Promise<void> {
console.log("📞 handleChatModelStart called!", {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using bare console.log for production debugging. Consider using a configurable logger or diagnostic module.

constructor(config: LangChainInstrumentationConfig = {}) {
super("@traceloop/instrumentation-langchain", version, config);

console.log("🚀 LangChain instrumentation constructor called");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace console.log statements with a proper logging/diagnostic mechanism to prevent unintended log output in production.

Suggested change
console.log("🚀 LangChain instrumentation constructor called");
this._diag.debug("🚀 LangChain instrumentation constructor called");

Remove temporary debug logging from instrumentation code.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@nirga nirga force-pushed the feat/langchain-callback-instrumentation branch from 8b2c52a to d4424b4 Compare August 17, 2025 21:14
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 8b2c52a in 48 seconds. Click for details.
  • Reviewed 63 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/test/instrumentation.test.ts:439
  • Draft comment:
    Condense multi-line assert.strictEqual into a single line is fine, but ensure it aligns with the team's formatting preferences.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/instrumentation-langchain/test/instrumentation.test.ts:458
  • Draft comment:
    Simplified the arrow function for finding the completion span. This improves readability for such a simple callback.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
3. packages/sample-app/src/sample_langchain_bedrock.ts:5
  • Draft comment:
    Added trailing commas and adjusted whitespace for consistency. These changes enhance code style uniformity.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_5Ebf7HgkklEMsxcq

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Add eslint disable comment to resolve TypeScript build error for require().

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed d4424b4 in 1 minute and 23 seconds. Click for details.
  • Reviewed 100 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/callback_handler.ts:51
  • Draft comment:
    Removed the debug console.log from handleChatModelStart. This cleanup is good; if debug info is needed, consider using a proper diagnostic logger.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/instrumentation-langchain/src/instrumentation.ts:29
  • Draft comment:
    Removed console.log in the instrumentation constructor. This avoids noisy production logs. Use a dedicated logging/diagnostic system if debugging is needed.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/instrumentation-langchain/src/instrumentation.ts:83
  • Draft comment:
    Removed the debug log inside the patched _configureSync method. This is a clean-up to prevent polluting output; ensure any needed debug info is routed through your diagnostic tools.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
4. packages/instrumentation-langchain/test/instrumentation.test.ts:461
  • Draft comment:
    Minor formatting changes and consolidation of assertions improve readability. No functional issues detected.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
5. packages/sample-app/src/sample_langchain_bedrock.ts:16
  • Draft comment:
    Minor formatting improvements (e.g. trailing commas and consistent object literals) enhance code consistency.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_k3jdt9nCdKIEC0e4

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed aa7ed30 in 48 seconds. Click for details.
  • Reviewed 12 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/instrumentation.ts:55
  • Draft comment:
    Dynamic require loading is acceptable here for auto-instrumentation, but consider adding a short note explaining why a static import isn't used. This inline eslint-disable comment is okay, but additional context could help future maintainers.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_zKqTmikEh58H4BUh

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Add @types/node dependency and eslint disable for require() usage to fix TypeScript compilation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 7506cc4 in 48 seconds. Click for details.
  • Reviewed 12 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/package.json:60
  • Draft comment:
    Ensure the added '@types/node' dev dependency (^24.0.15) is intentional. Verify that it aligns with the Node engine requirement (>=14) and doesn't introduce type conflicts with libraries expecting older Node types.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_qnlQWcI4H7MkHVc7

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Update lockfile to include @types/node dependency for TypeScript builds.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped PR review on 656265d because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
packages/instrumentation-langchain/src/instrumentation.ts (2)

55-64: Emit diagnostics for missing CallbackManager and require failures; improve observability

Add debug/warn logs when patching succeeds, when CallbackManager is absent, and when require fails so troubleshooting is easier. Also avoids no-empty smells.

   private instrumentCallbackManagerDirectly() {
     try {
       // eslint-disable-next-line @typescript-eslint/no-var-requires
       const callbackManagerModule = require("@langchain/core/callbacks/manager");
 
-      if (callbackManagerModule?.CallbackManager) {
-        this.patchCallbackManager(callbackManagerModule.CallbackManager);
-      }
+      if (callbackManagerModule?.CallbackManager) {
+        this._diag.debug(
+          "Patching @langchain/core/callbacks/manager.CallbackManager",
+        );
+        this.patchCallbackManager(callbackManagerModule.CallbackManager);
+      } else {
+        this._diag.warn(
+          "CallbackManager not found on @langchain/core/callbacks/manager; skipping LangChain callback instrumentation",
+        );
+      }
     } catch (error) {
-      this._diag.debug("Error instrumenting callback manager:", error);
+      this._diag.warn(
+        "Failed to require @langchain/core/callbacks/manager; LangChain callback instrumentation not applied",
+        error as Error,
+      );
     }
   }

67-110: Deduplicate handler injection and remove this-alias; add diagnostics and fallback

As written, each _configureSync call appends another TraceloopCallbackHandler, causing duplicate spans and potential memory growth; and it aliases this to self. Deduplicate by checking for an existing handler, avoid this-alias, add diagnostics, and keep a reference to the original function for unpatch.

   private patchCallbackManager(CallbackManager: unknown) {
     const callbackManagerAny = CallbackManager as Record<string, unknown>;
 
     if (
       callbackManagerAny._configureSync &&
       !callbackManagerAny._traceloopPatched
     ) {
-      const originalConfigureSync = callbackManagerAny._configureSync;
-
-      // eslint-disable-next-line @typescript-eslint/no-this-alias
-      const self = this;
-      callbackManagerAny._configureSync = function (
+      const originalConfigureSync = callbackManagerAny._configureSync as (...args: unknown[]) => unknown;
+      const originalSym = Symbol.for("traceloop.originalConfigureSync");
+      (callbackManagerAny as any)[originalSym] = originalConfigureSync;
+      const getTracer = () => this.tracer;
+      const shouldSendPrompts = () => this._shouldSendPrompts();
+      const diag = this._diag;
+      callbackManagerAny._configureSync = function (
         inheritableHandlers?: unknown[],
         localHandlers?: unknown[],
         inheritableTags?: string[],
         localTags?: string[],
         inheritableMetadata?: Record<string, unknown>,
         localMetadata?: Record<string, unknown>,
       ) {
-
-        // Add our callback handler to inheritable handlers
-        const callbackHandler = new TraceloopCallbackHandler(
-          self.tracer,
-          self._shouldSendPrompts(),
-        );
-        const updatedInheritableHandlers =
-          inheritableHandlers && Array.isArray(inheritableHandlers)
-            ? [...inheritableHandlers, callbackHandler]
-            : [callbackHandler];
-
-        return (originalConfigureSync as (...args: unknown[]) => unknown).call(
-          this,
-          updatedInheritableHandlers,
-          localHandlers,
-          inheritableTags,
-          localTags,
-          inheritableMetadata,
-          localMetadata,
-        );
+        try {
+          const baseHandlers = Array.isArray(inheritableHandlers) ? inheritableHandlers as any[] : [];
+          const alreadyInjected = baseHandlers.some(
+            (h: any) =>
+              h?.constructor?.name === "TraceloopCallbackHandler" || h?.__traceloop === true,
+          );
+          const handlers = alreadyInjected
+            ? baseHandlers
+            : (() => {
+                const h = new TraceloopCallbackHandler(getTracer(), shouldSendPrompts()) as any;
+                h.__traceloop = true;
+                diag.debug?.("Injected TraceloopCallbackHandler into LangChain CallbackManager");
+                return [...baseHandlers, h];
+              })();
+          return (originalConfigureSync as (...args: unknown[]) => unknown).call(
+            this,
+            handlers,
+            localHandlers,
+            inheritableTags,
+            localTags,
+            inheritableMetadata,
+            localMetadata,
+          );
+        } catch (e) {
+          diag.warn?.(
+            "Traceloop CallbackManager _configureSync wrapper failed; falling back to original",
+            e as Error,
+          );
+          return (originalConfigureSync as (...args: unknown[]) => unknown).call(
+            this,
+            inheritableHandlers,
+            localHandlers,
+            inheritableTags,
+            localTags,
+            inheritableMetadata,
+            localMetadata,
+          );
+        }
       };
 
       // Mark as patched to avoid double patching
       callbackManagerAny._traceloopPatched = true;
+      this._diag.debug("Traceloop CallbackManager patch applied");
+    } else if (callbackManagerAny._traceloopPatched) {
+      this._diag.debug("Traceloop CallbackManager was already patched; skipping");
+    } else {
+      this._diag.warn("CallbackManager._configureSync not found; cannot apply Traceloop patch");
     }
   }
packages/instrumentation-langchain/test/instrumentation.test.ts (1)

87-89: Enable Bedrock instrumentation so LLM spans are created deterministically

You set the tracer provider but never call enable(), so AWS client patching won’t be applied and the test falls back to callback spans. Enable it to validate Bedrock LLM spans when available.

   bedrockInstrumentation = new BedrockInstrumentation();
   bedrockInstrumentation.setTracerProvider(provider);
+  bedrockInstrumentation.enable();
🧹 Nitpick comments (3)
packages/sample-app/src/sample_langchain_bedrock.ts (2)

24-27: Nit: Log only the model’s message content for cleaner sample output

The full response object can be noisy. Consider logging the textual content for readability.

-  console.log("Response received:", response);
+  console.log("Response received:", response?.content);

28-30: Optional: Prefer explicit flush/shutdown over fixed sleep

If the SDK exposes a flush/shutdown API, prefer that over a fixed timeout to reduce test flakiness and speed up samples.

Is there a documented flush/shutdown method in @traceloop/node-server-sdk that ensures spans are exported without arbitrary delays?

packages/instrumentation-langchain/test/instrumentation.test.ts (1)

419-427: Use the tool-bound model instance returned from bindTools

bindTools returns a bound Runnable/model; it does not mutate in place. Use the returned instance to ensure tools are actually bound for the invocation.

-    model.bindTools([get_cities_data_by_country]);
+    const modelWithTools = model.bindTools([get_cities_data_by_country]);
@@
-    const response = await model.invoke([message]);
+    const response = await modelWithTools.invoke([message]);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 51b8557 and 656265d.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (5)
  • packages/instrumentation-langchain/package.json (2 hunks)
  • packages/instrumentation-langchain/src/callback_handler.ts (1 hunks)
  • packages/instrumentation-langchain/src/instrumentation.ts (1 hunks)
  • packages/instrumentation-langchain/test/instrumentation.test.ts (10 hunks)
  • packages/sample-app/src/sample_langchain_bedrock.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/instrumentation-langchain/package.json
  • packages/instrumentation-langchain/src/callback_handler.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
packages/instrumentation-langchain/test/instrumentation.test.ts (4)
packages/instrumentation-bedrock/src/instrumentation.ts (1)
  • BedrockInstrumentation (39-570)
packages/instrumentation-langchain/src/instrumentation.ts (1)
  • LangChainInstrumentation (25-126)
packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)
  • tool (276-282)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
packages/instrumentation-langchain/src/instrumentation.ts (2)
packages/instrumentation-langchain/src/types.ts (1)
  • LangChainInstrumentationConfig (3-14)
packages/instrumentation-langchain/src/callback_handler.ts (1)
  • TraceloopCallbackHandler (31-526)
🪛 GitHub Actions: CI
packages/instrumentation-langchain/src/instrumentation.ts

[warning] 1-1: Prettier formatting issues detected in this file. Run 'pnpm prettier --write' to fix.

🔇 Additional comments (4)
packages/sample-app/src/sample_langchain_bedrock.ts (1)

11-13: LGTM: Messages-based BedrockChat usage is correct

Importing BedrockChat and HumanMessage and invoking with an array of messages aligns with LangChain’s current chat model API.

packages/instrumentation-langchain/src/instrumentation.ts (1)

1-1: Formatting issues resolved

packages/instrumentation-langchain/src/instrumentation.ts now passes Prettier’s check and CI formatting errors are cleared.

packages/instrumentation-langchain/test/instrumentation.test.ts (2)

153-156: LGTM: AgentExecutor migration is clean

Switching to AgentExecutor usage looks correct and the workflow span assertions align with the new callback-based instrumentation.


430-489: Good fallback assertions for either Bedrock LLM spans or callback completion spans

The test robustly validates either path (AWS LLM span via Bedrock instrumentation or bedrock.chat via callback handler) and checks key attributes including usage. Nice.

Comment thread packages/instrumentation-langchain/src/instrumentation.ts
Comment thread packages/sample-app/src/sample_langchain_bedrock.ts
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 06abd38 in 1 minute and 1 seconds. Click for details.
  • Reviewed 20 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-langchain/src/instrumentation.ts:28
  • Draft comment:
    Extraneous blank line removed in the constructor. Confirm that this spacing change aligns with the project's formatting guidelines.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/instrumentation-langchain/src/instrumentation.ts:83
  • Draft comment:
    Extraneous blank line removed before the callback injection code. Verify consistency with the overall code style.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_6mp4z7orPasCx2Ts

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Fix Cohere response attribute extraction to access nested generations array
for finish_reason and text fields.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@nirga nirga force-pushed the feat/langchain-callback-instrumentation branch from 06abd38 to 6af4c33 Compare August 17, 2025 22:03
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 6af4c33 in 59 seconds. Click for details.
  • Reviewed 92 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/instrumentation-bedrock/src/instrumentation.ts:351
  • Draft comment:
    Formatting improvement: The refactor for assigning promptAttributes improves readability. Ensure consistent indentation for the ternary operator across similar blocks.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/instrumentation-bedrock/src/instrumentation.ts:491
  • Draft comment:
    Logic change in cohere response extraction: Using optional chaining (response["generations"]?.[0]?.["finish_reason"]) assumes the new structure. Verify that this change covers legacy responses or add a fallback if necessary.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/instrumentation-bedrock/src/instrumentation.ts:506
  • Draft comment:
    Minor formatting update for token usage attributes: The split across multiple lines is clear and maintains functionality.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
4. packages/instrumentation-langchain/src/instrumentation.ts:66
  • Draft comment:
    Patching the internal _configureSync method with a _traceloopPatched flag is a practical workaround. Keep in mind that relying on non-public APIs (like _configureSync) may break with future LangChain updates.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. packages/instrumentation-langchain/src/instrumentation.ts:111
  • Draft comment:
    The helper _shouldSendPrompts method is consistent with its usage across modules. No issues noted.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_P3VAmtjzHBuOkbJz

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
packages/instrumentation-langchain/src/instrumentation.ts (3)

53-64: Add diagnostics for missing export and require failure; avoid silent skips

Currently, a missing CallbackManager results in a silent no-op and require errors are logged only at debug level. Emit clear debug/warn logs to aid support.

Apply this diff:

   private instrumentCallbackManagerDirectly() {
     try {
       // eslint-disable-next-line @typescript-eslint/no-var-requires
       const callbackManagerModule = require("@langchain/core/callbacks/manager");
 
-      if (callbackManagerModule?.CallbackManager) {
-        this.patchCallbackManager(callbackManagerModule.CallbackManager);
-      }
+      if (callbackManagerModule?.CallbackManager) {
+        this._diag.debug(
+          "Patching @langchain/core/callbacks/manager.CallbackManager",
+        );
+        this.patchCallbackManager(callbackManagerModule.CallbackManager);
+      } else {
+        this._diag.warn(
+          "CallbackManager not found on @langchain/core/callbacks/manager; skipping LangChain callback instrumentation",
+        );
+      }
     } catch (error) {
-      this._diag.debug("Error instrumenting callback manager:", error);
+      this._diag.warn(
+        "Failed to require @langchain/core/callbacks/manager; LangChain callback instrumentation not applied",
+        error as Error,
+      );
     }
   }

31-33: Avoid constructor-time patching; respect enable/disable lifecycle (add diagnostics if you keep it)

Patching in the constructor makes the instrumentation always-on once the class is constructed and prevents proper unpatching on disable(). At minimum, add diagnostics here; ideally move patching to enable() and implement disable() to unpatch.

Apply this diff to add diagnostics now:

-    // Manually instrument CallbackManager immediately since module detection doesn't work
-    this.instrumentCallbackManagerDirectly();
+    // Manually instrument CallbackManager immediately since module detection doesn't work
+    this._diag.debug(
+      "LangChain instrumentation: attempting to patch CallbackManager from constructor",
+    );
+    this.instrumentCallbackManagerDirectly();

Additionally, implement proper lifecycle methods (outside this range) so users/tests can enable/disable cleanly:

// Add inside LangChainInstrumentation
public override enable(): void {
  super.enable();
  this._diag.debug(
    "LangChain instrumentation: enable() called; ensuring CallbackManager is patched",
  );
  this.instrumentCallbackManagerDirectly();
}

public override disable(): void {
  try {
    // eslint-disable-next-line @typescript-eslint/no-var-requires
    const callbackManagerModule = require("@langchain/core/callbacks/manager");
    const CallbackManager: any = callbackManagerModule?.CallbackManager;
    if (CallbackManager && CallbackManager._traceloopPatched) {
      const originalSym = Symbol.for("traceloop.originalConfigureSync");
      if (CallbackManager[originalSym]) {
        CallbackManager._configureSync = CallbackManager[originalSym];
      }
      delete CallbackManager._traceloopPatched;
      delete CallbackManager[originalSym];
      this._diag.debug(
        "LangChain instrumentation: unpatched CallbackManager._configureSync",
      );
    } else {
      this._diag.debug(
        "LangChain instrumentation: disable() called but CallbackManager was not patched",
      );
    }
  } catch (e) {
    this._diag.debug(
      "LangChain instrumentation: failed to unpatch CallbackManager on disable()",
      e as Error,
    );
  } finally {
    super.disable();
  }
}

66-109: Fix duplicate handler injection, remove this-alias, and add diagnostics/unpatch support

As written, every _configureSync call unconditionally appends a new TraceloopCallbackHandler, which can lead to duplicate handlers and duplicated spans for nested/langchain-internal calls. It also aliases this (eslint @typescript-eslint/no-this-alias). Add de-duping, store the original method for unpatching, and log patch status.

Apply this diff:

   private patchCallbackManager(CallbackManager: unknown) {
     const callbackManagerAny = CallbackManager as Record<string, unknown>;
 
     if (
       callbackManagerAny._configureSync &&
       !callbackManagerAny._traceloopPatched
     ) {
-      const originalConfigureSync = callbackManagerAny._configureSync;
-
-      // eslint-disable-next-line @typescript-eslint/no-this-alias
-      const self = this;
-      callbackManagerAny._configureSync = function (
+      const originalConfigureSync = callbackManagerAny._configureSync as (...args: unknown[]) => unknown;
+      const diag = this._diag;
+      const getTracer = () => this.tracer;
+      const shouldSendPrompts = () => this._shouldSendPrompts();
+      // Preserve original for unpatching later
+      const originalSym = Symbol.for("traceloop.originalConfigureSync");
+      if (!callbackManagerAny[originalSym]) {
+        callbackManagerAny[originalSym] = originalConfigureSync;
+      }
+      callbackManagerAny._configureSync = function (
         inheritableHandlers?: unknown[],
         localHandlers?: unknown[],
         inheritableTags?: string[],
         localTags?: string[],
         inheritableMetadata?: Record<string, unknown>,
         localMetadata?: Record<string, unknown>,
       ) {
-        // Add our callback handler to inheritable handlers
-        const callbackHandler = new TraceloopCallbackHandler(
-          self.tracer,
-          self._shouldSendPrompts(),
-        );
-        const updatedInheritableHandlers =
-          inheritableHandlers && Array.isArray(inheritableHandlers)
-            ? [...inheritableHandlers, callbackHandler]
-            : [callbackHandler];
-
-        return (originalConfigureSync as (...args: unknown[]) => unknown).call(
-          this,
-          updatedInheritableHandlers,
-          localHandlers,
-          inheritableTags,
-          localTags,
-          inheritableMetadata,
-          localMetadata,
-        );
+        try {
+          const baseHandlers = Array.isArray(inheritableHandlers) ? (inheritableHandlers as any[]) : [];
+          const locals = Array.isArray(localHandlers) ? (localHandlers as any[]) : [];
+          const alreadyInjected =
+            baseHandlers.some(
+              (h: any) =>
+                h?.constructor?.name === "TraceloopCallbackHandler" || h?.__traceloop === true,
+            ) ||
+            locals.some(
+              (h: any) =>
+                h?.constructor?.name === "TraceloopCallbackHandler" || h?.__traceloop === true,
+            );
+          const handlers = alreadyInjected
+            ? baseHandlers
+            : (() => {
+                const h: any = new TraceloopCallbackHandler(getTracer(), shouldSendPrompts());
+                h.__traceloop = true;
+                return [...baseHandlers, h];
+              })();
+          return (originalConfigureSync as (...args: unknown[]) => unknown).call(
+            this,
+            handlers,
+            localHandlers,
+            inheritableTags,
+            localTags,
+            inheritableMetadata,
+            localMetadata,
+          );
+        } catch (e) {
+          diag.warn(
+            "Traceloop CallbackManager _configureSync wrapper failed; falling back to original",
+            e as Error,
+          );
+          return (originalConfigureSync as (...args: unknown[]) => unknown).call(
+            this,
+            inheritableHandlers,
+            localHandlers,
+            inheritableTags,
+            localTags,
+            inheritableMetadata,
+            localMetadata,
+          );
+        }
       };
 
       // Mark as patched to avoid double patching
       callbackManagerAny._traceloopPatched = true;
-    }
+      this._diag.debug("Traceloop CallbackManager patch applied");
+    } else if (callbackManagerAny._traceloopPatched) {
+      this._diag.debug("Traceloop CallbackManager was already patched; skipping");
+    } else {
+      this._diag.warn("CallbackManager._configureSync not found; cannot apply Traceloop patch");
+    }
   }
🧹 Nitpick comments (3)
packages/instrumentation-langchain/src/instrumentation.ts (3)

36-46: Harden manualInstrumentation with guards and diagnostics

If the provided module lacks CallbackManager, we silently skip. Add a guard and warn to aid troubleshooting.

Apply this diff:

-  }: {
-    callbackManagerModule?: any;
-  }) {
-    if (callbackManagerModule) {
-      this._diag.debug(
-        "Manually instrumenting @langchain/core/callbacks/manager",
-      );
-      this.patchCallbackManager(callbackManagerModule.CallbackManager);
-    }
+  }: { callbackManagerModule?: any }) {
+    const cm = callbackManagerModule?.CallbackManager;
+    if (cm) {
+      this._diag.debug(
+        "Manually instrumenting @langchain/core/callbacks/manager",
+      );
+      this.patchCallbackManager(cm);
+    } else {
+      this._diag.warn(
+        "manualInstrument: CallbackManager not found on provided module; skipping",
+      );
+    }

49-51: init() returning [] undermines module-based auto-patching; prefer lifecycle-driven patching

Returning an empty module list forces constructor-based patching and bypasses the normal instrumentation lifecycle. Move patching into enable() and support unpatch in disable() (see constructor comment), or consider supplying an InstrumentationModuleDefinition for @langchain/core/callbacks/manager if feasible.

I can sketch a minimal module definition using onRequire if you want to try restoring module-based patching.


86-89: Trace-content flag is captured at config-time; consider evaluating per-event

Calling _shouldSendPrompts() when constructing the handler captures a single boolean for the lifetime of that handler. If you intend CONTEXT_KEY_ALLOW_TRACE_CONTENT to toggle per-run, the evaluation should happen inside the handler’s event methods rather than once here.

If helpful, I can propose a minimal change to TraceloopCallbackHandler to query context.active() at each handle* instead of relying on a constructor boolean.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 656265d and 06abd38.

📒 Files selected for processing (1)
  • packages/instrumentation-langchain/src/instrumentation.ts (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
packages/instrumentation-langchain/src/instrumentation.ts (2)
packages/instrumentation-langchain/src/types.ts (1)
  • LangChainInstrumentationConfig (3-14)
packages/instrumentation-langchain/src/callback_handler.ts (1)
  • TraceloopCallbackHandler (31-526)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and test
🔇 Additional comments (1)
packages/instrumentation-langchain/src/instrumentation.ts (1)

22-22: Import of TraceloopCallbackHandler is appropriate

Good addition. This aligns the instrumentation with the callback-based approach and centralizes span logic in one handler.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
packages/instrumentation-bedrock/src/instrumentation.ts (1)

209-281: Critical: Incorrect body type check breaks stream vs non-stream parsing

instanceof Object.getPrototypeOf(Uint8Array) will never match a Uint8Array and causes non-stream bodies to be treated as streams (and vice versa), leading to runtime errors when iterating. Detect AsyncIterable via Symbol.asyncIterator and Uint8Array directly.

Apply this diff:

-          if (!(result.body instanceof Object.getPrototypeOf(Uint8Array))) {
-            const rawRes = result.body as AsyncIterable<bedrock.ResponseStream>;
+          // Streaming response: AsyncIterable (InvokeModelWithResponseStream)
+          if (
+            result.body &&
+            typeof (result.body as any)[Symbol.asyncIterator] === "function"
+          ) {
+            const rawRes =
+              result.body as AsyncIterable<bedrock.ResponseStream>;
             let streamedContent = "";
             for await (const value of rawRes) {
               // Convert it to a JSON String
               const jsonString = new TextDecoder().decode(value.chunk?.bytes);
               // Parse the JSON string
               const parsedResponse = JSON.parse(jsonString);
@@
               span.setAttributes(responseAttributes);
             }
-          } else if (result.body instanceof Object.getPrototypeOf(Uint8Array)) {
+          // Non-streaming response: Uint8Array (InvokeModel)
+          } else if (result.body instanceof Uint8Array) {
             // Convert it to a JSON String
             const jsonString = new TextDecoder().decode(
               result.body as Uint8Array,
             );
             // Parse the JSON string
             const parsedResponse = JSON.parse(jsonString);
@@
-            span.setAttributes(responseAttributes);
+            span.setAttributes(responseAttributes);
+          } else if (typeof result.body === "string") {
+            // Fallback: body is already a string
+            const parsedResponse = JSON.parse(result.body);
+            const responseAttributes = this._setResponseAttributes(
+              modelVendor,
+              parsedResponse,
+            );
+            span.setAttributes(responseAttributes);
           }
♻️ Duplicate comments (1)
packages/instrumentation-bedrock/src/instrumentation.ts (1)

353-360: Remove explicit any in messages loop (keeps types and fixes lint).

Apply this diff:

-          requestBody["messages"].forEach((message: any, index: number) => {
-            promptAttributes[`${SpanAttributes.LLM_PROMPTS}.${index}.role`] =
-              message.role;
-            promptAttributes[`${SpanAttributes.LLM_PROMPTS}.${index}.content`] =
-              typeof message.content === "string"
-                ? message.content
-                : JSON.stringify(message.content);
-          });
+          (requestBody["messages"] as Array<{
+            role: string;
+            content: unknown;
+          }>).forEach((message, index) => {
+            promptAttributes[`${SpanAttributes.LLM_PROMPTS}.${index}.role`] =
+              message.role;
+            const content =
+              typeof message.content === "string"
+                ? message.content
+                : JSON.stringify(message.content);
+            promptAttributes[`${SpanAttributes.LLM_PROMPTS}.${index}.content`] =
+              content;
+          });
🧹 Nitpick comments (3)
packages/instrumentation-bedrock/src/instrumentation.ts (3)

151-183: Optional: Use chat span name for message-based requests

When Anthropic “messages” are used, prefer span name bedrock.chat for parity with Python and clearer semantics.

Apply this diff:

-    let attributes: Attributes = {};
+    let attributes: Attributes = {};
+    let spanName = "bedrock.completion";
@@
-      if (typeof input.body === "string") {
+      if (typeof input.body === "string") {
         const requestBody = JSON.parse(input.body);
+        if (
+          modelVendor === "anthropic" &&
+          Array.isArray(requestBody["messages"])
+        ) {
+          spanName = "bedrock.chat";
+          // Optionally, also set LLM_REQUEST_TYPE to CHAT if available
+          // attributes[SpanAttributes.LLM_REQUEST_TYPE] = LLMRequestTypeValues.CHAT;
+        }
@@
-    return this.tracer.startSpan(`bedrock.completion`, {
+    return this.tracer.startSpan(spanName, {
       kind: SpanKind.CLIENT,
       attributes,
     });

370-373: Nit: use global replacements when stripping Human/Assistant markers

Current replace only removes the first occurrence.

Apply this diff:

-              .replace("\n\nHuman:", "")
-              .replace("\n\nAssistant:", ""),
+              .replace(/\n\nHuman:/g, "")
+              .replace(/\n\nAssistant:/g, ""),

473-476: Avoid any in response content mapping; handle union safely

Type the content items and avoid relying on truthiness of c.text.

Apply this diff:

-          const content = Array.isArray(response["content"])
-            ? response["content"].map((c: any) => c.text || c).join("")
-            : response["content"];
+          const content = Array.isArray(response["content"])
+            ? (response["content"] as Array<{ text?: string } | string>)
+                .map((c) => (typeof c === "string" ? c : c.text ?? ""))
+                .join("")
+            : (response["content"] as string);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 06abd38 and 6af4c33.

📒 Files selected for processing (2)
  • packages/instrumentation-bedrock/src/instrumentation.ts (3 hunks)
  • packages/instrumentation-langchain/src/instrumentation.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/instrumentation-langchain/src/instrumentation.ts
🧰 Additional context used
🧬 Code Graph Analysis (1)
packages/instrumentation-bedrock/src/instrumentation.ts (1)
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
  • SpanAttributes (17-59)
🔇 Additional comments (3)
packages/instrumentation-bedrock/src/instrumentation.ts (3)

342-344: Good fallback for Anthropic max_tokens

Using max_tokens_to_sample || max_tokens covers both legacy and messages APIs.


390-391: Cohere prompt fallback looks good

Using message || prompt covers both input styles safely.


494-522: Cohere: token usage and safer field access LGTM

  • finish_reason via generations[0].finish_reason
  • content via generations[0].text
  • billed_units surfaced as usage tokens

This aligns with provider conventions.

Copy link
Copy Markdown
Contributor

@avivhalfon avivhalfon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope its working
Some functionality for the prompt and completion logic could be dryed - but its fine

Also - consider handling cache tokens, maybe in a different pr

Comment thread packages/instrumentation-langchain/src/callback_handler.ts Outdated
Comment thread packages/instrumentation-langchain/src/callback_handler.ts Outdated
});

// Add usage metrics if available
if (output.llmOutput?.usage) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in a different pr but we should also handle cache tokens i guess

nirga and others added 2 commits August 18, 2025 10:29
Co-authored-by: avivhalfon <aviv@traceloop.com>
Co-authored-by: avivhalfon <aviv@traceloop.com>
@nirga nirga merged commit fe78b1b into main Aug 18, 2025
4 of 5 checks passed
@nirga nirga deleted the feat/langchain-callback-instrumentation branch August 18, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants