fix(anthropic): add support for Claude thinking API by nirga · Pull Request #671 · traceloop/openllmetry-js

nirga · 2025-08-21T12:07:19Z

Summary

Add instrumentation for Claude's extended thinking API via anthropic.beta.messages.create()
Capture thinking parameters (type and budget_tokens) in span attributes
Handle response content containing both thinking and text blocks
Add comprehensive test coverage for thinking API functionality

Background

Customer reported that Claude thinking API calls (using anthropic.beta.messages.create() with thinking parameters) were not being instrumented. The existing instrumentation only covered regular anthropic.messages.create() calls.

Changes

Instrumentation: Added patching for Anthropic.Beta.Messages.prototype.create
Span attributes: Capture llm.request.thinking.type and llm.request.thinking.budget_tokens
Response handling: Parse content blocks containing both thinking reasoning and final text
Testing: Added test using Claude Opus 4.1 model with thinking enabled

Test plan

New thinking API test passes
All existing Anthropic instrumentation tests still pass
Verified test fails without instrumentation changes
Verified test passes with instrumentation changes

🤖 Generated with Claude Code

Important

Add support for Anthropic Claude thinking API by instrumenting anthropic.beta.messages.create() and capturing thinking parameters in spans.

Instrumentation:
- Add support for anthropic.beta.messages.create() in instrumentation.ts to handle Claude thinking API.
- Capture llm.request.thinking.type and llm.request.thinking.budget_tokens as span attributes.
Response Handling:
- Parse response content with both thinking and text blocks in instrumentation.ts.
Testing:
- Add test for thinking API in instrumentation.test.ts to validate span attributes and content structure.
- Include HAR recording fixture for testing in recording.har.

^{This description was created by}^{for 2004e2d. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Added instrumentation for Anthropic beta messages, capturing thinking parameters (type, budget) in spans.
- Enhanced span data for chat: records model, prompts, completions (thinking/text blocks), and token usage metrics.
Tests
- Added tests validating span attributes for beta thinking flows, including content structure and token accounting.
Chores
- Added a HAR recording fixture for Anthropic beta message interactions.

Adds instrumentation support for Claude's extended thinking feature via the beta messages API. Changes: - Add instrumentation for anthropic.beta.messages.create() calls - Capture thinking parameters (type and budget_tokens) in span attributes - Handle response content with both thinking and text blocks - Add comprehensive test for thinking API functionality This enables proper observability for Claude thinking API calls, capturing both the reasoning process and final response in telemetry data. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2025-08-21T12:07:26Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds beta “thinking” support to Anthropic chat instrumentation by patching Beta.Messages.create, extracting thinking params into span attributes, and updating unpatch flow. Introduces a HAR fixture capturing a beta message with thinking, and adds tests verifying span attributes, prompts, completions parsing, and token usage for beta messages.

Changes

Cohort / File(s)	Summary
Instrumentation patching and thinking attributes `packages/instrumentation-anthropic/src/instrumentation.ts`	Patches/unpatches `Anthropic.Beta.Messages.prototype.create`; adds BetaMessageCreateParamsNonStreaming typing; records thinking attributes (type, budget_tokens) for beta chat; integrates into startSpan; no public API change.
Test coverage for beta-thinking `packages/instrumentation-anthropic/test/instrumentation.test.ts`	Adds tests (duplicated) asserting beta-thinking span attributes, prompts, parsed completion blocks, model, max_tokens, and token usage; sets 30s timeout.
Recording fixture for tests `packages/instrumentation-anthropic/recordings/Test-Anthropic-instrumentation_.../should-set-attributes-in-span-for-beta-messages-with-thinking_.../recording.har`	New HAR capturing a beta=true v1/messages exchange with thinking and text blocks, headers, params, and usage metadata for test replay.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App as Application Code
  participant Inst as Instrumentation (patched)
  participant SDK as Anthropic SDK (Beta.Messages)
  participant API as Anthropic API

  App->>Inst: beta.messages.create(params)
  activate Inst
  note over Inst: startSpan()<br/>- extract model, max_tokens<br/>- record thinking.type/budget_tokens
  Inst->>SDK: invoke original create(params)
  activate SDK
  SDK->>API: POST /v1/messages (beta, thinking)
  API-->>SDK: response (blocks, usage)
  deactivate SDK
  SDK-->>Inst: result
  note over Inst: finishSpan()<br/>- record prompts, completions (blocks)<br/>- record usage tokens
  Inst-->>App: result
  deactivate Inst

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix(anthropic): add system prompt to span attributes #612 — Also changes Anthropic chat startSpan handling (system prompt ordering) in the same code paths, potentially interacting with the new beta-thinking attribute logic.

Poem

A rabbit taps the tracing tree,
Beta thoughts now hop with glee.
Spans record the mindful beat,
Tokens counted, neat and sweet.
Blocks of thinking, text in tow—
Burrows mapped from prompt to flow.
Thump! The metrics start to grow.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e45441b and 2004e2d.

📒 Files selected for processing (2)

packages/instrumentation-anthropic/src/instrumentation.ts (5 hunks)
packages/instrumentation-anthropic/test/instrumentation.test.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

packages/instrumentation-anthropic/src/instrumentation.ts
packages/instrumentation-anthropic/test/instrumentation.test.ts

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/claude-thinking-api-support

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to d97bd7e in 1 minute and 56 seconds. Click for details.

Reviewed 370 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-anthropic/src/instrumentation.ts:72

Draft comment:
Good to see Anthropic.Beta.Messages.create is wrapped consistently in both manual and module patch implementations.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/instrumentation-anthropic/test/instrumentation.test.ts:332

Draft comment:
The new beta messages test with thinking parameters is comprehensive and clearly validates both attribute capture and response content parsing.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

3.

packages/instrumentation-anthropic/recordings/Test-Anthropic-instrumentation_3769946143/should-set-attributes-in-span-for-beta-messages-with-thinking_2020128547/recording.har:1

Draft comment:
The HAR recording file for beta messages with thinking is auto-generated. Ensure that it does not include any sensitive data.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_y16hkV9gKKr92cAG

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-08-21T12:09:20Z

+        if (thinking.type) {
+          attributes["llm.request.thinking.type"] = thinking.type;
+        }
+        if (thinking.budget_tokens) {


Consider explicitly checking for undefined for thinking.budget_tokens (e.g., use if (thinking.budget_tokens !== undefined)) to avoid skipping valid 0 values.

Suggested change

if (thinking.budget_tokens) {

if (thinking.budget_tokens !== undefined) {

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/instrumentation-anthropic/src/instrumentation.ts (1)

351-365: Accumulate thinking deltas in streaming and add beta stream tests

Streaming currently only merges text_delta chunks, dropping deltas for “thinking” blocks and causing incomplete thinking content. We need to:

In packages/instrumentation-anthropic/src/instrumentation.ts (around lines 351–365), extend the content_block_delta handler to append thinking deltas alongside text_delta chunks.
In packages/instrumentation-anthropic/test/instrumentation.test.ts (around line 246), add a new streaming test for anthropic.beta.messages.stream that verifies thinking-content accumulation, mirroring the existing anthropic.messages.stream test.

Proposed diff for content_block_delta:

                 case "content_block_delta":
                   if (chunk.index < result.content.length) {
                     const current = result.content[chunk.index];
                     if (
                       current.type === "text" &&
                       chunk.delta.type === "text_delta"
                     ) {
                       result.content[chunk.index] = {
                         type: "text",
                         text: current.text + chunk.delta.text,
                         citations: current.citations,
                       };
-                    }
+                    } else if (current.type === "thinking") {
+                      // Accumulate thinking deltas as they stream in
+                      const curThinking = (current as any).thinking ?? "";
+                      const deltaAny = chunk.delta as any;
+                      const deltaText =
+                        typeof deltaAny?.thinking === "string"
+                          ? deltaAny.thinking
+                          : (typeof deltaAny?.text === "string" ? deltaAny.text : "");
+                      result.content[chunk.index] = {
+                        type: "thinking",
+                        thinking: curThinking + deltaText,
+                        signature: (current as any).signature,
+                      } as any;
+                    }
                   }
                   break;

Example test stub to add in instrumentation.test.ts:

it("streams reasoning/thinking deltas in beta AI", async () => {
  const stream = anthropic.beta.messages.stream({ /* config with thinking enabled */ });
  let accumulated = "";
  for await (const chunk of stream) {
    // assume thinking chunks carry `.thinking` or `.text`
    accumulated += chunk.thinking ?? chunk.text ?? "";
  }
  expect(accumulated).toContain("expected reasoning fragment");
});

🧹 Nitpick comments (4)

packages/instrumentation-anthropic/recordings/Test-Anthropic-instrumentation_3769946143/should-set-attributes-in-span-for-beta-messages-with-thinking_2020128547/recording.har (1)
95-103: Sanitize organization identifier to avoid leaking real IDs in fixtures.

The response contains an anthropic-organization-id header populated with a GUID. Prefer replacing it with a dummy value to avoid accidentally leaking or normalizing tests against environment-specific IDs.

Apply this diff in the HAR to replace the value with a placeholder:
       {
         "name": "anthropic-organization-id",
-        "value": "617d109c-a187-4902-889d-689223d134aa"
+        "value": "00000000-0000-0000-0000-000000000000"
       },
packages/instrumentation-anthropic/src/instrumentation.ts (1)

205-214: Optional: promote custom thinking keys into semantic conventions.

You’re using literal strings for llm.request.thinking.*. Consider adding these to @traceloop/ai-semantic-conventions (e.g., SpanAttributes.LLM_REQUEST_THINKING_TYPE, SpanAttributes.LLM_REQUEST_THINKING_BUDGET_TOKENS) to avoid drift and enable discoverability.
packages/instrumentation-anthropic/test/instrumentation.test.ts (2)
332-408: Add a streaming test for beta thinking to catch delta handling regressions.

We currently validate only the non-streaming beta path. A streaming test (beta.messages.stream + finalMessage) would ensure thinking deltas are captured fully.

I can add:

Polly HAR for beta=true with interleaved-thinking streaming events (content_block_delta for thinking).

A test asserting that the resulting completion content contains a populated thinking block matching the streamed deltas.
Would you like me to draft that?

37-59: Test harness setup looks solid; small nit: skip headers matching to reduce fixture churn.

You already disable matching on headers globally, which is good. Consider also normalizing x-stainless-* in recordings via Polly’s beforePersist hook to reduce diff noise on re-records.

Example tweak (outside this hunk) to the existing hook:
server.any().on("beforePersist", (_req, recording) => {
  recording.request.headers = recording.request.headers.filter(
    ({ name }: { name: string }) =>
      !/^x-api-key$/i.test(name) && !/^x-stainless-/i.test(name)
  );
});

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between db60e4b and d97bd7e.

📒 Files selected for processing (3)

packages/instrumentation-anthropic/recordings/Test-Anthropic-instrumentation_3769946143/should-set-attributes-in-span-for-beta-messages-with-thinking_2020128547/recording.har (1 hunks)
packages/instrumentation-anthropic/src/instrumentation.ts (4 hunks)
packages/instrumentation-anthropic/test/instrumentation.test.ts (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

packages/instrumentation-anthropic/test/instrumentation.test.ts (1)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

🪛 GitHub Actions: CI

packages/instrumentation-anthropic/test/instrumentation.test.ts

[warning] 1-1: Prettier formatting issue detected. Command 'pnpm prettier --check .' reported issues. Run 'pnpm prettier --write .' to fix.

packages/instrumentation-anthropic/src/instrumentation.ts

[warning] 1-1: Prettier formatting issue detected. Command 'pnpm prettier --check .' reported issues. Run 'pnpm prettier --write .' to fix.

🔇 Additional comments (6)

packages/instrumentation-anthropic/recordings/Test-Anthropic-instrumentation_3769946143/should-set-attributes-in-span-for-beta-messages-with-thinking_2020128547/recording.har (2)

1-10: HAR fixture addition looks good and aligns with the new test scenario.

The request/response pair captures the beta header, thinking params, and interleaved-thinking response with both thinking and text blocks. This is exactly what we need to validate instrumentation.

73-86: Fixture stability verified: no unintended diffs in HAR recordings
The Anthropics HAR fixture in packages/instrumentation-anthropic/recordings remains unchanged after running with RECORD_MODE=NEW. The request still uses model claude-opus-4-1-20250805 and the query parameter beta=true, ensuring Polly playback will continue to succeed.

packages/instrumentation-anthropic/src/instrumentation.ts (1)

1-547: Commit Prettier formatting changes

The formatter has been run and all files now pass Prettier’s checks, but there are unstaged modifications and one untracked file:

• Modified files to commit:

packages/instrumentation-anthropic/src/instrumentation.ts

packages/instrumentation-anthropic/test/instrumentation.test.ts

• Untracked file detected:

biome.json (please commit it if it’s intentional, or add it to .gitignore)

Please stage and commit these formatting updates to unblock CI.

packages/instrumentation-anthropic/test/instrumentation.test.ts (3)

332-408: New beta thinking test validates all critical attributes. LGTM.

Good coverage: request model/max_tokens, prompts, thinking attribute capture, content array shape, token usage invariants. This confirms the new instrumentation path.

332-408: No duplicate beta thinking test detected

The search for the test title should set attributes in span for beta messages with thinking returned a single occurrence at line 332, confirming there’s no accidental duplication. No further action is needed.

1-1: Formatting check passed—no action needed

Ran pnpm prettier --write packages/instrumentation-anthropic/test/instrumentation.test.ts and pnpm prettier --check .; all files, including this test, already conform to Prettier style. You can ignore the previous formatting request.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-08-21T12:14:26Z

+    this._wrap(
+      module.Anthropic.Beta.Messages.prototype,
+      "create",
+      this.patchAnthropic("chat", module),
+    );


🛠️ Refactor suggestion

Guard Beta.Messages patching for SDK versions without Beta API.

Accessing module.Anthropic.Beta.Messages.prototype unconditionally can throw on older @anthropic-ai/sdk versions. Add a structural guard to avoid runtime errors and log that Beta patching is skipped.

Apply this diff:

- this._wrap( - module.Anthropic.Beta.Messages.prototype, - "create", - this.patchAnthropic("chat", module), - ); + if (module?.Anthropic?.Beta?.Messages?.prototype?.create) { + this._wrap( + module.Anthropic.Beta.Messages.prototype, + "create", + this.patchAnthropic("chat", module), + ); + } else { + this._diag.debug( + "Anthropic Beta.Messages.create not found; skipping manual beta patch" + ); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this._wrap(

module.Anthropic.Beta.Messages.prototype,

"create",

this.patchAnthropic("chat", module),

);

// Guard Beta.Messages patching for SDK versions without Beta API.

if (module?.Anthropic?.Beta?.Messages?.prototype?.create) {

this._wrap(

module.Anthropic.Beta.Messages.prototype,

"create",

this.patchAnthropic("chat", module),

);

} else {

this._diag.debug(

"Anthropic Beta.Messages.create not found; skipping manual beta patch"

);

}

🤖 Prompt for AI Agents

In packages/instrumentation-anthropic/src/instrumentation.ts around lines 75 to 79, the code unconditionally accesses module.Anthropic.Beta.Messages.prototype which throws on older @anthropic-ai/sdk; modify this section to check that module.Anthropic, module.Anthropic.Beta, and module.Anthropic.Beta.Messages (and its prototype) exist before calling this._wrap, and if any are missing call the logger to record that Beta patching was skipped so runtime errors are avoided.

coderabbitai · 2025-08-21T12:14:26Z

+    this._wrap(
+      moduleExports.Anthropic.Beta.Messages.prototype,
+      "create",
+      this.patchAnthropic("chat", moduleExports),
+    );


🛠️ Refactor suggestion

Same guard needed in automatic patch() path.

Mirror the safety check here to prevent TypeError when the installed SDK lacks the Beta API.

- this._wrap( - moduleExports.Anthropic.Beta.Messages.prototype, - "create", - this.patchAnthropic("chat", moduleExports), - ); + if (moduleExports?.Anthropic?.Beta?.Messages?.prototype?.create) { + this._wrap( + moduleExports.Anthropic.Beta.Messages.prototype, + "create", + this.patchAnthropic("chat", moduleExports), + ); + } else { + this._diag.debug( + "Anthropic Beta.Messages.create not found; skipping beta patch" + ); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this._wrap(

moduleExports.Anthropic.Beta.Messages.prototype,

"create",

this.patchAnthropic("chat", moduleExports),

);

if (moduleExports?.Anthropic?.Beta?.Messages?.prototype?.create) {

this._wrap(

moduleExports.Anthropic.Beta.Messages.prototype,

"create",

this.patchAnthropic("chat", moduleExports),

);

} else {

this._diag.debug(

"Anthropic Beta.Messages.create not found; skipping beta patch"

);

}

🤖 Prompt for AI Agents

In packages/instrumentation-anthropic/src/instrumentation.ts around lines 105 to 109, the automatic patch path unconditionally attempts to wrap moduleExports.Anthropic.Beta.Messages.prototype.create which throws a TypeError when the installed SDK lacks the Beta API; add the same safety guard used elsewhere to first verify moduleExports.Anthropic, moduleExports.Anthropic.Beta and moduleExports.Anthropic.Beta.Messages and that prototype.create exists before calling this._wrap, and return/skip the wrap when any of those are missing so the patch is safe for SDKs without Beta.

coderabbitai · 2025-08-21T12:14:26Z


    this._unwrap(moduleExports.Anthropic.Completions.prototype, "create");
    this._unwrap(moduleExports.Anthropic.Messages.prototype, "create");
+    this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create");


🛠️ Refactor suggestion

Guard unwrap to match conditional wrapping.

Unwrapping an undefined prototype will also throw. Protect the unwrap with the same shape check.

- this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create"); + if (moduleExports?.Anthropic?.Beta?.Messages?.prototype) { + this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create"); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create");

if (moduleExports?.Anthropic?.Beta?.Messages?.prototype) {

this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create");

}

🤖 Prompt for AI Agents

In packages/instrumentation-anthropic/src/instrumentation.ts around line 121, the call this._unwrap(moduleExports.Anthropic.Beta.Messages.prototype, "create") can throw if the prototype is undefined; guard the unwrap with the same conditional shape check used when wrapping (check that moduleExports.Anthropic, Beta, Messages and Messages.prototype all exist) and only call this._unwrap when that prototype is present so unwrapping mirrors the conditional wrapping logic.

coderabbitai · 2025-08-21T12:14:26Z

+      // Handle thinking parameters
+      if ((params as any).thinking) {
+        const thinking = (params as any).thinking;
+        if (thinking.type) {
+          attributes["llm.request.thinking.type"] = thinking.type;
+        }
+        if (thinking.budget_tokens) {
+          attributes["llm.request.thinking.budget_tokens"] = thinking.budget_tokens;
+        }
+      }


🛠️ Refactor suggestion

Capture zero-value budget tokens; don’t skip falsy values.

if (thinking.budget_tokens) drops valid 0 budgets. Use an existence check instead.

- if ((params as any).thinking) { - const thinking = (params as any).thinking; - if (thinking.type) { - attributes["llm.request.thinking.type"] = thinking.type; - } - if (thinking.budget_tokens) { - attributes["llm.request.thinking.budget_tokens"] = thinking.budget_tokens; - } - } + const thinking = (params as any).thinking; + if (thinking && typeof thinking === "object") { + if ("type" in thinking) { + attributes["llm.request.thinking.type"] = (thinking as any).type; + } + if ("budget_tokens" in thinking) { + attributes["llm.request.thinking.budget_tokens"] = (thinking as any).budget_tokens; + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Handle thinking parameters

if ((params as any).thinking) {

const thinking = (params as any).thinking;

if (thinking.type) {

attributes["llm.request.thinking.type"] = thinking.type;

}

if (thinking.budget_tokens) {

attributes["llm.request.thinking.budget_tokens"] = thinking.budget_tokens;

}

}

// Handle thinking parameters

const thinking = (params as any).thinking;

if (thinking && typeof thinking === "object") {

if ("type" in thinking) {

attributes["llm.request.thinking.type"] = (thinking as any).type;

}

if ("budget_tokens" in thinking) {

attributes["llm.request.thinking.budget_tokens"] = (thinking as any).budget_tokens;

}

}

🤖 Prompt for AI Agents

In packages/instrumentation-anthropic/src/instrumentation.ts around lines 216 to 225, the current guard `if (thinking.budget_tokens)` skips valid zero budgets; change the existence check to detect defined values (e.g., check `'budget_tokens' in thinking` or `thinking.budget_tokens !== undefined`/`thinking.budget_tokens != null`) so that 0 is captured and then assign attributes["llm.request.thinking.budget_tokens"] = thinking.budget_tokens accordingly.

- Add proper TypeScript interfaces for thinking parameters - Replace 'any' types with specific typed interfaces - Reduce lint warnings from 25 to 21 - Improve code maintainability and type checking 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev

Caution

Changes requested ❌

Reviewed 8efeee6 in 1 minute and 27 seconds. Click for details.

Reviewed 76 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-anthropic/src/instrumentation.ts:216

Draft comment:
Good use of the explicit type (MessageCreateParamsWithThinking) for handling 'thinking' parameters. Consider an explicit null/undefined check if needed to further safeguard against unexpected values.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/instrumentation-anthropic/src/types.ts:10

Draft comment:
Fix the comment wording: change 'span created' to 'span creation' for clarity.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. packages/instrumentation-anthropic/test/instrumentation.test.ts:388

Draft comment:
The inline ContentBlock interface enhances type safety for parsing response blocks. Consider extracting it if it becomes reused to improve maintainability.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_bFyXIqKiRPkpuQU7

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-08-21T13:08:25Z


  /**
-   * A custom logger to log any exceptions that happen during span creation.
+   * A custom logger to log any exceptions that happen during span created.


Typo in the comment: 'span created' appears to be a mistake. It might be intended to read 'span creation' or 'spans are created'.

Suggested change

* A custom logger to log any exceptions that happen during span created.

* A custom logger to log any exceptions that happen during span creation.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

packages/instrumentation-anthropic/src/types.ts (2)
10-13: JSDoc grammar: “during span created” → “during span creation”.

Tiny wording fix for clarity.
   /**
-   * A custom logger to log any exceptions that happen during span created.
+   * A custom logger to log any exceptions that happen during span creation.
    */
16-19: Make budget_tokens optional to reflect Anthropic API behavior.

In practice, thinking can be enabled without specifying a budget; the server may choose a default. Treating budget_tokens as required can mask the possibility of undefined at runtime and forces downstream code to assume presence.
 export interface ThinkingParams {
   type: "enabled";
-  budget_tokens: number;
+  budget_tokens?: number;
 }
Would you like me to open a follow-up to add a guard in the instrumentation code so we only set the span attribute when budget_tokens != null?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d97bd7e and 8efeee6.

📒 Files selected for processing (3)

packages/instrumentation-anthropic/src/instrumentation.ts (5 hunks)
packages/instrumentation-anthropic/src/types.ts (1 hunks)
packages/instrumentation-anthropic/test/instrumentation.test.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

packages/instrumentation-anthropic/test/instrumentation.test.ts
packages/instrumentation-anthropic/src/instrumentation.ts

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and test

coderabbitai · 2025-08-21T13:12:55Z

+export interface ThinkingParams {
+  type: "enabled";
+  budget_tokens: number;
+}
+
+export interface MessageCreateParamsWithThinking {
+  thinking?: ThinkingParams;
+}


💡 Verification agent

🧩 Analysis chain

Consider aligning these types with upstream Anthropic SDK to avoid drift.

If the repo already depends on the official SDK, prefer reusing its request types so new fields/variants (if Anthropic extends the shape) flow through automatically. If that’s not feasible, keeping this minimal structural type is fine.

Option A (preferred when SDK types include thinking):

// types.ts import type { MessageCreateParams } from "@anthropic-ai/sdk/resources/messages"; // Narrow to just what we need for instrumentation: export type MessageCreateParamsWithThinking = Pick<MessageCreateParams, "thinking">;

Option B (keep structural typing) — add a brief doc to clarify intent:

/** * Minimal shape of the Anthropic `thinking` request param used by instrumentation. * Keep in sync with the Anthropic SDK to avoid type drift. */ export interface ThinkingParams { type: "enabled"; budget_tokens?: number; } export interface MessageCreateParamsWithThinking { thinking?: ThinkingParams; }

To confirm the latest allowed values for thinking.type and whether budget_tokens is optional in the beta API, please run this query:

🌐 Web query:

What are the current valid fields and allowed values for the Anthropic Messages API `thinking` request parameter (including the beta `anthropic.beta.messages.create`), and is `budget_tokens` optional?

💡 Result:

Short answer

The Messages API's thinking object accepts:

type — enum: "enabled" or "disabled".

budget_tokens — integer (when thinking is enabled).

budget_tokens is not optional when you turn thinking on — you must supply a token budget (minimum 1,024). The budget must normally be less than your max_tokens, except when using interleaved-thinking (beta) in which case the thinking budget can exceed max_tokens. (docs.anthropic.com)

Details / authoritative points

Field: thinking (object) — optional overall (only include it when you want extended thinking). The object form and behavior are documented in the Messages API reference and the Extended Thinking guide. (docs.anthropic.com)

thinking.type — required inside the thinking object; allowed values: "enabled" or "disabled". When "enabled" the model will emit thinking content blocks (summarized for Claude 4 models). (docs.anthropic.com)

thinking.budget_tokens — integer token budget for the model’s internal reasoning:

Minimum: 1,024 tokens.

Normally must be set to a value less than your request's max_tokens.

With interleaved thinking (a beta mode that lets the model interleave tool calls and thinking), the budget may exceed max_tokens because the effective token window becomes the larger context window (see the interleaved-thinking beta header). (docs.anthropic.com)

Behavior across models / summary visibility: Claude 3.7 returns full thinking output; Claude 4 models return summarized thinking by default (billing is for full thinking tokens even when you see the summary). (docs.anthropic.com)

Beta SDK / messages.create: the beta messages.create path (sdk/ beta namespace or using the anthropic-beta header) supports the same thinking parameter; you enable beta features via the anthropic-beta header or via the SDK beta namespace. Example SDK beta usages show passing thinking in the create call. (docs.anthropic.com, icodebroker.com)

Practical examples

Enable thinking (required budget):

{ "thinking": { "type": "enabled", "budget_tokens": 10000 } } — valid (10000 ≥ 1024, and you must ensure < max_tokens unless using interleaved thinking). (docs.anthropic.com)

Disable thinking:

{ "thinking": { "type": "disabled" } } — no budget_tokens needed/used. (docs.anthropic.com)

If you want I can:

Show a minimal curl / JSON example for either the stable endpoint or the beta SDK call, or

Look up any model-specific constraints on max_tokens vs. thinking budgets for a particular Claude model you plan to use.

Citations:

1: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking?utm_source=chatgpt.com

2: https://docs.anthropic.com/fr/api/messages?utm_source=chatgpt.com

3: https://docs.anthropic.com/fr/api/messages?utm_source=chatgpt.com

4: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking?utm_source=chatgpt.com

5: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking?utm_source=chatgpt.com

6: https://docs.anthropic.com/en/api/beta-headers?utm_source=chatgpt.com

7: https://icodebroker.com/archives/114355?utm_source=chatgpt.com

8: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking?utm_source=chatgpt.com

9: https://docs.anthropic.com/fr/api/messages?utm_source=chatgpt.com

Update ThinkingParams to match Anthropic’s current thinking spec

The Anthropic Messages API now supports two thinking.type values—"enabled" and "disabled"—and requires budget_tokens only when "enabled" (minimum 1,024). To keep your types in sync and prevent drift:

• File: packages/instrumentation-anthropic/src/types.ts
Lines: 16–23

Suggested refactor (recommended):

-export interface ThinkingParams { - type: "enabled"; - budget_tokens: number; -} +/** + * Corresponds to Anthropic's `thinking` parameter: + * - `"enabled"` requires a `budget_tokens` ≥ 1,024. + * - `"disabled"` omits `budget_tokens`. + */ +export type ThinkingParams = + | { type: "enabled"; budget_tokens: number } + | { type: "disabled" }; -export interface MessageCreateParamsWithThinking { - thinking?: ThinkingParams; -} +export interface MessageCreateParamsWithThinking { + thinking?: ThinkingParams; +}

Alternatively, if you depend on @anthropic-ai/sdk, you can reuse its types:

import type { MessageCreateParams } from "@anthropic-ai/sdk/resources/messages"; export type MessageCreateParamsWithThinking = Pick< MessageCreateParams, "thinking" >;

— Ensure budget_tokens meets the minimum 1,024 token requirement when type is "enabled" and leave it out for "disabled".

🤖 Prompt for AI Agents

In packages/instrumentation-anthropic/src/types.ts around lines 16–23, update the ThinkingParams/type definition so it accepts both "enabled" and "disabled" for thinking.type and only requires budget_tokens when type is "enabled" (with a minimum value of 1024); implement this by replacing the single-interface with a discriminated union (one branch for { type: "enabled"; budget_tokens: number } and one for { type: "disabled" }) or by importing and reusing the MessageCreateParams typing from @anthropic-ai/sdk to pick the thinking field, and ensure any runtime/validation code enforces budget_tokens >= 1024 when type === "enabled".

- Replace non-null assertions with proper type guards - Reduce lint warnings back to baseline (17 warnings, same as main) - Maintain test functionality while improving code quality 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed e45441b in 41 seconds. Click for details.

Reviewed 28 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-anthropic/test/instrumentation.test.ts:403

Draft comment:
Nice refactoring for readability. Consider explicitly coercing 'totalTokens' to a number (e.g., using '+totalTokens') in the final assertion for consistency with the other numeric conversions.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_xFne3UZa07NN7eBj

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

- Replace custom ThinkingParams with official BetaThinkingConfigParam - Use BetaMessageCreateParamsNonStreaming from @anthropic-ai/sdk - Remove redundant custom type definitions from types.ts - Maintain same functionality with better type safety and consistency This follows best practices by using the official SDK types rather than recreating them, ensuring compatibility and reducing maintenance overhead. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Format import statement in instrumentation.ts - Improve formatting in test file for better readability - Consistent code style throughout the project 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 55e294e in 2 minutes and 7 seconds. Click for details.

Reviewed 63 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-anthropic/src/instrumentation.ts:220

Draft comment:
The new logic for handling thinking parameters via the betaParams cast is clear; however, the strict check for betaParams.thinking.type === "enabled" could limit future extensions if other values are introduced. Also, ensure that casting params to BetaMessageCreateParamsNonStreaming is safe when non-beta calls are made.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment makes speculative points about future extensibility and type safety. It doesn't point to any current bugs or issues, just hypothetical future concerns. The cast appears intentional since this is specifically for beta messages. The type check seems reasonable given the current API. The comment could be identifying a real type safety issue if the cast is unsafe. The extensibility concern about thinking.type values could be valid if the API documentation specifies other valid values. Without access to the Anthropic API documentation or knowledge of the full type definitions, we can't verify if these are real issues or just speculation. The code appears to be handling a specific beta feature as intended. The comment should be deleted as it is speculative and asks for verification rather than pointing out clear issues. It violates the rules against speculative comments and asking authors to verify things.

2. packages/instrumentation-anthropic/src/types.ts:10

Draft comment:
Removed ThinkingParams and MessageCreateParamsWithThinking interfaces. Confirm that no downstream code depends on these removed types.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_2Jd0heCjJcZT7Veb

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 2004e2d in 57 seconds. Click for details.

Reviewed 107 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-anthropic/src/instrumentation.ts:221

Draft comment:
Reformatted the assignment for thinking.budget_tokens for readability. No logic change.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/instrumentation-anthropic/test/instrumentation.test.ts:332

Draft comment:
Reformatted beta message test input (e.g. 'betas' array and message object) for consistency and readability.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

3. packages/instrumentation-anthropic/test/instrumentation.test.ts:388

Draft comment:
Improved readability by reformatting the JSON.parse call into a multi-line structure.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_FUEFXTQHO6d3iu65

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Co-authored-by: Claude <noreply@anthropic.com>

nirga changed the title ~~feat(anthropic): add support for Claude thinking API~~ fix(anthropic): add support for Claude thinking API Aug 21, 2025

ellipsis-dev bot reviewed Aug 21, 2025

View reviewed changes

coderabbitai bot reviewed Aug 21, 2025

View reviewed changes

avivhalfon approved these changes Aug 21, 2025

View reviewed changes

ellipsis-dev bot reviewed Aug 21, 2025

View reviewed changes

coderabbitai bot reviewed Aug 21, 2025

View reviewed changes

ellipsis-dev bot reviewed Aug 21, 2025

View reviewed changes

nirga and others added 2 commits August 21, 2025 16:56

ellipsis-dev bot reviewed Aug 21, 2025

View reviewed changes

nirga merged commit 088986a into main Aug 21, 2025
4 of 5 checks passed

nirga deleted the feat/claude-thinking-api-support branch August 21, 2025 14:02

galkleinman pushed a commit that referenced this pull request Sep 17, 2025

fix(anthropic): add support for Claude thinking API (#671)

291fecb

Co-authored-by: Claude <noreply@anthropic.com>

	if (thinking.budget_tokens) {
	if (thinking.budget_tokens !== undefined) {

	* A custom logger to log any exceptions that happen during span created.
	* A custom logger to log any exceptions that happen during span creation.

Conversation

nirga commented Aug 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nirga commented Aug 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 21, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)