Skip to content

Commit c07a5cd

Browse files
authored
Improved Usage Tracking (#274)
* Create Normalize Usage * Improve Instrumentation for usage tracking * Fix Instrumentation properties
1 parent 4a3774f commit c07a5cd

40 files changed

Lines changed: 7024 additions & 722 deletions

CHANGELOG.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,35 @@ response = MyAgent.embed(inputs: ["Text 1", "Text 2"]).embed_now
148148
vectors = response.data.map { |d| d[:embedding] }
149149
```
150150

151+
**Normalized Usage Statistics**
152+
```ruby
153+
response = MyAgent.prompt("Hello").generate_now
154+
155+
# Works across all providers
156+
response.usage.input_tokens
157+
response.usage.output_tokens
158+
response.usage.total_tokens
159+
160+
# Provider-specific fields when available
161+
response.usage.cached_tokens # OpenAI, Anthropic
162+
response.usage.reasoning_tokens # OpenAI o1 models
163+
response.usage.service_tier # Anthropic
164+
```
165+
166+
**Enhanced Instrumentation for APM Integration**
167+
- Unified event structure: `prompt.active_agent` and `embed.active_agent` (top-level) plus `prompt.provider.active_agent` and `embed.provider.active_agent` (per-API-call)
168+
- Event payloads include comprehensive data for monitoring tools (New Relic, DataDog, etc.):
169+
- Request parameters: `model`, `temperature`, `max_tokens`, `top_p`, `stream`, `message_count`, `has_tools`
170+
- Usage data: `input_tokens`, `output_tokens`, `total_tokens`, `cached_tokens`, `reasoning_tokens`, `audio_tokens`, `cache_creation_tokens` (critical for cost tracking)
171+
- Response metadata: `finish_reason`, `response_model`, `response_id`, `embedding_count`
172+
- Top-level events report cumulative usage across all API calls in multi-turn conversations
173+
- Provider-level events report per-call usage for granular tracking
174+
175+
**Multi-Turn Usage Tracking**
176+
- `response.usage` now returns cumulative token counts across all API calls during tool calling
177+
- New `response.usages` array contains individual usage objects from each API call
178+
- `Usage` objects support addition: `usage1 + usage2` for combining statistics
179+
151180
**Provider Enhancements**
152181
- OpenAI Responses API: `api: :responses` or `api: :chat`
153182
- Anthropic JSON object mode with automatic extraction
@@ -195,6 +224,7 @@ vectors = response.data.map { |d| d[:embedding] }
195224
- Template rendering without blocks
196225
- Schema generator key symbolization
197226
- Rails 8.0 and 8.1 compatibility
227+
- Usage extraction across OpenAI/Anthropic response formats
198228

199229
### Removed
200230

docs/.vitepress/config.mts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ export default defineConfig({
100100
{ text: 'Embeddings', link: '/actions/embeddings' },
101101
{ text: 'Tools', link: '/actions/tools' },
102102
{ text: 'Structured Output', link: '/actions/structured_output' },
103+
{ text: 'Usage', link: '/actions/usage' },
103104
]
104105
},
105106
{

docs/actions.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,15 @@ Generate vectors for semantic search:
4343

4444
<<< @/../test/docs/actions_examples_test.rb#embeddings_vectorize{ruby:line-numbers}
4545

46+
### [Usage Statistics](/actions/usage)
47+
48+
Track token consumption and costs:
49+
50+
```ruby
51+
response = agent.summarize.generate_now
52+
response.usage.total_tokens #=> 125
53+
```
54+
4655
## Common Patterns
4756

4857
### Multi-Capability Actions

docs/actions/usage.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
title: Usage Statistics
3+
description: Track token usage and performance metrics across all AI providers with normalized usage objects.
4+
---
5+
# {{ $frontmatter.title }}
6+
7+
Track token consumption and performance metrics from AI provider responses. All providers return normalized usage statistics for consistent cost tracking and monitoring.
8+
9+
::: tip Monitor Usage in Production
10+
See [Instrumentation](/framework/instrumentation) to monitor usage statistics in real-time using ActiveSupport::Notifications.
11+
:::
12+
13+
## Accessing Usage
14+
15+
Get usage statistics from any response:
16+
17+
<<< @/../test/docs/actions/usage_examples_test.rb#accessing_usage{ruby:line-numbers}
18+
19+
## Common Fields
20+
21+
These fields work across all providers:
22+
23+
<<< @/../test/docs/actions/usage_examples_test.rb#common_fields{ruby:line-numbers}
24+
25+
## Provider-Specific Fields
26+
27+
Access advanced metrics when available:
28+
29+
::: code-group
30+
<<< @/../test/docs/actions/usage_examples_test.rb#provider_specific_openai{ruby:line-numbers} [OpenAI]
31+
<<< @/../test/docs/actions/usage_examples_test.rb#provider_specific_anthropic{ruby:line-numbers} [Anthropic]
32+
<<< @/../test/docs/actions/usage_examples_test.rb#provider_specific_ollama{ruby:line-numbers} [Ollama]
33+
:::
34+
35+
## Provider Details
36+
37+
Raw provider data preserved in `provider_details`:
38+
39+
::: code-group
40+
<<< @/../test/docs/actions/usage_examples_test.rb#provider_details_openai{ruby:line-numbers} [OpenAI]
41+
<<< @/../test/docs/actions/usage_examples_test.rb#provider_details_ollama{ruby:line-numbers} [Ollama]
42+
:::
43+
44+
## Cost Tracking
45+
46+
Calculate costs using token counts:
47+
48+
<<< @/../test/docs/actions/usage_examples_test.rb#cost_tracking{ruby:line-numbers}
49+
50+
**Monitor costs in production:** Use [Instrumentation](/framework/instrumentation#cost-tracking) to automatically track costs across all requests.
51+
52+
## Embeddings Usage
53+
54+
Embedding responses have zero output tokens:
55+
56+
<<< @/../test/docs/actions/usage_examples_test.rb#embeddings_usage{ruby:line-numbers}
57+
58+
## Field Mapping
59+
60+
How provider fields map to normalized names:
61+
62+
| Provider | input_tokens | output_tokens | total_tokens |
63+
|----------|--------------|---------------|--------------|
64+
| OpenAI Chat | prompt_tokens | completion_tokens | total_tokens |
65+
| OpenAI Embed | prompt_tokens | 0 | total_tokens |
66+
| OpenAI Responses | input_tokens | output_tokens | total_tokens |
67+
| Anthropic | input_tokens | output_tokens | calculated |
68+
| Ollama | prompt_eval_count | eval_count | calculated |
69+
| OpenRouter | prompt_tokens | completion_tokens | total_tokens |
70+
71+
**Note:** `total_tokens` is automatically calculated as `input_tokens + output_tokens` when not provided by the provider.

docs/agents/generation.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -93,10 +93,11 @@ response.raw_request # The most recent request in provider format
9393
response.raw_response # The most recent response in provider format
9494
response.context # The original context that was sent
9595

96-
# Usage statistics (when available from provider)
97-
response.prompt_tokens # Input tokens used
98-
response.completion_tokens # Output tokens used
99-
response.total_tokens # Total tokens used
96+
# Usage statistics (see /actions/usage for details)
97+
response.usage # Normalized usage object across all providers
98+
response.usage.input_tokens
99+
response.usage.output_tokens
100+
response.usage.total_tokens
100101
```
101102

102103
For embeddings:
@@ -110,14 +111,16 @@ response.raw_request # The most recent request in provider format
110111
response.raw_response # The most recent response in provider format
111112
response.context # The original context that was sent
112113

113-
# Usage statistics (when available from provider)
114-
response.prompt_tokens
114+
# Usage statistics
115+
response.usage # Normalized usage object
116+
response.usage.input_tokens
115117
```
116118

117119
## Next Steps
118120

119121
- [Agents](/agents) - Understanding the full agent lifecycle
120122
- [Actions](/actions) - Define what your agents can do
123+
- [Usage Statistics](/actions/usage) - Track token consumption and costs
121124
- [Messages](/actions/messages) - Work with multimodal content
122125
- [Tools](/actions/tools) - Enable function calling capabilities
123126
- [Streaming](/agents/streaming) - Stream responses in real-time

docs/framework.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ When you define an agent, you create a specialized participant that interacts wi
6060

6161
- **Agent** (Controller) - Manages lifecycle, defines actions, configures providers
6262
- **Generation** (Request Proxy) - Coordinates execution, holds configuration, provides synchronous/async methods. Created by invocation, it's lazy—execution doesn't start until you call `.prompt_now`, `.embed_now`, or `.prompt_later`.
63-
- **Response** (Result) - Contains messages, metadata, token usage, and parsed output. Returned after Generation executes.
63+
- **Response** (Result) - Contains messages, metadata, and normalized usage statistics (see **[Usage Statistics](/actions/usage)**). Returned after Generation executes.
6464

6565
**Request-Response Lifecycle:**
6666

0 commit comments

Comments
 (0)