Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ node_modules

# IDE - VSCode
.vscode/*
.vscode/
!.vscode/settings.json
!.vscode/tasks.json
!.vscode/launch.json
Expand All @@ -46,4 +47,7 @@ Thumbs.db
# env
.env*
chroma.sqlite3
chroma.log
chroma.log

# claude
.claude
140 changes: 140 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

OpenLLMetry-JS is a JavaScript/TypeScript observability framework for LLM applications, built on OpenTelemetry. It provides instrumentation for major LLM providers (OpenAI, Anthropic, etc.) and vector databases, with a unified SDK for easy integration.

## Development Commands

### Building
```bash
# Build all packages
pnpm nx run-many -t build
# or
pnpm nx run-many --targets=build

# Build affected packages only
pnpm nx affected -t build
```

### Testing
Each package has its own test command:
Comment thread
nina-kollman marked this conversation as resolved.
```bash
# Test individual packages
cd packages/traceloop-sdk
pnpm test

# Test specific instrumentation
cd packages/instrumentation-openai
pnpm test
```

### Linting
```bash
# Lint individual packages
cd packages/[package-name]
pnpm lint

# Fix lint issues
pnpm lint:fix
```

### Publishing
Comment thread
nina-kollman marked this conversation as resolved.
Outdated
```bash
pnpm nx run-many --targets=build
pnpm lerna publish --no-private
```

## Architecture

### Monorepo Structure
- **Lerna + Nx**: Manages multiple packages with shared tooling
- **packages/**: Contains all publishable packages and internal tooling
- **Rollup**: Used for building packages with TypeScript compilation

### Core Packages

#### `traceloop-sdk` (Main SDK)
- **Path**: `packages/traceloop-sdk/`
- **Exports**: `@traceloop/node-server-sdk`
- **Purpose**: Primary entry point that orchestrates all instrumentations
- **Key Files**:
- `src/lib/tracing/decorators.ts`: Workflow and task decorators (`@workflow`, `@task`, `@agent`)
- `src/lib/tracing/tracing.ts`: Core tracing utilities and span management
- `src/lib/node-server-sdk.ts`: Main initialization logic

#### Instrumentation Packages
Each follows the pattern: `packages/instrumentation-[provider]/`
- **OpenAI**: `@traceloop/instrumentation-openai`
- **Anthropic**: `@traceloop/instrumentation-anthropic`
- **Bedrock**: `@traceloop/instrumentation-bedrock`
- **Vector DBs**: Pinecone, Chroma, Qdrant packages
- **Frameworks**: LangChain, LlamaIndex packages

#### `ai-semantic-conventions`
- **Path**: `packages/ai-semantic-conventions/`
- **Purpose**: OpenTelemetry semantic conventions for AI/LLM spans
- **Key File**: `src/SemanticAttributes.ts` - defines all span attribute constants

### Instrumentation Pattern
All instrumentations extend `InstrumentationBase` from `@opentelemetry/instrumentation`:
1. **Hook Registration**: Wrap target library functions using `InstrumentationModuleDefinition`
2. **Span Creation**: Create spans with appropriate semantic attributes
3. **Data Extraction**: Extract request/response data and token usage
4. **Error Handling**: Capture and record errors appropriately

### Testing Strategy
- **Polly.js**: Records HTTP interactions for consistent test execution
- **ts-mocha**: TypeScript test runner
- **Recordings**: Stored in `recordings/` folders for replay testing

## Key Patterns

### Workspace Dependencies
Packages reference each other using `workspace:*` in package.json, managed by pnpm workspaces.

### Decorator Usage
```typescript
// Workflow spans
@workflow("my-workflow")
async function myWorkflow() { }

// Task spans
@task("my-task")
async function myTask() { }
```

### Manual Instrumentation
```typescript
import { trace } from "@traceloop/node-server-sdk";
const span = trace.withLLMSpan("my-llm-call", () => {
// LLM operations
});
```

### Telemetry Configuration
- Anonymous telemetry enabled by default
- Opt-out via `TRACELOOP_TELEMETRY=FALSE` environment variable
- Only collected in SDK, not individual instrumentations

## Common Development Tasks

### Adding New LLM Provider
1. Create new instrumentation package in `packages/instrumentation-[provider]/`
2. Implement instrumentation extending `InstrumentationBase`
3. Add to main SDK dependencies in `packages/traceloop-sdk/package.json`
4. Register in SDK initialization

### Running Single Test
```bash
cd packages/[package-name]
pnpm test -- --grep "test name pattern"
```

### Debugging Instrumentations
Enable OpenTelemetry debug logging:
```bash
export OTEL_LOG_LEVEL=debug
```
123 changes: 123 additions & 0 deletions packages/sample-app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Experiment Sample Application

This sample app demonstrates the new experiment functionality in OpenLLMetry-JS SDK.

## 🚀 Quick Start

### Prerequisites
- Node.js >= 14
- pnpm

### Setup
1. Copy environment variables:
```bash
cp .env.example .env
```

2. Fill in your API keys in `.env`:
```bash
TRACELOOP_API_KEY=your_traceloop_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
```

3. Build and run:
```bash
# Run JSONL parsing tests (no API keys required)
npm run run:experiment_test

# Run experiment example (requires API keys)
npm run run:experiment
```

## 🧪 Experiment Features Implemented

### Core Components
- **Experiment Client**: `client.experiment.run(taskFunction, options)`
- **Evaluator Client**: `client.evaluator.runExperimentEvaluator(...)`
- **SSE Streaming**: Real-time progress updates via Server-Sent Events
- **Dataset Integration**: Works with existing dataset functionality
- **JSONL Support**: Parse dataset versions in JSONL format

### Example Usage
```typescript
import * as traceloop from "@traceloop/node-server-sdk";

traceloop.initialize({ apiKey: "your-key", appName: "my-app" });
const client = traceloop.getClient();

// Define your experiment task
const myTask: ExperimentTaskFunction = async (input) => {
// Your task logic here
return { result: "processed", input };
};

// Run experiment
const results = await client.experiment.run(myTask, {
datasetSlug: "my-dataset",
datasetVersion: "v1",
evaluators: [{ name: "accuracy" }],
experimentSlug: "my-experiment",
stopOnError: false,
waitForResults: true,
concurrency: 3
});

console.log(\`Results: \${results.results.length}\`);
console.log(\`Errors: \${results.errors.length}\`);
```

## 🐛 Debug Configuration

### VS Code Debugging
Three debug configurations are available:

1. **Debug Experiment Example**: Full debug with real API calls
2. **Debug Simple Experiment Test**: Debug the JSONL parsing tests
3. **Debug Experiment Example (Mock Mode)**: Debug with mocked API responses

### Environment Variables
- \`MOCK_MODE=true\`: Enable mock responses for debugging without API keys
- \`DEBUG=traceloop:*\`: Enable debug logging
- \`OTEL_LOG_LEVEL=debug\`: Enable OpenTelemetry debug logs

### Running in Mock Mode
```bash
MOCK_MODE=true npm run run:experiment
```

## 📁 Files Structure

- \`src/experiment_example.ts\`: Main experiment demonstration
- \`src/medical_prompts.ts\`: Example prompt templates for healthcare experiments
- \`src/simple_experiment_test.ts\`: JSONL parsing tests
- \`.vscode/launch.json\`: Debug configurations
- \`.vscode/tasks.json\`: Build and run tasks

## 🔧 Available Scripts

- \`npm run build\`: Build TypeScript
- \`npm run run:experiment\`: Run experiment example
- \`npm run run:experiment_test\`: Run JSONL parsing tests
- \`npm run lint\`: Run ESLint
- \`npm run lint:fix\`: Fix ESLint issues

## 📊 Experiment Types Demonstrated

### Medical Question Experiments
- **Refuse Advice Strategy**: Redirects users to medical professionals
- **Provide Info Strategy**: Educational responses with disclaimers
- **Comparison**: Side-by-side evaluation of different approaches

### Sentiment Analysis Experiment
- **Task**: Analyze text sentiment
- **Evaluators**: Accuracy and confidence calibration
- **Concurrency**: Demonstrates parallel processing

## 🔗 Integration Points

The experiment feature integrates with:
- **Existing Dataset API**: Uses \`client.datasets.get()\`
- **TraceloopClient**: Available as \`client.experiment\`
- **OpenTelemetry**: All experiments are traced
- **Error Handling**: Configurable error propagation
- **Type Safety**: Full TypeScript support
3 changes: 3 additions & 0 deletions packages/sample-app/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
"run:image_generation": "npm run build && node dist/src/sample_openai_image_generation.js",
"run:sample_edit": "npm run build && node dist/src/test_edit_only.js",
"run:sample_generate": "npm run build && node dist/src/test_generate_only.js",
"run:experiment": "npm run build && node dist/src/experiment_example.js",
"run:experiment_test": "npm run build && node dist/src/simple_experiment_test.js",
"dev:image_generation": "pnpm --filter @traceloop/instrumentation-openai build && pnpm --filter @traceloop/node-server-sdk build && npm run build && node dist/src/sample_openai_image_generation.js",
"lint": "eslint .",
"lint:fix": "eslint . --fix"
Expand Down Expand Up @@ -71,6 +73,7 @@
"langchain": "^0.3.30",
"llamaindex": "^0.11.19",
"openai": "^5.12.2",
"eventsource": "^3.0.2",
"zod": "^3.25.76"
},
"private": true,
Expand Down
Loading
Loading