Skip to content

Commit 43c4bab

Browse files
nicolangrTonsOfFun
andauthored
added ollama embeddigs parameters for nomic-embed-text (#214)
* added ollama embeddigs parameters for nomic-embed-text * fixed Space missing after colon. * refactoring * lint fix * lint fix * added embed_now generation * added embed_later generation * Adding embeddings callback and tests * Linting * Updating embeddings docs and cassettes. --------- Co-authored-by: Justin Bowen <JusBowen@gmail.com>
1 parent f7b3041 commit 43c4bab

26 files changed

Lines changed: 55324 additions & 3096 deletions

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1299,3 +1299,4 @@ When updating documentation:
12991299

13001300
- VCR cassettes need to be removed and tests run again to record new cassettes when the request params change
13011301

1302+
- Do not hardcode examples and make sure to use vscode regions and vite-press code snippets imports

docs/docs/framework/embeddings.md

Lines changed: 384 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,384 @@
1+
# Embeddings
2+
3+
Embeddings are numerical representations of text that capture semantic meaning, enabling similarity searches, clustering, and other vector-based operations. ActiveAgent provides a unified interface for generating embeddings across all supported providers.
4+
5+
## Overview
6+
7+
Embeddings transform text into high-dimensional vectors that represent semantic meaning. Similar texts produce similar vectors, enabling powerful features like:
8+
9+
- **Semantic Search** - Find related content by meaning, not just keywords
10+
- **Clustering** - Group similar documents automatically
11+
- **Classification** - Categorize text based on similarity to examples
12+
- **Recommendation** - Suggest related content based on embeddings
13+
- **Anomaly Detection** - Identify outliers in text data
14+
15+
## Basic Usage
16+
17+
### Generating Embeddings
18+
19+
Use the `embed_now` method to generate embeddings synchronously:
20+
21+
<<< @/../test/agents/embedding_agent_test.rb#embedding_sync_generation {ruby:line-numbers}
22+
23+
::: details Response Example
24+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-generates-embeddings-synchronously-with-embed-now.md -->
25+
:::
26+
27+
### Async Embeddings
28+
29+
Generate embeddings in background jobs:
30+
31+
<<< @/../test/agents/embedding_agent_test.rb#embedding_async_generation {ruby:line-numbers}
32+
33+
## Embedding Callbacks
34+
35+
Use callbacks to process embeddings before and after generation:
36+
37+
<<< @/../test/agents/embedding_agent_test.rb#embedding_with_callbacks {ruby:line-numbers}
38+
39+
::: details Response Example
40+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-processes-embeddings-with-callbacks.md -->
41+
:::
42+
43+
## Provider Configuration
44+
45+
Each provider supports different embedding models and configurations:
46+
47+
### OpenAI
48+
49+
Configure OpenAI-specific embedding models:
50+
51+
<<< @/../test/agents/embedding_agent_test.rb#embedding_openai_model_config {ruby:line-numbers}
52+
53+
::: details Response Example
54+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-uses-configured-openai-embedding-model.md -->
55+
:::
56+
57+
### Ollama
58+
59+
Configure Ollama for local embedding generation:
60+
61+
<<< @/../test/agents/embedding_agent_test.rb#embedding_ollama_provider_test {ruby:line-numbers}
62+
63+
::: details Response Example
64+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-generates-embeddings-with-Ollama-provider.md -->
65+
:::
66+
67+
### Error Handling
68+
69+
ActiveAgent provides proper error handling for connection issues:
70+
71+
<<< @/../test/generation_provider/ollama_provider_test.rb#ollama_provider_embed {ruby:line-numbers}
72+
73+
::: details Response Example
74+
<!-- @include: @/parts/examples/ollama-provider-test.rb-test-embed-method-works-with-ollama-provider.md -->
75+
:::
76+
77+
## Working with Embeddings
78+
79+
### Similarity Search
80+
81+
Find similar documents using cosine similarity:
82+
83+
<<< @/../test/agents/embedding_agent_test.rb#embedding_similarity_search {ruby:line-numbers}
84+
85+
::: details Response Example
86+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-performs-similarity-search-with-embeddings.md -->
87+
:::
88+
89+
### Batch Processing
90+
91+
Process multiple embeddings efficiently:
92+
93+
<<< @/../test/agents/embedding_agent_test.rb#embedding_batch_processing {ruby:line-numbers}
94+
95+
::: details Response Example
96+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-processes-multiple-embeddings-in-batch.md -->
97+
:::
98+
99+
### Embedding Dimensions
100+
101+
Different models produce different embedding dimensions:
102+
103+
<<< @/../test/agents/embedding_agent_test.rb#embedding_dimension_test {ruby:line-numbers}
104+
105+
::: details Response Example
106+
<!-- @include: @/parts/examples/embedding-agent-test.rb-test-verifies-embedding-dimensions-for-different-models.md -->
107+
:::
108+
109+
## Advanced Patterns
110+
111+
### Caching Embeddings
112+
113+
Cache embeddings to avoid regenerating them:
114+
115+
```ruby
116+
class CachedEmbeddingAgent < ApplicationAgent
117+
def get_embedding(text)
118+
cache_key = "embedding:#{Digest::SHA256.hexdigest(text)}"
119+
120+
Rails.cache.fetch(cache_key, expires_in: 30.days) do
121+
generation = self.class.with(message: text).prompt_context
122+
generation.embed_now.message.content
123+
end
124+
end
125+
end
126+
```
127+
128+
### Multi-Model Embeddings
129+
130+
Use different models for different purposes:
131+
132+
```ruby
133+
class MultiModelEmbeddingAgent < ApplicationAgent
134+
def generate_semantic_embedding(text)
135+
# High-quality semantic embedding
136+
self.class.generate_with :openai,
137+
embedding_model: "text-embedding-3-large"
138+
139+
generation = self.class.with(message: text).prompt_context
140+
generation.embed_now
141+
end
142+
143+
def generate_fast_embedding(text)
144+
# Faster, smaller embedding for real-time use
145+
self.class.generate_with :openai,
146+
embedding_model: "text-embedding-3-small"
147+
148+
generation = self.class.with(message: text).prompt_context
149+
generation.embed_now
150+
end
151+
end
152+
```
153+
154+
## Vector Databases
155+
156+
Store and query embeddings using vector databases:
157+
158+
### PostgreSQL with pgvector
159+
160+
```ruby
161+
class PgVectorAgent < ApplicationAgent
162+
def store_document(text)
163+
# Generate embedding
164+
generation = self.class.with(message: text).prompt_context
165+
embedding = generation.embed_now.message.content
166+
167+
# Store in PostgreSQL with pgvector
168+
Document.create!(
169+
content: text,
170+
embedding: embedding # pgvector column
171+
)
172+
end
173+
174+
def search_similar(query, limit: 10)
175+
query_embedding = get_embedding(query)
176+
177+
# Use pgvector's <-> operator for cosine distance
178+
Document
179+
.order(Arel.sql("embedding <-> '#{query_embedding}'"))
180+
.limit(limit)
181+
end
182+
end
183+
```
184+
185+
### Pinecone Integration
186+
187+
```ruby
188+
class PineconeAgent < ApplicationAgent
189+
def initialize
190+
super
191+
@pinecone = Pinecone::Client.new(api_key: ENV['PINECONE_API_KEY'])
192+
@index = @pinecone.index('documents')
193+
end
194+
195+
def upsert_document(id, text, metadata = {})
196+
embedding = get_embedding(text)
197+
198+
@index.upsert(
199+
vectors: [{
200+
id: id,
201+
values: embedding,
202+
metadata: metadata.merge(text: text)
203+
}]
204+
)
205+
end
206+
207+
def query_similar(text, top_k: 10)
208+
embedding = get_embedding(text)
209+
210+
@index.query(
211+
vector: embedding,
212+
top_k: top_k,
213+
include_metadata: true
214+
)
215+
end
216+
end
217+
```
218+
219+
## Testing Embeddings
220+
221+
Test embedding functionality with comprehensive test coverage including callbacks, similarity search, and batch processing as shown in the examples above.
222+
223+
## Performance Optimization
224+
225+
### Batch Processing
226+
227+
Process embeddings in batches for better performance:
228+
229+
```ruby
230+
class BatchOptimizedAgent < ApplicationAgent
231+
def process_documents(documents)
232+
documents.each_slice(100) do |batch|
233+
Parallel.each(batch, in_threads: 5) do |doc|
234+
generation = self.class.with(message: doc.content).prompt_context
235+
doc.embedding = generation.embed_now.message.content
236+
doc.save!
237+
end
238+
end
239+
end
240+
end
241+
```
242+
243+
### Caching Strategy
244+
245+
Implement intelligent caching:
246+
247+
```ruby
248+
class SmartCacheAgent < ApplicationAgent
249+
def get_or_generate_embedding(text)
250+
# Check cache first
251+
cached = fetch_from_cache(text)
252+
return cached if cached
253+
254+
# Generate if not cached
255+
embedding = generate_embedding(text)
256+
257+
# Cache based on text length and importance
258+
if should_cache?(text)
259+
cache_embedding(text, embedding)
260+
end
261+
262+
embedding
263+
end
264+
265+
private
266+
267+
def should_cache?(text)
268+
text.length > 100 || text.include?("important")
269+
end
270+
end
271+
```
272+
273+
## Best Practices
274+
275+
1. **Choose the Right Model** - Balance quality, speed, and cost
276+
2. **Normalize Text** - Preprocess consistently before embedding
277+
3. **Cache Aggressively** - Embeddings are expensive to generate
278+
4. **Batch When Possible** - Process multiple texts together
279+
5. **Monitor Dimensions** - Different models produce different sizes
280+
6. **Use Callbacks** - Process embeddings consistently
281+
7. **Handle Failures** - Implement retry logic and fallbacks
282+
8. **Version Embeddings** - Track which model generated each embedding
283+
284+
## Common Use Cases
285+
286+
### Semantic Search
287+
288+
```ruby
289+
class SemanticSearchAgent < ApplicationAgent
290+
def build_search_index(documents)
291+
documents.each do |doc|
292+
generation = self.class.with(message: doc.content).prompt_context
293+
doc.update!(embedding: generation.embed_now.message.content)
294+
end
295+
end
296+
297+
def search(query)
298+
query_embedding = get_embedding(query)
299+
300+
Document
301+
.select("*, embedding <-> '#{query_embedding}' as distance")
302+
.order("distance")
303+
.limit(10)
304+
end
305+
end
306+
```
307+
308+
### Content Recommendations
309+
310+
```ruby
311+
class RecommendationAgent < ApplicationAgent
312+
def recommend_similar(article)
313+
article_embedding = article.embedding || generate_embedding(article.content)
314+
315+
Article
316+
.where.not(id: article.id)
317+
.select("*, embedding <-> '#{article_embedding}' as similarity")
318+
.order("similarity")
319+
.limit(5)
320+
end
321+
end
322+
```
323+
324+
### Clustering
325+
326+
```ruby
327+
class ClusteringAgent < ApplicationAgent
328+
def cluster_documents(documents, num_clusters: 5)
329+
# Generate embeddings
330+
embeddings = documents.map do |doc|
331+
get_embedding(doc.content)
332+
end
333+
334+
# Use k-means or other clustering algorithm
335+
clusters = perform_clustering(embeddings, num_clusters)
336+
337+
# Assign documents to clusters
338+
documents.zip(clusters).each do |doc, cluster_id|
339+
doc.update!(cluster_id: cluster_id)
340+
end
341+
end
342+
end
343+
```
344+
345+
## Troubleshooting
346+
347+
### Common Issues
348+
349+
1. **Dimension Mismatch** - Ensure all embeddings use the same model
350+
2. **Memory Issues** - Large embedding vectors can consume significant RAM
351+
3. **Rate Limits** - Implement exponential backoff for API limits
352+
4. **Cost Management** - Monitor embedding API usage and costs
353+
5. **Connection Errors** - Handle network issues with Ollama and other providers
354+
355+
### Debugging
356+
357+
```ruby
358+
class DebuggingAgent < ApplicationAgent
359+
def debug_embedding(text)
360+
generation = self.class.with(message: text).prompt_context
361+
362+
Rails.logger.info "Generating embedding for: #{text[0..100]}..."
363+
Rails.logger.info "Provider: #{generation_provider.class.name}"
364+
Rails.logger.info "Model: #{generation_provider.embedding_model}"
365+
366+
response = generation.embed_now
367+
embedding = response.message.content
368+
369+
Rails.logger.info "Dimensions: #{embedding.size}"
370+
Rails.logger.info "Range: [#{embedding.min}, #{embedding.max}]"
371+
Rails.logger.info "Mean: #{embedding.sum / embedding.size}"
372+
373+
embedding
374+
end
375+
end
376+
```
377+
378+
## Related Documentation
379+
380+
- [Generation Provider Overview](/docs/framework/generation-provider)
381+
- [OpenAI Provider](/docs/generation-providers/openai-provider)
382+
- [Ollama Provider](/docs/generation-providers/ollama-provider)
383+
- [Callbacks](/docs/active-agent/callbacks)
384+
- [Generation](/docs/active-agent/generation)

0 commit comments

Comments
 (0)