You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add SQLite database for storing embeddings and chunks by content hash
- Embeddings are deduplicated across branches - switching branches reuses existing embeddings
- Search results are filtered to only include chunks on current branch
- Auto-migrate legacy indexes on first run
- Add git branch detection and HEAD watcher for automatic re-indexing on branch switch
- Fix glob pattern to match root-level files (e.g., **/*.js now matches file.js)
- Add health check garbage collection for orphaned embeddings and chunks
- Fix lint errors: replace require() with ES imports
- Update README with accurate Quick Start instructions
Storage structure:
.opencode/index/
├── codebase.db # SQLite: embeddings, chunks, branch catalog
├── vectors.usearch # Vector index (uSearch)
├── inverted-index.json # BM25 keyword index
└── file-hashes.json # File change detection
1.**Parsing**: We use `tree-sitter` to intelligently parse your code into meaningful blocks (functions, classes, interfaces). JSDoc comments and docstrings are automatically included with their associated code.
115
120
2.**Chunking**: Large blocks are split with overlapping windows to preserve context across chunk boundaries.
116
121
3.**Embedding**: These blocks are converted into vector representations using your configured AI provider.
117
-
4.**Storage**: Vectors are stored in a high-performance local index using `usearch` with F16 quantization for 50% memory savings.
118
-
5.**Hybrid Search**: Combines semantic similarity (vectors) with BM25 keyword matching for best results.
122
+
4.**Storage**: Embeddings are stored in SQLite (deduplicated by content hash) and vectors in `usearch` with F16 quantization for 50% memory savings. A branch catalog tracks which chunks exist on each branch.
123
+
5.**Hybrid Search**: Combines semantic similarity (vectors) with BM25 keyword matching, filtered by current branch.
119
124
120
125
**Performance characteristics:**
121
126
-**Incremental indexing**: ~50ms check time — only re-embeds changed files
122
127
-**Smart chunking**: Understands code structure to keep functions whole, with overlap for context
123
128
-**Native speed**: Core logic written in Rust for maximum performance
124
129
-**Memory efficient**: F16 vector quantization reduces index size by 50%
130
+
-**Branch-aware**: Automatically tracks which chunks exist on each git branch
131
+
132
+
## 🌿 Branch-Aware Indexing
133
+
134
+
The plugin automatically detects git branches and optimizes indexing across branch switches.
135
+
136
+
### How It Works
137
+
138
+
When you switch branches, code changes but embeddings for unchanged content remain the same. The plugin:
139
+
140
+
1.**Stores embeddings by content hash**: Embeddings are deduplicated across branches
141
+
2.**Tracks branch membership**: A lightweight catalog tracks which chunks exist on each branch
142
+
3.**Filters search results**: Queries only return results relevant to the current branch
143
+
144
+
### Benefits
145
+
146
+
| Scenario | Without Branch Awareness | With Branch Awareness |
0 commit comments