Merge branch 'main' into Resolve-conflict

fsender · web-flow · commit c874abb73ae8 · 2026-04-10T00:43:30.000+08:00
diff --git a/README.md b/README.md
@@ -798,6 +798,10 @@ ollama pull nomic-embed-text
 }
 ```
 
+The built-in `ollama` provider uses Ollama's native `/api/embeddings` endpoint and is the simplest setup when you want to use `nomic-embed-text`.
+
+If you want to use a different Ollama embedding model through its OpenAI-compatible API, use the `custom` provider instead and set `customProvider.baseUrl` to `http://127.0.0.1:11434/v1` so the plugin calls `.../v1/embeddings`.
+
 ## 📈 Performance
 
 The plugin is built for speed with a Rust native module (`tree-sitter`, `usearch`, SQLite). In practice, indexing and retrieval remain fast enough for interactive use on medium/large repositories.
@@ -857,6 +861,26 @@ Works with any server that implements the OpenAI `/v1/embeddings` API format (ll
 ```
 Required fields: `baseUrl`, `model`, `dimensions` (positive integer). Optional: `apiKey`, `maxTokens`, `timeoutMs` (default: 30000), `maxBatchSize` (or `max_batch_size`) to cap inputs per `/embeddings` request for servers like text-embeddings-inference. `{env:VAR_NAME}` placeholders are resolved before config validation for fields that are actually used and throw if the referenced environment variable is missing or malformed.
 
+**Custom Ollama models via OpenAI-compatible API**
+If you are running Ollama locally and want to use an embedding model other than the built-in `ollama` setup, point the custom provider at Ollama's OpenAI-compatible base URL with the `/v1` suffix:
+
+```json
+{
+  "embeddingProvider": "custom",
+  "customProvider": {
+    "baseUrl": "http://127.0.0.1:11434/v1",
+    "model": "qwen3-embedding:0.6b",
+    "dimensions": 1024,
+    "apiKey": "ollama"
+  }
+}
+```
+
+Notes:
+- The plugin appends `/embeddings`, so `baseUrl` should be `http://127.0.0.1:11434/v1`, not just `http://127.0.0.1:11434`.
+- Ollama ignores the API key, but some OpenAI-compatible clients expect one, so a placeholder like `"ollama"` is fine.
+- Make sure `dimensions` matches the actual output size of the model you pulled locally.
+
 ## ⚠️ Tradeoffs
 
 Be aware of these characteristics:
diff --git a/package-lock.json b/package-lock.json
diff --git a/package.json b/package.json
@@ -70,6 +70,7 @@
     ]
   },
   "dependencies": {
+    "@opencode-ai/plugin": "~1.3.13",
     "chokidar": "^5.0.0",
     "ignore": "^7.0.5",
     "p-queue": "^9.1.1",
@@ -80,7 +81,6 @@
     "@eslint/js": "^9.39.4",
     "@modelcontextprotocol/sdk": "^1.29.0",
     "@napi-rs/cli": "^3.6.0",
-    "@opencode-ai/plugin": "^1.3.13",
     "@types/node": "^25.5.2",
     "@vitest/coverage-v8": "^4.1.2",
     "eslint": "^9.39.4",