Skip to content

Commit 49b83fa

Browse files
committed
docs: add key features section to README
1 parent e2110ed commit 49b83fa

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

README.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,18 @@
22

33
A MCP server for fetching and searching 3rd party package documentation.
44

5-
This project provides a Model Context Protocol (MCP) server designed to scrape, process, index, and search documentation for various software libraries and packages. It fetches content from specified URLs, splits it into meaningful chunks using semantic splitting techniques, generates vector embeddings using OpenAI, and stores the data in an SQLite database. The server utilizes `sqlite-vec` for efficient vector similarity search and FTS5 for full-text search capabilities, combining them for hybrid search results. It supports versioning, allowing documentation for different library versions (including unversioned content) to be stored and queried distinctly.
5+
## ✨ Key Features
6+
7+
- 🌐 **Scrape & Index:** Fetch documentation from web sources or local files.
8+
- 🧠 **Smart Processing:** Utilize semantic splitting and OpenAI embeddings for meaningful content chunks.
9+
- 💾 **Efficient Storage:** Store data in SQLite, leveraging `sqlite-vec` for vector search and FTS5 for full-text search.
10+
- 🔍 **Hybrid Search:** Combine vector and full-text search for relevant results across different library versions.
11+
- ⚙️ **Job Management:** Handle scraping tasks asynchronously with a robust job queue and management tools (MCP & CLI).
12+
- 🐳 **Easy Deployment:** Run the server easily using the provided Docker image.
613

7-
The scraping process is managed by an asynchronous job queue (`PipelineManager`), allowing multiple scrape jobs to run concurrently.
14+
## Overview
15+
16+
This project provides a Model Context Protocol (MCP) server designed to scrape, process, index, and search documentation for various software libraries and packages. It fetches content from specified URLs, splits it into meaningful chunks using semantic splitting techniques, generates vector embeddings using OpenAI, and stores the data in an SQLite database. The server utilizes `sqlite-vec` for efficient vector similarity search and FTS5 for full-text search capabilities, combining them for hybrid search results. It supports versioning, allowing documentation for different library versions (including unversioned content) to be stored and queried distinctly.
817

918
The server exposes MCP tools for:
1019

0 commit comments

Comments
 (0)