A local RAG system for chatting with your technical PDF library
Open-Books TUI in action — chat with your technical library directly from the terminal.
Open-Books is a fully local Retrieval-Augmented Generation (RAG) application designed to ingest technical PDF books and enable natural language querying through an elegant Terminal User Interface (TUI).
Built for developers, researchers, and technical professionals who want to leverage their personal PDF library as a knowledge base—without sending data to external servers.
- Fully Local — Your documents and queries never leave your machine
- PDF Ingestion — Intelligent parsing using Docling with table/formula extraction
- Semantic Search — Vector-based retrieval with sentence-transformers embeddings
- Conversational Interface — Beautiful TUI powered by Textual
- Smart Syncing — Hash-based incremental indexing (only re-process changed files)
- Configurable — YAML-based configuration for all components
The system is built around two main pipelines: Ingestion (document processing) and Retrieval (query answering).
flowchart LR
subgraph Ingestion Pipeline
A[PDF Files] --> B[DoclingParser]
B --> C[MarkdownChunker]
C --> D[SentenceTransformers<br/>Embedder]
D --> E[(ChromaDB)]
end
subgraph Retrieval Pipeline
F[User Query] --> G[Query Constructor]
G --> H[Vector Search]
E --> H
H --> I[Query Answerer]
I --> J[OllamaGenerator]
J --> K[Response]
end
style E fill:#f9a825,stroke:#f57f17
style A fill:#e3f2fd,stroke:#1565c0
style K fill:#e8f5e9,stroke:#2e7d32
| Stage | Component | Description |
|---|---|---|
| Parsing | DoclingParser |
Extracts text, tables, and formulas from PDFs with structure preservation |
| Chunking | MarkdownChunker |
Splits documents into semantic chunks respecting section boundaries |
| Embedding | SentenceTransformersEmbedder |
Generates 384-dim vectors using all-MiniLM-L6-v2 |
| Storage | ChromaStore |
Persistent vector storage with metadata filtering |
| Retrieval | SimpleRAGPipeline |
Coordinates search and answer generation |
| Generation | OllamaGenerator |
Local LLM inference via Ollama |
| Technology | Purpose |
|---|---|
| Python 3.11+ | Runtime |
| uv | Fast package manager and resolver |
| Pydantic | Configuration validation and settings management |
| Typer | CLI framework |
| Technology | Purpose |
|---|---|
| ChromaDB | Persistent vector database |
| Docling | PDF parsing with structure extraction |
| sentence-transformers | Text embedding models |
| Ollama | Local LLM serving (llama3.2 default) |
| Technology | Purpose |
|---|---|
| Opik | ML observability and tracing |
| Redis + RedisVL | Semantic caching for repeated queries |
| Loguru | Structured logging |
| Technology | Purpose |
|---|---|
| Textual | Modern TUI framework |
| Rich | Terminal formatting and tables |
- Python 3.11 or higher
- Ollama installed and running
- (Optional) Redis for semantic caching
-
Clone the repository
git clone https://github.com/Moad26/Open-Code.git cd open-books -
Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh -
Install dependencies
uv sync
-
Start Ollama and pull the default model
ollama serve # In a separate terminal ollama pull llama3.2 -
Configure the application (optional)
Edit
config/config.yamlto customize settings. See Configuration for details.
Place your PDF files in the books/ directory:
cp /path/to/your/book.pdf books/Index your books into the vector store:
uv run python main.py syncThis command will:
- Scan the
books/folder for PDFs - Parse new/modified files using Docling
- Chunk and embed the content
- Store vectors in ChromaDB
- Update the manifest to track file hashes
View information about indexed books:
uv run python main.py infoStart the interactive terminal interface:
uv run python main.py chatTUI Keybindings:
| Key | Action |
|---|---|
q |
Quit application |
d |
Toggle dark/light mode |
s |
Toggle sidebar (book list) |
Ctrl+L |
Clear chat history |
PageUp/Down |
Scroll messages |
open-books/
├── main.py # Application entry point
├── pyproject.toml # Project metadata and dependencies
├── uv.lock # Locked dependency versions
├── config/
│ └── config.yaml # Application configuration
├── books/ # Drop PDFs here for indexing
├── data/
│ └── chroma_db/ # Persisted vector storage
├── logs/ # Application logs
├── src/
│ ├── cli.py # Typer CLI commands (sync, info, chat)
│ ├── ingestion/ # Data Processing Pipeline
│ │ ├── parsers/ # PDF parsing (DoclingParser, MarkerParser)
│ │ ├── chunking/ # Text chunking strategies
│ │ ├── embedding/ # Vector embedding (sentence-transformers)
│ │ ├── indexer/ # Library management and sync logic
│ │ └── vector_store/ # ChromaDB and Redis store wrappers
│ ├── generation/ # RAG Logic
│ │ ├── generator.py # LLM interfaces (OllamaGenerator)
│ │ ├── query_constructor.py # Multi-query expansion
│ │ ├── answerer.py # Context-based answer generation
│ │ └── pipeline.py # RAG pipeline orchestration
│ ├── retrieval/ # Search utilities
│ ├── ui/ # Terminal User Interface
│ │ ├── app.py # Textual RAGApp main class
│ │ └── widgets.py # Custom message widgets
│ ├── shared/ # Shared models and types
│ └── utils/ # Configuration and logging
└── tests/ # Pytest test suite
All settings are managed via config/config.yaml. Below is the configuration schema with defaults:
parsing:
parser: docling # Options: docling, marker
extract_images: true # Extract diagrams/figures
extract_tables: true # Extract tables as structured data
ocr_enabled: false # Enable OCR for scanned PDFs (slow)chunking:
strategy: markdown_based # Options: markdown_based, semantic
chunk_size: 512 # Target tokens per chunk
chunk_overlap: 50 # Overlap between chunks
respect_boundaries: true # Never split across sections
min_chunk_size: 100 # Discard chunks smaller than this
max_chunk_size: 1024 # Hard maximum size
preserve_code_blocks: true # Keep code as atomic chunks
preserve_equations: true # Keep equations with contextembedding:
provider: sentence_transformers
model_name: all-MiniLM-L6-v2 # HuggingFace model ID
dimensions: 384 # Must match model output
device: cpu # Options: cpu, cuda, mps
batch_size: 32vector_store:
client_path: /path/to/data/chroma_db
collection_name: technical_booksllm:
provider: ollama # Options: ollama, openai
model_name: llama3.2 # Model to use
api_key: null # Required for OpenAI
base_url: http://localhost:11434 # Ollama server URL
temperature: 0.1 # Response creativity (0-1)redis:
host: localhost
port: 6379
cache_threshold: 0.1 # Semantic similarity thresholduv run pytestRun with coverage:
uv run pytest --cov=srcuv run pytest -m "not slow"This project is open source. See LICENSE for details.
Built for knowledge seekers