A sophisticated literary discussion platform that combines the power of Large Language Models (LLMs) with retrieval-augmented generation to provide intelligent, context-aware discussions about books.
- FastAPI: Modern, fast web framework for building APIs
- ChromaDB: Vector database for efficient similarity search and storage
- Sentence Transformers: For generating high-quality text embeddings
- Google Gemini API: For generating contextual responses
- Wikipedia API: For fetching book-related information
- Google Books API: For book metadata and search
The system follows a three-step process to prepare books for discussion:
-
Book Search & Selection
- Uses Google Books API to search and retrieve book metadata
- Endpoint:
/search-books - Returns book details including title, authors, description, and thumbnail
-
Wikipedia Data Fetching
- Retrieves comprehensive book information from Wikipedia
- Endpoint:
/books/fetch-wiki - Stores book summaries and content in JSON format
- Handles disambiguation and not-found cases gracefully
-
Content Embedding
- Processes book content into vector embeddings
- Uses SentenceTransformer ("all-MiniLM-L6-v2") for embedding generation
- Chunks content into 500-character segments for optimal processing
- Stores embeddings in ChromaDB for efficient retrieval
- Endpoint:
/books/embed
-
ChromaDB Integration
- Persistent vector storage in
vectorstoredirectory - Each book gets its own collection
- Efficient similarity search for context retrieval
- Automatic collection management and persistence
- Persistent vector storage in
-
Embedding Process
- Content chunking for manageable segments
- High-quality embeddings using SentenceTransformer
- Unique IDs for each chunk
- Automatic collection creation and management
The system implements a sophisticated query processing pipeline:
-
Context Retrieval
- Converts user questions into embeddings
- Retrieves most relevant book chunks using similarity search
- Combines retrieved context with conversation history
-
Response Generation
- Uses Google Gemini API for response generation
- Incorporates:
- Retrieved book context
- Conversation history
- Book metadata
- Wikipedia information
- Generates concise, contextually relevant responses
-
Response Management
- Maintains conversation history
- Compresses responses for efficient storage
- Handles error cases gracefully
GET /search-books: Search for books using Google Books APIPOST /books/fetch-wiki: Fetch Wikipedia data for a bookPOST /books/embed: Generate and store embeddings for a bookGET /books/check: Check if a book is prepared for discussionPOST /books/prepare: Prepare a book for discussion (combines wiki fetch and embedding)
POST /chat/query: Process user questions and generate responses- Input: book_id, question, conversation history
- Output: AI response with updated history
-
Book Preparation
User Request → Book Search → Wikipedia Fetch → Content Embedding → Vector Storage -
Query Processing
User Question → Embedding → Context Retrieval → Response Generation → User Response
- Environment variables for API keys
- CORS middleware for secure cross-origin requests
- Error handling and validation at each step
- Persistent storage with proper directory management
The frontend is a modern web application that provides:
- Book search and selection interface
- Real-time chat interface for book discussions
- Progress tracking for book preparation
- Responsive design for various devices
- Python 3.10+
- Required API keys:
- Google Books API
- Google Gemini API
- (Optional) Wikipedia API
-
Create a
.envfile with required API keys:GOOGLE_API_KEY=your_google_books_api_key GEMINI_API_KEY=your_gemini_api_key -
Install dependencies:
pip install -r requirements.txt
-
Start the backend server:
uvicorn backend.main:app --reload
The system implements RAG to enhance LLM responses with relevant book context:
- Retrieval: Uses vector similarity search to find relevant book passages
- Augmentation: Combines retrieved context with user query
- Generation: Uses Gemini to generate contextually relevant responses
- Chunk size optimization (500 characters)
- Efficient embedding storage and retrieval
- Persistent vector database for quick access
- Automatic collection management
- Graceful handling of API failures
- Proper error messages for users
- Automatic retry mechanisms
- Data validation at each step
-
Enhanced Context Retrieval
- Implement more sophisticated chunking strategies
- Add metadata-aware retrieval
- Implement hybrid search (keyword + semantic)
-
Performance Optimization
- Implement caching for frequent queries
- Optimize vector search parameters
- Add batch processing for embeddings
-
Additional Features
- Support for multiple languages
- Enhanced book metadata integration
- Advanced conversation management
- User feedback integration