🔗 Live Demo: 🌐 Live Demo
AI-Powered Document Query System – Upload PDFs or share document URLs and ask natural language questions to get instant, intelligent answers.
Built with a cutting-edge stack: FastAPI (backend) ⚡ + Next.js (frontend) 🎨.
- Document Processing – Upload PDFs or provide document URLs for seamless ingestion
- Google Drive Smart Links – Auto-converts shareable Drive links into direct download links
- AI-Powered Q&A – Powered by Google Gemini 2.0 Flash Lite for intelligent responses
- Vector Search – Fast & accurate retrieval using FAISS + Jina embeddings
- Persistent Caching – Indexes cached with SHA256 hashing for lightning-fast reuse
- Modern UI/UX – Clean, responsive design with dark & light mode support
- Real-time Processing – Concurrent question handling for speed & scalability
- Secure by Design – Bearer token authentication & input validation
✔️ No more manual searching through huge PDFs
✔️ Get answers with context in seconds
✔️ Works for research, legal docs, policies, contracts, study notes
✔️ Fast ⚡, Secure 🔒, and Beautiful ✨
| Light Mode 🌞 | Dark Mode 🌙 |
|---|---|
![]() |
![]() |
- Framework: FastAPI with async support
- Document Processing: PyMuPDF for PDF parsing
- Vector Store: FAISS with Jina embeddings
- LLM: Google Generative AI (Gemini 2.0 Flash Lite)
- Text Splitting: LangChain's RecursiveCharacterTextSplitter
- Caching: File-based persistent indexes with SHA256 hashing
- Framework: Next.js 15 with TypeScript
- UI Components: Radix UI with Tailwind CSS
- State Management: React hooks
- Theme Support: next-themes for dark/light mode
- File Handling: Native FormData API for file uploads
- Python 3.8+
- Node.js 18+
- npm or yarn
- API Keys:
- Google AI API key (for Gemini)
- Jina AI API key (for embeddings)
- Navigate to the backend directory:
cd backend- Install Python dependencies:
pip install -r requirements.txt- Create a
.envfile with your API keys:
GOOGLE_API_KEY=your_google_api_key_here
JINA_API_KEY=your_jina_api_key_here
BEARER_TOKEN=your_secure_bearer_token_here- Start the FastAPI server:
uvicorn main:app --reload --host 0.0.0.0 --port 8000- Production ready deployment:
current setup uses Hugging Face Spaces (Docker build)- Navigate to the frontend directory:
cd frontend- Install Node.js dependencies:
npm install- Create a
.env.localfile:
NEXT_PUBLIC_API_URL=http://localhost:8000/hackrx/run
NEXT_PUBLIC_API_TOKEN=your_secure_bearer_token_here- Start the development server:
npm run devThe application will be available at http://localhost:9002.
- Production ready deployment:
current setup uses Vercel.Process documents and answer questions.
Request Body (multipart/form-data):
questions: Array of strings (questions to ask)file: PDF file (optional)documents: Document URL (optional)
Response:
{
"answers": ["Answer 1", "Answer 2", "..."]
}Authentication: Bearer token required in Authorization header.
Health check endpoint.
- Document Input: Choose between uploading a PDF file or providing a document URL (including Google Drive links)
- Questions: Enter your questions, one per line
- Processing: The system will:
- Download/process the document
- Create vector embeddings (cached for future use)
- Use RAG (Retrieval-Augmented Generation) to answer questions
- Results: View AI-generated answers with source context
- LangChain: Document processing and RAG pipeline
- FAISS: Vector similarity search
- Jina Embeddings: High-quality text embeddings
- Google Gemini: Large language model for answer generation
- Radix UI: Accessible component library
- Tailwind CSS: Utility-first CSS framework
- Concurrent Processing: Multiple questions processed simultaneously
- Intelligent Caching: Document indexes persisted to disk
- Chunking Strategy: Optimized text splitting (1000 chars, 150 overlap)
- Retrieval Optimization: Top-7 relevant chunks for context
- Bearer token authentication
- Input validation and sanitization
- CORS configuration for frontend communication
- Secure file handling with temporary files
cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm run devnpm run typecheck- Students: Quickly summarize large PDFs and extract key points
- Enterprises: Analyze contracts, policies, and reports
- Legal Professionals: Retrieve specific clauses from long agreements
- Developers: Integrate smart document search into apps
- Fork the repository
- Create a feature branch
- Make your changes
- Optimize caching or embedding logic
- Add tests if applicable
- Submit a pull request
This project is developed by team Innov8 as a solution to HackRX6.0 Hackathon by Bajaj Finserv.
For questions or issues, please refer to the application's built-in FAQ section or contact the development team.
Made with ❤️ by Team Innov8




