Project Title: Ask Document RAG

Overview

This project provides a sophisticated document retrieval system that leverages the capabilities of the LlamaIndex for efficient query processing. The architecture is designed to operate locally, integrating advanced Retrieval-Augmented Generation (RAG) strategies with powerful language models like Mistral, Gemma, and Meta for natural language processing.

Architecture

The architecture is split into several key components:

User Interface (UI)
- Streamlit: Provides a user-friendly interface for querying documents. Users interact with the system by entering their queries through the Streamlit application.
Document Processing and Querying
- Document Upload: Users can upload documents which are then processed and stored locally.
- LlamaIndex:
  - Chunking: Documents are split into smaller, manageable chunks.
  - Embedding: Each chunk is embedded into a vector space using LlamaIndex’s vectorization techniques, allowing for efficient searching and retrieval.
  - Advanced RAG Retrieval Strategies: Implements advanced RAG strategies to enhance the retrieval process.
- Prompt Templates: Custom templates are used to format and present the retrieved documents as context for generating responses.
Data Storage
- PostgreSQL: A PostgreSQL database, running in a Docker container, is used to store the indexed chunks and their embeddings. This enables persistent storage and quick retrieval of document data.
Language Model Interaction
- Colab Machine: A Google Colab instance is used to run the language models. This allows for powerful computation without requiring local resources.
- Ngrok: Ngrok tunnels are used to securely expose the Colab machine's server to the local environment, facilitating communication between the components.
- Language Models: Supports multiple models including Mistral AI, Gemma, and Meta, for generating responses based on the queries.
Response Generation
- After retrieving the most relevant chunks from the database, the system formulates a prompt incorporating the retrieved context and sends it to the language model. The model then generates a response, which is relayed back to the user through the Streamlit interface.

Setup and Installation

Prerequisites

Python 3.x
Docker
PostgreSQL
Streamlit
Google Colab Account
Ngrok Account

Installation Steps

Clone the Repository

git clone https://github.com/Elma-dev/ask_document_rag.git
cd ask-document-rag```

Install Dependencies
```
pip install -r requirements.txt
```
Setup Ngrok for Colab Machine
Run the Application
```
streamlit run app.py
```

Usage

Upload Document: Use the Streamlit interface to upload documents that you want to query.

Submit Query: Enter your query in the provided input box.

Receive Response: The system will process your query, retrieve the most relevant information from the documents, and generate a response using the language model. The response will be displayed on the Streamlit interface.

Contributing

If you wish to contribute to this project, please fork the repository and submit a pull request with detailed information about the changes you made.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
.streamlit		.streamlit
files		files
images		images
pages		pages
utils		utils
.DS_Store		.DS_Store
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Title: Ask Document RAG

Overview

Architecture

Setup and Installation

Prerequisites

Installation Steps

Usage

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Title: Ask Document RAG

Overview

Architecture

Setup and Installation

Prerequisites

Installation Steps

Usage

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages