Skip to content

DandaAkhilReddy/Audtext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽ™๏ธ Audtext

Transform Audio into Text & Insights โ€” 100% Local, 100% Private

GitHub stars GitHub forks License Python React


Audtext Demo

๐Ÿš€ Quick Start โ€ข โœจ Features โ€ข ๐Ÿ“– Documentation โ€ข ๐Ÿค Contributing


๐ŸŒŸ Why Audtext?

๐Ÿ”’ Privacy First

Your audio never leaves your computer. Everything runs locally using OpenAI's Whisper model - no cloud uploads, no API keys needed, no subscription costs.

โšก Lightning Fast

CPU-optimized transcription with faster-whisper. Process 1-hour audio files in minutes, not hours. Real-time progress tracking included.

๐Ÿค– AI-Powered Summaries

Get intelligent summaries using Ollama's local LLM. Choose from concise, detailed, or bullet-point formats - all without API costs.

๐Ÿ“ค Multiple Export Formats

Export your transcripts as TXT, SRT, VTT, or JSON. Perfect for subtitles, documentation, or further processing.


โœจ Features

Feature Description
๐ŸŽต Multi-Format Support MP3, WAV, M4A, FLAC, OGG, WEBM, MP4
๐Ÿ“Š Real-Time Progress Watch transcription progress live
๐Ÿ• Timestamps Every segment includes precise timing
๐ŸŒ Multi-Language Automatic language detection
๐Ÿ“ฑ Responsive UI Beautiful interface on any device
๐Ÿ”„ No Size Limits Upload audio files of any length

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         AUDTEXT                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚              โ”‚     โ”‚              โ”‚     โ”‚              โ”‚   โ”‚
โ”‚   โ”‚   Frontend   โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   Backend    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   Whisper    โ”‚   โ”‚
โ”‚   โ”‚   React 18   โ”‚     โ”‚   FastAPI    โ”‚     โ”‚   (Local)    โ”‚   โ”‚
โ”‚   โ”‚              โ”‚     โ”‚              โ”‚     โ”‚              โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                               โ”‚                                  โ”‚
โ”‚                               โ–ผ                                  โ”‚
โ”‚                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                         โ”‚
โ”‚                        โ”‚              โ”‚                         โ”‚
โ”‚                        โ”‚   Ollama     โ”‚                         โ”‚
โ”‚                        โ”‚   (LLM)      โ”‚                         โ”‚
โ”‚                        โ”‚              โ”‚                         โ”‚
โ”‚                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                         โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿš€ Quick Start

Prerequisites

Requirement Version Installation
Python 3.11+ python.org
Node.js 18+ nodejs.org
FFmpeg Latest See below
Ollama Latest ollama.ai
๐Ÿ“ฆ Install FFmpeg
# Windows (winget)
winget install ffmpeg

# Windows (chocolatey)
choco install ffmpeg

# macOS
brew install ffmpeg

# Linux (Ubuntu/Debian)
sudo apt install ffmpeg

โšก 3-Step Setup

# 1๏ธโƒฃ Clone & Setup Backend
git clone https://github.com/DandaAkhilReddy/Audtext.git
cd Audtext/backend
python -m venv venv && .\venv\Scripts\activate  # Windows
pip install -r requirements.txt

# 2๏ธโƒฃ Setup Frontend
cd ../frontend
npm install

# 3๏ธโƒฃ Download AI Model
ollama pull llama3.1:8b

๐ŸŽฌ Run the App

Open 3 terminals:

# Terminal 1 - AI Engine
ollama serve

# Terminal 2 - Backend (activate venv first!)
cd Audtext/backend && .\venv\Scripts\activate
uvicorn main:app --reload --port 8000

# Terminal 3 - Frontend
cd Audtext/frontend
npm run dev

๐ŸŒ Open โ†’ http://localhost:5173


๐Ÿ“– Installation

๐Ÿ”ง Detailed Backend Setup
cd backend

# Create virtual environment
python -m venv venv

# Activate it
# Windows:
.\venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Dependencies include:

  • fastapi - Modern web framework
  • faster-whisper - Optimized speech recognition
  • httpx - Async HTTP client for Ollama
  • pydantic - Data validation
๐ŸŽจ Detailed Frontend Setup
cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

# Build for production
npm run build

Built with:

  • React 18 - UI framework
  • Vite - Lightning fast bundler
  • Tailwind CSS - Utility-first styling
  • Lucide React - Beautiful icons

โš™๏ธ Configuration

๐ŸŽค Whisper Models

Edit backend/core/config.py:

WHISPER_MODEL: str = "base"  # Options: tiny, base, small, medium, large
Model RAM Speed (1hr audio) Quality
tiny 1GB ~5 min โญโญ
base 1.5GB ~10 min โญโญโญ
small 2.5GB ~20 min โญโญโญโญ
medium 5GB ~40 min โญโญโญโญโญ

๐Ÿค– Ollama Models

OLLAMA_MODEL: str = "llama3.1:8b"  # Or any Ollama model

๐Ÿ”Œ API Reference

Endpoint Method Description
/api/upload POST Upload audio file
/api/status/{task_id} GET Get transcription progress
/api/result/{task_id} GET Get full transcript
/api/summarize POST Generate AI summary
/api/export/{format}/{task_id} GET Export (txt/srt/vtt/json)
/api/ollama/health GET Check Ollama status

๐Ÿ“š Interactive Docs โ†’ http://localhost:8000/docs


๐Ÿ“ Project Structure

Audtext/
โ”œโ”€โ”€ ๐Ÿ backend/
โ”‚   โ”œโ”€โ”€ main.py              # FastAPI entry point
โ”‚   โ”œโ”€โ”€ requirements.txt     # Python dependencies
โ”‚   โ”œโ”€โ”€ api/routes/          # API endpoints
โ”‚   โ”œโ”€โ”€ services/            # Business logic
โ”‚   โ”‚   โ”œโ”€โ”€ whisper_service.py   # Transcription
โ”‚   โ”‚   โ””โ”€โ”€ ollama_service.py    # Summarization
โ”‚   โ”œโ”€โ”€ core/config.py       # Settings
โ”‚   โ””โ”€โ”€ tests/               # Test suite
โ”‚
โ”œโ”€โ”€ โš›๏ธ frontend/
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ App.tsx          # Main component
โ”‚   โ”‚   โ”œโ”€โ”€ components/      # UI components
โ”‚   โ”‚   โ””โ”€โ”€ services/api.ts  # API client
โ”‚   โ””โ”€โ”€ package.json
โ”‚
โ””โ”€โ”€ ๐Ÿ“‚ uploads/              # Temporary storage

๐Ÿ› Troubleshooting

โŒ "Failed to fetch" on upload

Make sure the backend is running on port 8000:

uvicorn main:app --reload --port 8000
โŒ Summary returns 500 error
  1. Ensure Ollama is running: ollama serve
  2. Download the model: ollama pull llama3.1:8b
  3. Verify: curl http://localhost:11434/api/tags
โŒ First transcription is slow

The first run downloads the Whisper model (~150MB for base). Subsequent runs are faster.


๐Ÿค Contributing

Contributions are welcome! Here's how you can help:

  1. ๐Ÿด Fork the repository
  2. ๐ŸŒฟ Create a feature branch (git checkout -b feature/amazing)
  3. ๐Ÿ’พ Commit your changes (git commit -m 'Add amazing feature')
  4. ๐Ÿ“ค Push to the branch (git push origin feature/amazing)
  5. ๐Ÿ”ƒ Open a Pull Request

๐Ÿ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

Technology Purpose
๐ŸŽค OpenAI Whisper Speech Recognition
โšก faster-whisper Optimized Inference
๐Ÿฆ™ Ollama Local LLM Runtime
๐Ÿš€ FastAPI Backend Framework
โš›๏ธ React Frontend Framework

โญ Star this repo if you find it useful!

Made with โค๏ธ by Akhil Reddy


Star History Chart

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

โšก