Local Transcriber

Private, on-device audio transcription and summarization. No audio or text ever leaves your machine.

Features

Transcribe audio files (MP3, WAV, M4A, OGG, FLAC, WebM) using Whisper large-v3
Optionally summarize transcripts using Llama-3.1-8B-Instruct running locally via llama.cpp
Download results as plain text or Markdown
Jobs are queued and processed one at a time; temporary files are cleaned up automatically

Requirements

macOS (Apple Silicon recommended — Metal is used for both Whisper and Llama inference)
Python 3.11+
Node.js 18+
ffmpeg (required by pywhispercpp to decode audio)
The Llama-3.1-8B-Instruct Q4_K_M GGUF model at backend/models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf (~4.7 GB)

First-time setup

# 1. Install ffmpeg (required by pywhispercpp to decode audio)
brew install ffmpeg

# 2. Create and activate a Python virtual environment
python3 -m venv .venv
source .venv/bin/activate

# 3. Install Python dependencies
pip install -r backend/requirements.txt

# 4. Install llama-cpp-python with Metal support
CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python

# 5. Install frontend dependencies
npm install --prefix frontend

Place your Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf file at:

backend/models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf

The Whisper large-v3 model (~3.1 GB) is downloaded automatically to ~/Library/Application Support/pywhispercpp/models/ on first run.

Running

./start.sh

This starts both servers:

Backend API: http://localhost:8000
Frontend UI: http://localhost:5173

Open http://localhost:5173 in your browser. The UI shows a loading spinner while models are initializing (typically 5–15 seconds), then presents the upload form.

Press Ctrl+C to stop both servers.

Architecture

local-transcriber-app/
├── backend/
│   ├── main.py          # FastAPI app — routes, lifespan, background job runner
│   ├── job_store.py     # In-memory job store with threading.Lock and TTL cleanup
│   ├── transcriber.py   # Whisper large-v3 via pywhispercpp / whisper.cpp (Metal)
│   ├── summarizer.py    # Llama-3.1-8B-Instruct Q4_K_M via llama-cpp-python (Metal)
│   ├── models/          # GGUF model file (gitignored)
│   └── tmp/             # Ephemeral audio and output files (gitignored)
├── frontend/
│   └── src/
│       ├── App.tsx                      # State machine: loading → idle → processing → complete/error
│       └── components/
│           ├── UploadCard.tsx           # Drag-and-drop upload, format/mode toggles
│           ├── ProgressBar.tsx          # Polling /status every 2s, indeterminate then determinate
│           └── DownloadPanel.tsx        # Download trigger and reset
└── start.sh             # Starts uvicorn + Vite dev server, kill -9 on Ctrl+C

Request flow

User selects a file, output format (TXT/Markdown), and mode (transcript only or transcript + summary)
POST /transcribe — file is saved to backend/tmp/, a job is created and queued
A background task acquires the inference semaphore (serializes jobs) and runs transcription in a thread pool executor so the event loop stays unblocked
If summarization is requested, Llama-3.1-8B runs after transcription completes
Frontend polls GET /status/{job_id} every 2 seconds; progress advances 0→80% during transcription, 80→100% during summarization
On completion, GET /download/{job_id} returns the file and schedules cleanup of all temporary files for that job

Models

Model	Purpose	Runtime	Approximate memory
Whisper large-v3	Transcription	pywhispercpp / whisper.cpp (Metal)	~3.1 GB
Llama-3.1-8B-Instruct Q4_K_M	Summarization	llama-cpp-python / Metal	~4.7 GB

Known behavior

Transcription speed: Whisper large-v3 runs via whisper.cpp with Metal acceleration on Apple Silicon. The Whisper model auto-downloads (~3.1 GB) to ~/Library/Application Support/pywhispercpp/models/ on first startup. Rough speed: ~15–20× realtime on Apple Silicon.
Summarization: Llama-3.1-8B must run with verbose=True in llama-cpp-python; verbose=False suppresses file descriptors in a way that breaks Metal inference on macOS.
Job queue: Only one job runs at a time. A second upload while a job is in progress will queue and start automatically when the first finishes.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
backend		backend
frontend		frontend
specs		specs
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
local_transcriber_screenshot.png		local_transcriber_screenshot.png
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Transcriber

Features

Requirements

First-time setup

Running

Architecture

Request flow

Models

Known behavior

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local Transcriber

Features

Requirements

First-time setup

Running

Architecture

Request flow

Models

Known behavior

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages