A voice transcription tool using faster-whisper that records audio and converts speech to text on Linux systems.
- Real-time audio recording using Linux's
soxutility - Speech-to-text transcription using faster-whisper
- Automatic clipboard copy of transcribed text (using
wl-copyfor Wayland) - Voice activity detection (VAD) to filter silence
- Support for different whisper models (small.en, large-v3, distil-medium.en)
- Linux operating system
- Python >= 3.12
soxfor audio recording:# Ubuntu/Debian sudo apt install sox # Fedora sudo dnf install sox
wl-clipboardfor Wayland clipboard support:# Ubuntu/Debian sudo apt install wl-clipboard # Fedora sudo dnf install wl-clipboard
uvpackage manager
- Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh - Insall dependencies
# Ubuntu sudo apt install sox wl-clipboard python3 - Clone this repository
Run directly with Python:
OMP_NUM_THREADS=2 uv run transcribe.pyOr use the terminal launcher, which will open a terminal and run the script inside. Useful for sway hotkeys.
./run_in_terminal.sh- The program will start recording automatically
- Press Enter to stop recording
- Wait for transcription to complete
- The transcribed text will be copied to clipboard automatically
- Press Enter to record another message or 'q' + Enter to quit
You can change the model size in transcribe.py:
# Available options:
model_size = "small.en" # Faster, less accurate
# model_size = "large-v3" # Slower, more accurate
# model_size = "distil-medium.en" # Balanced performance- This tool is designed specifically for Linux systems running Wayland
- For X11 systems, you'll need to modify the clipboard command from
wl-copytoxclip - The transcribed audio files are temporarily stored in
/tmp/recordings/
MIT