Secure and scalable speech transcription for local and HPC #2723

pablobernabeu · 2026-01-31T14:56:06Z

pablobernabeu
Jan 31, 2026

Cloud-based speech-to-text services are convenient, but they often have file size limits, lack transparency for reproducible research, and can pose privacy risks under regulations like GDPR. To address these limitations, this project introduces a production-ready, local transcription workflow using OpenAI's Whisper models. This self-contained system ensures complete data sovereignty and is designed for scalability, supporting batch operations on high-performance computing (HPC) clusters with GPU acceleration. The workflow includes advanced quality control, such as algorithms to detect and remove AI-generated repetitions, context-aware name masking for privacy, speaker diarisation, and a flexible audio enhancement pipeline. Implemented as a single Python script, this system offers a robust, reproducible, and secure alternative for academic and enterprise transcription.

Click to view documentation and source code

Reference

Bernabeu, P. (2025). Secure and scalable speech transcription for local and HPC (Version 1.0.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.17624830

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Secure and scalable speech transcription for local and HPC #2723

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Secure and scalable speech transcription for local and HPC #2723

Uh oh!

pablobernabeu Jan 31, 2026

Click to view documentation and source code

Reference

Replies: 0 comments

pablobernabeu
Jan 31, 2026