Evaluation: whisper-1 on Indian-context audio — Hinglish, names, addresses, call-center phrases #2761

weekendpm · 2026-04-14T20:00:44Z

weekendpm
Apr 14, 2026

What this is

A structured benchmark of whisper-1 across six categories of Indian-context audio: pure Hindi, pure English (Indian vocabulary), Hinglish (code-mixed), Indian proper nouns, addresses with PIN codes, and call-center phrases.

Methodology transparency upfront:

Audio: synthetically generated via gTTS (not human recordings)
Language parameter: set explicitly (hi for Hindi/Hinglish, en for English)
WER: string-level comparison, script-sensitive
Model: whisper-1 via OpenAI API
20 clips total

The synthetic audio means results are directional, not definitive. Real human recordings — especially for Hinglish — would shift numbers. Treating this as a baseline and an invitation for the community to build on it.

Dataset (clips + ground truth + results CSV): https://huggingface.co/datasets/Primepluto/hinglish-whisper-benchmark

Results

Category	Clips	Avg WER	Notes
Pure English (Indian vocab)	3	4%	Strong
Pure Hindi	3	16%	Loanword spelling variance
Addresses + PINs	3	37%	Mixed — see below
Indian names	3	66%	Transliteration in Hindi context
Call-center (mixed lang)	3	68%	Hindi phrases fail
Hinglish (romanized)	5	100%	Script switching — see below

Finding 1: No output path for romanized Hinglish

All five Hinglish clips were spoken as code-mixed Hindi-English (romanized, e.g. "Mera order abhi tak deliver nahi hua, please check karo"). whisper-1 consistently transcribed them in Devanagari:

REF: Mera order abhi tak deliver nahi hua, please check karo.
HYP: mera order abhi tak deliver nahi hua, please check karo. (Devanagari in actual output)

REF: Support team se baat karni hai, hold pe mat rakho.
HYP: Fully Devanagari output

This is expected given language=hi was set. But the broader issue is structural: there is no way to request romanized Hinglish output from whisper-1. For developers building Indian support-call or WhatsApp transcription tools where downstream systems expect Latin-script text, this is a blocker. No flag, no workaround via the API today.

Finding 2: Hindi loanword spelling inconsistency

Whisper transcribes English loanwords in Hindi with non-standard Devanagari spellings. Examples from actual output:

Reference order (loanword) transcribed with a non-standard Devanagari variant
Reference password transcribed phonetically but incorrectly

These aren't random errors — they reflect real ambiguity in how English loanwords are written in Devanagari. But for downstream NLP tasks like search or entity extraction, inconsistency across runs is more damaging than a fixed variant.

Finding 3: PIN code digit hallucination

One address clip had a 6-digit PIN (400050) transcribed with an extra digit (4000050) — making the PIN invalid. Small sample, needs more testing, but worth flagging for any address or logistics use case.

Finding 4: Indian proper noun accuracy

In English-language context, proper nouns are slightly off but recognizable:

Koramangala transcribed as Kormangala
Abhijit transcribed as Abhijeet

In Hindi-language context, names get fully transliterated to Devanagari, losing the Latin spelling entirely — which breaks any downstream system expecting the original name string.

What I'd want to see next

Real human recordings — especially Hinglish from actual speakers. gTTS Hinglish is too clean and phonetically Hindi-dominant. Real code-mixed speech would stress the model differently.
Auto language-detection behavior — I set language explicitly. What does whisper-1 do with Hinglish audio when language is unset? Does it detect hi or en? That changes the output script entirely.
large-v3 comparison — this benchmark used whisper-1 (API). Would large-v3 handle loanword spelling more consistently?

The dataset is small and synthetic, but the categories and ground truth are reusable. If you have real Hinglish recordings — WhatsApp voice notes, support call clips, anything CC-licensed — happy to extend this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation: whisper-1 on Indian-context audio — Hinglish, names, addresses, call-center phrases #2761

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Evaluation: whisper-1 on Indian-context audio — Hinglish, names, addresses, call-center phrases #2761

Uh oh!

weekendpm Apr 14, 2026

What this is

Results

Finding 1: No output path for romanized Hinglish

Finding 2: Hindi loanword spelling inconsistency

Finding 3: PIN code digit hallucination

Finding 4: Indian proper noun accuracy

What I'd want to see next

Replies: 0 comments

weekendpm
Apr 14, 2026