-
Notifications
You must be signed in to change notification settings - Fork 234
Gemini 2.5 Flash Native Audio — Telephony Compatibility & Output Sample Rate Support #1428
Description
Hi,
I've been integrating the Gemini 2.5 Flash Native Audio model with telephony vendors (Plivo, Exotel). The model accepts input at 16 kHz and produces output at 24 kHz. While 16 kHz input works seamlessly with telephony, I'm observing noticeable latency on the output side at 24 kHz.
The model works well over browser-based WebRTC, but the telephony experience is impacted. I have two questions:
Is the Gemini 2.5 Flash Native Audio model intended to support telephony use cases, or is it currently optimized primarily for browser/WebRTC?
Are there plans to support lower output sample rates (16 kHz / 8 kHz)? I understand this would reduce audio quality, but it would significantly improve latency for telephony and unlock a wide range of voice-based use cases.