Skip to content

Gemini 2.5 Flash Native Audio — Telephony Compatibility & Output Sample Rate Support #1428

@smeetagrawal

Description

@smeetagrawal

Hi,
I've been integrating the Gemini 2.5 Flash Native Audio model with telephony vendors (Plivo, Exotel). The model accepts input at 16 kHz and produces output at 24 kHz. While 16 kHz input works seamlessly with telephony, I'm observing noticeable latency on the output side at 24 kHz.
The model works well over browser-based WebRTC, but the telephony experience is impacted. I have two questions:

Is the Gemini 2.5 Flash Native Audio model intended to support telephony use cases, or is it currently optimized primarily for browser/WebRTC?

Are there plans to support lower output sample rates (16 kHz / 8 kHz)? I understand this would reduce audio quality, but it would significantly improve latency for telephony and unlock a wide range of voice-based use cases.

Metadata

Metadata

Assignees

Labels

api:gemini-apipriority: p3Desirable enhancement or fix. May not be included in next release.status:awaiting user responseissues requiring a response from the userstatus:staleIssues tagged for cleanup by the stale issue workflow.type: questionRequest for information or clarification. Not an issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions