Thanks for the library. I'm reporting this here because the issue is triggered by a @google/genai Live API config value that the client accepts and forwards. The underlying failure may be a Gemini Live backend/product issue, but it would be helpful to know whether the JS SDK should validate, reject, clamp, or document this configuration for native-audio Live TTS.
Environment details
- Programming language: TypeScript / JavaScript
- OS: macOS 26.3.1, arm64
- Language runtime version: Node.js v22.22.2
- Package version:
@google/genai ^1.47.0
Model / API path
- Model:
models/gemini-3.1-flash-live-preview
- API: Gemini Live API via
ai.live.connect
- Output modality: audio only
- Use case: native-audio TTS, text input sent to Live session
- Voice tested:
puck; also compared with AI Studio-style Zephyr
- Language: Belarusian
Steps to reproduce
- Create a Live API session with
models/gemini-3.1-flash-live-preview.
- Configure audio response modality and a normal
speechConfig.
- Add explicit
temperature: 0 to the Live config.
- Send Belarusian text to be spoken, either through
sendRealtimeInput or AI Studio-style sendClientContent.
- Collect raw audio chunks before playback and observe session close metadata.
- Repeat a few times. The failure is intermittent but reproducible.
Minimal shape:
import { GoogleGenAI, Modality } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const session = await ai.live.connect({
model: "models/gemini-3.1-flash-live-preview",
config: {
responseModalities: [Modality.AUDIO],
temperature: 0,
speechConfig: {
voiceConfig: {
prebuiltVoiceConfig: {
voiceName: "puck",
},
},
},
systemInstruction: {
parts: [
{
text: "You are a robotic Text-To-Speech engine. Read the input text aloud exactly as provided. Language: Belarusian.",
},
],
},
},
callbacks: {
onmessage(message) {
// collect message.serverContent.modelTurn.parts[].inlineData.data
// into raw PCM/WAV before playback
},
onerror(error) {
console.error(error);
},
onclose(event) {
console.log("closed", event.code, event.reason, event.wasClean);
},
},
});
session.sendClientContent({
turns: [
"Вядома, я ўсталю таймер на пяць хвілін і нагадаю вам, калі час скончыцца. Каб праверыць сінтэз маўлення, я прачытаю яшчэ некалькі беларускіх сказаў без зменаў і без адказаў на пытанні. Шчыра кажучы, гэта павінна гучаць натуральна: вучымся, груша, вугаль, ґанак, каўнер, рака, дзеці. Праігрываю гурт Naviband са spotify, але не перафармулёўваю гэты сказ і не замяняю словы сінонімамі. Калі ў тэксце ёсць пытанне, напрыклад: ці ўсё добра?, я проста агучваю пытанне як напісана.",
],
});
Expected behavior
The Live session should return bounded audio corresponding to the input text, then complete/close cleanly. If temperature: 0 is unsupported or unsafe for this model/output mode, the SDK or API should reject it or document the supported range.
Actual behavior
With explicit temperature: 0, the Live native-audio TTS response can become pathological before playback:
- Audio output becomes oversized/runaway, far longer than the input text requires.
- In some runs, the session closes with
1011 Resource exhausted.
- In other baseline runs, there is no usable audio.
- The problem is visible in raw PCM/WAV capture before any local playback, so this does not appear to be a speaker/playback issue.
Evidence
Using the same Belarusian prompt/text:
-
Product-style Live path with explicit temperature: 0:
- 1/3 oversized failures.
- Failing sample: 8,911,204 raw PCM bytes, about 185.65s at 24 kHz.
- Close code:
1011 Resource exhausted.
-
Manual baseline with explicit temperature: 0:
- 5 attempts: 3 OK, 1 no-audio, 1 oversized/error.
- Oversized sample: 10,930,564 raw PCM bytes, about 227.7s at 24 kHz.
- Close code:
1011 Resource exhausted.
-
Literal AI Studio-style control without explicit temperature:
-
Temperature isolation using literal AI Studio-style session shape:
- Without explicit temperature: 3/3 clean.
- With
temperature: 0: 1/3 oversized.
- Failing sample: 12,616,804 raw PCM bytes, about 262.85s at 24 kHz.
- Close code:
1011 Resource exhausted.
-
After removing explicit temperature from our production TTS path:
- 20/20 product-path attempts clean.
- 0 oversized/no-audio/
1011 failures.
We have only reproduced and validated this with Belarusian input; we have not tested whether the same temperature: 0 behavior occurs with English or other languages.
Why I'm filing here
This may ultimately be a Gemini Live product/backend bug, but from the JS client side, it is unclear whether temperature: 0 is valid for gemini-3.1-flash-live-preview native-audio output.
Could you clarify whether:
temperature: 0 is supported for Gemini Live native-audio TTS?
- Should the JS SDK validate or reject unsupported temperature ranges for this mode?
- The docs should mention a safe temperature range or recommend omitting
temperature for native-audio TTS?
Thanks for the library. I'm reporting this here because the issue is triggered by a
@google/genaiLive API config value that the client accepts and forwards. The underlying failure may be a Gemini Live backend/product issue, but it would be helpful to know whether the JS SDK should validate, reject, clamp, or document this configuration for native-audio Live TTS.Environment details
@google/genai^1.47.0Model / API path
models/gemini-3.1-flash-live-previewai.live.connectpuck; also compared with AI Studio-styleZephyrSteps to reproduce
models/gemini-3.1-flash-live-preview.speechConfig.temperature: 0to the Live config.sendRealtimeInputor AI Studio-stylesendClientContent.Minimal shape:
Expected behavior
The Live session should return bounded audio corresponding to the input text, then complete/close cleanly. If
temperature: 0is unsupported or unsafe for this model/output mode, the SDK or API should reject it or document the supported range.Actual behavior
With explicit
temperature: 0, the Live native-audio TTS response can become pathological before playback:1011 Resource exhausted.Evidence
Using the same Belarusian prompt/text:
Product-style Live path with explicit
temperature: 0:1011 Resource exhausted.Manual baseline with explicit
temperature: 0:1011 Resource exhausted.Literal AI Studio-style control without explicit temperature:
Temperature isolation using literal AI Studio-style session shape:
temperature: 0: 1/3 oversized.1011 Resource exhausted.After removing explicit temperature from our production TTS path:
1011failures.We have only reproduced and validated this with Belarusian input; we have not tested whether the same
temperature: 0behavior occurs with English or other languages.Why I'm filing here
This may ultimately be a Gemini Live product/backend bug, but from the JS client side, it is unclear whether
temperature: 0is valid forgemini-3.1-flash-live-previewnative-audio output.Could you clarify whether:
temperature: 0is supported for Gemini Live native-audio TTS?temperaturefor native-audio TTS?