Swap ElevenLabs SDK for @speech-sdk/core and add a provider picker by piersonmarks · Pull Request #35 · BolajiAyodeji/chat-with-siri

piersonmarks · 2026-04-08T18:18:19Z

What I did

Replaces the direct elevenlabs SDK integration in /api/speech with @speech-sdk/core — a unified, multi-provider text-to-speech SDK — and adds a Speech Provider dropdown next to the existing voice picker so users can switch between TTS backends at runtime.

Concretely:

Added @speech-sdk/core dependency (^0.4.1)
New provider registry at app/utils/providers.ts — single source of truth for supported providers (currently ElevenLabs + OpenAI) with static voice catalogs where applicable
/api/speech now constructs a ResolvedModel via createElevenLabs / createOpenAI and calls generateSpeech({ model, text, voice }); responds with the SDK-reported mediaType instead of hardcoding audio/mpeg
getVoices(provider) accepts a provider arg and returns a normalized ProviderVoice shape — ElevenLabs voices stay dynamic (via the ElevenLabs voices API), OpenAI uses the static catalog (alloy / ash / ballad / coral / echo / fable / nova / onyx / sage / shimmer)
ChatVoice renders two <select>s (provider + voice). Provider + voice are persisted to localStorage; switching provider auto-snaps selectedVoice to that provider's default when the prior selection isn't in the new catalog
Fixed a latent bug surfaced by the provider change: the chat page was gated on voices.length === 0 as a loading indicator, which caused the UI to hang forever if the ElevenLabs key lacked voices_read scope. Replaced with an explicit voicesLoading flag that flips off after the first fetch attempt (success or failure), so users can always reach the provider dropdown and switch backends
Updated README with a "Speech providers" section documenting the dropdown, the two built-in providers, and how to add more

ElevenLabs remains the default so behavior at chat-with-siri.vercel.app is unchanged for anyone who doesn't touch the new dropdown.

Minor behavior change to call out: the previous route passed voice_settings: { similarity_boost: 0.5, stability: 0.5 } to ElevenLabs directly. @speech-sdk/core uses provider defaults; these can be reintroduced via providerOptions if you'd like them back — happy to add that in a follow-up commit if you prefer.

Closes:

How to test

git checkout <this-branch>
npm install
cp .env.example .env.local
# add OPENAI_API_KEY and ELEVENLABS_API_KEY
npm run dev

Then at http://localhost:3000/chat:

ElevenLabs regression check — provider dropdown defaults to "ElevenLabs", voice list populates dynamically, sending a message plays audio. Behavior should be identical to main.
OpenAI path — switch the provider dropdown to "OpenAI". The voice dropdown swaps to the 10 static OpenAI voices, selection snaps to alloy. Send a message — audio plays via OpenAI TTS (tts-1).
Persistence — reload the page; provider + voice selection survives via localStorage.
Error surface — temporarily break one of the keys; the toast shows the correct provider name ("Your OpenAI API Key is invalid…" vs ElevenLabs).

npx tsc --noEmit is clean and npm run build passes.

Any background context you want to add?

@speech-sdk/core is an open-source, MIT-licensed multi-provider TTS SDK — repo at Jellypod-Inc/speech-sdk. Full disclosure: I'm one of the maintainers, which is part of why I was interested in this swap — chat-with-siri's architecture (single ElevenLabs integration with a voice picker) is exactly the shape the SDK abstracts cleanly, and it was a good real-world test. No hard feelings if you'd rather not take the dependency.
Adding more providers (Deepgram, Cartesia, Hume, Google, Fish Audio, xAI, etc.) is now a two-line change: append to SPEECH_PROVIDERS in app/utils/providers.ts and add a case in app/api/speech/route.ts. I kept it to ElevenLabs + OpenAI in this PR to minimize review surface.
I deliberately did not touch /api/chat, the API-key modal, styling, or any unrelated code — scoped the PR tightly per your contributor guide.
No tests added because the repo has no test harness and introducing one felt out of scope. Happy to add a vitest setup in a separate PR if you'd find that useful.

- /api/speech now uses generateSpeech() from @speech-sdk/core, branching on a 'provider' field (elevenlabs default, openai added) - ChatVoice gains a provider dropdown alongside the voice dropdown - Provider + voice persist to localStorage; switching provider resets the voice to that provider's default if the prior voice isn't in the catalog - ElevenLabs remains the default so existing deployment behavior is unchanged

The previous `voices.length === 0` gate made sense when ElevenLabs was the only provider and an empty list meant "still loading." With the provider picker, an empty list is a legitimate state (e.g. ElevenLabs key lacks voices_read scope) and the user needs the UI to be interactive so they can switch to a provider with a static catalog. Replaced the gate with an explicit `voicesLoading` flag that flips off in the useEffect's finally block, so the page always renders after the first fetch attempt regardless of outcome.

vercel · 2026-04-08T18:18:25Z

@piersonmarks is attempting to deploy a commit to the BA Team on Vercel.

A member of the Team first needs to authorize it.

socket-security · 2026-04-08T18:19:47Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	@speech-sdk/core@0.4.1

View full report

piersonmarks added 5 commits April 8, 2026 10:49

chore: add @speech-sdk/core dependency

6e6063b

feat: add speech provider registry

8a63675

docs: document @speech-sdk/core provider picker in README

a910c9b

piersonmarks mentioned this pull request Apr 8, 2026

Proposal: swap ElevenLabs SDK for @speech-sdk/core + add a provider picker #36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swap ElevenLabs SDK for @speech-sdk/core and add a provider picker#35

Swap ElevenLabs SDK for @speech-sdk/core and add a provider picker#35
piersonmarks wants to merge 5 commits into
BolajiAyodeji:mainfrom
Jellypod-Inc:pierson/spe-41-pr-bolajiayodejichat-with-siri-swap-elevenlabs-for-speechsdk

piersonmarks commented Apr 8, 2026

Uh oh!

vercel Bot commented Apr 8, 2026

Uh oh!

socket-security Bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

piersonmarks commented Apr 8, 2026

What I did

How to test

Any background context you want to add?

Uh oh!

vercel Bot commented Apr 8, 2026

Uh oh!

socket-security Bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant