Skip to content

TTS (CrispASR): send consent_attestation for voice cloning#11504

Merged
niksedk merged 2 commits into
mainfrom
crispasr-tts-consent-attestation
Jun 9, 2026
Merged

TTS (CrispASR): send consent_attestation for voice cloning#11504
niksedk merged 2 commits into
mainfrom
crispasr-tts-consent-attestation

Conversation

@niksedk

@niksedk niksedk commented Jun 9, 2026

Copy link
Copy Markdown
Member

Problem

CrispASR v0.7.0 added a consent gate for voice cloning: any /v1/audio/speech request that clones a reference voice must include a consent_attestation field, or the server returns HTTP 400 consent_required. Without this, every cloning synth fails, e.g.:

F5-TTS (CrispASR) synthesis failed (400): {"error": {"message": "voice cloning requires a
'consent_attestation' field in the request body. ...", "code": "consent_required"}}

VoxCPM2CrispAsr already sent the field; the other cloning engines never did, so they all break identically on 0.7.0.

Fix

Send the same consent_attestation value (matching VoxCPM2's existing wording) on every cloning request:

Engine Change
F5-TTS added to payload (always clones)
IndexTTS added to payload (always clones)
VibeVoice added to payload (always clones)
CosyVoice3 added inside the if (isClone) branch only
Qwen3 added only when a reference WAV is set (CustomVoice; VoiceDesign does not clone)

The user supplies their own reference voice by importing a WAV into SE, which is the act being attested.

Notes

🤖 Generated with Claude Code

niksedk and others added 2 commits June 9, 2026 09:04
CrispASR v0.7.0 gates voice cloning behind a consent attestation and
returns HTTP 400 consent_required without it. VoxCPM2 already sent the
field; apply the same to the remaining cloning engines so they work
against 0.7.0:

- F5-TTS, IndexTTS, VibeVoice: always clone, so always send it
- CosyVoice3: only inside the isClone branch
- Qwen3: only for CustomVoice (reference WAV); VoiceDesign does not clone

The user supplies their own reference voice by importing a WAV into SE,
which is the act being attested.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CrispASR added a `spoken_disclaimer` request flag (default true) that, when
false, suppresses the audible AI-disclosure prefix it otherwise prepends to
cloned audio. The inaudible watermark + C2PA provenance metadata remain
embedded regardless, so machine-readable provenance is unaffected.

Send `spoken_disclaimer = false` from every cloning engine (F5-TTS, IndexTTS,
VibeVoice, CosyVoice3, Qwen3 CustomVoice, VoxCPM2). SE surfaces the
AI-generated nature in its own UI, so the spoken prefix would only corrupt the
synthesized subtitle audio.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@niksedk niksedk merged commit a63fd1d into main Jun 9, 2026
1 of 3 checks passed
@niksedk niksedk deleted the crispasr-tts-consent-attestation branch June 9, 2026 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant