- Fix
pathvalidatedependency
- Add missing wheels
- Add Chinese phonemizer based on g2pW
- Using a quantized version of the original model with
quantize_dynamic
- Using a quantized version of the original model with
- Add
--data.phoneme_type pinyinfor Chinese phonemization using g2pW - Add
--data.phoneme_type textfor using IPA phonemes directly (no espeak-ng) - Add
--model.vocoder_warmstart_ckpt <CHECKPOINT>to restore vocoder params only - Add
--data.dataset_type 'phoneme_ids'to train with pre-generated phoneme ids- Use
--data.num_symbols <N>to set number of phonemes - Use
--data.phonemes_path "/path/to/phonemes.json"for phoneme/id map
- Use
- Add
--output-dir-namingoption withtimestamp(default) andtext
- Add experimental support for alignments (see docs/ALIGNMENTS.md)
- Raw phonemes no longer split sentences
- Fix training for multi-speaker voices
- Moved development to OHF-Voice org
- Removed C++ code for now to focus on Python development
- A C API
libpiperwritten in C++ is planned
- A C API
- Embed espeak-ng directly instead of using separate
piper-phonemizelibrary - Change license to GPLv3
- Use Python stable ABI (3.9+) so only a single wheel per platform is needed
- Change Python API:
PiperVoice.synthesizetakes aSynthesisConfigand generatesAudioChunkobjectsPiperVoice.synthesize_rawis removed
- Add separate
piper.download_voicesutility for downloading voices from HuggingFace - Allow text as CLI argument:
piper ... -- "Text to speak" - Allow text from one or more files with
--input-file <FILE> - Excluding any file output arguments will play audio directly with
ffplay - Support for raw phonemes in text with
[[ <phonemes> ]] - Adjust output volume with
--volume <MULTIPLIER>(default is 1.0)