Text to Speech#
Overview#
On all platforms, an external TTS
(Text to speech) application may be defined by the --voice-command
option. Alternately, on POSIX platforms, a speech library may be dynamically loaded at run time. This requires the desired speech library to have been available when mwp was compiled.
Built-in libraries#
- Espeak / Espeak-ng
- Speech Dispatcher
- Flite
None of these provide very good speech synthesis
External Commands#
You can use an external command on all platforms (it is the only option on Windows). Any external speech command should:
- Read lines of text to be spoken from
stdin
(standard input) - Directly output the synthesised speech
- Only require invoking once, reading
stdin
for new text until it is closed.
External command usage#
The simplest way is to add a --voice-command
line to your cmdopts file.
Examples:
# Espeak-ng
--voice-command="espeak-ng"
# Speech Dispatcher
--voice-command="spd-say -t female2 -e"
# piper-tts
# Choose your model, I have like the Scottish lady ...
#VMODEL=/usr/share/piper-voices/en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx
#VMODEL=/usr/share/piper-voices/en/en_GB/aru/medium/en_GB-aru-medium.onnx
VMODEL=/usr/share/piper-voices/en/en_GB/alba/medium/en_GB-alba-medium.onnx
--voice-command="sh -c \"piper-tts -q --model $VMODEL --output-raw | aplay -q -r 22050 -f S16_LE -t raw -\""
In the piper-tts
example, (by far the best TTS for Linux), the voice model file is defined by an environment variable VMODEL
which is evaluated by mwp before the voice command is invoked, making it easy to test out different voices.
Flite specifics#
mwp can use the flite
text to speech engine directly (as well as espeak or speech-dispatcher. Flite is enabled if:
- You have the flite development files installed
Flite is available at run-time if:
- The flite version is 2.0 or later.
Unfortunately, it is non-trivial to detect the flite version at mwp build time.
Flite provides reasonable quality voices with low overhead, including some female voices.
Configuration#
Flite is configured using two gsettings
keys:
Key | Usage |
---|---|
speech-api |
Defines the speech API to be used, one of none , espeak , speechd or flite |
flite-voice |
The voice file to be used. If not specified, the internal slt (female) voice is used. The value takes the absolute path name to a voice file, optionally followed by a , and a floating point speed factor (see below) |
$ gsettings set org.stronnag.mwp speech-api flite
$ gsettings set org.stronnag.mwp flite-voice-file /home/jrh/.config/mwp/cmu_us_clb.flitevox,0.9
Flite Discussion#
Voice Files#
flite can use external voice files that provide better quality than the built-in voices. Your distro may provide these voice files in an optional package, or you can download from http://www.festvox.org, e.g. for flite 2.1 http://www.festvox.org/flite/packed/flite-2.1/voices/ (replace 2.1 with 2.0 etc., not all the 2.1 voices may exist for 2.0). The following script will bulk download the non-Indic voices; you can test them out with the flite
application.
#!/bin/bash
BASE=http://www.festvox.org/flite/packed/flite-2.1/voices
for V in cmu_us_aew.flitevox cmu_us_ahw.flitevox cmu_us_aup.flitevox \
cmu_us_awb.flitevox cmu_us_axb.flitevox cmu_us_bdl.flitevox \
cmu_us_clb.flitevox cmu_us_eey.flitevox cmu_us_fem.flitevox \
cmu_us_gka.flitevox cmu_us_jmk.flitevox cmu_us_ksp.flitevox \
cmu_us_ljm.flitevox cmu_us_lnh.flitevox cmu_us_rms.flitevox \
cmu_us_rxr.flitevox cmu_us_slp.flitevox cmu_us_slt.flitevox
do
wget -P . $BASE/$V
done
Replay Speed#
The default replay speed for some flite voices is rather slow. The optional rate setting in the gsettings flite-voice-file
key may be used to increase the rate.