Mention the MLX quantization.

This commit is contained in:
laurent 2025-07-03 07:52:27 +02:00
parent 236df522b8
commit 6c1e9f12cf

View File

@ -250,10 +250,13 @@ hardware acceleration on Apple silicon.
Use our example script to run Kyutai TTS on MLX. Use our example script to run Kyutai TTS on MLX.
The script takes text from stdin or a file and can output to a file or stream the resulting audio. The script takes text from stdin or a file and can output to a file or stream the resulting audio.
When streaming the output, if the model is not fast enough to keep with
real-time, you can use the `--quantize 8` or `--quantize 4` flags to quantize
the model resulting in faster inference.
```bash ```bash
# From stdin, plays audio immediately # From stdin, plays audio immediately
echo "Hey, how are you?" | python scripts/tts_mlx.py - - echo "Hey, how are you?" | python scripts/tts_mlx.py - - --quantize 8
# From text file to audio file # From text file to audio file
python scripts/tts_mlx.py text_to_say.txt audio_output.wav python scripts/tts_mlx.py text_to_say.txt audio_output.wav