Reorganize for TTS

This commit is contained in:
Vaclav Volhejn 2025-07-02 17:49:15 +02:00
parent 8e254d8b09
commit 64b3a50e80

View File

@ -1,10 +1,4 @@
<a href="https://huggingface.co/collections/kyutai/speech-to-text-685403682cf8a23ab9466886" target="_blank" style="margin: 2px;">
<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-KyutaiSTT-blue" style="display: inline-block; vertical-align: middle;"/>
</a>
<a target="_blank" href="https://colab.research.google.com/github/kyutai-labs/delayed-streams-modeling/blob/main/transcribe_via_pytorch.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
# Delayed Streams Modeling: Kyutai STT & TTS
This repo contains instructions and examples of how to run
[Kyutai Speech-To-Text](#kyutai-speech-to-text)
@ -18,6 +12,10 @@ to be notified when we open-source text-to-speech and [Unmute](https://unmute.sh
## Kyutai Speech-To-Text
<a href="https://huggingface.co/collections/kyutai/speech-to-text-685403682cf8a23ab9466886" target="_blank" style="margin: 2px;">
<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-KyutaiSTT-blue" style="display: inline-block; vertical-align: middle;"/>
</a>
**More details can be found on the [project page](https://kyutai.org/next/stt).**
Kyutai STT models are optimized for real-time usage, can be batched for efficiency, and return word level timestamps.