diff --git a/README.md b/README.md index 34b2ec6..bd6d97d 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ This repo contains instructions and examples of how to run and [Kyutai Text-To-Speech](#kyutai-text-to-speech) models. These models are powered by delayed streams modeling (DSM), a flexible formulation for streaming, multimodal sequence-to-sequence learning. +See also [Unmute](https://github.com/kyutai-labs/unmute), an voice AI system built using Kyutai STT and Kyutai TTS. But wait, what is "Delayed Streams Modeling"? It is a technique for solving many streaming X-to-Y tasks (with X, Y in `{speech, text}`) that formalize the approach we had with Moshi and Hibiki. A pre-print paper is coming soon! @@ -248,17 +249,13 @@ and just prefix the command above with `uvx --with moshi`. The Rust implementation provides a server that can process multiple streaming queries in parallel. -In order to run the server, install the [moshi-server -crate](https://crates.io/crates/moshi-server) via the following command. The -server code can be found in the -[kyutai-labs/moshi](https://github.com/kyutai-labs/moshi/tree/main/rust/moshi-server) -repository. -```bash -cargo install --features cuda moshi-server -``` +Installing the Rust server is a bit tricky because it uses our Python implementation under the hood, +which also requires installing the Python dependencies. +Use the [start_tts.sh](https://github.com/kyutai-labs/unmute/blob/main/dockerless/start_tts.sh) script to properly install the Rust server. +If you already installed the `moshi-server` crate before and it's not working, you might need to force a reinstall by running `cargo uninstall moshi-server` first. +Feel free to open an issue if the installation is still broken. - -Then the server can be started via the following command using the config file +Once installed, the server can be started via the following command using the config file from this repository. ```bash