diff --git a/README.md b/README.md
index 03afd92..7f40f21 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,9 @@ transcribed into text. We provide two such models:
More details can be found on the [project page](https://kyutai.org/next/stt).
### PyTorch implementation
-[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en)
+
+
+
@@ -30,7 +32,10 @@ python -m moshi.run_inference --hf-repo kyutai/stt-2.6b-en bria.mp3
```
### MLX implementation
-[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en-mlx)
+
+
+
This requires the [moshi-mlx package](https://pypi.org/project/moshi-mlx/)
with version 0.2.5 or later, which can be installed via pip.
@@ -41,7 +46,9 @@ python -m moshi_mlx.run_inference --hf-repo kyutai/stt-2.6b-en-mlx bria.mp3 --te
```
### Rust implementation
-[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en-candle)
+
+
+
A standalone Rust example is provided in the `stt-rs` directory in this repo.
This can be used as follows:
@@ -51,7 +58,9 @@ cargo run --features cuda -r -- bria.mp3
```
### Rust server
-[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en-candle)
+
+
+
The Rust implementation provides a server that can process multiple streaming
queries in parallel. Dependening on the amount of memory on your GPU, you may