Readme update.
This commit is contained in:
parent
f87b8f1e6f
commit
21ea77169b
22
README.md
22
README.md
|
|
@ -4,14 +4,14 @@ Delayed Streams Modeling (DSM) is a flexible formulation for streaming, multimod
|
||||||
## Speech To Text
|
## Speech To Text
|
||||||
|
|
||||||
### English only model
|
### English only model
|
||||||
The main model handles english only, it has ~2.6B parameters.
|
The main model handles english only, it has ~2.6b parameters.
|
||||||
|
|
||||||
#### PyTorch implementation
|
#### PyTorch implementation
|
||||||
[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en)
|
[[Hugging Face]](https://huggingface.co/kyutai/stt-2.6b-en)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
|
# wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
|
||||||
python -m moshi.run_inference --hf-repo kyutai/stt-2.6B-en bria.mp3
|
python -m moshi.run_inference --hf-repo kyutai/stt-2.6b-en bria.mp3
|
||||||
```
|
```
|
||||||
|
|
||||||
#### MLX implementation
|
#### MLX implementation
|
||||||
|
|
@ -56,7 +56,23 @@ can be triggered by setting the real-time factor, e.g. `--rtf 500` will process
|
||||||
the data as fast as possible.
|
the data as fast as possible.
|
||||||
|
|
||||||
### English + French model
|
### English + French model
|
||||||
This model has ~1B parameters and supports both English and French.
|
This model has ~1b parameters and supports both English and French.
|
||||||
|
|
||||||
|
#### PyTorch implementation
|
||||||
|
[[Hugging Face]](https://huggingface.co/kyutai/stt-1b-en_fr)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
|
||||||
|
python -m moshi.run_inference --hf-repo kyutai/stt-1b-en_fr bria.mp3
|
||||||
|
```
|
||||||
|
|
||||||
|
#### MLX implementation
|
||||||
|
[[Hugging Face]](https://huggingface.co/kyutai/stt-1b-en_fr-mlx)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
|
||||||
|
python -m moshi_mlx.run_inference --hf-repo kyutai/stt-1b-en_fr-mlx bria.mp3 --temp 0
|
||||||
|
```
|
||||||
|
|
||||||
#### Rust implementation
|
#### Rust implementation
|
||||||
[[Hugging Face]](https://huggingface.co/kyutai/stt-1b-en_fr-candle)
|
[[Hugging Face]](https://huggingface.co/kyutai/stt-1b-en_fr-candle)
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user