# Serge - LLaMA made easy ๐ฆ Serge is a chat interface crafted with [llama.cpp](https://github.com/ggerganov/llama.cpp) for running Alpaca models. No API keys, entirely self-hosted! - ๐ **SvelteKit** frontend - ๐พ **[Redis](https://github.com/redis/redis)** for storing chat history & parameters - โ๏ธ **FastAPI + LangChain** for the API, wrapping calls to [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [python bindings](https://github.com/abetlen/llama-cpp-python) ## ๐ง Supported Models | Category | Models | |:-------------:|:-------| | **Alpaca ๐ฆ** | Alpaca-LoRA-65B, GPT4-Alpaca-LoRA-30B | | **Chronos ๐**| Chronos-13B, Chronos-33B, Chronos-Hermes-13B | | **GPT4All ๐**| GPT4All-13B | | **Koala ๐จ** | Koala-7B, Koala-13B | | **LLaMA ๐ฆ** | FinLLaMA-33B, LLaMA-Supercot-30B, LLaMA2 7B, LLaMA2 13B, LLaMA2 70B | | **Lazarus ๐**| Lazarus-30B | | **Nous ๐ง ** | Nous-Hermes-13B | | **OpenAssistant ๐๏ธ** | OpenAssistant-30B | | **Orca ๐ฌ** | Orca-Mini-v2-7B, Orca-Mini-v2-13B, OpenOrca-Preview1-13B | | **Samantha ๐ฉ**| Samantha-7B, Samantha-13B, Samantha-33B | | **Vicuna ๐ฆ** | Stable-Vicuna-13B, Vicuna-CoT-7B, Vicuna-CoT-13B, Vicuna-v1.1-7B, Vicuna-v1.1-13B, VicUnlocked-30B, VicUnlocked-65B | | **Wizard ๐ง** | Wizard-Mega-13B, WizardLM-Uncensored-7B, WizardLM-Uncensored-13B, WizardLM-Uncensored-30B, WizardCoder-Python-13B-V1.0 | Additional weights can be added to the `serge_weights` volume using `docker cp`: ```bash docker cp ./my_weight.bin serge:/usr/src/app/weights/ ``` ## โ ๏ธ Memory Usage LLaMA will crash if you don't have enough available memory for the model: | Model | Max RAM Required | |-------------|------------------| | 7B | 4.5GB | | 7B-q2_K | 5.37GB | | 7B-q3_K_L | 6.10GB | | 7B-q4_1 | 6.71GB | | 7B-q4_K_M | 6.58GB | | 7B-q5_1 | 7.56GB | | 7B-q5_K_M | 7.28GB | | 7B-q6_K | 8.03GB | | 7B-q8_0 | 9.66GB | | 13B | 12GB | | 13B-q2_K | 8.01GB | | 13B-q3_K_L | 9.43GB | | 13B-q4_1 | 10.64GB | | 13B-q4_K_M | 10.37GB | | 13B-q5_1 | 12.26GB | | 13B-q5_K_M | 11.73GB | | 13B-q6_K | 13.18GB | | 13B-q8_0 | 16.33GB | | 33B | 20GB | | 33B-q2_K | 16.21GB | | 33B-q3_K_L | 19.78GB | | 33B-q4_1 | 22.83GB | | 33B-q4_K_M | 22.12GB | | 33B-q5_1 | 26.90GB | | 33B-q5_K_M | 25.55GB | | 33B-q6_K | 29.19GB | | 33B-q8_0 | 37.06GB | | 65B | 50GB | | 65B-q2_K | 29.95GB | | 65B-q3_K_L | 37.15GB | | 65B-q4_1 | 43.31GB | | 65B-q4_K_M | 41.85GB | | 65B-q5_1 | 51.47GB | | 65B-q5_K_M | 48.74GB | | 65B-q6_K | 56.06GB | | 65B-q8_0 | 71.87GB | ## ๐งพ License [Nathan Sarrazin](https://github.com/nsarrazin) and [Contributors](https://github.com/serge-chat/serge/graphs/contributors). `Serge` is free and open-source software licensed under the [MIT License](https://github.com/serge-chat/serge/blob/master/LICENSE). ## ๐ฌ Support Need help? Join our [Discord](https://discord.gg/62Hc6FEYQH)