chore(ollama-nvidia): small config adjustments

This commit is contained in:
Nicolas Meienberger 2024-05-11 11:47:20 +02:00
parent 13d5bb04c8
commit 4b09d44d56
2 changed files with 16 additions and 18 deletions

View File

@ -5,10 +5,8 @@ services:
image: ollama/ollama:0.1.33 image: ollama/ollama:0.1.33
restart: unless-stopped restart: unless-stopped
container_name: ollama-nvidia container_name: ollama-nvidia
environment:
- PORT=11435
ports: ports:
- '${APP_PORT}:11435' - '${APP_PORT}:11434'
networks: networks:
- tipi_main_network - tipi_main_network
deploy: deploy:
@ -20,12 +18,12 @@ services:
capabilities: capabilities:
- gpu - gpu
volumes: volumes:
- ${APP_DATA_DIR}/.ollama:/root/.ollama - ${APP_DATA_DIR}/data/.ollama:/root/.ollama
labels: labels:
# Main # Main
traefik.enable: true traefik.enable: true
traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https
traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11435 traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11434
# Web # Web
traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`) traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`)
traefik.http.routers.ollama-nvidia-insecure.entrypoints: web traefik.http.routers.ollama-nvidia-insecure.entrypoints: web

View File

@ -1,22 +1,24 @@
# Ollama - Nvidia
[Ollama](https://github.com/ollama/ollama) allows you to run open-source large language models, such as Llama 3 & Mistral, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.
---
## Nvidia Instructions ## Nvidia Instructions
To enable your Nvidia GPU in Docker: To enable your Nvidia GPU in Docker:
- You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation) - You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation)
- And configure Docker to use Nvidia driver - And configure Docker to use Nvidia driver
```sh ```sh
sudo nvidia-ctk runtime configure --runtime=docker sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker sudo systemctl restart docker
``` ```
--- ---
## Usage ## Usage
⚠️ This app runs on port **11435**. Take this into account when configuring tools connecting to the app.
### Use with a frontend ### Use with a frontend
- [LobeChat](https://github.com/lobehub/lobe-chat) - [LobeChat](https://github.com/lobehub/lobe-chat)
- [LibreChat](https://github.com/danny-avila/LibreChat) - [LibreChat](https://github.com/danny-avila/LibreChat)
- [OpenWebUI](https://github.com/open-webui/open-webui) - [OpenWebUI](https://github.com/open-webui/open-webui)
@ -25,9 +27,11 @@ sudo systemctl restart docker
--- ---
### Try the REST API ### Try the REST API
Ollama has a REST API for running and managing models. Ollama has a REST API for running and managing models.
**Generate a response** **Generate a response**
```sh ```sh
curl http://localhost:11434/api/generate -d '{ curl http://localhost:11434/api/generate -d '{
"model": "llama3", "model": "llama3",
@ -36,6 +40,7 @@ curl http://localhost:11434/api/generate -d '{
``` ```
**Chat with a model** **Chat with a model**
```sh ```sh
curl http://localhost:11434/api/chat -d '{ curl http://localhost:11434/api/chat -d '{
"model": "llama3", "model": "llama3",
@ -44,16 +49,11 @@ curl http://localhost:11434/api/chat -d '{
] ]
}' }'
``` ```
---
### Try in terminal
```sh
docker exec -it ollama-nvidia ollama run llama3 --verbose
```
--- ---
## Compatible GPUs ## Compatible GPUs
Ollama supports Nvidia GPUs with compute capability 5.0+. Ollama supports Nvidia GPUs with compute capability 5.0+.
Check your compute compatibility to see if your card is supported: Check your compute compatibility to see if your card is supported:
@ -82,10 +82,10 @@ Check your compute compatibility to see if your card is supported:
| 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` | | 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` |
| | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` | | | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` |
--- ---
## Model library ## Model library
Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library') Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library')
Here are some example models that can be downloaded: Here are some example models that can be downloaded: