chore(ollama-nvidia): small config adjustments
This commit is contained in:
parent
13d5bb04c8
commit
4b09d44d56
|
@ -5,10 +5,8 @@ services:
|
||||||
image: ollama/ollama:0.1.33
|
image: ollama/ollama:0.1.33
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
container_name: ollama-nvidia
|
container_name: ollama-nvidia
|
||||||
environment:
|
|
||||||
- PORT=11435
|
|
||||||
ports:
|
ports:
|
||||||
- '${APP_PORT}:11435'
|
- '${APP_PORT}:11434'
|
||||||
networks:
|
networks:
|
||||||
- tipi_main_network
|
- tipi_main_network
|
||||||
deploy:
|
deploy:
|
||||||
|
@ -20,12 +18,12 @@ services:
|
||||||
capabilities:
|
capabilities:
|
||||||
- gpu
|
- gpu
|
||||||
volumes:
|
volumes:
|
||||||
- ${APP_DATA_DIR}/.ollama:/root/.ollama
|
- ${APP_DATA_DIR}/data/.ollama:/root/.ollama
|
||||||
labels:
|
labels:
|
||||||
# Main
|
# Main
|
||||||
traefik.enable: true
|
traefik.enable: true
|
||||||
traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https
|
traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https
|
||||||
traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11435
|
traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11434
|
||||||
# Web
|
# Web
|
||||||
traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`)
|
traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`)
|
||||||
traefik.http.routers.ollama-nvidia-insecure.entrypoints: web
|
traefik.http.routers.ollama-nvidia-insecure.entrypoints: web
|
||||||
|
|
|
@ -1,22 +1,24 @@
|
||||||
# Ollama - Nvidia
|
|
||||||
[Ollama](https://github.com/ollama/ollama) allows you to run open-source large language models, such as Llama 3 & Mistral, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Nvidia Instructions
|
## Nvidia Instructions
|
||||||
|
|
||||||
To enable your Nvidia GPU in Docker:
|
To enable your Nvidia GPU in Docker:
|
||||||
|
|
||||||
- You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation)
|
- You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation)
|
||||||
|
|
||||||
- And configure Docker to use Nvidia driver
|
- And configure Docker to use Nvidia driver
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
sudo nvidia-ctk runtime configure --runtime=docker
|
sudo nvidia-ctk runtime configure --runtime=docker
|
||||||
sudo systemctl restart docker
|
sudo systemctl restart docker
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
|
⚠️ This app runs on port **11435**. Take this into account when configuring tools connecting to the app.
|
||||||
|
|
||||||
### Use with a frontend
|
### Use with a frontend
|
||||||
|
|
||||||
- [LobeChat](https://github.com/lobehub/lobe-chat)
|
- [LobeChat](https://github.com/lobehub/lobe-chat)
|
||||||
- [LibreChat](https://github.com/danny-avila/LibreChat)
|
- [LibreChat](https://github.com/danny-avila/LibreChat)
|
||||||
- [OpenWebUI](https://github.com/open-webui/open-webui)
|
- [OpenWebUI](https://github.com/open-webui/open-webui)
|
||||||
|
@ -25,9 +27,11 @@ sudo systemctl restart docker
|
||||||
---
|
---
|
||||||
|
|
||||||
### Try the REST API
|
### Try the REST API
|
||||||
|
|
||||||
Ollama has a REST API for running and managing models.
|
Ollama has a REST API for running and managing models.
|
||||||
|
|
||||||
**Generate a response**
|
**Generate a response**
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
curl http://localhost:11434/api/generate -d '{
|
curl http://localhost:11434/api/generate -d '{
|
||||||
"model": "llama3",
|
"model": "llama3",
|
||||||
|
@ -36,6 +40,7 @@ curl http://localhost:11434/api/generate -d '{
|
||||||
```
|
```
|
||||||
|
|
||||||
**Chat with a model**
|
**Chat with a model**
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
curl http://localhost:11434/api/chat -d '{
|
curl http://localhost:11434/api/chat -d '{
|
||||||
"model": "llama3",
|
"model": "llama3",
|
||||||
|
@ -44,16 +49,11 @@ curl http://localhost:11434/api/chat -d '{
|
||||||
]
|
]
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
---
|
|
||||||
|
|
||||||
### Try in terminal
|
|
||||||
```sh
|
|
||||||
docker exec -it ollama-nvidia ollama run llama3 --verbose
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Compatible GPUs
|
## Compatible GPUs
|
||||||
|
|
||||||
Ollama supports Nvidia GPUs with compute capability 5.0+.
|
Ollama supports Nvidia GPUs with compute capability 5.0+.
|
||||||
|
|
||||||
Check your compute compatibility to see if your card is supported:
|
Check your compute compatibility to see if your card is supported:
|
||||||
|
@ -82,10 +82,10 @@ Check your compute compatibility to see if your card is supported:
|
||||||
| 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` |
|
| 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` |
|
||||||
| | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` |
|
| | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` |
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Model library
|
## Model library
|
||||||
|
|
||||||
Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library')
|
Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library')
|
||||||
|
|
||||||
Here are some example models that can be downloaded:
|
Here are some example models that can be downloaded:
|
||||||
|
|
Loading…
Reference in New Issue
Block a user