chore(ollama-nvidia): small config adjustments

2024-05-11 11:47:20 +02:00 · 2024-05-11 11:47:20 +02:00 · 4b09d44d56
commit 4b09d44d56
parent 13d5bb04c8
2 changed files with 16 additions and 18 deletions
--- a/apps/ollama-nvidia/docker-compose.yml
+++ b/apps/ollama-nvidia/docker-compose.yml
@ -5,10 +5,8 @@ services:
    image: ollama/ollama:0.1.33
    restart: unless-stopped
    container_name: ollama-nvidia
-    environment:
-      - PORT=11435
    ports:
-      - '${APP_PORT}:11435'
+      - '${APP_PORT}:11434'
    networks:
      - tipi_main_network
    deploy:
@ -18,14 +16,14 @@ services:
            - driver: nvidia
              count: all
              capabilities:
-              - gpu
+                - gpu
    volumes:
-      - ${APP_DATA_DIR}/.ollama:/root/.ollama
+      - ${APP_DATA_DIR}/data/.ollama:/root/.ollama
    labels:
      # Main
      traefik.enable: true
      traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https
-      traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11435
+      traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11434
      # Web
      traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`)
      traefik.http.routers.ollama-nvidia-insecure.entrypoints: web
--- a/apps/ollama-nvidia/metadata/description.md
+++ b/apps/ollama-nvidia/metadata/description.md
@ -1,22 +1,24 @@
-# Ollama - Nvidia
-[Ollama](https://github.com/ollama/ollama) allows you to run open-source large language models, such as Llama 3 & Mistral, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.
-
---
-
 ## Nvidia Instructions
+
 To enable your Nvidia GPU in Docker:
+
 - You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation)

 - And configure Docker to use Nvidia driver
+
 ```sh
 sudo nvidia-ctk runtime configure --runtime=docker
 sudo systemctl restart docker
 ```
+
 ---

 ## Usage

+⚠️ This app runs on port **11435**. Take this into account when configuring tools connecting to the app.
+
 ### Use with a frontend
+
 - [LobeChat](https://github.com/lobehub/lobe-chat)
 - [LibreChat](https://github.com/danny-avila/LibreChat)
 - [OpenWebUI](https://github.com/open-webui/open-webui)
@ -25,9 +27,11 @@ sudo systemctl restart docker
 ---

 ### Try the REST API
+
 Ollama has a REST API for running and managing models.

 **Generate a response**
+
 ```sh
 curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
@ -36,6 +40,7 @@ curl http://localhost:11434/api/generate -d '{
 ```

 **Chat with a model**
+
 ```sh
 curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
@ -44,16 +49,11 @@ curl http://localhost:11434/api/chat -d '{
  ]
 }'
 ```
---
-
-### Try in terminal
-```sh
-docker exec -it ollama-nvidia ollama run llama3 --verbose
-```

 ---

 ## Compatible GPUs
+
 Ollama supports Nvidia GPUs with compute capability 5.0+.

 Check your compute compatibility to see if your card is supported:
@ -82,10 +82,10 @@ Check your compute compatibility to see if your card is supported:
 | 5.0                | GeForce GTX         | `GTX 750 Ti` `GTX 750` `NVS 810`                                                                            |
 |                    | Quadro              | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M`  |

-
 ---

 ## Model library
+
 Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library')

 Here are some example models that can be downloaded: