chore(ollama-nvidia): small config adjustments
This commit is contained in:
		
							parent
							
								
									13d5bb04c8
								
							
						
					
					
						commit
						4b09d44d56
					
				|  | @ -5,10 +5,8 @@ services: | ||||||
|     image: ollama/ollama:0.1.33 |     image: ollama/ollama:0.1.33 | ||||||
|     restart: unless-stopped |     restart: unless-stopped | ||||||
|     container_name: ollama-nvidia |     container_name: ollama-nvidia | ||||||
|     environment: |  | ||||||
|       - PORT=11435 |  | ||||||
|     ports: |     ports: | ||||||
|       - '${APP_PORT}:11435' |       - '${APP_PORT}:11434' | ||||||
|     networks: |     networks: | ||||||
|       - tipi_main_network |       - tipi_main_network | ||||||
|     deploy: |     deploy: | ||||||
|  | @ -20,12 +18,12 @@ services: | ||||||
|               capabilities: |               capabilities: | ||||||
|                 - gpu |                 - gpu | ||||||
|     volumes: |     volumes: | ||||||
|       - ${APP_DATA_DIR}/.ollama:/root/.ollama |       - ${APP_DATA_DIR}/data/.ollama:/root/.ollama | ||||||
|     labels: |     labels: | ||||||
|       # Main |       # Main | ||||||
|       traefik.enable: true |       traefik.enable: true | ||||||
|       traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https |       traefik.http.middlewares.ollama-nvidia-web-redirect.redirectscheme.scheme: https | ||||||
|       traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11435 |       traefik.http.services.ollama-nvidia.loadbalancer.server.port: 11434 | ||||||
|       # Web |       # Web | ||||||
|       traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`) |       traefik.http.routers.ollama-nvidia-insecure.rule: Host(`${APP_DOMAIN}`) | ||||||
|       traefik.http.routers.ollama-nvidia-insecure.entrypoints: web |       traefik.http.routers.ollama-nvidia-insecure.entrypoints: web | ||||||
|  |  | ||||||
|  | @ -1,22 +1,24 @@ | ||||||
| # Ollama - Nvidia |  | ||||||
| [Ollama](https://github.com/ollama/ollama) allows you to run open-source large language models, such as Llama 3 & Mistral, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. |  | ||||||
| 
 |  | ||||||
| --- |  | ||||||
| 
 |  | ||||||
| ## Nvidia Instructions | ## Nvidia Instructions | ||||||
|  | 
 | ||||||
| To enable your Nvidia GPU in Docker: | To enable your Nvidia GPU in Docker: | ||||||
|  | 
 | ||||||
| - You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation) | - You need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation) | ||||||
| 
 | 
 | ||||||
| - And configure Docker to use Nvidia driver | - And configure Docker to use Nvidia driver | ||||||
|  | 
 | ||||||
| ```sh | ```sh | ||||||
| sudo nvidia-ctk runtime configure --runtime=docker | sudo nvidia-ctk runtime configure --runtime=docker | ||||||
| sudo systemctl restart docker | sudo systemctl restart docker | ||||||
| ``` | ``` | ||||||
|  | 
 | ||||||
| --- | --- | ||||||
| 
 | 
 | ||||||
| ## Usage | ## Usage | ||||||
| 
 | 
 | ||||||
|  | ⚠️ This app runs on port **11435**. Take this into account when configuring tools connecting to the app. | ||||||
|  | 
 | ||||||
| ### Use with a frontend | ### Use with a frontend | ||||||
|  | 
 | ||||||
| - [LobeChat](https://github.com/lobehub/lobe-chat) | - [LobeChat](https://github.com/lobehub/lobe-chat) | ||||||
| - [LibreChat](https://github.com/danny-avila/LibreChat) | - [LibreChat](https://github.com/danny-avila/LibreChat) | ||||||
| - [OpenWebUI](https://github.com/open-webui/open-webui) | - [OpenWebUI](https://github.com/open-webui/open-webui) | ||||||
|  | @ -25,9 +27,11 @@ sudo systemctl restart docker | ||||||
| --- | --- | ||||||
| 
 | 
 | ||||||
| ### Try the REST API | ### Try the REST API | ||||||
|  | 
 | ||||||
| Ollama has a REST API for running and managing models. | Ollama has a REST API for running and managing models. | ||||||
| 
 | 
 | ||||||
| **Generate a response** | **Generate a response** | ||||||
|  | 
 | ||||||
| ```sh | ```sh | ||||||
| curl http://localhost:11434/api/generate -d '{ | curl http://localhost:11434/api/generate -d '{ | ||||||
|   "model": "llama3", |   "model": "llama3", | ||||||
|  | @ -36,6 +40,7 @@ curl http://localhost:11434/api/generate -d '{ | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| **Chat with a model** | **Chat with a model** | ||||||
|  | 
 | ||||||
| ```sh | ```sh | ||||||
| curl http://localhost:11434/api/chat -d '{ | curl http://localhost:11434/api/chat -d '{ | ||||||
|   "model": "llama3", |   "model": "llama3", | ||||||
|  | @ -44,16 +49,11 @@ curl http://localhost:11434/api/chat -d '{ | ||||||
|   ] |   ] | ||||||
| }' | }' | ||||||
| ``` | ``` | ||||||
| --- |  | ||||||
| 
 |  | ||||||
| ### Try in terminal |  | ||||||
| ```sh |  | ||||||
| docker exec -it ollama-nvidia ollama run llama3 --verbose |  | ||||||
| ``` |  | ||||||
| 
 | 
 | ||||||
| --- | --- | ||||||
| 
 | 
 | ||||||
| ## Compatible GPUs | ## Compatible GPUs | ||||||
|  | 
 | ||||||
| Ollama supports Nvidia GPUs with compute capability 5.0+. | Ollama supports Nvidia GPUs with compute capability 5.0+. | ||||||
| 
 | 
 | ||||||
| Check your compute compatibility to see if your card is supported: | Check your compute compatibility to see if your card is supported: | ||||||
|  | @ -82,10 +82,10 @@ Check your compute compatibility to see if your card is supported: | ||||||
| | 5.0                | GeForce GTX         | `GTX 750 Ti` `GTX 750` `NVS 810`                                                                            | | | 5.0                | GeForce GTX         | `GTX 750 Ti` `GTX 750` `NVS 810`                                                                            | | ||||||
| |                    | Quadro              | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M`  | | |                    | Quadro              | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M`  | | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| --- | --- | ||||||
| 
 | 
 | ||||||
| ## Model library | ## Model library | ||||||
|  | 
 | ||||||
| Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library') | Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library') | ||||||
| 
 | 
 | ||||||
| Here are some example models that can be downloaded: | Here are some example models that can be downloaded: | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user
	 Nicolas Meienberger
						Nicolas Meienberger