2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								## Nvidia Instructions
 
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:38:32 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								To enable your Nvidia GPU in Docker:
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								-  You need to install the [NVIDIA Container Toolkit ](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								-  And configure Docker to use Nvidia driver
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								```sh
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								sudo nvidia-ctk runtime configure --runtime=docker
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								sudo systemctl restart docker
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								## Usage
 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								⚠️ This app runs on port **11435** . Take this into account when configuring tools connecting to the app.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								### Use with a frontend
 
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								-  [LobeChat ](https://github.com/lobehub/lobe-chat )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								-  [LibreChat ](https://github.com/danny-avila/LibreChat )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								-  [OpenWebUI ](https://github.com/open-webui/open-webui )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								-  [And more ... ](https://github.com/ollama/ollama )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								### Try the REST API
 
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								Ollama has a REST API for running and managing models.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								**Generate a response**
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								```sh
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								curl http://localhost:11434/api/generate -d '{
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								  "model": "llama3",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								  "prompt":"Why is the sky blue?"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								}'
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								**Chat with a model**
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								```sh
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								curl http://localhost:11434/api/chat -d '{
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								  "model": "llama3",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								  "messages": [
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    { "role": "user", "content": "why is the sky blue?" }
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								  ]
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								}'
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								## Compatible GPUs
 
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								Ollama supports Nvidia GPUs with compute capability 5.0+.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								Check your compute compatibility to see if your card is supported:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								[https://developer.nvidia.com/cuda-gpus ](https://developer.nvidia.com/cuda-gpus )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Compute Capability | Family              | Cards                                                                                                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| ------------------ | ------------------- | ----------------------------------------------------------------------------------------------------------- |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 9.0                | NVIDIA              | `H100`                                                                                                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 8.9                | GeForce RTX 40xx    | `RTX 4090`  `RTX 4080`  `RTX 4070 Ti`  `RTX 4060 Ti`                                                            |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | NVIDIA Professional | `L4`  `L40`  `RTX 6000`                                                                                        |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 8.6                | GeForce RTX 30xx    | `RTX 3090 Ti`  `RTX 3090`  `RTX 3080 Ti`  `RTX 3080`  `RTX 3070 Ti`  `RTX 3070`  `RTX 3060 Ti`  `RTX 3060`          |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | NVIDIA Professional | `A40`  `RTX A6000`  `RTX A5000`  `RTX A4000`  `RTX A3000`  `RTX A2000`  `A10`  `A16`  `A2`                           |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 8.0                | NVIDIA              | `A100`  `A30`                                                                                                 |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 7.5                | GeForce GTX/RTX     | `GTX 1650 Ti`  `TITAN RTX`  `RTX 2080 Ti`  `RTX 2080`  `RTX 2070`  `RTX 2060`                                     |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | NVIDIA Professional | `T4`  `RTX 5000`  `RTX 4000`  `RTX 3000`  `T2000`  `T1200`  `T1000`  `T600`  `T500`                                  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Quadro              | `RTX 8000`  `RTX 6000`  `RTX 5000`  `RTX 4000`                                                                  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 7.0                | NVIDIA              | `TITAN V`  `V100`  `Quadro GV100`                                                                              |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 6.1                | NVIDIA TITAN        | `TITAN Xp`  `TITAN X`                                                                                         |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | GeForce GTX         | `GTX 1080 Ti`  `GTX 1080`  `GTX 1070 Ti`  `GTX 1070`  `GTX 1060`  `GTX 1050`                                      |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Quadro              | `P6000`  `P5200`  `P4200`  `P3200`  `P5000`  `P4000`  `P3000`  `P2200`  `P2000`  `P1000`  `P620`  `P600`  `P500`  `P520`  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Tesla               | `P40`  `P4`                                                                                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 6.0                | NVIDIA              | `Tesla P100`  `Quadro GP100`                                                                                  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 5.2                | GeForce GTX         | `GTX TITAN X`  `GTX 980 Ti`  `GTX 980`  `GTX 970`  `GTX 960`  `GTX 950`                                           |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Quadro              | `M6000 24GB`  `M6000`  `M5000`  `M5500M`  `M4000`  `M2200`  `M2000`  `M620`                                         |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Tesla               | `M60`  `M40`                                                                                                  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| 5.0                | GeForce GTX         | `GTX 750 Ti`  `GTX 750`  `NVS 810`                                                                             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								|                    | Quadro              | `K2200`  `K1200`  `K620`  `M1200`  `M520`  `M5000M`  `M4000M`  `M3000M`  `M2000M`  `M1000M`  `K620M`  `M600M`  `M500M`   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								## Model library
 
							 
						 
					
						
							
								
									
										
										
										
											2024-05-11 09:47:20 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:31:09 +00:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								Ollama supports a list of models available on [ollama.com/library ](https://ollama.com/library 'ollama model library' )
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								Here are some example models that can be downloaded:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Model              | Parameters | Size  | Download                       |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| ------------------ | ---------- | ----- | ------------------------------ |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Llama 3            | 8B         | 4.7GB | `ollama run llama3`             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Llama 3            | 70B        | 40GB  | `ollama run llama3:70b`         |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Phi-3              | 3,8B       | 2.3GB | `ollama run phi3`               |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Mistral            | 7B         | 4.1GB | `ollama run mistral`            |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Neural Chat        | 7B         | 4.1GB | `ollama run neural-chat`        |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Starling           | 7B         | 4.1GB | `ollama run starling-lm`        |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Code Llama         | 7B         | 3.8GB | `ollama run codellama`          |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Llama 2 Uncensored | 7B         | 3.8GB | `ollama run llama2-uncensored`  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| LLaVA              | 7B         | 4.5GB | `ollama run llava`              |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Gemma              | 2B         | 1.4GB | `ollama run gemma:2b`           |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Gemma              | 7B         | 4.8GB | `ollama run gemma:7b`           |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								| Solar              | 10.7B      | 6.1GB | `ollama run solar`              |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-05-04 14:38:10 +00:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								>  Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.