[GH-ISSUE #1302] Ollama does not see GPU #26433

Closed
opened 2026-04-22 02:43:45 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @robertsd on GitHub (Nov 28, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1302

Originally assigned to: @dhiltgen on GitHub.

curl https://ollama.ai/install.sh | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7883    0  7883    0     0  32574      0 --:--:-- --:--:-- --:--:-- 32440
>>> Downloading ollama...
######################################################################## 100.0%#=#=#              ######################################################################## 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> NVIDIA GPU installed.
>>> The Ollama API is now available at 0.0.0.0:11434.
>>> Install complete. Run "ollama" from the command line.

ollama serve

2023/11/28 14:54:33 images.go:784: total blobs: 8
2023/11/28 14:54:33 images.go:791: total unused blobs removed: 0
2023/11/28 14:54:33 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.12)
2023/11/28 14:54:33 routes.go:797: GPU support may not enabled, check you have installed GPU drivers and have the necessary permissions to run nvidia-smi

nvidia-smi

Tue Nov 28 14:55:48 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100 80G...  Off  | 00000000:25:00.0 Off |                   On |
| N/A   33C    P0    74W / 300W |                  N/A |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| MIG devices:                                                                |
+------------------+----------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
|      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
|                  |                      |        ECC|                       |
|==================+======================+===========+=======================|
|  0    0   0   0  |      0MiB / 81085MiB | 98      0 |  7   0    5    1    1 |
|                  |      1MiB / 13107... |           |                       |
+------------------+----------------------+-----------+-----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

uname -a

4.18.0-477.27.1.el8_8.x86_64 #1 SMP Thu Aug 31 10:29:22 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Originally created by @robertsd on GitHub (Nov 28, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1302 Originally assigned to: @dhiltgen on GitHub. ``` curl https://ollama.ai/install.sh | sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 7883 0 7883 0 0 32574 0 --:--:-- --:--:-- --:--:-- 32440 >>> Downloading ollama... ######################################################################## 100.0%#=#=# ######################################################################## 100.0% >>> Installing ollama to /usr/local/bin... >>> Adding current user to ollama group... >>> Creating ollama systemd service... >>> NVIDIA GPU installed. >>> The Ollama API is now available at 0.0.0.0:11434. >>> Install complete. Run "ollama" from the command line. ``` ollama serve ``` 2023/11/28 14:54:33 images.go:784: total blobs: 8 2023/11/28 14:54:33 images.go:791: total unused blobs removed: 0 2023/11/28 14:54:33 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.12) 2023/11/28 14:54:33 routes.go:797: GPU support may not enabled, check you have installed GPU drivers and have the necessary permissions to run nvidia-smi ``` nvidia-smi ``` Tue Nov 28 14:55:48 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100 80G... Off | 00000000:25:00.0 Off | On | | N/A 33C P0 74W / 300W | N/A | N/A Default | | | | Enabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | MIG devices: | +------------------+----------------------+-----------+-----------------------+ | GPU GI CI MIG | Memory-Usage | Vol| Shared | | ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG| | | | ECC| | |==================+======================+===========+=======================| | 0 0 0 0 | 0MiB / 81085MiB | 98 0 | 7 0 5 1 1 | | | 1MiB / 13107... | | | +------------------+----------------------+-----------+-----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ ``` uname -a ``` 4.18.0-477.27.1.el8_8.x86_64 #1 SMP Thu Aug 31 10:29:22 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux ```
GiteaMirror added the bug label 2026-04-22 02:43:45 -05:00
Author
Owner

@jmorganca commented on GitHub (Nov 28, 2023):

It seems the ollama user created for the ollama system service may not have access to the GPU. From this thread it's possible the ollama user may need to get added to a group such as vglusers (if that exists for you). Will keep looking into this

<!-- gh-comment-id:1830043694 --> @jmorganca commented on GitHub (Nov 28, 2023): It seems the `ollama` user created for the ollama system service may not have access to the GPU. From [this thread](https://stackoverflow.com/questions/52507744/enable-nvidia-smi-permissions-to-be-run-by-all-users) it's possible the `ollama` user may need to get added to a group such as `vglusers` (if that exists for you). Will keep looking into this
Author
Owner

@dillera commented on GitHub (Nov 28, 2023):

Look at my issue in: https://github.com/jmorganca/ollama/issues/1289

Not exactly the same but still the same: ollama won't touch the GPU.

<!-- gh-comment-id:1830390781 --> @dillera commented on GitHub (Nov 28, 2023): Look at my issue in: https://github.com/jmorganca/ollama/issues/1289 Not exactly the same but still the same: ollama won't touch the GPU.
Author
Owner

@wookayin commented on GitHub (Nov 28, 2023):

First I encourage @robertsd to see this to learn how to use backticks to format code in Github.

This seems like a permission issue, user ollama does not have permission on /dev/nvidia* files. What if you run ollama with your account, not ollama? (It doesn't have to be running as daemon or sudo)

curl -fSL --show-error --progress-bar -o ./ollama "https://ollama.ai/download/ollama-linux-amd64";
chmod +x ./ollama
./ollma serve
<!-- gh-comment-id:1830804466 --> @wookayin commented on GitHub (Nov 28, 2023): First I encourage @robertsd to see [this](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks) to learn how to use backticks to format code in Github. This seems like a permission issue, user `ollama` does not have permission on `/dev/nvidia*` files. What if you run ollama with your account, not `ollama`? (It doesn't have to be running as daemon or sudo) ``` curl -fSL --show-error --progress-bar -o ./ollama "https://ollama.ai/download/ollama-linux-amd64"; chmod +x ./ollama ./ollma serve ```
Author
Owner

@robertsd commented on GitHub (Nov 28, 2023):

Thank you @wookayin for the markdown tip!

I ran the commands you suggested, but still receive the GPU not enabled warning.

coder@coder-robertsd-deeplearning-01:~$ curl -fSL --show-error --progress-bar -o ./ollama "https://ollama.ai/download/ollama-linux-amd64"
########################################################################################### 100.0%########################################################################################### 100.0%
coder@coder-robertsd-deeplearning-01:~$ chmod +x ./ollama 
coder@coder-robertsd-deeplearning-01:~$ ./ollama serve
2023/11/28 21:53:01 images.go:784: total blobs: 8
2023/11/28 21:53:01 images.go:791: total unused blobs removed: 0
2023/11/28 21:53:01 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.12)
2023/11/28 21:53:01 routes.go:797: GPU support may not enabled, check you have installed GPU drivers and have the necessary permissions to run nvidia-smi
coder@coder-djroberts-deeplearning-01:~$
<!-- gh-comment-id:1830824381 --> @robertsd commented on GitHub (Nov 28, 2023): Thank you @wookayin for the markdown tip! I ran the commands you suggested, but still receive the GPU not enabled warning. ``` coder@coder-robertsd-deeplearning-01:~$ curl -fSL --show-error --progress-bar -o ./ollama "https://ollama.ai/download/ollama-linux-amd64" ########################################################################################### 100.0%########################################################################################### 100.0% coder@coder-robertsd-deeplearning-01:~$ chmod +x ./ollama coder@coder-robertsd-deeplearning-01:~$ ./ollama serve 2023/11/28 21:53:01 images.go:784: total blobs: 8 2023/11/28 21:53:01 images.go:791: total unused blobs removed: 0 2023/11/28 21:53:01 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.12) 2023/11/28 21:53:01 routes.go:797: GPU support may not enabled, check you have installed GPU drivers and have the necessary permissions to run nvidia-smi coder@coder-djroberts-deeplearning-01:~$ ```
Author
Owner

@jmorganca commented on GitHub (Jan 12, 2024):

Hi @robertsd, improvements have been made in the last few versions of Ollama (latest is 0.1.20) that improve GPU discoverability, would it be possible to give it a try? The A100 is definitely supported (and should be quite fast 😄 )

<!-- gh-comment-id:1888484473 --> @jmorganca commented on GitHub (Jan 12, 2024): Hi @robertsd, improvements have been made in the last few versions of Ollama (latest is [0.1.20](https://github.com/jmorganca/ollama/releases/tag/v0.1.20)) that improve GPU discoverability, would it be possible to give it a try? The A100 is definitely supported (and should be quite fast 😄 )
Author
Owner

@johnnyq commented on GitHub (Feb 5, 2024):

Hello I had an Nvidia A2 GPU passed through Proxmox to a Virtual Machine running Debian 12.
The VM can see the Nvidia A2 GPU but Ollama is not taking advantage of it, I am logged in as root

See here

root@ai-gpu:~# nvidia-smi
Mon Feb  5 17:44:28 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A2                      On  | 00000000:01:00.0 Off |                    0 |
|  0%   39C    P8               5W /  60W |      4MiB / 15356MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Im not sure what other steps to take and need help thank you

<!-- gh-comment-id:1927581657 --> @johnnyq commented on GitHub (Feb 5, 2024): Hello I had an Nvidia A2 GPU passed through Proxmox to a Virtual Machine running Debian 12. The VM can see the Nvidia A2 GPU but Ollama is not taking advantage of it, I am logged in as root See here ``` root@ai-gpu:~# nvidia-smi Mon Feb 5 17:44:28 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA A2 On | 00000000:01:00.0 Off | 0 | | 0% 39C P8 5W / 60W | 4MiB / 15356MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ ``` Im not sure what other steps to take and need help thank you
Author
Owner

@johnnyq commented on GitHub (Feb 5, 2024):

I also tried this with an ubuntu 22.04 Virtual Machine using the the Ollama Linux install process which also installed the latest Cuda Nvidia Drivers and it is not using my GPU.

However I can verify the GPU is working hashcat installed and being benchmarked

<!-- gh-comment-id:1927764524 --> @johnnyq commented on GitHub (Feb 5, 2024): I also tried this with an **ubuntu 22.04** Virtual Machine using the the Ollama Linux install process which also installed the latest Cuda Nvidia Drivers and it is not using my GPU. However I can verify the GPU is working **hashcat** installed and being benchmarked
Author
Owner

@dhiltgen commented on GitHub (Mar 12, 2024):

@robertsd are you still unable to get Ollama running on your GPU with the latest version? If so, can you enable debug logging with OLLAMA_DEBUG=1 for the server and share your server log so we can see more details on why it's not able to discover the GPU properly?

@johnnyq your problem is likely lack of AVX in proxmox #2187. By default, proxmox masks the vector features of the CPU to enhance portability, but this has a major performance impact on LLM processing. You should update your proxmox configuration to expose all the AVX features.

<!-- gh-comment-id:1992068129 --> @dhiltgen commented on GitHub (Mar 12, 2024): @robertsd are you still unable to get Ollama running on your GPU with the latest version? If so, can you enable debug logging with `OLLAMA_DEBUG=1` for the server and share your server log so we can see more details on why it's not able to discover the GPU properly? @johnnyq your problem is likely lack of AVX in proxmox #2187. By default, proxmox masks the vector features of the CPU to enhance portability, but this has a major performance impact on LLM processing. You should update your proxmox configuration to expose all the AVX features.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26433