[GH-ISSUE #4395] Cannot Use GPU properly #80428

Closed
opened 2026-05-09 08:54:21 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @applepieiris on GitHub (May 13, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4395

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

I installed the Ollama in my linux server according to the official documents:
curl -fsSL https://ollama.com/install.sh | sh
Installation is ok and it returns:
`

Downloading ollama...
######################################################################## 100.0%-#O#- # #
Installing ollama to /usr/local/bin...
Adding ollama user to render group...
Adding ollama user to video group...
Adding current user to ollama group...
Creating ollama systemd service...
Enabling and starting ollama service...
NVIDIA GPU installed.`

But when I ollama run llama2, when the model file downloaded already. The GPU shows no running process:

ubuntu@:~$ sudo nvidia-smi
Mon May 13 09:15:28 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100 80GB PCIe          Off |   00000000:03:00.0 Off |                   On |
| N/A   29C    P0             41W /  300W |       0MiB /  81920MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

But when I checked the CPU usages:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1701363 ollama    20   0   20.0g  19.1g  18.1g R 840.0  10.1   9:51.51 /tmp/ollama872259507/runners/cpu_avx2/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-949974ebf5978d3d2e232+
   1554 root      20   0 1236380  10880   8320 S   6.7   0.0   3:48.73 /usr/bin/containerd-shim-runc-v2 -namespace moby -id d2abaf7e2a6553dc1eae353c2e5eda9138ee8b2b925d1fdaae2ab97518a6996a -address /run/c+
1704361 ubuntu    20   0   11080   4736   3712 R   6.7   0.0   0:00.01 top -bn 1 -i -c

From the above, we can see that the ollama is running on CPU!!

I check the logs of ollama it shows me:
image

Is there any solutions to this?

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.37

Originally created by @applepieiris on GitHub (May 13, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4395 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? I installed the Ollama in my linux server according to the official documents: `curl -fsSL https://ollama.com/install.sh | sh` Installation is ok and it returns: ` >>> Downloading ollama... ######################################################################## 100.0%-#O#- # # >>> Installing ollama to /usr/local/bin... >>> Adding ollama user to render group... >>> Adding ollama user to video group... >>> Adding current user to ollama group... >>> Creating ollama systemd service... >>> Enabling and starting ollama service... >>> NVIDIA GPU installed.` But when I `ollama run llama2`, when the model file downloaded already. The GPU shows no running process: ``` ubuntu@:~$ sudo nvidia-smi Mon May 13 09:15:28 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA A100 80GB PCIe Off | 00000000:03:00.0 Off | On | | N/A 29C P0 41W / 300W | 0MiB / 81920MiB | N/A Default | | | | Enabled | +-----------------------------------------+------------------------+----------------------+ ``` But when I checked the CPU usages: ``` PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1701363 ollama 20 0 20.0g 19.1g 18.1g R 840.0 10.1 9:51.51 /tmp/ollama872259507/runners/cpu_avx2/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-949974ebf5978d3d2e232+ 1554 root 20 0 1236380 10880 8320 S 6.7 0.0 3:48.73 /usr/bin/containerd-shim-runc-v2 -namespace moby -id d2abaf7e2a6553dc1eae353c2e5eda9138ee8b2b925d1fdaae2ab97518a6996a -address /run/c+ 1704361 ubuntu 20 0 11080 4736 3712 R 6.7 0.0 0:00.01 top -bn 1 -i -c ``` From the above, we can see that the ollama is running on CPU!! I check the logs of ollama it shows me: ![image](https://github.com/ollama/ollama/assets/36785462/fa3327df-4ded-4423-a37a-8491adf09713) Is there any solutions to this? ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.37
GiteaMirror added the bugnvidiagpu labels 2026-05-09 08:54:21 -05:00
Author
Owner

@mr-j0nes commented on GitHub (May 14, 2024):

I find this relevant to the topic, so I hope it's ok to reuse this ticket:

I am trying ollama (preview for Windows) on my Windows box with llama3. This runs on a Dell 9510 with an Intel GPU (0) and an NVidea GPU (1). I have installed CUDA before installing ollama (not sure if this makes any difference)

Is this normal?

image

The GPU usage doesn't go above 0.1%. Shouldn't the GPU be used more than the CPU?

<!-- gh-comment-id:2109746600 --> @mr-j0nes commented on GitHub (May 14, 2024): I find this relevant to the topic, so I hope it's ok to reuse this ticket: I am trying ollama (preview for Windows) on my Windows box with llama3. This runs on a Dell 9510 with an Intel GPU (0) and an NVidea GPU (1). I have installed CUDA before installing ollama (not sure if this makes any difference) Is this normal? ![image](https://github.com/ollama/ollama/assets/24613872/93668c4e-3b6d-4a9f-b82c-0f13d69caa13) The GPU usage doesn't go above 0.1%. Shouldn't the GPU be used more than the CPU?
Author
Owner

@pdevine commented on GitHub (May 18, 2024):

@applepieiris what version of Linux are you using? I have been using an a100 on Ubuntu 22.04 and it's working correctly. If you can upgrade to the newest version of ollama you can try out the ollama ps command which should tell you if your model is using the GPU or not. e.g.:

$ ollama ps
NAME               	ID          	SIZE  	PROCESSOR	UNTIL
qwen:1.8b-chat-fp16	7b9c77c7b5b6	3.7 GB	100% GPU 	4 minutes from now

If you're on the CPU, it will say 100% CPU instead of 100% GPU.

@mr-j0nes I think your issue is probably different than this one, however, you can use the same ollama ps command to see if your model loaded correctly. Although in re-reading your message, the Intel GPU probably isn't going to work (I think there are some other issues for this already).

<!-- gh-comment-id:2118913057 --> @pdevine commented on GitHub (May 18, 2024): @applepieiris what version of Linux are you using? I have been using an a100 on Ubuntu 22.04 and it's working correctly. If you can upgrade to the newest version of ollama you can try out the `ollama ps` command which should tell you if your model is using the GPU or not. e.g.: ``` $ ollama ps NAME ID SIZE PROCESSOR UNTIL qwen:1.8b-chat-fp16 7b9c77c7b5b6 3.7 GB 100% GPU 4 minutes from now ``` If you're on the CPU, it will say `100% CPU` instead of `100% GPU`. @mr-j0nes I think your issue is probably different than this one, however, you can use the same `ollama ps` command to see if your model loaded correctly. Although in re-reading your message, the Intel GPU probably isn't going to work (I think there are some other issues for this already).
Author
Owner

@dhiltgen commented on GitHub (May 21, 2024):

In addition to what Patrick mentioned, the log you included above seems to be from 0.1.31.

Please do upgrade to pick up fixes we've made around GPU discovery, and you might also want to check out the troubleshooting notes here https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#container-fails-to-run-on-nvidia-gpu which might cover your scenario. (cudart init 3 failures are one symptom covered by the troubleshooting)

If things still aren't working as expected and it's not running on the GPU, please share your server log so I can see what caused us not to load on GPU.

<!-- gh-comment-id:2123560606 --> @dhiltgen commented on GitHub (May 21, 2024): In addition to what Patrick mentioned, the log you included above seems to be from 0.1.31. Please do upgrade to pick up fixes we've made around GPU discovery, and you might also want to check out the troubleshooting notes here https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#container-fails-to-run-on-nvidia-gpu which might cover your scenario. (cudart init 3 failures are one symptom covered by the troubleshooting) If things still aren't working as expected and it's not running on the GPU, please share your server log so I can see what caused us not to load on GPU.
Author
Owner

@algocrypto commented on GitHub (May 23, 2024):

I am having similar issue, I have server with Ubuntu 22.02 desktop with Nvidia 3070 and 3080 GPU.
I have installed nvidia-driver-535 and cuda version 11.5.
I have same version installed on another gaming PC and it works on it .
But the same software version installed in server it does not use the GPU.

<!-- gh-comment-id:2126023553 --> @algocrypto commented on GitHub (May 23, 2024): I am having similar issue, I have server with Ubuntu 22.02 desktop with Nvidia 3070 and 3080 GPU. I have installed nvidia-driver-535 and cuda version 11.5. I have same version installed on another gaming PC and it works on it . But the same software version installed in server it does not use the GPU.
Author
Owner

@dhiltgen commented on GitHub (May 23, 2024):

@algocrypto the server log will help us understand why it's not able to discover the GPU.

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

<!-- gh-comment-id:2127517714 --> @dhiltgen commented on GitHub (May 23, 2024): @algocrypto the server log will help us understand why it's not able to discover the GPU. https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md
Author
Owner

@algocrypto commented on GitHub (May 24, 2024):

@algocrypto the server log will help us understand why it's not able to discover the GPU.

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

Thanks for sharing the link, it seems that I have the Nvidia GPU/Cuda driver issue, it is fixed now using Nvidia driver 535 and Cuda toolkit 11.5

<!-- gh-comment-id:2128281964 --> @algocrypto commented on GitHub (May 24, 2024): > @algocrypto the server log will help us understand why it's not able to discover the GPU. > > https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md Thanks for sharing the link, it seems that I have the Nvidia GPU/Cuda driver issue, it is fixed now using Nvidia driver 535 and Cuda toolkit 11.5
Author
Owner

@dhiltgen commented on GitHub (May 25, 2024):

@applepieiris are you still having troubles after upgrading and following the nvidia troubleshooting linked above?

<!-- gh-comment-id:2131344934 --> @dhiltgen commented on GitHub (May 25, 2024): @applepieiris are you still having troubles after upgrading and following the nvidia troubleshooting linked above?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#80428