Ollama does not run on GPU at 0.4.0-rc5-rocm version #4671

Closed
opened 2025-11-12 12:27:13 -06:00 by GiteaMirror · 5 comments
Owner

Originally created by @chiehpower on GitHub (Oct 24, 2024).

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

Hi all,

I was testing a very new version (0.4.0-rc5-rocm) that the server was deployed by docker container.

docker run -itd --name=ollama --gpus=all --shm-size=100GB \
    -v ollama:/root/.ollama -p 11434:11434 \
    ollama/ollama:0.4.0-rc5-rocm

The client was using this prompt:

curl http://10.1.2.10:11434/api/generate -d '{                                                                                       
  "model": "llama3.1",
  "prompt": "Why is the sky blue?",
  "stream": false,
  "options": {
    "top_k": 20,
    "temperature": 0.8
  }
}'

I monitored the GPU usage by nvidia-smi that it did not work.
However, if I changed the version to 0.3.14 (changed the docker image) then it was working fine.

Spec

  • GPU: A100

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.4.0-rc5-rocm

Originally created by @chiehpower on GitHub (Oct 24, 2024). Originally assigned to: @dhiltgen on GitHub. ### What is the issue? Hi all, I was testing a very new version (`0.4.0-rc5-rocm`) that the server was deployed by docker container. ``` docker run -itd --name=ollama --gpus=all --shm-size=100GB \ -v ollama:/root/.ollama -p 11434:11434 \ ollama/ollama:0.4.0-rc5-rocm ``` The client was using this prompt: ``` curl http://10.1.2.10:11434/api/generate -d '{ "model": "llama3.1", "prompt": "Why is the sky blue?", "stream": false, "options": { "top_k": 20, "temperature": 0.8 } }' ``` I monitored the GPU usage by `nvidia-smi` that it did not work. However, if I changed the version to `0.3.14` (changed the docker image) then it was working fine. ### Spec - GPU: A100 ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.4.0-rc5-rocm
GiteaMirror added the bug label 2025-11-12 12:27:13 -06:00
Author
Owner

@aNyaaCaldari commented on GitHub (Oct 25, 2024):

i'm using rc3, in WSL (no point change to rc5)

after upgrading driver from 565 to 566 (windows) and restart wsl i finnaly saw gpu usage for ollama

Fri Oct 25 21:09:47 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.02 Driver Version: 566.03 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 ... On | 00000000:01:00.0 On | N/A |
| 0% 40C P8 15W / 285W | 15681MiB / 16376MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4071 C /ollama_llama_server N/A |
+-----------------------------------------------------------------------------------------+

@aNyaaCaldari commented on GitHub (Oct 25, 2024): i'm using rc3, in WSL (no point change to rc5) after upgrading driver from 565 to 566 (windows) and restart wsl i finnaly saw gpu usage for ollama Fri Oct 25 21:09:47 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 565.57.02 Driver Version: 566.03 CUDA Version: 12.7 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4070 ... On | 00000000:01:00.0 On | N/A | | 0% 40C P8 15W / 285W | 15681MiB / 16376MiB | 1% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 4071 C /ollama_llama_server N/A | +-----------------------------------------------------------------------------------------+
Author
Owner

@dhiltgen commented on GitHub (Oct 25, 2024):

@chiehpower ROCm is for AMD GPUs, but it looks like you have an NVIDIA GPU. You should try ollama/ollama:0.4.0-rc5 instead, which has the necessary components for NVIDIA.hub

@dhiltgen commented on GitHub (Oct 25, 2024): @chiehpower ROCm is for AMD GPUs, but it looks like you have an NVIDIA GPU. You should try `ollama/ollama:0.4.0-rc5` instead, which has the necessary components for NVIDIA.hub
Author
Owner

@aNyaaCaldari commented on GitHub (Oct 25, 2024):

@chiehpower ROCm is for AMD GPUs, but it looks like you have an NVIDIA GPU. You should try ollama/ollama:0.4.0-rc5 instead, which has the necessary components for NVIDIA.hub

$ ollama --version
ollama version is 0.4.0-rc3

since yesterday i use release ollama/ollama:0.4.0-rc3

{052E72DA-48F2-4366-9C06-9E293C77737B}

My previous point was that the last section had been empty (at the time, after reinstalling the driver, it now correctly shows as "C /ollama_llama_server".

example logs:
Oct 25 16:34:59 WIN ollama[243]: time=2024-10-25T16:34:59.593+03:00 level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
Oct 25 16:35:00 WIN ollama[243]: time=2024-10-25T16:35:00.897+03:00 level=INFO source=gpu.go:606 msg="no nvidia devices detected by library /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01"

When "no nvidia devices detected" is reported, the Windows Task Manager does not display GPU RAM consumption or 3D usage (CUDA sensor in Windows 11).

Now everything is working correctly.

after Fixed: no usage
{CD01336C-5481-4E25-B0D3-BFADD2EDD68A}

after Fixed: usage
{D29ED908-06B0-4A33-80AC-F0EB88D278B4}

@aNyaaCaldari commented on GitHub (Oct 25, 2024): > @chiehpower ROCm is for AMD GPUs, but it looks like you have an NVIDIA GPU. You should try `ollama/ollama:0.4.0-rc5` instead, which has the necessary components for NVIDIA.hub $ ollama --version ollama version is 0.4.0-rc3 since yesterday i use release ollama/ollama:0.4.0-rc3 ![{052E72DA-48F2-4366-9C06-9E293C77737B}](https://github.com/user-attachments/assets/f1d1b57d-8971-415c-972a-144e57537fd3) My previous point was that the last section had been empty (at the time, after reinstalling the driver, it now correctly shows as "C /ollama_llama_server". example logs: Oct 25 16:34:59 WIN ollama[243]: time=2024-10-25T16:34:59.593+03:00 level=INFO source=gpu.go:221 msg="looking for compatible GPUs" Oct 25 16:35:00 WIN ollama[243]: time=2024-10-25T16:35:00.897+03:00 level=INFO source=gpu.go:606 msg="no nvidia devices detected by library /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01" When "no nvidia devices detected" is reported, the Windows Task Manager does not display GPU RAM consumption or 3D usage (CUDA sensor in Windows 11). Now everything is working correctly. after Fixed: no usage ![{CD01336C-5481-4E25-B0D3-BFADD2EDD68A}](https://github.com/user-attachments/assets/3b556642-2659-4e26-8316-49996cde5431) after Fixed: usage ![{D29ED908-06B0-4A33-80AC-F0EB88D278B4}](https://github.com/user-attachments/assets/1d662d68-dbee-4318-8540-20f393b415b3)
Author
Owner

@dhiltgen commented on GitHub (Oct 25, 2024):

Happy to hear you got it working!

The rocm image is missing nvidia library support. In the future we may improve error reporting to make this easier to detect.

@dhiltgen commented on GitHub (Oct 25, 2024): Happy to hear you got it working! The rocm image is missing nvidia library support. In the future we may improve error reporting to make this easier to detect.
Author
Owner

@chiehpower commented on GitHub (Oct 26, 2024):

Thank you all!
It works fine now!

@chiehpower commented on GitHub (Oct 26, 2024): Thank you all! It works fine now!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#4671