[GH-ISSUE #9885] Ollama not using GPU though CUDA driver exists #6471

Closed
opened 2026-04-12 18:02:09 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @sivag-csod on GitHub (Mar 19, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9885

What is the issue?

I am using AWS g5.4xlarge instance to run on AWS

ollama ps
NAME          ID              SIZE     PROCESSOR    UNTIL
gemma3:12b    6fd036cefda5    13 GB    100% GPU     4 minutes from now
top
top - 09:58:01 up 2 days, 16 min,  2 users,  load average: 1.30, 0.35, 1.14
Tasks: 260 total,   1 running, 259 sleeping,   0 stopped,   0 zombie
%Cpu(s): 50.0 us,  0.4 sy,  0.0 ni, 49.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  63592.1 total,   7811.2 free,  21319.2 used,  34461.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  41548.0 avail Mem
nvidia-smi
Wed Mar 19 09:58:12 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03             Driver Version: 550.144.03     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10G                    On  |   00000000:00:1E.0 Off |                    0 |
|  0%   18C    P8             15W /  300W |       4MiB /  23028MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Relevant log output


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.6.1

Originally created by @sivag-csod on GitHub (Mar 19, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9885 ### What is the issue? I am using AWS g5.4xlarge instance to run on AWS ``` ollama ps NAME ID SIZE PROCESSOR UNTIL gemma3:12b 6fd036cefda5 13 GB 100% GPU 4 minutes from now ``` ``` top top - 09:58:01 up 2 days, 16 min, 2 users, load average: 1.30, 0.35, 1.14 Tasks: 260 total, 1 running, 259 sleeping, 0 stopped, 0 zombie %Cpu(s): 50.0 us, 0.4 sy, 0.0 ni, 49.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63592.1 total, 7811.2 free, 21319.2 used, 34461.8 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 41548.0 avail Mem ``` ``` nvidia-smi Wed Mar 19 09:58:12 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | | 0% 18C P8 15W / 300W | 4MiB / 23028MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ ``` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.1
GiteaMirror added the bug label 2026-04-12 18:02:09 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 19, 2025):

Server logs will aid in debugging.

<!-- gh-comment-id:2736101357 --> @rick-github commented on GitHub (Mar 19, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.
Author
Owner

@sivag-csod commented on GitHub (Mar 19, 2025):

Server logs will aid in debugging.
@rick-github

log.txt

<!-- gh-comment-id:2736155045 --> @sivag-csod commented on GitHub (Mar 19, 2025): > [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging. @rick-github [log.txt](https://github.com/user-attachments/files/19338875/log.txt)
Author
Owner

@rick-github commented on GitHub (Mar 19, 2025):

Mar 19 09:57:29 ip-10-50-153-71 ollama[28834]: time=2025-03-19T09:57:29.254Z level=INFO source=ggml.go:109 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)

No GPU or vector CPU backends found. How did you install ollama? What's the output of

find /usr/local/lib/ollama /usr/lib/ollama
<!-- gh-comment-id:2736234548 --> @rick-github commented on GitHub (Mar 19, 2025): ``` Mar 19 09:57:29 ip-10-50-153-71 ollama[28834]: time=2025-03-19T09:57:29.254Z level=INFO source=ggml.go:109 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc) ``` No GPU or vector CPU backends found. How did you install ollama? What's the output of ``` find /usr/local/lib/ollama /usr/lib/ollama ```
Author
Owner

@sivag-csod commented on GitHub (Mar 19, 2025):

@rick-github cc @vaibhavi-csod
Installation steps

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git unzip docker.io python3-pip build-essential
curl -fsSL https://ollama.com/install.sh | sh
find /usr/local/lib/ollama
/usr/local/lib/ollama
/usr/local/lib/ollama/cuda_v11
/usr/local/lib/ollama/cuda_v11/libggml-cuda.so 
<!-- gh-comment-id:2736268189 --> @sivag-csod commented on GitHub (Mar 19, 2025): @rick-github cc @vaibhavi-csod Installation steps ``` sudo apt update && sudo apt upgrade -y sudo apt install -y curl git unzip docker.io python3-pip build-essential curl -fsSL https://ollama.com/install.sh | sh ``` ``` find /usr/local/lib/ollama /usr/local/lib/ollama /usr/local/lib/ollama/cuda_v11 /usr/local/lib/ollama/cuda_v11/libggml-cuda.so ```
Author
Owner

@rick-github commented on GitHub (Mar 19, 2025):

You are missing a bunch of libraries, this is what /usr/local/lib/ollama should look like:

/usr/local/lib/ollama
/usr/local/lib/ollama/cuda_v11
/usr/local/lib/ollama/cuda_v11/libcublasLt.so.11
/usr/local/lib/ollama/cuda_v11/libcublasLt.so.11.5.1.109
/usr/local/lib/ollama/cuda_v11/libcublas.so.11
/usr/local/lib/ollama/cuda_v11/libcublas.so.11.5.1.109
/usr/local/lib/ollama/cuda_v11/libcudart.so.11.0
/usr/local/lib/ollama/cuda_v11/libcudart.so.11.3.109
/usr/local/lib/ollama/cuda_v11/libggml-cuda.so
/usr/local/lib/ollama/cuda_v12
/usr/local/lib/ollama/cuda_v12/libcublasLt.so.12
/usr/local/lib/ollama/cuda_v12/libcublasLt.so.12.8.4.1
/usr/local/lib/ollama/cuda_v12/libcublas.so.12
/usr/local/lib/ollama/cuda_v12/libcublas.so.12.8.4.1
/usr/local/lib/ollama/cuda_v12/libcudart.so.12
/usr/local/lib/ollama/cuda_v12/libcudart.so.12.8.90
/usr/local/lib/ollama/cuda_v12/libggml-cuda.so
/usr/local/lib/ollama/libggml-base.so
/usr/local/lib/ollama/libggml-cpu-alderlake.so
/usr/local/lib/ollama/libggml-cpu-haswell.so
/usr/local/lib/ollama/libggml-cpu-icelake.so
/usr/local/lib/ollama/libggml-cpu-sandybridge.so
/usr/local/lib/ollama/libggml-cpu-skylakex.so

Were any errors emitted during the install?

Are you building this inside a docker container? If so, it might be easier to just inherit from the official image.

<!-- gh-comment-id:2736301469 --> @rick-github commented on GitHub (Mar 19, 2025): You are missing a bunch of libraries, this is what /usr/local/lib/ollama should look like: ``` /usr/local/lib/ollama /usr/local/lib/ollama/cuda_v11 /usr/local/lib/ollama/cuda_v11/libcublasLt.so.11 /usr/local/lib/ollama/cuda_v11/libcublasLt.so.11.5.1.109 /usr/local/lib/ollama/cuda_v11/libcublas.so.11 /usr/local/lib/ollama/cuda_v11/libcublas.so.11.5.1.109 /usr/local/lib/ollama/cuda_v11/libcudart.so.11.0 /usr/local/lib/ollama/cuda_v11/libcudart.so.11.3.109 /usr/local/lib/ollama/cuda_v11/libggml-cuda.so /usr/local/lib/ollama/cuda_v12 /usr/local/lib/ollama/cuda_v12/libcublasLt.so.12 /usr/local/lib/ollama/cuda_v12/libcublasLt.so.12.8.4.1 /usr/local/lib/ollama/cuda_v12/libcublas.so.12 /usr/local/lib/ollama/cuda_v12/libcublas.so.12.8.4.1 /usr/local/lib/ollama/cuda_v12/libcudart.so.12 /usr/local/lib/ollama/cuda_v12/libcudart.so.12.8.90 /usr/local/lib/ollama/cuda_v12/libggml-cuda.so /usr/local/lib/ollama/libggml-base.so /usr/local/lib/ollama/libggml-cpu-alderlake.so /usr/local/lib/ollama/libggml-cpu-haswell.so /usr/local/lib/ollama/libggml-cpu-icelake.so /usr/local/lib/ollama/libggml-cpu-sandybridge.so /usr/local/lib/ollama/libggml-cpu-skylakex.so ``` Were any errors emitted during the install? Are you building this inside a docker container? If so, it might be easier to just inherit from the official image.
Author
Owner

@vaibhavi-csod commented on GitHub (Mar 19, 2025):

No, @rick-github , there were no errors during the installation of Ollama using the following two commands:

sudo apt install -y curl git unzip docker.io python3-pip build-essential
curl -fsSL https://ollama.com/install.sh | sh

Also @sivag-csod now has full root permissions in case any modifications or reinstallation are needed on the machine.

<!-- gh-comment-id:2736318827 --> @vaibhavi-csod commented on GitHub (Mar 19, 2025): No, @rick-github , there were no errors during the installation of Ollama using the following two commands: sudo apt install -y curl git unzip docker.io python3-pip build-essential curl -fsSL https://ollama.com/install.sh | sh Also @sivag-csod now has full root permissions in case any modifications or reinstallation are needed on the machine.
Author
Owner

@sivag-csod commented on GitHub (Mar 19, 2025):

@rick-github Could you please provide steps to remove setup and do re-installation again ?

<!-- gh-comment-id:2736507284 --> @sivag-csod commented on GitHub (Mar 19, 2025): @rick-github Could you please provide steps to remove setup and do re-installation again ?
Author
Owner

@rick-github commented on GitHub (Mar 19, 2025):

https://github.com/ollama/ollama/blob/main/docs/linux.md#uninstall

<!-- gh-comment-id:2736512020 --> @rick-github commented on GitHub (Mar 19, 2025): https://github.com/ollama/ollama/blob/main/docs/linux.md#uninstall
Author
Owner

@OscarBuilds commented on GitHub (Mar 20, 2025):

@rick-github Thank you! Was trying to fix this for forever. I didn't read the docs and improperly installed Ollama with pacman/yay, which doesn't install all of the libraries alongside Ollama.

<!-- gh-comment-id:2738893689 --> @OscarBuilds commented on GitHub (Mar 20, 2025): @rick-github Thank you! Was trying to fix this for forever. I didn't read the docs and improperly installed Ollama with pacman/yay, which doesn't install all of the libraries alongside Ollama.
Author
Owner

@sivag-csod commented on GitHub (Mar 20, 2025):

@rick-github Worked. Now inference became fast. Thanks for you help :)

<!-- gh-comment-id:2739393447 --> @sivag-csod commented on GitHub (Mar 20, 2025): @rick-github Worked. Now inference became fast. Thanks for you help :)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6471