[GH-ISSUE #6961] UNABLE TO USE GPU FOR OLLAMA MODELS #4407

Closed
opened 2026-04-12 15:20:51 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @Paramjethwa on GitHub (Sep 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6961

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

ollama is not utilizing GPU

this is what i get in Ubuntu terminal

[+] Running 2/0
 ✔ Container local_multimodal_ai-ollama-1  Created                                                                 0.0s
 ✔ Container local_multimodal_ai-app-1     Created                                                                 0.0s
Attaching to app-1, ollama-1
ollama-1  | 2024/09/25 17:46:47 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:15m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
ollama-1  | time=2024-09-25T17:46:47.254Z level=INFO source=images.go:753 msg="total blobs: 28"
ollama-1  | time=2024-09-25T17:46:47.320Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
ollama-1  | time=2024-09-25T17:46:47.391Z level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.10)"
ollama-1  | time=2024-09-25T17:46:47.394Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama80602225/runners
app-1     |
app-1     | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
app-1     |
app-1     |
app-1     |   You can now view your Streamlit app in your browser.
app-1     |
app-1     |   Local URL: http://localhost:8501
app-1     |   Network URL: http://172.18.0.3:8501
app-1     |   External URL: http://103.110.166.152:8501
app-1     |
ollama-1  | time=2024-09-25T17:46:55.459Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
ollama-1  | time=2024-09-25T17:46:55.460Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
ollama-1  | time=2024-09-25T17:46:55.538Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/x86_64-linux-gnu/libcuda.so.1 error="cuda driver library init failure: 500"
ollama-1  | time=2024-09-25T17:46:55.539Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/wsl/drivers/nvaci.inf_amd64_bcb4d5d133099d13/libcuda.so.1.1 error="cuda driver library init failure: 500"
ollama-1  | time=2024-09-25T17:46:55.552Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"
ollama-1  | time=2024-09-25T17:46:55.552Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="7.6 GiB" available="6.5 GiB"
^CGracefully stopping... (press Ctrl+C again to force)
[+] Stopping 2/2
 ✔ Container local_multimodal_ai-app-1     Stopped                                                                10.4s
 ✔ Container local_multimodal_ai-ollama-1  Stopped                                                                 0.6s
canceled
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ sudo systemctl restart docker
[sudo] password for paramubuntu:
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose down
[+] Running 3/3
 ✔ Container local_multimodal_ai-app-1     Removed                                                                 0.0s
 ✔ Container local_multimodal_ai-ollama-1  Removed                                                                 0.0s
 ✔ Network local_multimodal_ai_default     Removed                                                                 0.5s
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ sudo systemctl restart docker
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up --gpus all
unknown flag: --gpus
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up --runtime=nvidia
unknown flag: --runtime
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker run --gpus all
"docker run" requires at least 1 argument.
See 'docker run --help'.

Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Create and run a new container from an image
paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up
[+] Running 3/1
 ✔ Network local_multimodal_ai_default     Created                                                                 0.1s
 ✔ Container local_multimodal_ai-ollama-1  Created                                                                 0.0s
 ✔ Container local_multimodal_ai-app-1     Created                                                                 0.0s
Attaching to app-1, ollama-1
ollama-1  | 2024/09/25 17:51:02 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:15m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
ollama-1  | time=2024-09-25T17:51:02.311Z level=INFO source=images.go:753 msg="total blobs: 28"
ollama-1  | time=2024-09-25T17:51:02.375Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
ollama-1  | time=2024-09-25T17:51:02.435Z level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.10)"
ollama-1  | time=2024-09-25T17:51:02.437Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1976662840/runners
app-1     |
app-1     | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
app-1     |
app-1     |
app-1     |   You can now view your Streamlit app in your browser.
app-1     |
app-1     |   Local URL: http://localhost:8501
app-1     |   Network URL: http://172.18.0.3:8501
app-1     |   External URL: http://103.110.166.152:8501
app-1     |
ollama-1  | time=2024-09-25T17:51:10.557Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11 cuda_v12 rocm_v60102 cpu cpu_avx]"
ollama-1  | time=2024-09-25T17:51:10.558Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
ollama-1  | time=2024-09-25T17:51:10.631Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/x86_64-linux-gnu/libcuda.so.1 error="cuda driver library init failure: 500"
ollama-1  | time=2024-09-25T17:51:10.632Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/wsl/drivers/nvaci.inf_amd64_bcb4d5d133099d13/libcuda.so.1.1 error="cuda driver library init failure: 500"
ollama-1  | time=2024-09-25T17:51:10.646Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"
ollama-1  | time=2024-09-25T17:51:10.647Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="7.6 GiB" available="6.2 GiB"
ollama-1  | [GIN] 2024/09/25 - 17:51:11 | 200 |    67.01424ms |      172.18.0.3 | GET      "/api/tags"

I am using WSL2 with Docker to run a stream lit app of chat application

OS

Windows, Docker, WSL2

GPU

Nvidia

CPU

Intel

Ollama version

0.3.11

Originally created by @Paramjethwa on GitHub (Sep 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6961 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? ollama is not utilizing GPU this is what i get in Ubuntu terminal ``` [+] Running 2/0 ✔ Container local_multimodal_ai-ollama-1 Created 0.0s ✔ Container local_multimodal_ai-app-1 Created 0.0s Attaching to app-1, ollama-1 ollama-1 | 2024/09/25 17:46:47 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:15m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" ollama-1 | time=2024-09-25T17:46:47.254Z level=INFO source=images.go:753 msg="total blobs: 28" ollama-1 | time=2024-09-25T17:46:47.320Z level=INFO source=images.go:760 msg="total unused blobs removed: 0" ollama-1 | time=2024-09-25T17:46:47.391Z level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.10)" ollama-1 | time=2024-09-25T17:46:47.394Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama80602225/runners app-1 | app-1 | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false. app-1 | app-1 | app-1 | You can now view your Streamlit app in your browser. app-1 | app-1 | Local URL: http://localhost:8501 app-1 | Network URL: http://172.18.0.3:8501 app-1 | External URL: http://103.110.166.152:8501 app-1 | ollama-1 | time=2024-09-25T17:46:55.459Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]" ollama-1 | time=2024-09-25T17:46:55.460Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs" ollama-1 | time=2024-09-25T17:46:55.538Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/x86_64-linux-gnu/libcuda.so.1 error="cuda driver library init failure: 500" ollama-1 | time=2024-09-25T17:46:55.539Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/wsl/drivers/nvaci.inf_amd64_bcb4d5d133099d13/libcuda.so.1.1 error="cuda driver library init failure: 500" ollama-1 | time=2024-09-25T17:46:55.552Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered" ollama-1 | time=2024-09-25T17:46:55.552Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="7.6 GiB" available="6.5 GiB" ^CGracefully stopping... (press Ctrl+C again to force) [+] Stopping 2/2 ✔ Container local_multimodal_ai-app-1 Stopped 10.4s ✔ Container local_multimodal_ai-ollama-1 Stopped 0.6s canceled paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ sudo systemctl restart docker [sudo] password for paramubuntu: paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose down [+] Running 3/3 ✔ Container local_multimodal_ai-app-1 Removed 0.0s ✔ Container local_multimodal_ai-ollama-1 Removed 0.0s ✔ Network local_multimodal_ai_default Removed 0.5s paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ sudo systemctl restart docker paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up --gpus all unknown flag: --gpus paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up --runtime=nvidia unknown flag: --runtime paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker run --gpus all "docker run" requires at least 1 argument. See 'docker run --help'. Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...] Create and run a new container from an image paramubuntu@LAPTOP-AF3LO3NQ:/mnt/c/Users/Param Jethwa/Desktop/local_multimodal_ai$ docker compose up [+] Running 3/1 ✔ Network local_multimodal_ai_default Created 0.1s ✔ Container local_multimodal_ai-ollama-1 Created 0.0s ✔ Container local_multimodal_ai-app-1 Created 0.0s Attaching to app-1, ollama-1 ollama-1 | 2024/09/25 17:51:02 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:15m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" ollama-1 | time=2024-09-25T17:51:02.311Z level=INFO source=images.go:753 msg="total blobs: 28" ollama-1 | time=2024-09-25T17:51:02.375Z level=INFO source=images.go:760 msg="total unused blobs removed: 0" ollama-1 | time=2024-09-25T17:51:02.435Z level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.10)" ollama-1 | time=2024-09-25T17:51:02.437Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1976662840/runners app-1 | app-1 | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false. app-1 | app-1 | app-1 | You can now view your Streamlit app in your browser. app-1 | app-1 | Local URL: http://localhost:8501 app-1 | Network URL: http://172.18.0.3:8501 app-1 | External URL: http://103.110.166.152:8501 app-1 | ollama-1 | time=2024-09-25T17:51:10.557Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11 cuda_v12 rocm_v60102 cpu cpu_avx]" ollama-1 | time=2024-09-25T17:51:10.558Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs" ollama-1 | time=2024-09-25T17:51:10.631Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/x86_64-linux-gnu/libcuda.so.1 error="cuda driver library init failure: 500" ollama-1 | time=2024-09-25T17:51:10.632Z level=INFO source=gpu.go:568 msg="unable to load cuda driver library" library=/usr/lib/wsl/drivers/nvaci.inf_amd64_bcb4d5d133099d13/libcuda.so.1.1 error="cuda driver library init failure: 500" ollama-1 | time=2024-09-25T17:51:10.646Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered" ollama-1 | time=2024-09-25T17:51:10.647Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="7.6 GiB" available="6.2 GiB" ollama-1 | [GIN] 2024/09/25 - 17:51:11 | 200 | 67.01424ms | 172.18.0.3 | GET "/api/tags" ``` I am using WSL2 with Docker to run a stream lit app of chat application ### OS Windows, Docker, WSL2 ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.11
GiteaMirror added the dockerwslquestionnvidia labels 2026-04-12 15:20:51 -05:00
Author
Owner

@dhiltgen commented on GitHub (Sep 25, 2024):

cuda driver library init failure: 500 indicates the container runtime isn't able to talk to the driver correctly.

    /**
     * This indicates that a named symbol was not found. Examples of symbols
     * are global/constant variable names, driver function names, texture names,
     * and surface names.
     */
    CUDA_ERROR_NOT_FOUND                      = 500,

Since you're running in WSL, there's likely some configuration problem between the versions installed in the Ubuntu system, and/or container runtime.

I would suggest making sure the GPU is accessible from WSL first, and troubleshoot that layer. Once that is confirmed to work, then look at the container layer.

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

<!-- gh-comment-id:2375025622 --> @dhiltgen commented on GitHub (Sep 25, 2024): `cuda driver library init failure: 500` indicates the container runtime isn't able to talk to the driver correctly. ``` /** * This indicates that a named symbol was not found. Examples of symbols * are global/constant variable names, driver function names, texture names, * and surface names. */ CUDA_ERROR_NOT_FOUND = 500, ``` Since you're running in WSL, there's likely some configuration problem between the versions installed in the Ubuntu system, and/or container runtime. I would suggest making sure the GPU is accessible from WSL first, and troubleshoot that layer. Once that is confirmed to work, then look at the container layer. https://docs.nvidia.com/cuda/wsl-user-guide/index.html
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4407