[GH-ISSUE #684] WSL2 Ubuntu 22.04 GPU "CUDA error 100" ggml-cuda.cu:5522 ggml-cuda.cu:4883 no CUDA-capable device is detected #26071

Closed
opened 2026-04-22 01:58:23 -05:00 by GiteaMirror · 14 comments
Owner

Originally created by @iamexe on GitHub (Oct 2, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/684

Thank you so much for ollama and the wsl2 support,
I already wrote a vuejs frontend and it works great with CPU.

I want GPU on WSL.

I installed CUDA like recomended from nvidia with wsl2 (cuda on windows).

I ran the following:

go generate ./...
go build .

I got a ollama that runs with CPU but not with GPU.

In journalctl | grep cuda I see:
/home/y/Dev/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5522: no CUDA-capable device is detected

Everytime I run any model in ollama I get that error. I tried with mistral and with my own gpu-mistral that had num_gpu 50 and the same with num_gpu 1000. It doesn't matter. I am able to create the models with num_gpu.

When I run ollama/llm/llama.cpp/gguf/build/cuda/bin/server or ggml/...../server it used to have the same error I am still facing with ollama now.

I adjusted my environmennt variables and now the error doesn't show anymore for those built "server" binaries. ollama still shows the error.

I adjusted my enviroment variables like this:

cat /etc/*/environment_variables.sh

export CUDA_PATH="/usr/local/cuda-12.2/bin"
export LD_LIBRARY_PATH="/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu"
export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}

It works fine for the server binaries of gguf and ggml (the error 100 was no longer present there).

Info: There is no more text from ehre on apart of console commands and their output.

./Dev/ollama/llm/llama.cpp/gguf/build/cuda/bin/server

ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 2060 with Max-Q Design, compute capability 7.5
{"timestamp":1696288901,"level":"INFO","function":"main","line":1294,"message":"build info","build":1267,"commit":"bc9d3e3"}
{"timestamp":1696288901,"level":"INFO","function":"main","line":1296,"message":"system info","n_threads":8,"total_threads":16,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | "}
error loading model: failed to open models/7B/ggml-model-f16.gguf: No such file or directory
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model-f16.gguf'
{"timestamp":1696288901,"level":"ERROR","function":"loadModel","line":265,"message":"unable to load model","model":"models/7B/ggml-model-f16.gguf"}

./Dev/ollama/llm/llama.cpp/ggml/build/cuda/bin/server

ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 2060 with Max-Q Design, compute capability 7.5
{"timestamp":1696288885,"level":"INFO","function":"main","line":1190,"message":"build info","build":1009,"commit":"9e232f0"}
{"timestamp":1696288885,"level":"INFO","function":"main","line":1192,"message":"system info","n_threads":8,"total_threads":16,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | "}
error loading model: failed to open models/7B/ggml-model.bin: No such file or directory
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model.bin'
{"timestamp":1696288885,"level":"ERROR","function":"loadModel","line":261,"message":"unable to load model","model":"models/7B/ggml-model.bin"}

cmake --version
cmake version 3.27.6

go version
go version go1.21.1 linux/amd64

gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

echo $PATH (full env output with full path output at bottom)
/usr/local/cuda-12.2/bin
/usr/lib/wsl/lib
/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin
/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp
/mnt/c/program files/python311/scripts/
/mnt/c/program files/python311/
/mnt/c/program files/nvidia corporation/nvidia nvdlisr
/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common
/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.2/

Other relevant environment variables: (full env at bottom)
LD_LIBRARY_PATH=/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu
WSL2_GUI_APPS_ENABLED=1
WSL_DISTRO_NAME=Ubuntu-22.04
CUDA_PATH=/usr/local/cuda-12.2/bin

journalctl | grep cuda

Oct 03 01:10:56 c1 unknown: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link
Oct 03 01:12:30 c1 ollama[884]: CUDA error 100 at /home/y/Dev/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5522: no CUDA-capable device is detected
Oct 03 01:15:33 c1 ollama[1138]: CUDA error 100 at /home/y/Dev/ollama/llm/llama.cpp/ggml/ggml-cuda.cu:4883: no CUDA-capable device is detected

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

nvidia-smi
Tue Oct 3 01:18:06 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2060 ... On | 00000000:01:00.0 Off | N/A |
| N/A 54C P8 4W / 65W | 12MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

env

SHELL=/bin/bash
NVM_INC=/home/y/.nvm/versions/node/v20.6.1/include/node
WSL2_GUI_APPS_ENABLED=1
WSL_DISTRO_NAME=Ubuntu-22.04
NAME=c1
PWD=/home/y
LOGNAME=y
HOME=/home/y
LANG=C.UTF-8
WSL_INTEROP=/run/WSL/391_interop
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
WAYLAND_DISPLAY=wayland-0
NVM_DIR=/home/y/.nvm
LESSCLOSE=/usr/bin/lesspipe %s %s
TERM=xterm-256color
LESSOPEN=| /usr/bin/lesspipe %s
USER=y
CUDA_PATH=/usr/local/cuda-12.2/bin
DISPLAY=:0
SHLVL=1
NVM_CD_FLAGS=
LD_LIBRARY_PATH=/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu
XDG_RUNTIME_DIR=/run/user/1000/
WSLENV=
XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop
PATH=/home/y/.local/bin:/home/y/.nvm/versions/node/v20.6.1/bin:/usr/local/cuda-12.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/WindowsApps/CanonicalGroupLimited.Ubuntu22.04LTS_2204.2.47.0_x64__79rhkp1fndgsc:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/program files/python311/scripts/:/mnt/c/program files/python311/:/mnt/c/program files/common files/oracle/java/javapath:/mnt/c/windows/system32:/mnt/c/windows:/mnt/c/windows/system32/wbem:/mnt/c/windows/system32/windowspowershell/v1.0/:/mnt/c/windows/system32/openssh/:/mnt/c/program files/dotnet/:/mnt/c/programdata/chocolatey/bin:/mnt/c/program files/microsoft vs code/bin:/mnt/c/program files/putty/:/mnt/c/program files/nvidia corporation/nvidia nvdlisr:/mnt/c/program files (x86)/vim/vim82/:/mnt/c/windows/system32/openssh/:/mnt/c/program files/nodejs/:/mnt/c/program files/process lasso/:/mnt/c/Program Files/PowerShell/7/:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.2/:/mnt/c/Users/User/AppData/Local/Microsoft/WindowsApps:/mnt/c/Windows/Microsoft.NET/Framework/v4.0.30319/:/mnt/c/Program Files (x86)/Vim/vim82/vim.exe:/mnt/c/Program Files/Java/jdk-19/bin/java.exe:/mnt/c/Users/User/AppData/Roaming/npm:/mnt/c/Users/User/AppData/Local/GitHubDesktop/bin:/mnt/c/Program Files (x86)/Nmap:/snap/bin
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
NVM_BIN=/home/y/.nvm/versions/node/v20.6.1/bin
HOSTTYPE=x86_64
PULSE_SERVER=unix:/mnt/wslg/PulseServer
_=/usr/bin/env

So am I missing something? Thank you for any hints!

Originally created by @iamexe on GitHub (Oct 2, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/684 Thank you so much for ollama and the wsl2 support, I already wrote a vuejs frontend and it works great with CPU. I want GPU on WSL. I installed CUDA like recomended from nvidia with wsl2 (cuda on windows). I ran the following: go generate ./... go build . I got a ollama that runs with CPU but not with GPU. In journalctl | grep cuda I see: /home/y/Dev/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5522: no CUDA-capable device is detected Everytime I run any model in ollama I get that error. I tried with mistral and with my own gpu-mistral that had num_gpu 50 and the same with num_gpu 1000. It doesn't matter. I am able to create the models with num_gpu. When I run ollama/llm/llama.cpp/gguf/build/cuda/bin/server or ggml/...../server it used to have the same error I am still facing with ollama now. I adjusted my environmennt variables and now the error doesn't show anymore for those built "server" binaries. ollama still shows the error. I adjusted my enviroment variables like this: **cat /etc/*/environment_variables.sh** ``` export CUDA_PATH="/usr/local/cuda-12.2/bin" export LD_LIBRARY_PATH="/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu" export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}} ``` It works fine for the server binaries of gguf and ggml (the error 100 was no longer present there). Info: There is no more text from ehre on apart of console commands and their output. **./Dev/ollama/llm/llama.cpp/gguf/build/cuda/bin/server** ``` ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 2060 with Max-Q Design, compute capability 7.5 {"timestamp":1696288901,"level":"INFO","function":"main","line":1294,"message":"build info","build":1267,"commit":"bc9d3e3"} {"timestamp":1696288901,"level":"INFO","function":"main","line":1296,"message":"system info","n_threads":8,"total_threads":16,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | "} error loading model: failed to open models/7B/ggml-model-f16.gguf: No such file or directory llama_load_model_from_file: failed to load model llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model-f16.gguf' {"timestamp":1696288901,"level":"ERROR","function":"loadModel","line":265,"message":"unable to load model","model":"models/7B/ggml-model-f16.gguf"} ``` **./Dev/ollama/llm/llama.cpp/ggml/build/cuda/bin/server** ``` ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 2060 with Max-Q Design, compute capability 7.5 {"timestamp":1696288885,"level":"INFO","function":"main","line":1190,"message":"build info","build":1009,"commit":"9e232f0"} {"timestamp":1696288885,"level":"INFO","function":"main","line":1192,"message":"system info","n_threads":8,"total_threads":16,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | "} error loading model: failed to open models/7B/ggml-model.bin: No such file or directory llama_load_model_from_file: failed to load model llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model.bin' {"timestamp":1696288885,"level":"ERROR","function":"loadModel","line":261,"message":"unable to load model","model":"models/7B/ggml-model.bin"} ``` **cmake --version** cmake version 3.27.6 **go version** go version go1.21.1 linux/amd64 **gcc --version** gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 **echo $PATH** (full env output with full path output at bottom) /usr/local/cuda-12.2/bin /usr/lib/wsl/lib /mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin /mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp /mnt/c/program files/python311/scripts/ /mnt/c/program files/python311/ /mnt/c/program files/nvidia corporation/nvidia nvdlisr /mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common /mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.2/ **Other relevant environment variables:** (full env at bottom) LD_LIBRARY_PATH=/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu WSL2_GUI_APPS_ENABLED=1 WSL_DISTRO_NAME=Ubuntu-22.04 CUDA_PATH=/usr/local/cuda-12.2/bin **journalctl | grep cuda** ``` Oct 03 01:10:56 c1 unknown: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link Oct 03 01:12:30 c1 ollama[884]: CUDA error 100 at /home/y/Dev/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5522: no CUDA-capable device is detected Oct 03 01:15:33 c1 ollama[1138]: CUDA error 100 at /home/y/Dev/ollama/llm/llama.cpp/ggml/ggml-cuda.cu:4883: no CUDA-capable device is detected ``` **nvcc --version** ``` nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:02:13_PDT_2023 Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0 ``` **nvidia-smi** Tue Oct 3 01:18:06 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 2060 ... On | 00000000:01:00.0 Off | N/A | | N/A 54C P8 4W / 65W | 12MiB / 6144MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ **env** ``` SHELL=/bin/bash NVM_INC=/home/y/.nvm/versions/node/v20.6.1/include/node WSL2_GUI_APPS_ENABLED=1 WSL_DISTRO_NAME=Ubuntu-22.04 NAME=c1 PWD=/home/y LOGNAME=y HOME=/home/y LANG=C.UTF-8 WSL_INTEROP=/run/WSL/391_interop LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: WAYLAND_DISPLAY=wayland-0 NVM_DIR=/home/y/.nvm LESSCLOSE=/usr/bin/lesspipe %s %s TERM=xterm-256color LESSOPEN=| /usr/bin/lesspipe %s USER=y CUDA_PATH=/usr/local/cuda-12.2/bin DISPLAY=:0 SHLVL=1 NVM_CD_FLAGS= LD_LIBRARY_PATH=/mnt/c/Windows/System32/lxss/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/lib/x86_64-linux-gnu XDG_RUNTIME_DIR=/run/user/1000/ WSLENV= XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop PATH=/home/y/.local/bin:/home/y/.nvm/versions/node/v20.6.1/bin:/usr/local/cuda-12.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/WindowsApps/CanonicalGroupLimited.Ubuntu22.04LTS_2204.2.47.0_x64__79rhkp1fndgsc:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/program files/python311/scripts/:/mnt/c/program files/python311/:/mnt/c/program files/common files/oracle/java/javapath:/mnt/c/windows/system32:/mnt/c/windows:/mnt/c/windows/system32/wbem:/mnt/c/windows/system32/windowspowershell/v1.0/:/mnt/c/windows/system32/openssh/:/mnt/c/program files/dotnet/:/mnt/c/programdata/chocolatey/bin:/mnt/c/program files/microsoft vs code/bin:/mnt/c/program files/putty/:/mnt/c/program files/nvidia corporation/nvidia nvdlisr:/mnt/c/program files (x86)/vim/vim82/:/mnt/c/windows/system32/openssh/:/mnt/c/program files/nodejs/:/mnt/c/program files/process lasso/:/mnt/c/Program Files/PowerShell/7/:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.2/:/mnt/c/Users/User/AppData/Local/Microsoft/WindowsApps:/mnt/c/Windows/Microsoft.NET/Framework/v4.0.30319/:/mnt/c/Program Files (x86)/Vim/vim82/vim.exe:/mnt/c/Program Files/Java/jdk-19/bin/java.exe:/mnt/c/Users/User/AppData/Roaming/npm:/mnt/c/Users/User/AppData/Local/GitHubDesktop/bin:/mnt/c/Program Files (x86)/Nmap:/snap/bin DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus NVM_BIN=/home/y/.nvm/versions/node/v20.6.1/bin HOSTTYPE=x86_64 PULSE_SERVER=unix:/mnt/wslg/PulseServer _=/usr/bin/env ``` So am I missing something? Thank you for any hints!
Author
Owner

@BruceMacD commented on GitHub (Oct 3, 2023):

Thanks for opening this issue.

There's a good amount going on here, I've got a couple suggestions to get us started on troubleshooting.

  1. Rather than building from source you should download the official release now that it is available:
curl https://ollama.ai/install.sh | sh
  1. Rather than increasing num_gpu try lowering it. If you try to load more layers into gpu that it can handle the gpu runner will crash and revert to CPU. For testing purposes try something really low, and see if that works.
<!-- gh-comment-id:1745703137 --> @BruceMacD commented on GitHub (Oct 3, 2023): Thanks for opening this issue. There's a good amount going on here, I've got a couple suggestions to get us started on troubleshooting. 1) Rather than building from source you should download the official release now that it is available: ``` curl https://ollama.ai/install.sh | sh ``` 2) Rather than increasing `num_gpu` try lowering it. If you try to load more layers into gpu that it can handle the gpu runner will crash and revert to CPU. For testing purposes try something really low, and see if that works.
Author
Owner

@iamexe commented on GitHub (Oct 3, 2023):

@BruceMacD Thank you. Installing the official release worked.

For future readers: I originally installed ollama but at that time I had not installed CUDA yet.
In my case I could just run the install script again regardless of my build activities.

<!-- gh-comment-id:1745894871 --> @iamexe commented on GitHub (Oct 3, 2023): @BruceMacD Thank you. Installing the official release worked. For future readers: I originally installed ollama but at that time I had not installed CUDA yet. In my case I could just run the install script again regardless of my build activities.
Author
Owner

@eshack94 commented on GitHub (Oct 21, 2023):

I'm having a similar issue, running WSL2 Debian (Bullseye).

<!-- gh-comment-id:1773742022 --> @eshack94 commented on GitHub (Oct 21, 2023): I'm having a similar issue, running WSL2 Debian (Bullseye).
Author
Owner

@eshack94 commented on GitHub (Oct 21, 2023):

@BruceMacD Any advice for those on Debian (WSL2)?

<!-- gh-comment-id:1773742142 --> @eshack94 commented on GitHub (Oct 21, 2023): @BruceMacD Any advice for those on Debian (WSL2)?
Author
Owner

@roychri commented on GitHub (Nov 14, 2023):

The recent version of Ollama doesn't detect my GPU but an older version does.
I get this no CUDA-capable device is detected with the version (0.1.9).
The older version is so old that ollama --version is not even supported so I can't tell which version it is!

<!-- gh-comment-id:1809344148 --> @roychri commented on GitHub (Nov 14, 2023): The recent version of Ollama doesn't detect my GPU but an older version does. I get this `no CUDA-capable device is detected` with the version (0.1.9). The older version is so old that `ollama --version` is not even supported so I can't tell which version it is!
Author
Owner

@iukea1 commented on GitHub (Dec 27, 2023):

getting this error also on a 4090 using WSL2 Ubuntu. Cuda tool kit is also installed

can get the gpu working when running in docker

<!-- gh-comment-id:1870056725 --> @iukea1 commented on GitHub (Dec 27, 2023): getting this error also on a 4090 using WSL2 Ubuntu. Cuda tool kit is also installed can get the gpu working when running in docker
Author
Owner

@noahhaon commented on GitHub (Jan 3, 2024):

Hitting this as well with latest version 0.1.17. Ubuntu, wsl2. nvidia-smi runs fine. tried running ollama serve both as root and non-root user.

CUDA error 100 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: no CUDA-capable device is detected
current device: 18037184
GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: !"CUDA error"
2024/01/03 17:02:08 llama.go:451: 100 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: no CUDA-capable device is detected
current device: 18037184
GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: !"CUDA error"
2024/01/03 17:02:08 llama.go:459: error starting llama runner: llama runner process has terminated
<!-- gh-comment-id:1875602891 --> @noahhaon commented on GitHub (Jan 3, 2024): Hitting this as well with latest version 0.1.17. Ubuntu, wsl2. nvidia-smi runs fine. tried running ollama serve both as root and non-root user. ``` CUDA error 100 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: no CUDA-capable device is detected current device: 18037184 GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: !"CUDA error" 2024/01/03 17:02:08 llama.go:451: 100 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: no CUDA-capable device is detected current device: 18037184 GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:495: !"CUDA error" 2024/01/03 17:02:08 llama.go:459: error starting llama runner: llama runner process has terminated ```
Author
Owner

@BruceMacD commented on GitHub (Jan 8, 2024):

Are people hitting this on CUDA 12? You can check your CUDA version in the nvidia-smi output. We target CUDA 12 in our releases so I'm wondering if this is a CUDA 11 problem.

<!-- gh-comment-id:1881506844 --> @BruceMacD commented on GitHub (Jan 8, 2024): Are people hitting this on CUDA 12? You can check your CUDA version in the `nvidia-smi` output. We target CUDA 12 in our releases so I'm wondering if this is a CUDA 11 problem.
Author
Owner

@mongolu commented on GitHub (Jan 8, 2024):

I'm using 12.3

<!-- gh-comment-id:1881588448 --> @mongolu commented on GitHub (Jan 8, 2024): I'm using 12.3
Author
Owner

@iamexe commented on GitHub (Jan 8, 2024):

Not sure if it helps but I can note that my ollama on wsl2 ubuntu worked
fine.

What I ended up doing is removing ollama and reinstalling it not by
building but simply with the script on the website.

Maybe the error arrises if some CUDA components are installed after ollama?

Good luck!

On Mon, 8 Jan 2024, 19:11 mongolu, @.***> wrote:

I'm using 12.3


Reply to this email directly, view it on GitHub
https://github.com/jmorganca/ollama/issues/684#issuecomment-1881588448,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AONY5LGTREWMRLSINRCHRS3YNQZD5AVCNFSM6AAAAAA5QEQ2PCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBRGU4DQNBUHA
.
You are receiving this because you modified the open/close state.Message
ID: @.***>

<!-- gh-comment-id:1881911006 --> @iamexe commented on GitHub (Jan 8, 2024): Not sure if it helps but I can note that my ollama on wsl2 ubuntu worked fine. What I ended up doing is removing ollama and reinstalling it not by building but simply with the script on the website. Maybe the error arrises if some CUDA components are installed after ollama? Good luck! On Mon, 8 Jan 2024, 19:11 mongolu, ***@***.***> wrote: > I'm using 12.3 > > — > Reply to this email directly, view it on GitHub > <https://github.com/jmorganca/ollama/issues/684#issuecomment-1881588448>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AONY5LGTREWMRLSINRCHRS3YNQZD5AVCNFSM6AAAAAA5QEQ2PCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBRGU4DQNBUHA> > . > You are receiving this because you modified the open/close state.Message > ID: ***@***.***> >
Author
Owner

@noahhaon commented on GitHub (Jan 11, 2024):

my nvidia-smi output here.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.33.01              Driver Version: 546.29       CUDA Version: 12.3     |

WSL:

$ uname -a
Linux XXXX 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 21:02:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/issue
Ubuntu 20.04.5 LTS \n \l
<!-- gh-comment-id:1887386343 --> @noahhaon commented on GitHub (Jan 11, 2024): my `nvidia-smi` output here. ``` +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.33.01 Driver Version: 546.29 CUDA Version: 12.3 | ``` WSL: ``` $ uname -a Linux XXXX 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 21:02:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/issue Ubuntu 20.04.5 LTS \n \l ```
Author
Owner

@m4ttgit commented on GitHub (Jan 15, 2024):

I had the same issue on WSL2 but on Ubuntu 20.04 LTS. CUDA Version: 12.3.

Update to ollama version 0.1.20 and it should be fixed. [But my GPU is too old to be useful]

2024/01/15 17:37:47 gpu.go:88: Detecting GPU type
2024/01/15 17:37:47 gpu.go:203: Searching for GPU management library libnvidia-ml.so
2024/01/15 17:37:47 gpu.go:248: Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00 /usr/lib/wsl/lib/libnvidia-ml.so.1]
2024/01/15 17:37:47 gpu.go:259: Unable to load CUDA management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00: nvml vram init failure: 9
2024/01/15 17:37:48 gpu.go:94: Nvidia GPU detected
2024/01/15 17:37:48 gpu.go:138: CUDA GPU is too old. Falling back to CPU mode. Compute Capability detected: 5.0
2024/01/15 17:37:48 routes.go:953: no GPU detected

<!-- gh-comment-id:1891756712 --> @m4ttgit commented on GitHub (Jan 15, 2024): I had the same issue on WSL2 but on Ubuntu 20.04 LTS. CUDA Version: 12.3. **Update to ollama version 0.1.20 and it should be fixed.** [But my GPU is too old to be useful] 2024/01/15 17:37:47 gpu.go:88: Detecting GPU type 2024/01/15 17:37:47 gpu.go:203: Searching for GPU management library libnvidia-ml.so 2024/01/15 17:37:47 gpu.go:248: Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00 /usr/lib/wsl/lib/libnvidia-ml.so.1] 2024/01/15 17:37:47 gpu.go:259: Unable to load CUDA management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00: nvml vram init failure: 9 2024/01/15 17:37:48 gpu.go:94: Nvidia GPU detected 2024/01/15 17:37:48 gpu.go:138: CUDA GPU is too old. Falling back to CPU mode. Compute Capability detected: 5.0 2024/01/15 17:37:48 routes.go:953: no GPU detected
Author
Owner

@tux-00 commented on GitHub (Jan 17, 2024):

I'm using ollama 0.1.20 with CUDA 12.3 and a 3080, same behaviour for me : no CUDA-capable device is detected.

<!-- gh-comment-id:1895432415 --> @tux-00 commented on GitHub (Jan 17, 2024): I'm using ollama 0.1.20 with CUDA 12.3 and a 3080, same behaviour for me : no CUDA-capable device is detected.
Author
Owner

@simonnxren commented on GitHub (Jan 21, 2024):

I'm using ollama 0.1.20 with CUDA 12.3 and a 3080, same behaviour for me : no CUDA-capable device is detected.

same here for 4090. Why closed this issue? I don't see a solution to it.

<!-- gh-comment-id:1902573747 --> @simonnxren commented on GitHub (Jan 21, 2024): > I'm using ollama 0.1.20 with CUDA 12.3 and a 3080, same behaviour for me : no CUDA-capable device is detected. same here for 4090. Why closed this issue? I don't see a solution to it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26071