[GH-ISSUE #1460] Getting the GPU running in WSL2? #784

Closed
opened 2026-04-12 10:28:01 -05:00 by GiteaMirror · 18 comments
Owner

Originally created by @gerroon on GitHub (Dec 11, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1460

Originally assigned to: @BruceMacD on GitHub.

Hi

I am running it under WSL2. It is telling me that it cant fing the GPU. Is anyone running it under WSL with GPU? I have a 3080.

>>> The Ollama API is now available at 0.0.0.0:11434.
>>> Install complete. Run "ollama" from the command line.
WARNING: No NVIDIA GPU detected. Ollama will run in CPU-only mode.
>>> The Ollama API is now available at 0.0.0.0:11434.
>>> Install complete. Run "ollama" from the command line.

Originally created by @gerroon on GitHub (Dec 11, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1460 Originally assigned to: @BruceMacD on GitHub. Hi I am running it under WSL2. It is telling me that it cant fing the GPU. Is anyone running it under WSL with GPU? I have a 3080. ``` >>> The Ollama API is now available at 0.0.0.0:11434. >>> Install complete. Run "ollama" from the command line. WARNING: No NVIDIA GPU detected. Ollama will run in CPU-only mode. >>> The Ollama API is now available at 0.0.0.0:11434. >>> Install complete. Run "ollama" from the command line. ```
Author
Owner

@igorschlum commented on GitHub (Dec 11, 2023):

Hi @gerroon Did you install the driver? Those links could help to install latest drivers.

https://learn.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl

https://developer.nvidia.com/cuda/wsl

https://sylabs.io/2022/03/wsl2-gpu/

https://askubuntu.com/questions/1252964/please-help-configuring-nvidia-smi-ubuntu-20-04-on-wsl-2

The commande
nvidia-smi
should display the GPU status and it will help to see if the issue is a configuration issue or an Ollama issue.

<!-- gh-comment-id:1849449824 --> @igorschlum commented on GitHub (Dec 11, 2023): Hi @gerroon Did you install the driver? Those links could help to install latest drivers. https://learn.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl https://developer.nvidia.com/cuda/wsl https://sylabs.io/2022/03/wsl2-gpu/ https://askubuntu.com/questions/1252964/please-help-configuring-nvidia-smi-ubuntu-20-04-on-wsl-2 The commande nvidia-smi should display the GPU status and it will help to see if the issue is a configuration issue or an Ollama issue.
Author
Owner

@gerroon commented on GitHub (Dec 11, 2023):

Hi

Thanks for your help. I believe I have CUDA running now, but it still complaints about it. It sounds like there is no "nvidia-smi" for wsl2. Maybe the code can check for it in another way.

023/12/11 17:43:55 images.go:732: total blobs: 6                                                                                                                                                                           2023/12/11 17:43:55 images.go:739: total unused blobs removed: 0                                                                                                                                                            2023/12/11 17:43:55 routes.go:843: Listening on 127.0.0.1:11434 (version 0.1.14)                                                                                                                                            2023/12/11 17:43:55 routes.go:863: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed                                                                                                                                                                                                                                 

Checking CUDA in the container.

./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 3080 Ti"
  CUDA Driver Version / Runtime Version          12.2 / 12.3
  CUDA Capability Major/Minor version number:    8.6
  Total amount of global memory:                 12288 MBytes (12884377600 bytes)
  (080) Multiprocessors, (128) CUDA Cores/MP:    10240 CUDA Cores
  GPU Max Clock rate:                            1665 MHz (1.66 GHz)
  Memory Clock rate:                             9501 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 6291456 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        102400 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.2, CUDA Runtime Version = 12.3, NumDevs = 1
Result = PASS

<!-- gh-comment-id:1851074647 --> @gerroon commented on GitHub (Dec 11, 2023): Hi Thanks for your help. I believe I have CUDA running now, but it still complaints about it. It sounds like there is no "**nvidia-smi**" for wsl2. Maybe the code can check for it in another way. ``` 023/12/11 17:43:55 images.go:732: total blobs: 6 2023/12/11 17:43:55 images.go:739: total unused blobs removed: 0 2023/12/11 17:43:55 routes.go:843: Listening on 127.0.0.1:11434 (version 0.1.14) 2023/12/11 17:43:55 routes.go:863: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed ``` Checking CUDA in the container. ``` ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA GeForce RTX 3080 Ti" CUDA Driver Version / Runtime Version 12.2 / 12.3 CUDA Capability Major/Minor version number: 8.6 Total amount of global memory: 12288 MBytes (12884377600 bytes) (080) Multiprocessors, (128) CUDA Cores/MP: 10240 CUDA Cores GPU Max Clock rate: 1665 MHz (1.66 GHz) Memory Clock rate: 9501 Mhz Memory Bus Width: 384-bit L2 Cache Size: 6291456 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 102400 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.2, CUDA Runtime Version = 12.3, NumDevs = 1 Result = PASS ```
Author
Owner

@BruceMacD commented on GitHub (Dec 12, 2023):

Typically nvidia-smi will be available in wsl2 if the nvidia drivers have been installed on the host system:
https://www.nvidia.com/Download/index.aspx

I'd recommend checking the nvidia drivers in windows and seeing if they are up to date. I'll take a look at seeing if we can package in deviceQuery like you used here. It would be nice to have something more reliable.

<!-- gh-comment-id:1852422647 --> @BruceMacD commented on GitHub (Dec 12, 2023): Typically `nvidia-smi` will be available in wsl2 if the nvidia drivers have been installed on the host system: https://www.nvidia.com/Download/index.aspx I'd recommend checking the nvidia drivers in windows and seeing if they are up to date. I'll take a look at seeing if we can package in deviceQuery like you used here. It would be nice to have something more reliable.
Author
Owner

@gerroon commented on GitHub (Dec 12, 2023):

@BruceMacD Thanks for the reply. I was able to install it properly after trying couple links.. I think this was the one helped me

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0

<!-- gh-comment-id:1852442293 --> @gerroon commented on GitHub (Dec 12, 2023): @BruceMacD Thanks for the reply. I was able to install it properly after trying couple links.. I think this was the one helped me https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0
Author
Owner

@ghost commented on GitHub (Dec 19, 2023):

Ollama cannot find the GPU no matter what I try:

/var/log/syslog:

routes.go:891: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed
$ which nvidia-smi
/usr/lib/wsl/lib/nvidia-smi
$ nvidia-smi
Tue Dec 19 06:31:34 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.36                 Driver Version: 546.33       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4080        On  | 00000000:01:00.0  On |                  N/A |
|  0%   48C    P5              35W / 320W |   1374MiB / 16376MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
$
$ cat /proc/version
Linux version 5.15.133.1-microsoft-standard-WSL2 (root@1c602f52c2e4) (gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Thu Oct 5 21:02:42 UTC 2023

Latest version of WSL2 kernel (just did wsl --update). All latest Windows updates.

Even tried installing cuda-toolkit-12-3 (WSL-Ubuntu), didn't affect anything.

<!-- gh-comment-id:1862154454 --> @ghost commented on GitHub (Dec 19, 2023): Ollama cannot find the GPU no matter what I try: `/var/log/syslog`: ``` routes.go:891: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed ``` ``` $ which nvidia-smi /usr/lib/wsl/lib/nvidia-smi $ nvidia-smi Tue Dec 19 06:31:34 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.36 Driver Version: 546.33 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4080 On | 00000000:01:00.0 On | N/A | | 0% 48C P5 35W / 320W | 1374MiB / 16376MiB | 2% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ $ ``` ``` $ cat /proc/version Linux version 5.15.133.1-microsoft-standard-WSL2 (root@1c602f52c2e4) (gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Thu Oct 5 21:02:42 UTC 2023 ``` Latest version of WSL2 kernel (just did `wsl --update`). All latest Windows updates. Even tried installing `cuda-toolkit-12-3` (WSL-Ubuntu), didn't affect anything.
Author
Owner

@mongolu commented on GitHub (Dec 19, 2023):

I am on Win11 with wsl2 and I run ollama in docker (built locally from Dockerfile) => it's using GPU.

<!-- gh-comment-id:1862160822 --> @mongolu commented on GitHub (Dec 19, 2023): I am on Win11 with wsl2 and I run ollama in docker (built locally from Dockerfile) => it's using GPU.
Author
Owner

@gerroon commented on GitHub (Dec 19, 2023):

@yurigeinish you might need to symlink nvidia-smi, if you installed it that should be in the sytem but it is not in the path by default.

<!-- gh-comment-id:1862164133 --> @gerroon commented on GitHub (Dec 19, 2023): @yurigeinish you might need to symlink nvidia-smi, if you installed it that should be in the sytem but it is not in the path by default.
Author
Owner

@ghost commented on GitHub (Dec 19, 2023):

@gerroon Looks like sudo ln -s $(which nvidia-smi) /usr/bin/ helped, thanks. At least I'm not seeing the related error in the logs anymore. And when Ollama generates an answer, first I get a CPU spike around 41% and several seconds later I'm getting 100% GPU usage while the answer is getting generated. I assume that's how it should be, thank you.

<!-- gh-comment-id:1862181745 --> @ghost commented on GitHub (Dec 19, 2023): @gerroon Looks like `sudo ln -s $(which nvidia-smi) /usr/bin/` helped, thanks. At least I'm not seeing the related error in the logs anymore. And when Ollama generates an answer, first I get a CPU spike around 41% and several seconds later I'm getting 100% GPU usage while the answer is getting generated. I assume that's how it should be, thank you.
Author
Owner

@KDVan commented on GitHub (Dec 19, 2023):

Hi I'm having trouble trying to make ollama (or maybe wsl) to utilizate my GPU. When I try to run the model, only the CPU spike up to 100%. Am I missing something, I have installed all necessary drivers for windows and ubuntu.

<!-- gh-comment-id:1862917555 --> @KDVan commented on GitHub (Dec 19, 2023): Hi I'm having trouble trying to make ollama (or maybe wsl) to utilizate my GPU. When I try to run the model, only the CPU spike up to 100%. Am I missing something, I have installed all necessary drivers for windows and ubuntu.
Author
Owner

@cadeon commented on GitHub (Dec 21, 2023):

<removed because I just realized I wrote this against the entirely wrong project. Sorry.>

<!-- gh-comment-id:1866486621 --> @cadeon commented on GitHub (Dec 21, 2023): `<removed because I just realized I wrote this against the entirely wrong project. Sorry.>`
Author
Owner

@siikdUde commented on GitHub (Dec 22, 2023):

I got ollama to start using my rtx 4090 by:

  1. Uninstalling Ubuntu
  2. Uninstalling WSL
  3. Reboot
  4. Installing WSL
  5. Installing Ubuntu
  6. (Crucial Part): Basically this is optional for you but it makes the process streamlined:
  • Installed oobabooga via the one click installer start_wsl.bat for WSL in my root folder.
  • Input all the values for my system and such (such as specifying I have an nvidia GPU) and it went ahead and downloaded all CUDA drivers, toolkit, pytorch and all other dependencies.
  • Again, this part is optional as it is for installing oobabooga, but as a welcomed side effect, it installed everything I needed to get Ollama working with my GPU. As a result, my GPU usage now is between 40% - 100% and CPU around 60% while the model is working. Before it was at 0% with my CPU being at around 70%.

Also, it installs the 12.1 version of the toolkit, which I believe is the one that works (at least for me). When I updated to 12.3, my GPU stopped working with Ollama, so be mindful of that.

Hope this helps anyone that comes across this thread.

<!-- gh-comment-id:1868097750 --> @siikdUde commented on GitHub (Dec 22, 2023): I got ollama to start using my rtx 4090 by: 1. Uninstalling Ubuntu 2. Uninstalling WSL 3. Reboot 4. Installing WSL 5. Installing Ubuntu 6. (Crucial Part): Basically this is optional for you but it makes the process streamlined: - Installed [oobabooga](https://github.com/oobabooga/text-generation-webui?tab=readme-ov-file#how-to-install) via the one click installer `start_wsl.bat` for WSL in my root folder. - Input all the values for my system and such (such as specifying I have an nvidia GPU) and it went ahead and downloaded all CUDA drivers, toolkit, pytorch and all other dependencies. - Again, this part is optional as it is for installing oobabooga, but as a welcomed side effect, it installed everything I needed to get Ollama working with my GPU. As a result, my GPU usage now is between 40% - 100% and CPU around 60% while the model is working. Before it was at 0% with my CPU being at around 70%. Also, it installs the 12.1 version of the toolkit, which I believe is the one that works (at least for me). When I updated to 12.3, my GPU stopped working with Ollama, so be mindful of that. Hope this helps anyone that comes across this thread.
Author
Owner

@mongolu commented on GitHub (Dec 22, 2023):

Uau!
A lot of long things you:ve done.
Glad that you sorted it out.

<!-- gh-comment-id:1868121925 --> @mongolu commented on GitHub (Dec 22, 2023): Uau! A lot of long things you:ve done. Glad that you sorted it out.
Author
Owner

@pai1234 commented on GitHub (Feb 8, 2024):

If you are running on WSL2 but only have built-in gpu (Intel R Iris R Xe Graphics)? Any idea on how to set this up on Ubuntu?

<!-- gh-comment-id:1933668550 --> @pai1234 commented on GitHub (Feb 8, 2024): If you are running on WSL2 but only have built-in gpu (Intel R Iris R Xe Graphics)? Any idea on how to set this up on Ubuntu?
Author
Owner

@gerroon commented on GitHub (Feb 8, 2024):

If you are running on WSL2 but only have built-in gpu (Intel R Iris R Xe Graphics)? Any idea on how to set this up on Ubuntu?

I do not think these models or IGPUs support this. You need an NVIDIA card

<!-- gh-comment-id:1934470084 --> @gerroon commented on GitHub (Feb 8, 2024): > If you are running on WSL2 but only have built-in gpu (Intel R Iris R Xe Graphics)? Any idea on how to set this up on Ubuntu? I do not think these models or IGPUs support this. You need an NVIDIA card
Author
Owner

@HusseinAdeiza commented on GitHub (Oct 18, 2024):

how to run ollama on GPU

<!-- gh-comment-id:2421819137 --> @HusseinAdeiza commented on GitHub (Oct 18, 2024): how to run ollama on GPU
Author
Owner

@igorschlum commented on GitHub (Oct 19, 2024):

@HusseinAdeiza Ollama runs by default on GPU. What type of computer do you have and how much GPU space do you have?

<!-- gh-comment-id:2423428142 --> @igorschlum commented on GitHub (Oct 19, 2024): @HusseinAdeiza Ollama runs by default on GPU. What type of computer do you have and how much GPU space do you have?
Author
Owner

@aguywithcode commented on GitHub (May 20, 2025):

Looks like if you run the windows native version, the WSL ollama command uses that engine.

<!-- gh-comment-id:2895387518 --> @aguywithcode commented on GitHub (May 20, 2025): Looks like if you run the windows native version, the WSL ollama command uses that engine.
Author
Owner

@CryptoDragonLady commented on GitHub (Mar 18, 2026):

@BruceMacD Sorry to necrobump a closed thread.. But this is the thread that lead me to the solution.

So I just realized why everyone is having issues in WSL.

WSL Stubs the cuda drivers into /usr/lib/wsl/lib/ which, is also where it puts nvidia-smi
after adding that to my path ( export PATH=$PATH:/usr/lib/wsl/lib/ ) volia.. everything worked.

Maybe add a check for WSL and if so attempt to execute nvidia-smi at /usr/lib/wsl/lib/ if you're gonna rely on that to locate cuda devices...

SMH, I spent three hours trying to figure this out and its all because an executable that wsl stubs into the same place nearly every time in wsl, that isnt part of the typical path, isnt found.. is completely beyond me as to why this engineering decision was made.

<!-- gh-comment-id:4083480916 --> @CryptoDragonLady commented on GitHub (Mar 18, 2026): @BruceMacD Sorry to necrobump a closed thread.. But this is the thread that lead me to the solution. So I just realized why everyone is having issues in WSL. WSL Stubs the cuda drivers into /usr/lib/wsl/lib/ which, is also where it puts nvidia-smi after adding that to my path ( export PATH=$PATH:/usr/lib/wsl/lib/ ) volia.. everything worked. Maybe add a check for WSL and if so attempt to execute nvidia-smi at /usr/lib/wsl/lib/ if you're gonna rely on that to locate cuda devices... SMH, I spent three hours trying to figure this out and its all because an executable that wsl stubs into the same place nearly every time in wsl, that isnt part of the typical path, isnt found.. is completely beyond me as to why this engineering decision was made.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#784