[GH-ISSUE #5275] ROCm on WSL #49816

Open
opened 2026-04-28 13:01:52 -05:00 by GiteaMirror · 18 comments
Owner

Originally created by @justinkb on GitHub (Jun 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5275

Originally assigned to: @dhiltgen on GitHub.

Recently, AMD released preview drivers for Windows that, alongside userspace packages for WSL, enable one to use ROCm through WSL. Ollama detection of AMD GPUs in linux, however, uses the presence of loaded amdgpu drivers and other sysfs stuff to determine various properties of the GPU. These are not available with this WSL ROCm setup, nor is rocm-smi used for querying VRAM size and its usage etc. I was wondering if it was feasible to add some detection for this setup, so it can be used anyway, even if some runtime information is not available. Is runtime knowledge of the available VRAM strictly necessary? Could a user just not make sure not to load too big of a model, and in case of failing to do so, accept that the ROCm runtime will hard error out on failing hipMallocs etc? Perhaps we could warn users in the output that this might happen.

Originally created by @justinkb on GitHub (Jun 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5275 Originally assigned to: @dhiltgen on GitHub. Recently, AMD released preview drivers for Windows that, alongside userspace packages for WSL, enable one to use ROCm through WSL. Ollama detection of AMD GPUs in linux, however, uses the presence of loaded amdgpu drivers and other sysfs stuff to determine various properties of the GPU. These are not available with this WSL ROCm setup, nor is rocm-smi used for querying VRAM size and its usage etc. I was wondering if it was feasible to add some detection for this setup, so it can be used anyway, even if some runtime information is not available. Is runtime knowledge of the available VRAM strictly necessary? Could a user just not make sure not to load too big of a model, and in case of failing to do so, accept that the ROCm runtime will hard error out on failing hipMallocs etc? Perhaps we could warn users in the output that this might happen.
GiteaMirror added the wslamdfeature request labels 2026-04-28 13:01:54 -05:00
Author
Owner

@jmorganca commented on GitHub (Jun 25, 2024):

cc @dhiltgen

<!-- gh-comment-id:2189292345 --> @jmorganca commented on GitHub (Jun 25, 2024): cc @dhiltgen
Author
Owner

@dhiltgen commented on GitHub (Jun 25, 2024):

The installation docs seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system?

ls /sys/class/kfd/kfd/topology/nodes/*
cat /sys/class/kfd/kfd/topology/nodes/*/properties
ls /sys/class/drm/card*/device
<!-- gh-comment-id:2189335403 --> @dhiltgen commented on GitHub (Jun 25, 2024): The [installation docs](https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html) seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system? ``` ls /sys/class/kfd/kfd/topology/nodes/* cat /sys/class/kfd/kfd/topology/nodes/*/properties ls /sys/class/drm/card*/device ```
Author
Owner

@justinkb commented on GitHub (Jun 25, 2024):

I didn't realize the kernel driver would be installed and loadable(?) on
wsl, since I got some stable diffusion and llama stuff working with just
user space without amdgpu loaded (I guess with this setup amdgpu driver
might be some stub indirecting and translating calls to the actual windows
driver). Not at my PC right now, so can't check the driver thing, but I
will see what I can get working myself, too.

On Tue, Jun 25, 2024, 5:56 PM Daniel Hiltgen @.***>
wrote:

The installation docs
https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html
seem to imply the amdgpu driver is installed. I'll have to set up a test
system so I can poke around and see what discovery options we've got.


Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/5275#issuecomment-2189335403,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AABVHKACYPFS7VX32SCK7EDZJGHKRAVCNFSM6AAAAABJ4DVQACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBZGMZTKNBQGM
.
You are receiving this because you authored the thread.Message ID:
@.***>

<!-- gh-comment-id:2189370975 --> @justinkb commented on GitHub (Jun 25, 2024): I didn't realize the kernel driver would be installed and loadable(?) on wsl, since I got some stable diffusion and llama stuff working with just user space without amdgpu loaded (I guess with this setup amdgpu driver might be some stub indirecting and translating calls to the actual windows driver). Not at my PC right now, so can't check the driver thing, but I will see what I can get working myself, too. On Tue, Jun 25, 2024, 5:56 PM Daniel Hiltgen ***@***.***> wrote: > The installation docs > <https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html> > seem to imply the amdgpu driver is installed. I'll have to set up a test > system so I can poke around and see what discovery options we've got. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/5275#issuecomment-2189335403>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AABVHKACYPFS7VX32SCK7EDZJGHKRAVCNFSM6AAAAABJ4DVQACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBZGMZTKNBQGM> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
Author
Owner

@justinkb commented on GitHub (Jun 25, 2024):

I just noticed those docs specify installing with "amdgpu-install -y --usecase=wsl,rocm --no-dkms" specifically, meaning the kernel driver source for dkms won't be installed. this isn't to say getting it installed and loaded isn't possible on wsl, but I doubt it will be, since I don't think any of the drm subsystem is actually available in wsl linux kernel. In any case, compiling the dkms driver isn't trivial, since I'll need to install the WSL2 kernel headers in a way dkms expects them. Again, even if and when I get that set up, I cannot imagine that'll actually work to load the actual amdgpu driver. I checked the sources, it is just the same amdgpu driver as used in actual linux, not some forwarding stub, so I don't see how it could work when there is no way the underlying vm for wsl2 can somehow paravirtualize the actual GPU while windows is also using it.

<!-- gh-comment-id:2189496478 --> @justinkb commented on GitHub (Jun 25, 2024): I just noticed those docs specify installing with "amdgpu-install -y --usecase=wsl,rocm --no-dkms" specifically, meaning the kernel driver source for dkms won't be installed. this isn't to say getting it installed and loaded isn't possible on wsl, but I doubt it will be, since I don't think any of the drm subsystem is actually available in wsl linux kernel. In any case, compiling the dkms driver isn't trivial, since I'll need to install the WSL2 kernel headers in a way dkms expects them. Again, even if and when I get that set up, I cannot imagine that'll actually work to load the actual amdgpu driver. I checked the sources, it is just the same amdgpu driver as used in actual linux, not some forwarding stub, so I don't see how it could work when there is no way the underlying vm for wsl2 can somehow paravirtualize the actual GPU while windows is also using it.
Author
Owner

@justinkb commented on GitHub (Jun 27, 2024):

The installation docs seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system?

ls /sys/class/kfd/kfd/topology/nodes/*
cat /sys/class/kfd/kfd/topology/nodes/*/properties
ls /sys/class/drm/card*/device

the devices in /sys/class/drm/card0 and /sys/class/drm/render128 just point to vgem the virtual GEM provider, /sys/class/kfd is completely absent. and as expected, I wasn't able to load amdgpu driver on wsl

<!-- gh-comment-id:2195631891 --> @justinkb commented on GitHub (Jun 27, 2024): > The [installation docs](https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html) seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system? > > ``` > ls /sys/class/kfd/kfd/topology/nodes/* > cat /sys/class/kfd/kfd/topology/nodes/*/properties > ls /sys/class/drm/card*/device > ``` the devices in /sys/class/drm/card0 and /sys/class/drm/render128 just point to vgem the virtual GEM provider, /sys/class/kfd is completely absent. and as expected, I wasn't able to load amdgpu driver on wsl
Author
Owner

@justinkb commented on GitHub (Jun 28, 2024):

I managed to hack this into working order, see https://github.com/justinkb/ollama/tree/wsl-rocm-hack - can confirm it works perfectly like this. theoretically, I could write a windows program that updates the referenced text file that contains the used memory periodically, which would enable ollama to monitor the memory usage.

<!-- gh-comment-id:2197080580 --> @justinkb commented on GitHub (Jun 28, 2024): I managed to hack this into working order, see https://github.com/justinkb/ollama/tree/wsl-rocm-hack - can confirm it works perfectly like this. theoretically, I could write a windows program that updates the referenced text file that contains the used memory periodically, which would enable ollama to monitor the memory usage.
Author
Owner

@xaxaxa7b9 commented on GitHub (Jul 10, 2024):

@justinkb how do i install your solution?

<!-- gh-comment-id:2221008718 --> @xaxaxa7b9 commented on GitHub (Jul 10, 2024): @justinkb how do i install your solution?
Author
Owner

@MorrisLu-Taipei commented on GitHub (Aug 1, 2024):

@justinkb sorry i dont know how to hack this issue, in your link , can you give more hints, thanks

<!-- gh-comment-id:2263561593 --> @MorrisLu-Taipei commented on GitHub (Aug 1, 2024): @justinkb sorry i dont know how to hack this issue, in your link , can you give more hints, thanks
Author
Owner

@evshiron commented on GitHub (Aug 6, 2024):

@xaxaxa7b9 @MorrisLu-Taipei

  1. Clone the repo and switch to that branch
git clone https://github.com/justinkb/ollama
cd ollama
git checkout wsl-rocm-hack
  1. Hack for the hack

Edit gpu/amd_linux.go, change this line to usedMemory := uint64(0), and save.

The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama.

USE AT YOUR OWN RISK.

  1. Build Ollama
go generate ./...
go build .
  1. Run Ollama directly
# service
./ollama serve

# client, in another terminal session
./ollama run phi3

EDIT: You may try https://github.com/ollama/ollama/pull/6201 if interested.

<!-- gh-comment-id:2270886785 --> @evshiron commented on GitHub (Aug 6, 2024): @xaxaxa7b9 @MorrisLu-Taipei 1. Clone the repo and switch to that branch ``` git clone https://github.com/justinkb/ollama cd ollama git checkout wsl-rocm-hack ``` 2. Hack for the hack Edit `gpu/amd_linux.go`, change [this line](https://github.com/justinkb/ollama/commit/ab8cc737ba51897085908ce63ef4c1f482a364ed#diff-14be148cab6c347e3e401abe4a5a3587c8202147cae7ead7377518d696bb2ff6R25) to `usedMemory := uint64(0)`, and save. The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama. USE AT YOUR OWN RISK. 3. Build Ollama ``` go generate ./... go build . ``` 4. Run Ollama directly ``` # service ./ollama serve # client, in another terminal session ./ollama run phi3 ``` EDIT: You may try https://github.com/ollama/ollama/pull/6201 if interested.
Author
Owner

@githust66 commented on GitHub (Sep 2, 2024):

@xaxaxa7b9 @MorrisLu-Taipei

  1. Clone the repo and switch to that branch
git clone https://github.com/justinkb/ollama
cd ollama
git checkout wsl-rocm-hack
  1. Hack for the hack

Edit gpu/amd_linux.go, change this line to usedMemory := uint64(0), and save.

The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama.

USE AT YOUR OWN RISK.

  1. Build Ollama
go generate ./...
go build .
  1. Run Ollama directly
# service
./ollama serve

# client, in another terminal session
./ollama run phi3

EDIT: You may try #6201 if interested.

Excuse me, what are the risks associated with this?

<!-- gh-comment-id:2324176683 --> @githust66 commented on GitHub (Sep 2, 2024): > @xaxaxa7b9 @MorrisLu-Taipei > > 1. Clone the repo and switch to that branch > > ``` > git clone https://github.com/justinkb/ollama > cd ollama > git checkout wsl-rocm-hack > ``` > > 2. Hack for the hack > > Edit `gpu/amd_linux.go`, change [this line](https://github.com/justinkb/ollama/commit/ab8cc737ba51897085908ce63ef4c1f482a364ed#diff-14be148cab6c347e3e401abe4a5a3587c8202147cae7ead7377518d696bb2ff6R25) to `usedMemory := uint64(0)`, and save. > > The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama. > > USE AT YOUR OWN RISK. > > 3. Build Ollama > > ``` > go generate ./... > go build . > ``` > > 4. Run Ollama directly > > ``` > # service > ./ollama serve > > # client, in another terminal session > ./ollama run phi3 > ``` > > EDIT: You may try #6201 if interested. Excuse me, what are the risks associated with this?
Author
Owner

@githust66 commented on GitHub (Sep 2, 2024):

@xaxaxa7b9 @MorrisLu-Taipei

  1. Clone the repo and switch to that branch
git clone https://github.com/justinkb/ollama
cd ollama
git checkout wsl-rocm-hack
  1. Hack for the hack

Edit gpu/amd_linux.go, change this line to usedMemory := uint64(0), and save.

The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama.

USE AT YOUR OWN RISK.

  1. Build Ollama
go generate ./...
go build .
  1. Run Ollama directly
# service
./ollama serve

# client, in another terminal session
./ollama run phi3

EDIT: You may try #6201 if interested.
Is it to modify this line, please?
image

<!-- gh-comment-id:2324183611 --> @githust66 commented on GitHub (Sep 2, 2024): > @xaxaxa7b9 @MorrisLu-Taipei > > 1. Clone the repo and switch to that branch > > ``` > git clone https://github.com/justinkb/ollama > cd ollama > git checkout wsl-rocm-hack > ``` > > 2. Hack for the hack > > Edit `gpu/amd_linux.go`, change [this line](https://github.com/justinkb/ollama/commit/ab8cc737ba51897085908ce63ef4c1f482a364ed#diff-14be148cab6c347e3e401abe4a5a3587c8202147cae7ead7377518d696bb2ff6R25) to `usedMemory := uint64(0)`, and save. > > The original hack returns a GPU info to trick Ollama into using our AMD GPU in WSL without care. The hack above skips retrieving free memory from that file and pretends all VRAM can be used in Ollama. > > USE AT YOUR OWN RISK. > > 3. Build Ollama > > ``` > go generate ./... > go build . > ``` > > 4. Run Ollama directly > > ``` > # service > ./ollama serve > > # client, in another terminal session > ./ollama run phi3 > ``` > > EDIT: You may try #6201 if interested. Is it to modify this line, please? ![image](https://github.com/user-attachments/assets/883e36db-7d89-4e94-8fa4-6693f10af7c1)
Author
Owner

@evshiron commented on GitHub (Sep 2, 2024):

@githust66

Yes. Just trick Ollama that no VRAM is currently being used, and it will use the remaining (all in this case) VRAM for its LLM sessions, which enables Ollama to use your AMD GPU in WSL, without the need to update a text file to report VRAM usage.

If Ollama requests more VRAM that it can use, for example, it wants 20GB but 6GB has been taken for gaming, the GPU may become unstable and the driver may crash.

Btw, I will use https://github.com/ollama/ollama/pull/6201 to obtain actual VRAM usage and it works just fine.

<!-- gh-comment-id:2324253431 --> @evshiron commented on GitHub (Sep 2, 2024): @githust66 Yes. Just trick Ollama that no VRAM is currently being used, and it will use the remaining (all in this case) VRAM for its LLM sessions, which enables Ollama to use your AMD GPU in WSL, without the need to update a text file to report VRAM usage. If Ollama requests more VRAM that it can use, for example, it wants 20GB but 6GB has been taken for gaming, the GPU may become unstable and the driver may crash. Btw, I will use https://github.com/ollama/ollama/pull/6201 to obtain actual VRAM usage and it works just fine.
Author
Owner

@seboss666 commented on GitHub (Oct 1, 2024):

Hello, are we expecting all this hacky things coming to stable releases somehow ? (thanks to AMD for the shitty support, the amdgpu-install stuff installed when following the doc is scary as hell)

ollama runs really fine on Windows (Kudos for that), but I want to use it from WSL for various tasks and for now, the network limitations from Windows itself prevents me to do that. Having it "natively" inside WSL would be a sweet, even if I doubt AMD deserves all the hassle you have to support because of their unwillingness to understand what they need to stay relevant in the field...

<!-- gh-comment-id:2386604096 --> @seboss666 commented on GitHub (Oct 1, 2024): Hello, are we expecting all this hacky things coming to stable releases somehow ? (thanks to AMD for the shitty support, the amdgpu-install stuff installed when following the doc is scary as hell) ollama runs really fine on Windows (Kudos for that), but I want to use it from WSL for various tasks and for now, the network limitations from Windows itself prevents me to do that. Having it "natively" inside WSL would be a sweet, even if I doubt AMD deserves all the hassle you have to support because of their unwillingness to understand what they need to stay relevant in the field...
Author
Owner

@evshiron commented on GitHub (Oct 2, 2024):

If you use the branch in https://github.com/ollama/ollama/pull/6201, it's already working pretty well in WSL. I do hope it will be merged someday but this PR might not progress unless AMD rolls out a stable release for ROCm in WSL.

<!-- gh-comment-id:2387518065 --> @evshiron commented on GitHub (Oct 2, 2024): If you use the branch in https://github.com/ollama/ollama/pull/6201, it's already working pretty well in WSL. I do hope it will be merged someday but this PR might not progress unless AMD rolls out a stable release for ROCm in WSL.
Author
Owner

@nikkanon commented on GitHub (Jan 23, 2025):

Still not working running Ollama on WSL with AMD GPU.

When will Ollama team fix this? @dhiltgen

<!-- gh-comment-id:2611205907 --> @nikkanon commented on GitHub (Jan 23, 2025): Still not working running Ollama on WSL with AMD GPU. When will Ollama team fix this? @dhiltgen
Author
Owner

@daedmunoz commented on GitHub (Mar 19, 2025):

Hello, are we expecting all this hacky things coming to stable releases somehow ? (thanks to AMD for the shitty support, the amdgpu-install stuff installed when following the doc is scary as hell)

ollama runs really fine on Windows (Kudos for that), but I want to use it from WSL for various tasks and for now, the network limitations from Windows itself prevents me to do that. Having it "natively" inside WSL would be a sweet, even if I doubt AMD deserves all the hassle you have to support because of their unwillingness to understand what they need to stay relevant in the field...

In fact, you can use a Llama model running on Windows from a WSL machine. For that, you need to set the OLLAMA_HOST user environment variable on Windows like OLLAMA_HOST=0.0.0.0:11434 and then you will be able to access ollama server using the Windows host IP from WSL.

<!-- gh-comment-id:2737576836 --> @daedmunoz commented on GitHub (Mar 19, 2025): > Hello, are we expecting all this hacky things coming to stable releases somehow ? (thanks to AMD for the shitty support, the amdgpu-install stuff installed when following the doc is scary as hell) > > ollama runs really fine on Windows (Kudos for that), but I want to use it from WSL for various tasks and for now, the network limitations from Windows itself prevents me to do that. Having it "natively" inside WSL would be a sweet, even if I doubt AMD deserves all the hassle you have to support because of their unwillingness to understand what they need to stay relevant in the field... In fact, you can use a Llama model running on Windows from a WSL machine. For that, you need to set the OLLAMA_HOST user environment variable on Windows like `OLLAMA_HOST=0.0.0.0:11434` and then you will be able to access ollama server using the Windows host IP from WSL.
Author
Owner

@cromefire commented on GitHub (Jul 5, 2025):

nor is rocm-smi used for querying VRAM size and its usage etc

rocminfo is available though and can show whether an AMD GPU is present (Device Type: GPU).

<!-- gh-comment-id:3039950120 --> @cromefire commented on GitHub (Jul 5, 2025): > nor is rocm-smi used for querying VRAM size and its usage etc `rocminfo` is available though and can show whether an AMD GPU is present (Device Type: GPU).
Author
Owner

@atsetoglou commented on GitHub (Feb 24, 2026):

AMD has released this repo: https://github.com/ROCm/librocdxg
I tried running Ollama on WSL but it seems reporting 0 B vram and ends up using the CPU...

Is there any update on WSL support please?

<!-- gh-comment-id:3950494812 --> @atsetoglou commented on GitHub (Feb 24, 2026): AMD has released this repo: https://github.com/ROCm/librocdxg I tried running Ollama on WSL but it seems reporting 0 B vram and ends up using the CPU... Is there any update on WSL support please?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49816