[GH-ISSUE #3172] Allow to choose a preferred variant (an AMD GPU/an NIVIDIA GPU/CPU) when running a model #63991

Closed
opened 2026-05-03 15:43:51 -05:00 by GiteaMirror · 12 comments
Owner

Originally created by @Inokinoki on GitHub (Mar 15, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3172

Originally assigned to: @dhiltgen on GitHub.

What are you trying to do?

I have both NVIDIA and AMD cards on one PC. Both nvml.dll and amdhip64.dll are available on Windows.
I saw in gpu/gpu.go ollama tries to detect first NVIDIA and will not try AMD if it found NVIDIA.

How should we solve this?

Could it be possible to add an arg to indicate the preferred device or variant?

What is the impact of not solving this?

As a workaround, I just move nvml.dll out of the path to by-pass the detection of NVIDIA.

Anything else?

No response

Originally created by @Inokinoki on GitHub (Mar 15, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3172 Originally assigned to: @dhiltgen on GitHub. ### What are you trying to do? I have both NVIDIA and AMD cards on one PC. Both `nvml.dll` and `amdhip64.dll` are available on Windows. I saw in `gpu/gpu.go` ollama tries to detect first NVIDIA and will not try AMD if it found NVIDIA. ### How should we solve this? Could it be possible to add an arg to indicate the preferred device or variant? ### What is the impact of not solving this? As a workaround, I just move `nvml.dll` out of the path to by-pass the detection of NVIDIA. ### Anything else? _No response_
GiteaMirror added the windows label 2026-05-03 15:43:51 -05:00
Author
Owner

@frostworx commented on GitHub (Mar 16, 2024):

I have a similar setup, but use Linux as operating system, and can confirm that the nvidia gpu is prioritized over the amd gpu.

My working workaround to "forcefully" use the amd gpu is to LD_PRELOAD the stub library
/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so (belonging to cuda ironically)

(previously I tried a dummy self rolled library for preloading, but it crashed on program launch)

LD_PRELOAD is not accepted by systemd as Environment override, so I use a custom ExecStart script and preload there.

Using this workaround I'm able to use both gpus simulatenously, because nvidia still finds its original shared library from nvidia-utils

Some function behind an environment variable like USE_CUDA or similar would fix this issue.

<!-- gh-comment-id:2001940957 --> @frostworx commented on GitHub (Mar 16, 2024): I have a similar setup, but use Linux as operating system, and can confirm that the nvidia gpu is prioritized over the amd gpu. My working workaround to "forcefully" use the amd gpu is to `LD_PRELOAD` the stub library `/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so` _(belonging to cuda ironically)_ _(previously I tried a dummy self rolled library for preloading, but it crashed on program launch)_ `LD_PRELOAD` is not accepted by systemd as `Environment` override, so I use a custom `ExecStart` script and preload there. Using this workaround I'm able to use both gpus simulatenously, because nvidia still finds its original shared library from nvidia-utils Some function behind an environment variable like USE_CUDA or similar would fix this issue.
Author
Owner

@louwangzhiyuY commented on GitHub (Mar 16, 2024):

II have the same question as you. I want an option to choose which GPU.

<!-- gh-comment-id:2002015613 --> @louwangzhiyuY commented on GitHub (Mar 16, 2024): II have the same question as you. I want an option to choose which GPU.
Author
Owner

@dhiltgen commented on GitHub (Mar 21, 2024):

I believe this should be possibly by using a mixture of CUDA_VISIBLE_DEVICES and HIP_VISIBLE_DEVICES to invalid IDs like "-1".

<!-- gh-comment-id:2011964668 --> @dhiltgen commented on GitHub (Mar 21, 2024): I believe this should be possibly by using a mixture of `CUDA_VISIBLE_DEVICES` and `HIP_VISIBLE_DEVICES` to invalid IDs like "-1".
Author
Owner

@frostworx commented on GitHub (Mar 21, 2024):

Thanks for the hint! I haven't thought of trying invalid IDs.
I won't be able to test before next week, but will report back.

<!-- gh-comment-id:2011975358 --> @frostworx commented on GitHub (Mar 21, 2024): Thanks for the hint! I haven't thought of trying invalid IDs. I won't be able to test before next week, but will report back.
Author
Owner

@Inokinoki commented on GitHub (Mar 21, 2024):

I believe this should be possibly by using a mixture of CUDA_VISIBLE_DEVICES and HIP_VISIBLE_DEVICES to invalid IDs like "-1".

Thanks for the workaround based on the env! Will try it

I created by my self an arg to select gpu variant: bed32826d4

Should it be done other way?

<!-- gh-comment-id:2012437763 --> @Inokinoki commented on GitHub (Mar 21, 2024): > I believe this should be possibly by using a mixture of `CUDA_VISIBLE_DEVICES` and `HIP_VISIBLE_DEVICES` to invalid IDs like "-1". Thanks for the workaround based on the env! Will try it I created by my self an arg to select gpu variant: https://github.com/ollama/ollama/commit/bed32826d40779e620a376c24c23e929135383aa Should it be done other way?
Author
Owner

@frostworx commented on GitHub (Mar 25, 2024):

I tested a bit and have "bad" news for you:

The ollama build I used for testing was your official release v0.1.29 (ollama-linux-amd64)

The command

CUDA_VISIBLE_DEVICES=-1 HIP_VISIBLE_DEVICES=-1 /usr/local/bin/ollama-linux-amd64-v0.1.29 serve

did not make ollama use the amd gpu automatically. Instead, the Nvidia GPU was detected, because /usr/lib/libnvidia-ml.so.550.54.14 was loaded, but could not be used and therefore ollama fell back to CPU:

$ CUDA_VISIBLE_DEVICES=-1 HIP_VISIBLE_DEVICES=-1 /usr/local/bin/ollama-linux-amd64-v0.1.29 serve
time=2024-03-25T16:50:34.703+01:00 level=INFO source=images.go:806 msg="total blobs: 29"
time=2024-03-25T16:50:34.704+01:00 level=INFO source=images.go:813 msg="total unused blobs removed: 0"
time=2024-03-25T16:50:34.704+01:00 level=INFO source=routes.go:1110 msg="Listening on 127.0.0.1:11434 (version 0.1.29)"
time=2024-03-25T16:50:34.704+01:00 level=INFO source=payload_common.go:112 msg="Extracting dynamic libraries to /tmp/ollama3416071157/runners ..."
time=2024-03-25T16:50:36.697+01:00 level=INFO source=payload_common.go:139 msg="Dynamic LLM libraries [rocm_v60000 cpu cpu_avx2 cuda_v11 cpu_avx]"
time=2024-03-25T16:50:36.697+01:00 level=INFO source=gpu.go:77 msg="Detecting GPU type"
time=2024-03-25T16:50:36.697+01:00 level=INFO source=gpu.go:191 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-03-25T16:50:36.707+01:00 level=INFO source=gpu.go:237 msg="Discovered GPU libraries: [/usr/lib/libnvidia-ml.so.550.54.14 /usr/lib64/libnvidia-ml.so.550.54.14 /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so]"
time=2024-03-25T16:50:36.714+01:00 level=INFO source=gpu.go:82 msg="Nvidia GPU detected"
time=2024-03-25T16:50:36.714+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-03-25T16:50:36.719+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1"
time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-03-25T16:51:33.387+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1"
time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-03-25T16:51:33.387+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1"
time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
loading library /tmp/ollama3416071157/runners/cuda_v11/libext_server.so
time=2024-03-25T16:51:33.394+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3416071157/runners/cuda_v11/libext_server.so"
time=2024-03-25T16:51:33.394+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server"
time=2024-03-25T16:51:33.398+01:00 level=WARN source=llm.go:170 msg="Failed to load dynamic library /tmp/ollama3416071157/runners/cuda_v11/libext_server.so  Unable to init GPU: no CUDA-capable device is detected"
loading library /tmp/ollama3416071157/runners/cpu_avx2/libext_server.so
time=2024-03-25T16:51:33.398+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3416071157/runners/cpu_avx2/libext_server.so"
time=2024-03-25T16:51:33.398+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server"

The same binary successfully uses the AMD GPU when preloading the corresponding cuda stub library:

LD_PRELOAD=/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/local/bin/ollama-linux-amd64-v0.1.29 serve

$ LD_PRELOAD=/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/local/bin/ollama-linux-amd64-v0.1.29 serve
time=2024-03-25T16:52:56.560+01:00 level=INFO source=images.go:806 msg="total blobs: 29"
time=2024-03-25T16:52:56.560+01:00 level=INFO source=images.go:813 msg="total unused blobs removed: 0"
time=2024-03-25T16:52:56.560+01:00 level=INFO source=routes.go:1110 msg="Listening on 127.0.0.1:11434 (version 0.1.29)"
time=2024-03-25T16:52:56.560+01:00 level=INFO source=payload_common.go:112 msg="Extracting dynamic libraries to /tmp/ollama1611052111/runners ..."
time=2024-03-25T16:52:58.544+01:00 level=INFO source=payload_common.go:139 msg="Dynamic LLM libraries [cuda_v11 cpu_avx cpu cpu_avx2 rocm_v60000]"
time=2024-03-25T16:52:58.544+01:00 level=INFO source=gpu.go:77 msg="Detecting GPU type"
time=2024-03-25T16:52:58.544+01:00 level=INFO source=gpu.go:191 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-03-25T16:52:58.560+01:00 level=INFO source=gpu.go:237 msg="Discovered GPU libraries: [/usr/lib/libnvidia-ml.so.550.54.14 /usr/lib64/libnvidia-ml.so.550.54.14 /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so]"

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Linked to libnvidia-ml library at wrong path : /usr/lib/libnvidia-ml.so.550.54.14

time=2024-03-25T16:52:58.588+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /usr/lib/libnvidia-ml.so.550.54.14: nvml vram init failure: 9"

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Linked to libnvidia-ml library at wrong path : /usr/lib/libnvidia-ml.so.550.54.14

time=2024-03-25T16:52:58.614+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /usr/lib64/libnvidia-ml.so.550.54.14: nvml vram init failure: 9"

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Linked to libnvidia-ml library at wrong path : /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

time=2024-03-25T16:52:58.637+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so: nvml vram init failure: 9"
time=2024-03-25T16:52:58.637+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-03-25T16:52:58.637+01:00 level=WARN source=amd_linux.go:53 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers: amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-03-25T16:52:58.638+01:00 level=INFO source=amd_linux.go:88 msg="detected amdgpu versions [gfx1030]"
time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:119 msg="amdgpu [0] gfx1030 is supported"
time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:246 msg="[0] amdgpu totalMemory 16368M"
time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:247 msg="[0] amdgpu freeMemory  16368M"

simply running

/usr/local/bin/ollama-linux-amd64-v0.1.29 serve

successfully makes use of the Nvidia GPU

Please let me know if you need something else (likely I won't reply before 2nd week of April)

<!-- gh-comment-id:2018376791 --> @frostworx commented on GitHub (Mar 25, 2024): I tested a bit and have "bad" news for you: The ollama build I used for testing was your [official release v0.1.29](https://github.com/ollama/ollama/releases/tag/v0.1.29) _(ollama-linux-amd64)_ The command `CUDA_VISIBLE_DEVICES=-1 HIP_VISIBLE_DEVICES=-1 /usr/local/bin/ollama-linux-amd64-v0.1.29 serve` did not make ollama use the amd gpu automatically. Instead, the Nvidia GPU was detected, because /usr/lib/libnvidia-ml.so.550.54.14 was loaded, but could not be used and therefore ollama fell back to CPU: ``` $ CUDA_VISIBLE_DEVICES=-1 HIP_VISIBLE_DEVICES=-1 /usr/local/bin/ollama-linux-amd64-v0.1.29 serve time=2024-03-25T16:50:34.703+01:00 level=INFO source=images.go:806 msg="total blobs: 29" time=2024-03-25T16:50:34.704+01:00 level=INFO source=images.go:813 msg="total unused blobs removed: 0" time=2024-03-25T16:50:34.704+01:00 level=INFO source=routes.go:1110 msg="Listening on 127.0.0.1:11434 (version 0.1.29)" time=2024-03-25T16:50:34.704+01:00 level=INFO source=payload_common.go:112 msg="Extracting dynamic libraries to /tmp/ollama3416071157/runners ..." time=2024-03-25T16:50:36.697+01:00 level=INFO source=payload_common.go:139 msg="Dynamic LLM libraries [rocm_v60000 cpu cpu_avx2 cuda_v11 cpu_avx]" time=2024-03-25T16:50:36.697+01:00 level=INFO source=gpu.go:77 msg="Detecting GPU type" time=2024-03-25T16:50:36.697+01:00 level=INFO source=gpu.go:191 msg="Searching for GPU management library libnvidia-ml.so" time=2024-03-25T16:50:36.707+01:00 level=INFO source=gpu.go:237 msg="Discovered GPU libraries: [/usr/lib/libnvidia-ml.so.550.54.14 /usr/lib64/libnvidia-ml.so.550.54.14 /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so]" time=2024-03-25T16:50:36.714+01:00 level=INFO source=gpu.go:82 msg="Nvidia GPU detected" time=2024-03-25T16:50:36.714+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-03-25T16:50:36.719+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1" time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-03-25T16:51:33.387+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1" time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-03-25T16:51:33.387+01:00 level=INFO source=gpu.go:119 msg="CUDA Compute Capability detected: 6.1" time=2024-03-25T16:51:33.387+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" loading library /tmp/ollama3416071157/runners/cuda_v11/libext_server.so time=2024-03-25T16:51:33.394+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3416071157/runners/cuda_v11/libext_server.so" time=2024-03-25T16:51:33.394+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server" time=2024-03-25T16:51:33.398+01:00 level=WARN source=llm.go:170 msg="Failed to load dynamic library /tmp/ollama3416071157/runners/cuda_v11/libext_server.so Unable to init GPU: no CUDA-capable device is detected" loading library /tmp/ollama3416071157/runners/cpu_avx2/libext_server.so time=2024-03-25T16:51:33.398+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3416071157/runners/cpu_avx2/libext_server.so" time=2024-03-25T16:51:33.398+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server" ``` The same binary successfully uses the AMD GPU when preloading the corresponding cuda stub library: `LD_PRELOAD=/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/local/bin/ollama-linux-amd64-v0.1.29 serve` ``` $ LD_PRELOAD=/opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so /usr/local/bin/ollama-linux-amd64-v0.1.29 serve time=2024-03-25T16:52:56.560+01:00 level=INFO source=images.go:806 msg="total blobs: 29" time=2024-03-25T16:52:56.560+01:00 level=INFO source=images.go:813 msg="total unused blobs removed: 0" time=2024-03-25T16:52:56.560+01:00 level=INFO source=routes.go:1110 msg="Listening on 127.0.0.1:11434 (version 0.1.29)" time=2024-03-25T16:52:56.560+01:00 level=INFO source=payload_common.go:112 msg="Extracting dynamic libraries to /tmp/ollama1611052111/runners ..." time=2024-03-25T16:52:58.544+01:00 level=INFO source=payload_common.go:139 msg="Dynamic LLM libraries [cuda_v11 cpu_avx cpu cpu_avx2 rocm_v60000]" time=2024-03-25T16:52:58.544+01:00 level=INFO source=gpu.go:77 msg="Detecting GPU type" time=2024-03-25T16:52:58.544+01:00 level=INFO source=gpu.go:191 msg="Searching for GPU management library libnvidia-ml.so" time=2024-03-25T16:52:58.560+01:00 level=INFO source=gpu.go:237 msg="Discovered GPU libraries: [/usr/lib/libnvidia-ml.so.550.54.14 /usr/lib64/libnvidia-ml.so.550.54.14 /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so]" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WARNING: You should always run with libnvidia-ml.so that is installed with your NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64. libnvidia-ml.so in GDK package is a stub library that is attached only for build purposes (e.g. machine that you build your application doesn't have to have Display Driver installed). !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Linked to libnvidia-ml library at wrong path : /usr/lib/libnvidia-ml.so.550.54.14 time=2024-03-25T16:52:58.588+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /usr/lib/libnvidia-ml.so.550.54.14: nvml vram init failure: 9" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WARNING: You should always run with libnvidia-ml.so that is installed with your NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64. libnvidia-ml.so in GDK package is a stub library that is attached only for build purposes (e.g. machine that you build your application doesn't have to have Display Driver installed). !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Linked to libnvidia-ml library at wrong path : /usr/lib/libnvidia-ml.so.550.54.14 time=2024-03-25T16:52:58.614+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /usr/lib64/libnvidia-ml.so.550.54.14: nvml vram init failure: 9" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! WARNING: You should always run with libnvidia-ml.so that is installed with your NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64. libnvidia-ml.so in GDK package is a stub library that is attached only for build purposes (e.g. machine that you build your application doesn't have to have Display Driver installed). !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Linked to libnvidia-ml library at wrong path : /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so time=2024-03-25T16:52:58.637+01:00 level=INFO source=gpu.go:249 msg="Unable to load CUDA management library /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so: nvml vram init failure: 9" time=2024-03-25T16:52:58.637+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-03-25T16:52:58.637+01:00 level=WARN source=amd_linux.go:53 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers: amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2024-03-25T16:52:58.638+01:00 level=INFO source=amd_linux.go:88 msg="detected amdgpu versions [gfx1030]" time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:119 msg="amdgpu [0] gfx1030 is supported" time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:246 msg="[0] amdgpu totalMemory 16368M" time=2024-03-25T16:52:58.639+01:00 level=INFO source=amd_linux.go:247 msg="[0] amdgpu freeMemory 16368M" ``` simply running `/usr/local/bin/ollama-linux-amd64-v0.1.29 serve` successfully makes use of the Nvidia GPU Please let me know if you need something else _(likely I won't reply before 2nd week of April)_
Author
Owner

@dhiltgen commented on GitHub (Apr 12, 2024):

Within PR #3418 I've added support for mixed GPU types. Once that merges, and we ship a release with it, please give it a try.

<!-- gh-comment-id:2052629195 --> @dhiltgen commented on GitHub (Apr 12, 2024): Within PR #3418 I've added support for mixed GPU types. Once that merges, and we ship a release with it, please give it a try.
Author
Owner

@frostworx commented on GitHub (Apr 16, 2024):

Thanks for heads up!

Sorry for the lag, a today's mention of the patch at work remembered me, that I still wanted to test it :)

I just successfully compiled current master with 3418 applied, and can confirm that ollama found both the nvidia and the amdgpu as valid gpu.
While running a quick test it picked the nvidia gpu by default.

Please let me know how to best test/force amdgpu usage.

there's a minor(?) glitch with the amdgpu kernel module btw:
time=2024-04-16T15:48:32.254+02:00 level=WARN source=amd_linux.go:49 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
(I'm on Arch linux, currently running 6.8.5-zen1-1-zen, please let me know if more details are required)

The gpu is detected correctly:

time=2024-04-16T15:48:59.836+02:00 level=INFO source=amd_linux.go:217 msg="amdgpu memory" gpu=0 totalMB=16368
time=2024-04-16T15:48:59.836+02:00 level=INFO source=amd_linux.go:218 msg="amdgpu memory" gpu=0 freeMB=16368
time=2024-04-16T15:48:59.838+02:00 level=INFO source=amd_linux.go:276 msg="amdgpu is supported" gpu=0 gpu_type=gfx1030
<!-- gh-comment-id:2059192753 --> @frostworx commented on GitHub (Apr 16, 2024): Thanks for heads up! Sorry for the lag, a today's mention of the patch at work remembered me, that I still wanted to test it :) I just successfully compiled current master with [3418](https://github.com/ollama/ollama/pull/3418) applied, and can confirm that ollama found both the nvidia and the amdgpu as valid gpu. While running a quick test it picked the nvidia gpu by default. Please let me know how to best test/force amdgpu usage. there's a minor(?) glitch with the amdgpu kernel module btw: `time=2024-04-16T15:48:32.254+02:00 level=WARN source=amd_linux.go:49 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"` _(I'm on Arch linux, currently running 6.8.5-zen1-1-zen, please let me know if more details are required)_ The gpu is detected correctly: ``` time=2024-04-16T15:48:59.836+02:00 level=INFO source=amd_linux.go:217 msg="amdgpu memory" gpu=0 totalMB=16368 time=2024-04-16T15:48:59.836+02:00 level=INFO source=amd_linux.go:218 msg="amdgpu memory" gpu=0 freeMB=16368 time=2024-04-16T15:48:59.838+02:00 level=INFO source=amd_linux.go:276 msg="amdgpu is supported" gpu=0 gpu_type=gfx1030 ```
Author
Owner

@dhiltgen commented on GitHub (Apr 16, 2024):

there's a minor(?) glitch with the amdgpu kernel module

That's "working as designed" - the lack of the version file implies this is an upstream amdgpu driver merged into the linux kernel source, and is quite a bit out of date compared to the AMD downstream driver, so has bugs and missing features that may impact some GPUs. This warning is there to guide users if they run into problems so they can try upgrading to the latest driver. If things are working properly, then you should be OK.

The design goal for 3418 is to allow models to run on both GPUs, but we're taking it one step at a time to make sure things are solid. You can set OLLAMA_MAX_RUNNERS to zero for dynamic, or some number greater than 1 to put a hard cap on the number of runners (e.g. 2 in your case) and then try to load two (or more) different models. Once the nvidia card is full, it should move to the AMD card.

<!-- gh-comment-id:2059454061 --> @dhiltgen commented on GitHub (Apr 16, 2024): > there's a minor(?) glitch with the amdgpu kernel module That's "working as designed" - the lack of the version file implies this is an upstream amdgpu driver merged into the linux kernel source, and is quite a bit out of date compared to the AMD downstream driver, so has bugs and missing features that may impact some GPUs. This warning is there to guide users if they run into problems so they can try upgrading to the latest driver. If things are working properly, then you should be OK. The design goal for 3418 is to allow models to run on both GPUs, but we're taking it one step at a time to make sure things are solid. You can set `OLLAMA_MAX_RUNNERS` to zero for dynamic, or some number greater than 1 to put a hard cap on the number of runners (e.g. 2 in your case) and then try to load two (or more) different models. Once the nvidia card is full, it should move to the AMD card.
Author
Owner

@frostworx commented on GitHub (Apr 16, 2024):

Ah ic, thank you for clarification!

<!-- gh-comment-id:2059473159 --> @frostworx commented on GitHub (Apr 16, 2024): Ah ic, thank you for clarification!
Author
Owner

@dhiltgen commented on GitHub (Apr 28, 2024):

The 0.1.33 release is available now as a pre-release.

<!-- gh-comment-id:2081605761 --> @dhiltgen commented on GitHub (Apr 28, 2024): The [0.1.33](https://github.com/ollama/ollama/releases) release is available now as a pre-release.
Author
Owner

@dhiltgen commented on GitHub (May 4, 2024):

I believe everything should be in place for this to work now with 0.1.33, so I'll close this ticket.

<!-- gh-comment-id:2094442878 --> @dhiltgen commented on GitHub (May 4, 2024): I believe everything should be in place for this to work now with 0.1.33, so I'll close this ticket.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63991