[GH-ISSUE #4355] Ollama doesn' t work well with Zluda after 0.1.34 #28477

Closed
opened 2026-04-22 06:40:51 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @4thanks on GitHub (May 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4355

What is the issue?

when I was using ollama 0.1.32, it worked well with Zluda for my GPU (5700XT) follow the steps ollama_windows_10_rx6600xt_zluda.
recently update to the newest version (0.1.37), the GPU isn' t being utilized anymore; try downgrade to 0.1.34, not work, to the 0.1.33 is ok.

update log

time=2024-05-13T23:45:14.969+08:00 level=INFO source=images.go:704 msg="total blobs: 20"
time=2024-05-13T23:45:14.970+08:00 level=INFO source=images.go:711 msg="total unused blobs removed: 0"
time=2024-05-13T23:45:14.971+08:00 level=INFO source=routes.go:1052 msg="Listening on 127.0.0.1:11434 (version 0.1.37)"
time=2024-05-13T23:45:14.971+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11.3 rocm_v5.7 cpu cpu_avx]"
time=2024-05-13T23:45:15.003+08:00 level=INFO source=gpu.go:197 msg="error looking up nvidia GPU memory" error="nvcuda failed to get primary device context 801"
time=2024-05-13T23:45:15.005+08:00 level=WARN source=amd_windows.go:95 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1010:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1100 gfx1101 gfx1102 gfx803 gfx900 gfx906]"
time=2024-05-13T23:45:15.005+08:00 level=WARN source=amd_windows.go:97 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-05-13T23:45:15.005+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="31.8 GiB" available="20.9 GiB"
[GIN] 2024/05/13 - 23:45:48 | 200 |      2.5827ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2024/05/13 - 23:46:02 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2024/05/13 - 23:46:02 | 200 |      1.0369ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2024/05/13 - 23:46:02 | 200 |       518.2µs |       127.0.0.1 | POST     "/api/show"
time=2024-05-13T23:46:02.341+08:00 level=INFO source=gpu.go:197 msg="error looking up nvidia GPU memory" error="nvcuda failed to get primary device context 801"
time=2024-05-13T23:46:02.343+08:00 level=WARN source=amd_windows.go:95 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1010:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1100 gfx1101 gfx1102 gfx803 gfx900 gfx906]"
time=2024-05-13T23:46:02.343+08:00 level=WARN source=amd_windows.go:97 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-05-13T23:46:02.578+08:00 level=WARN source=server.go:207 msg="multimodal models don't support parallel requests yet"

ref: https://github.com/lshqqytiger/ZLUDA/issues/16 https://github.com/lshqqytiger/ZLUDA/issues/13#issuecomment-2085675119

OS

Windows

GPU

AMD

CPU

Intel

Ollama version

0.1.35

Originally created by @4thanks on GitHub (May 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4355 ### What is the issue? when I was using ollama 0.1.32, it worked well with Zluda for my GPU (5700XT) follow the steps [ollama_windows_10_rx6600xt_zluda](https://www.reddit.com/r/ollama/comments/1cf5tq1/ollama_windows_10_rx6600xt_zluda/). recently update to the newest version (0.1.37), the GPU isn' t being utilized anymore; try downgrade to 0.1.34, not work, to the 0.1.33 is ok. update log ``` time=2024-05-13T23:45:14.969+08:00 level=INFO source=images.go:704 msg="total blobs: 20" time=2024-05-13T23:45:14.970+08:00 level=INFO source=images.go:711 msg="total unused blobs removed: 0" time=2024-05-13T23:45:14.971+08:00 level=INFO source=routes.go:1052 msg="Listening on 127.0.0.1:11434 (version 0.1.37)" time=2024-05-13T23:45:14.971+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11.3 rocm_v5.7 cpu cpu_avx]" time=2024-05-13T23:45:15.003+08:00 level=INFO source=gpu.go:197 msg="error looking up nvidia GPU memory" error="nvcuda failed to get primary device context 801" time=2024-05-13T23:45:15.005+08:00 level=WARN source=amd_windows.go:95 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1010:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1100 gfx1101 gfx1102 gfx803 gfx900 gfx906]" time=2024-05-13T23:45:15.005+08:00 level=WARN source=amd_windows.go:97 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage" time=2024-05-13T23:45:15.005+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="31.8 GiB" available="20.9 GiB" [GIN] 2024/05/13 - 23:45:48 | 200 | 2.5827ms | 127.0.0.1 | GET "/api/tags" [GIN] 2024/05/13 - 23:46:02 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/05/13 - 23:46:02 | 200 | 1.0369ms | 127.0.0.1 | POST "/api/show" [GIN] 2024/05/13 - 23:46:02 | 200 | 518.2µs | 127.0.0.1 | POST "/api/show" time=2024-05-13T23:46:02.341+08:00 level=INFO source=gpu.go:197 msg="error looking up nvidia GPU memory" error="nvcuda failed to get primary device context 801" time=2024-05-13T23:46:02.343+08:00 level=WARN source=amd_windows.go:95 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1010:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1100 gfx1101 gfx1102 gfx803 gfx900 gfx906]" time=2024-05-13T23:46:02.343+08:00 level=WARN source=amd_windows.go:97 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage" time=2024-05-13T23:46:02.578+08:00 level=WARN source=server.go:207 msg="multimodal models don't support parallel requests yet" ``` ref: https://github.com/lshqqytiger/ZLUDA/issues/16 https://github.com/lshqqytiger/ZLUDA/issues/13#issuecomment-2085675119 ### OS Windows ### GPU AMD ### CPU Intel ### Ollama version 0.1.35
GiteaMirror added the bug label 2026-04-22 06:40:51 -05:00
Author
Owner

@igorschlum commented on GitHub (May 11, 2024):

Hi @4thanks can you try with version 0.1.36 ?

<!-- gh-comment-id:2105933884 --> @igorschlum commented on GitHub (May 11, 2024): Hi @4thanks can you try with version 0.1.36 ?
Author
Owner

@4thanks commented on GitHub (May 12, 2024):

Hi @4thanks can you try withversion 0.1.36 ?

still not work with zluda in 0.1.36 !

<!-- gh-comment-id:2106080701 --> @4thanks commented on GitHub (May 12, 2024): > Hi @4thanks can you try withversion 0.1.36 ? still not work with zluda in 0.1.36 !
Author
Owner

@igorschlum commented on GitHub (May 12, 2024):

Sorry, I cannot help. I'm on MacOS.

<!-- gh-comment-id:2106330396 --> @igorschlum commented on GitHub (May 12, 2024): Sorry, I cannot help. I'm on MacOS.
Author
Owner

@4thanks commented on GitHub (May 13, 2024):

add HSA_OVERRIDE_GFX_VERSION=10.3.0 environment variables, and start with zluda ollama.exe serve, not zluda ollama app.exe serve, my gpu running again!

<!-- gh-comment-id:2108103513 --> @4thanks commented on GitHub (May 13, 2024): add `HSA_OVERRIDE_GFX_VERSION=10.3.0` environment variables, and start with `zluda ollama.exe serve`, not `zluda ollama app.exe serve`, my gpu running again!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28477