[GH-ISSUE #8765] Doesn't work with AMD GPU when path has a space in it - ROCm error: no kernel image is available for execution on the device #31450

Open
opened 2026-04-22 11:53:57 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @pkrasicki on GitHub (Feb 1, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8765

What is the issue?

I can't get Ollama 0.5.7 to work with my Radeon RX 6700 XT (gfx1031) GPU on Debian 13. I got Ollama by downloading and extracting the archives from the releases page: ollama-linux-amd64.tgz and ollama-linux-amd64-rocm.tgz.

Gfx1031 doesn't get detected, so I set the override variable to gfx1030 when starting the server:
HSA_OVERRIDE_GFX_VERSION=10.3.0 ./ollama serve

Here's the log:

2025/02/01 19:37:43 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.3.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/user/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-02-01T19:37:43.401Z level=INFO source=images.go:432 msg="total blobs: 5"
time=2025-02-01T19:37:43.401Z level=INFO source=images.go:439 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:	export GIN_MODE=release
 - using code:	gin.SetMode(gin.ReleaseMode)

[GIN-debug] POST   /api/pull                 --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers)
[GIN-debug] POST   /api/generate             --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST   /api/chat                 --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST   /api/embed                --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
[GIN-debug] POST   /api/embeddings           --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST   /api/create               --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers)
[GIN-debug] POST   /api/push                 --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers)
[GIN-debug] POST   /api/copy                 --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete               --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
[GIN-debug] POST   /api/show                 --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers)
[GIN-debug] POST   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] GET    /api/ps                   --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers)
[GIN-debug] POST   /v1/chat/completions      --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers)
[GIN-debug] POST   /v1/completions           --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers)
[GIN-debug] POST   /v1/embeddings            --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers)
[GIN-debug] GET    /v1/models                --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers)
[GIN-debug] GET    /v1/models/:model         --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers)
[GIN-debug] GET    /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET    /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] GET    /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD   /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] HEAD   /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
time=2025-02-01T19:37:43.402Z level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)"
time=2025-02-01T19:37:43.402Z level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx cpu cpu_avx]"
time=2025-02-01T19:37:43.402Z level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
time=2025-02-01T19:37:43.412Z level=WARN source=amd_linux.go:61 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2025-02-01T19:37:43.412Z level=INFO source=amd_linux.go:391 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0
time=2025-02-01T19:37:43.412Z level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1031 driver=0.0 name=1002:73df total="12.0 GiB" available="11.3 GiB"

(...)

time=2025-02-01T19:38:11.459Z level=INFO source=server.go:594 msg="llama runner started in 2.26 seconds"
ggml_cuda_compute_forward: RMS_NORM failed
ROCm error: no kernel image is available for execution on the device
  current device: 0, in function ggml_cuda_compute_forward at llama/ggml-cuda/ggml-cuda.cu:2218
  err
llama/ggml-cuda/ggml-cuda.cu:96: ROCm error
SIGSEGV: segmentation violation
PC=0x7f170cf4928b m=7 sigcode=1 addr=0x7f1685bbc018
signal arrived during cgo execution

(...)

I've never been able to run a newer version of ROCM than 5.7 (I don't know why), so maybe that's the issue? I have no errors when running Ollama on the CPU.

OS

Linux

GPU

AMD

CPU

No response

Ollama version

0.5.7

Originally created by @pkrasicki on GitHub (Feb 1, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8765 ### What is the issue? I can't get Ollama 0.5.7 to work with my Radeon RX 6700 XT (gfx1031) GPU on Debian 13. I got Ollama by downloading and extracting the archives from the releases page: `ollama-linux-amd64.tgz` and `ollama-linux-amd64-rocm.tgz`. Gfx1031 doesn't get detected, so I set the override variable to gfx1030 when starting the server: `HSA_OVERRIDE_GFX_VERSION=10.3.0 ./ollama serve` Here's the log: ``` 2025/02/01 19:37:43 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.3.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/user/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-02-01T19:37:43.401Z level=INFO source=images.go:432 msg="total blobs: 5" time=2025-02-01T19:37:43.401Z level=INFO source=images.go:439 msg="total unused blobs removed: 0" [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached. [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. - using env: export GIN_MODE=release - using code: gin.SetMode(gin.ReleaseMode) [GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers) [GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers) [GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers) [GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers) [GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers) [GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers) [GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers) [GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers) [GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers) [GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers) [GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers) [GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers) [GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers) [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers) [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers) [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers) [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers) [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers) [GIN-debug] GET / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) [GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) [GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) [GIN-debug] HEAD / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) [GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) [GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) time=2025-02-01T19:37:43.402Z level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)" time=2025-02-01T19:37:43.402Z level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx cpu cpu_avx]" time=2025-02-01T19:37:43.402Z level=INFO source=gpu.go:226 msg="looking for compatible GPUs" time=2025-02-01T19:37:43.412Z level=WARN source=amd_linux.go:61 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2025-02-01T19:37:43.412Z level=INFO source=amd_linux.go:391 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0 time=2025-02-01T19:37:43.412Z level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1031 driver=0.0 name=1002:73df total="12.0 GiB" available="11.3 GiB" (...) time=2025-02-01T19:38:11.459Z level=INFO source=server.go:594 msg="llama runner started in 2.26 seconds" ggml_cuda_compute_forward: RMS_NORM failed ROCm error: no kernel image is available for execution on the device current device: 0, in function ggml_cuda_compute_forward at llama/ggml-cuda/ggml-cuda.cu:2218 err llama/ggml-cuda/ggml-cuda.cu:96: ROCm error SIGSEGV: segmentation violation PC=0x7f170cf4928b m=7 sigcode=1 addr=0x7f1685bbc018 signal arrived during cgo execution (...) ``` I've never been able to run a newer version of ROCM than 5.7 (I don't know why), so maybe that's the issue? I have no errors when running Ollama on the CPU. ### OS Linux ### GPU AMD ### CPU _No response_ ### Ollama version 0.5.7
GiteaMirror added the bug label 2026-04-22 11:53:57 -05:00
Author
Owner

@pkrasicki commented on GitHub (Feb 11, 2025):

It started working after I moved Ollama to a different location that doesn't have a space in the path.

<!-- gh-comment-id:2650671328 --> @pkrasicki commented on GitHub (Feb 11, 2025): It started working after I moved Ollama to a different location that doesn't have a space in the path.
Author
Owner

@pkrasicki commented on GitHub (Feb 13, 2025):

This is caused by a bug in ROCm: https://github.com/ROCm/ROCm/issues/4329#issuecomment-2651923830. Can we get some kind of workaround for this in Ollama? Or at least it should be documented.

<!-- gh-comment-id:2656290099 --> @pkrasicki commented on GitHub (Feb 13, 2025): This is caused by a bug in ROCm: https://github.com/ROCm/ROCm/issues/4329#issuecomment-2651923830. Can we get some kind of workaround for this in Ollama? Or at least it should be documented.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#31450