[GH-ISSUE #12388] Ollama windows build regression - Failing to use AMD GPU gfx1030 #33989

Closed
opened 2026-04-22 17:12:31 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @dsimoes on GitHub (Sep 23, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12388

What is the issue?

I installed a new version of Ollama 0.12.1 for windows, updating my previous one (don't know the previous version, several months older), and now Ollama fails to run with my GPU - AMD 6800 XT.
Attaching the log below.

Thanks in advance for the help

Relevant log output

time=2025-09-24T00:25:12.801+01:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\dsimo\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-09-24T00:25:12.805+01:00 level=INFO source=images.go:518 msg="total blobs: 34"
time=2025-09-24T00:25:12.807+01:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0"
time=2025-09-24T00:25:12.808+01:00 level=INFO source=routes.go:1528 msg="Listening on [::]:11434 (version 0.12.1)"
time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:183 msg="efficiency cores detected" maxEfficiencyClass=1
time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=16 efficiency=8 threads=24
time=2025-09-24T00:25:13.182+01:00 level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1030 driver=6.4 name="AMD Radeon RX 6800 XT" total="16.0 GiB" available="15.8 GiB"
time=2025-09-24T00:25:13.182+01:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"
[GIN] 2025/09/24 - 00:25:13 | 200 |      2.0352ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/09/24 - 00:25:13 | 200 |     33.4291ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/09/24 - 00:26:14 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/09/24 - 00:26:14 | 200 |     67.0215ms |       127.0.0.1 | POST     "/api/show"
time=2025-09-24T00:26:14.906+01:00 level=INFO source=sched.go:192 msg="one or more GPUs detected that are unable to accurately report free memory - disabling default concurrency"
time=2025-09-24T00:26:15.347+01:00 level=INFO source=server.go:399 msg="starting runner" cmd="C:\\Users\\dsimo\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model C:\\Users\\dsimo\\.ollama\\models\\blobs\\sha256-aeda25e63ebd698fab8638ffb778e68bed908b960d39d0becc650fa981609d25 --port 57590"
time=2025-09-24T00:26:15.349+01:00 level=INFO source=server.go:672 msg="loading model" "model layers"=35 requested=-1
time=2025-09-24T00:26:15.373+01:00 level=INFO source=runner.go:1252 msg="starting ollama engine"
time=2025-09-24T00:26:15.374+01:00 level=INFO source=runner.go:1287 msg="Server listening on 127.0.0.1:57590"
time=2025-09-24T00:26:15.687+01:00 level=INFO source=server.go:678 msg="system memory" total="31.8 GiB" free="19.2 GiB" free_swap="22.0 GiB"
time=2025-09-24T00:26:15.687+01:00 level=INFO source=server.go:686 msg="gpu memory" id=0 available="15.3 GiB" free="15.7 GiB" minimum="457.0 MiB" overhead="0 B"
time=2025-09-24T00:26:15.688+01:00 level=INFO source=runner.go:1171 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:8 GPULayers:35[ID:0 Layers:35(0..34)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2025-09-24T00:26:15.722+01:00 level=INFO source=ggml.go:131 msg="" architecture=gemma3 file_type=Q4_K_M name="" description="" num_tensors=883 num_key_values=36
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32, ID: 0
load_backend: loaded ROCm backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\ggml-hip.dll
load_backend: loaded CPU backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32
load_backend: loaded ROCm backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\rocm\ggml-hip.dll
Exception 0xc0000005 0x8 0xf 0xf
PC=0xf
signal arrived during external code execution

runtime.cgocall(0x7ff634a02660, 0xc00004b9d0)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc00004b9a8 sp=0xc00004b940 pc=0x7ff633ce2dbe
github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_reg_get_proc_address(0x23421e1fc00, 0x23421dfdc90)
(... stacktrace)

OS

Windows

GPU

AMD

CPU

Intel

Ollama version

0.12.1

Originally created by @dsimoes on GitHub (Sep 23, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12388 ### What is the issue? I installed a new version of Ollama 0.12.1 for windows, updating my previous one (don't know the previous version, several months older), and now Ollama fails to run with my GPU - AMD 6800 XT. Attaching the log below. Thanks in advance for the help ### Relevant log output ```shell time=2025-09-24T00:25:12.801+01:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\dsimo\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-09-24T00:25:12.805+01:00 level=INFO source=images.go:518 msg="total blobs: 34" time=2025-09-24T00:25:12.807+01:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0" time=2025-09-24T00:25:12.808+01:00 level=INFO source=routes.go:1528 msg="Listening on [::]:11434 (version 0.12.1)" time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:183 msg="efficiency cores detected" maxEfficiencyClass=1 time=2025-09-24T00:25:12.808+01:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=16 efficiency=8 threads=24 time=2025-09-24T00:25:13.182+01:00 level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1030 driver=6.4 name="AMD Radeon RX 6800 XT" total="16.0 GiB" available="15.8 GiB" time=2025-09-24T00:25:13.182+01:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB" [GIN] 2025/09/24 - 00:25:13 | 200 | 2.0352ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/09/24 - 00:25:13 | 200 | 33.4291ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/09/24 - 00:26:14 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/09/24 - 00:26:14 | 200 | 67.0215ms | 127.0.0.1 | POST "/api/show" time=2025-09-24T00:26:14.906+01:00 level=INFO source=sched.go:192 msg="one or more GPUs detected that are unable to accurately report free memory - disabling default concurrency" time=2025-09-24T00:26:15.347+01:00 level=INFO source=server.go:399 msg="starting runner" cmd="C:\\Users\\dsimo\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model C:\\Users\\dsimo\\.ollama\\models\\blobs\\sha256-aeda25e63ebd698fab8638ffb778e68bed908b960d39d0becc650fa981609d25 --port 57590" time=2025-09-24T00:26:15.349+01:00 level=INFO source=server.go:672 msg="loading model" "model layers"=35 requested=-1 time=2025-09-24T00:26:15.373+01:00 level=INFO source=runner.go:1252 msg="starting ollama engine" time=2025-09-24T00:26:15.374+01:00 level=INFO source=runner.go:1287 msg="Server listening on 127.0.0.1:57590" time=2025-09-24T00:26:15.687+01:00 level=INFO source=server.go:678 msg="system memory" total="31.8 GiB" free="19.2 GiB" free_swap="22.0 GiB" time=2025-09-24T00:26:15.687+01:00 level=INFO source=server.go:686 msg="gpu memory" id=0 available="15.3 GiB" free="15.7 GiB" minimum="457.0 MiB" overhead="0 B" time=2025-09-24T00:26:15.688+01:00 level=INFO source=runner.go:1171 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:8 GPULayers:35[ID:0 Layers:35(0..34)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2025-09-24T00:26:15.722+01:00 level=INFO source=ggml.go:131 msg="" architecture=gemma3 file_type=Q4_K_M name="" description="" num_tensors=883 num_key_values=36 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32, ID: 0 load_backend: loaded ROCm backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\ggml-hip.dll load_backend: loaded CPU backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32 load_backend: loaded ROCm backend from C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama\rocm\ggml-hip.dll Exception 0xc0000005 0x8 0xf 0xf PC=0xf signal arrived during external code execution runtime.cgocall(0x7ff634a02660, 0xc00004b9d0) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc00004b9a8 sp=0xc00004b940 pc=0x7ff633ce2dbe github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_reg_get_proc_address(0x23421e1fc00, 0x23421dfdc90) (... stacktrace) ``` ### OS Windows ### GPU AMD ### CPU Intel ### Ollama version 0.12.1
GiteaMirror added the bug label 2026-04-22 17:12:31 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 24, 2025):

You have multiple copies of ggml-hip.dll. I suggest stopping ollama, deleting C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama and re-installing.

<!-- gh-comment-id:3326027354 --> @rick-github commented on GitHub (Sep 24, 2025): You have multiple copies of ggml-hip.dll. I suggest stopping ollama, deleting C:\Users\dsimo\AppData\Local\Programs\Ollama\lib\ollama and re-installing.
Author
Owner

@pdevine commented on GitHub (Sep 24, 2025):

cc @dhiltgen

<!-- gh-comment-id:3326114484 --> @pdevine commented on GitHub (Sep 24, 2025): cc @dhiltgen
Author
Owner

@dsimoes commented on GitHub (Sep 24, 2025):

Thanks @rick-github I've done like you suggested and It's working now.
Cheers.

<!-- gh-comment-id:3327221119 --> @dsimoes commented on GitHub (Sep 24, 2025): Thanks @rick-github I've done like you suggested and It's working now. Cheers.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#33989