[GH-ISSUE #12915] Runner crashes checking AMD MI50 GPU #55075

Closed
opened 2026-04-29 08:17:13 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @0xE1 on GitHub (Nov 2, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12915

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

Seems like something goes very wrong with runner as it crashes while checking compatible GPUs and host also throws a lot of amdgpu driver errors.

It works without issues if I roll back to 0.12.3-rocm tag.

Flags:
HCC_AMDGPU_TARGET=gfx906
HSA_OVERRIDE_GFX_VERSION=10.3.0
OLLAMA_DEBUG=1
AMD_LOG_LEVEL=3 (this one oddly didn't change anything)

Relevant log output

time=2025-11-02T22:55:49.353Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.3.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-11-02T22:55:49.384Z level=INFO source=images.go:522 msg="total blobs: 5"
time=2025-11-02T22:55:49.401Z level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-02T22:55:49.419Z level=INFO source=routes.go:1577 msg="Listening on [::]:11434 (version 0.12.9)"
time=2025-11-02T22:55:49.419Z level=DEBUG source=sched.go:120 msg="starting llm scheduler"
time=2025-11-02T22:55:49.420Z level=INFO source=runner.go:76 msg="discovering available GPUs..."
time=2025-11-02T22:55:49.422Z level=INFO source=server.go:400 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44589"
time=2025-11-02T22:55:49.422Z level=DEBUG source=server.go:401 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_ORIGINS=* HSA_OVERRIDE_GFX_VERSION=10.3.0 OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm
time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:471 msg="bootstrap discovery took" duration=3.129890019s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs=map[]
time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:120 msg="evluating which if any devices to filter out" initial_count=1
time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:132 msg="verifying GPU is supported" library=/usr/lib/ollama/rocm description="AMD Radeon Graphics" compute=gfx1030 id=GPU-214a716173497dd3 pci_id=0000:45:00.0
time=2025-11-02T22:55:52.551Z level=INFO source=server.go:400 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41391"
time=2025-11-02T22:55:52.551Z level=DEBUG source=server.go:401 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_ORIGINS=* HSA_OVERRIDE_GFX_VERSION=10.3.0 OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm GGML_CUDA_INIT=1 ROCR_VISIBLE_DEVICES=GPU-214a716173497dd3
time=2025-11-02T22:55:58.210Z level=INFO source=runner.go:498 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-214a716173497dd3]" error="runner crashed"
time=2025-11-02T22:55:58.210Z level=DEBUG source=runner.go:471 msg="bootstrap discovery took" duration=5.660153408s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-214a716173497dd3]"
time=2025-11-02T22:55:58.210Z level=DEBUG source=runner.go:158 msg="filtering device which didn't fully initialize" id=GPU-214a716173497dd3 libdir=/usr/lib/ollama/rocm pci_id=0000:45:00.0 library=ROCm
time=2025-11-02T22:55:58.211Z level=DEBUG source=runner.go:41 msg="GPU bootstrap discovery took" duration=8.791322532s
time=2025-11-02T22:55:58.211Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="251.8 GiB" available="197.0 GiB"
time=2025-11-02T22:55:58.211Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

dmesg from the host:
> [Sun Nov  2 23:41:39 2025] gmc_v9_0_process_interrupt: 43 callbacks suppressed
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x008012B1
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (inst) (0x9)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x1
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0xb
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:  for process ollama pid 103233 thread ollama pid 103233)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:   in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MORE_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          WALKER_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          MAPPING_ERROR: 0x0
[Sun Nov  2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu:          RW: 0x0
[Sun Nov  2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80004000
[Sun Nov  2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: queue id 0x0 at pasid 0x8001 is reset
[Sun Nov  2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: Queues reset on process ollama tid 103233 thread ollama pid 103233
[Sun Nov  2 23:41:43 2025] event_interrupt_wq_v9: 230 callbacks suppressed
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 15, err_type 1
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 15, err_type 1
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 3, cu_id 15, err_type 2
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 0, cu_id 15, err_type 2
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 15, err_type 1
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 3, cu_id 15, err_type 1
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 15, err_type 1
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 14, err_type 2
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 14, err_type 2
[Sun Nov  2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 0, cu_id 15, err_type 2

OS

Linux, Docker

GPU

AMD

CPU

AMD

Ollama version

0.12.9

Originally created by @0xE1 on GitHub (Nov 2, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12915 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? Seems like something goes very wrong with runner as it crashes while checking compatible GPUs and host also throws a lot of amdgpu driver errors. It works without issues if I roll back to `0.12.3-rocm` tag. Flags: HCC_AMDGPU_TARGET=gfx906 HSA_OVERRIDE_GFX_VERSION=10.3.0 OLLAMA_DEBUG=1 AMD_LOG_LEVEL=3 (this one oddly didn't change anything) ### Relevant log output ```shell time=2025-11-02T22:55:49.353Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.3.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-11-02T22:55:49.384Z level=INFO source=images.go:522 msg="total blobs: 5" time=2025-11-02T22:55:49.401Z level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-02T22:55:49.419Z level=INFO source=routes.go:1577 msg="Listening on [::]:11434 (version 0.12.9)" time=2025-11-02T22:55:49.419Z level=DEBUG source=sched.go:120 msg="starting llm scheduler" time=2025-11-02T22:55:49.420Z level=INFO source=runner.go:76 msg="discovering available GPUs..." time=2025-11-02T22:55:49.422Z level=INFO source=server.go:400 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44589" time=2025-11-02T22:55:49.422Z level=DEBUG source=server.go:401 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_ORIGINS=* HSA_OVERRIDE_GFX_VERSION=10.3.0 OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:471 msg="bootstrap discovery took" duration=3.129890019s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs=map[] time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:120 msg="evluating which if any devices to filter out" initial_count=1 time=2025-11-02T22:55:52.550Z level=DEBUG source=runner.go:132 msg="verifying GPU is supported" library=/usr/lib/ollama/rocm description="AMD Radeon Graphics" compute=gfx1030 id=GPU-214a716173497dd3 pci_id=0000:45:00.0 time=2025-11-02T22:55:52.551Z level=INFO source=server.go:400 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41391" time=2025-11-02T22:55:52.551Z level=DEBUG source=server.go:401 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_ORIGINS=* HSA_OVERRIDE_GFX_VERSION=10.3.0 OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_LIBRARY_PATH=/usr/lib/ollama:/usr/lib/ollama/rocm GGML_CUDA_INIT=1 ROCR_VISIBLE_DEVICES=GPU-214a716173497dd3 time=2025-11-02T22:55:58.210Z level=INFO source=runner.go:498 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-214a716173497dd3]" error="runner crashed" time=2025-11-02T22:55:58.210Z level=DEBUG source=runner.go:471 msg="bootstrap discovery took" duration=5.660153408s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-214a716173497dd3]" time=2025-11-02T22:55:58.210Z level=DEBUG source=runner.go:158 msg="filtering device which didn't fully initialize" id=GPU-214a716173497dd3 libdir=/usr/lib/ollama/rocm pci_id=0000:45:00.0 library=ROCm time=2025-11-02T22:55:58.211Z level=DEBUG source=runner.go:41 msg="GPU bootstrap discovery took" duration=8.791322532s time=2025-11-02T22:55:58.211Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="251.8 GiB" available="197.0 GiB" time=2025-11-02T22:55:58.211Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" dmesg from the host: > [Sun Nov 2 23:41:39 2025] gmc_v9_0_process_interrupt: 43 callbacks suppressed [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x008012B1 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: SQC (inst) (0x9) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x1 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0xb [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:88 vmid:8 pasid:32769) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: for process ollama pid 103233 thread ollama pid 103233) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MORE_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Nov 2 23:41:39 2025] amdgpu 0000:45:00.0: amdgpu: RW: 0x0 [Sun Nov 2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80004000 [Sun Nov 2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: queue id 0x0 at pasid 0x8001 is reset [Sun Nov 2 23:41:43 2025] amdgpu 0000:45:00.0: amdgpu: Queues reset on process ollama tid 103233 thread ollama pid 103233 [Sun Nov 2 23:41:43 2025] event_interrupt_wq_v9: 230 callbacks suppressed [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 15, err_type 1 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 15, err_type 1 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 3, cu_id 15, err_type 2 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 0, cu_id 15, err_type 2 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 15, err_type 1 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 3, cu_id 15, err_type 1 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 15, err_type 1 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 1, cu_id 14, err_type 2 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 2, data 0x0, sh 0, priv 1, wave_id 0, simd_id 2, cu_id 14, err_type 2 [Sun Nov 2 23:41:43 2025] amdgpu: sq_intr: error, se 0, data 0x0, sh 0, priv 1, wave_id 0, simd_id 0, cu_id 15, err_type 2 ``` ### OS Linux, Docker ### GPU AMD ### CPU AMD ### Ollama version 0.12.9
GiteaMirror added the gpuamdbug labels 2026-04-29 08:17:13 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 2, 2025):

MI50 is not supported by ROCm from 0.12.5. Experimental Vulkan support was added in 0.12.6 which should support this GPU. You can clone the repo and build ollama to test the experimental support, or stay on 0.12.4 until Vulkan support is mainlined.

<!-- gh-comment-id:3478472119 --> @rick-github commented on GitHub (Nov 2, 2025): MI50 is not supported by ROCm from [0.12.5](https://github.com/ollama/ollama/releases/tag/v0.12.5). Experimental Vulkan support was added in [0.12.6](https://github.com/ollama/ollama/releases/tag/v0.12.6) which should support this GPU. You can clone the repo and build ollama to test the experimental support, or stay on 0.12.4 until Vulkan support is mainlined.
Author
Owner

@dhiltgen commented on GitHub (Nov 4, 2025):

gfx1030 is quite a bit different than gfx906, so I'm not surprised things go bad when forcing it via the override.

Let's track this via #12600

<!-- gh-comment-id:3487077001 --> @dhiltgen commented on GitHub (Nov 4, 2025): gfx1030 is quite a bit different than gfx906, so I'm not surprised things go bad when forcing it via the override. Let's track this via #12600
Author
Owner

@0xE1 commented on GitHub (Nov 5, 2025):

For the no-retry page fault errors mentioned in the log, I wonder if it's due to amdgpu or mesa drivers in kernel 6.12.54 that's in latest unraid 7.2.0,
I've passed the GPU into a Ubuntu 24 LTS VM with 6.17 kernel and it has no issues running 0.12.3.

Thanks for the tip about experimental Vulkan support, will give it a try sometime later.

<!-- gh-comment-id:3488757214 --> @0xE1 commented on GitHub (Nov 5, 2025): For the `no-retry page fault` errors mentioned in the log, I wonder if it's due to amdgpu or mesa drivers in kernel 6.12.54 that's in latest unraid 7.2.0, I've passed the GPU into a Ubuntu 24 LTS VM with 6.17 kernel and it has no issues running 0.12.3. Thanks for the tip about experimental Vulkan support, will give it a try sometime later.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55075