[GH-ISSUE #14869] 0.18.0 AMD Ryzen AI Max+ 395 can not allocate memory #71648

Open
opened 2026-05-05 02:16:48 -05:00 by GiteaMirror · 16 comments
Owner

Originally created by @Haiwen-Yin on GitHub (Mar 16, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14869

What is the issue?

After upgrading Ollama to version 0.18.0 and extracting the relevant files, the model cannot use the GPU memory properly when started.
The hardware is AMD Ryzen AI Max+395, and ROCm has already been installed.
I have already rollback to 0.17.7, no logs anymore.

Relevant log output


OS

Windows

GPU

AMD

CPU

AMD

Ollama version

0.18.0

Originally created by @Haiwen-Yin on GitHub (Mar 16, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14869 ### What is the issue? After upgrading Ollama to version 0.18.0 and extracting the relevant files, the model cannot use the GPU memory properly when started. The hardware is AMD Ryzen AI Max+395, and ROCm has already been installed. I have already rollback to 0.17.7, no logs anymore. ### Relevant log output ```shell ``` ### OS Windows ### GPU AMD ### CPU AMD ### Ollama version 0.18.0
GiteaMirror added the bug label 2026-05-05 02:16:48 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 16, 2026):

What version of ROCm is installed? 0.18.0+ needs ROCm 7.

<!-- gh-comment-id:4065650806 --> @rick-github commented on GitHub (Mar 16, 2026): What version of ROCm is installed? 0.18.0+ needs ROCm 7.
Author
Owner

@Jasdfgh commented on GitHub (Mar 16, 2026):

could you try reinstalling 0.18.0 and capturing the logs before rolling back? on Windows they should be at %LOCALAPPDATA%\Ollama\server.log. looking for whether ROCm detects the GPU at all (should show a line like inference compute with your GPU info).

one thing worth checking: is your VRAM allocation set to "Auto" or a fixed size in BIOS? there have been issues with Ryzen AI APUs where dynamic VRAM mode causes GPU detection to fail (#11451). if it's on Auto, try setting a fixed VRAM size.

also, what driver version are you on?

<!-- gh-comment-id:4065664710 --> @Jasdfgh commented on GitHub (Mar 16, 2026): could you try reinstalling 0.18.0 and capturing the logs before rolling back? on Windows they should be at %LOCALAPPDATA%\Ollama\server.log. looking for whether ROCm detects the GPU at all (should show a line like inference compute with your GPU info). one thing worth checking: is your VRAM allocation set to "Auto" or a fixed size in BIOS? there have been issues with Ryzen AI APUs where dynamic VRAM mode causes GPU detection to fail (#11451). if it's on Auto, try setting a fixed VRAM size. also, what driver version are you on?
Author
Owner

@mcpata2002 commented on GitHub (Mar 17, 2026):

Same for me using ollama/ollama:rocm docker image on Debian 12.

OS

Debian 12

CPU and GPU

AMD RYZEN AI MAX+ 395 w/ Radeon 8060S

Ollama version
0.18.0

hipconfig -v

7.2.26015-fc0010cf6a

HSA System Attributes

ROCk module version 6.16.13 is loaded

HSA System Attributes

Runtime Version: 1.18
Runtime Ext Version: 1.15
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES

Output when I start ollama serve via the docker image.

aillm-ollama  | time=2026-03-17T01:48:18.523Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:11.0.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:2 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
aillm-ollama  | time=2026-03-17T01:48:18.523Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false"
aillm-ollama  | time=2026-03-17T01:48:18.531Z level=INFO source=images.go:477 msg="total blobs: 154"
aillm-ollama  | time=2026-03-17T01:48:18.533Z level=INFO source=images.go:484 msg="total unused blobs removed: 0"
aillm-ollama  | time=2026-03-17T01:48:18.533Z level=INFO source=routes.go:1782 msg="Listening on [::]:11434 (version 0.18.0)"
aillm-ollama  | time=2026-03-17T01:48:18.534Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
aillm-ollama  | time=2026-03-17T01:48:18.534Z level=WARN source=runner.go:485 msg="user overrode visible devices" HSA_OVERRIDE_GFX_VERSION=11.0.0
aillm-ollama  | time=2026-03-17T01:48:18.534Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
aillm-ollama  | time=2026-03-17T01:48:18.534Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41113"
aillm-ollama  | time=2026-03-17T01:48:18.635Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41653"
aillm-ollama  | time=2026-03-17T01:48:19.006Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="runner crashed"
aillm-ollama  | time=2026-03-17T01:48:19.007Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="30.8 GiB"
aillm-ollama  | time=2026-03-17T01:48:19.007Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
<!-- gh-comment-id:4071799589 --> @mcpata2002 commented on GitHub (Mar 17, 2026): Same for me using ollama/ollama:rocm docker image on Debian 12. *OS* Debian 12 *CPU and GPU* AMD RYZEN AI MAX+ 395 w/ Radeon 8060S *Ollama version* 0.18.0 hipconfig -v 7.2.26015-fc0010cf6a *HSA System Attributes* ROCk module version 6.16.13 is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.18 Runtime Ext Version: 1.15 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED XNACK enabled: NO DMAbuf Support: YES VMM Support: YES *Output when I start ollama serve via the docker image.* ``` aillm-ollama | time=2026-03-17T01:48:18.523Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:11.0.0 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:2 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" aillm-ollama | time=2026-03-17T01:48:18.523Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false" aillm-ollama | time=2026-03-17T01:48:18.531Z level=INFO source=images.go:477 msg="total blobs: 154" aillm-ollama | time=2026-03-17T01:48:18.533Z level=INFO source=images.go:484 msg="total unused blobs removed: 0" aillm-ollama | time=2026-03-17T01:48:18.533Z level=INFO source=routes.go:1782 msg="Listening on [::]:11434 (version 0.18.0)" aillm-ollama | time=2026-03-17T01:48:18.534Z level=INFO source=runner.go:67 msg="discovering available GPUs..." aillm-ollama | time=2026-03-17T01:48:18.534Z level=WARN source=runner.go:485 msg="user overrode visible devices" HSA_OVERRIDE_GFX_VERSION=11.0.0 aillm-ollama | time=2026-03-17T01:48:18.534Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again" aillm-ollama | time=2026-03-17T01:48:18.534Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41113" aillm-ollama | time=2026-03-17T01:48:18.635Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 41653" aillm-ollama | time=2026-03-17T01:48:19.006Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="runner crashed" aillm-ollama | time=2026-03-17T01:48:19.007Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="30.8 GiB" aillm-ollama | time=2026-03-17T01:48:19.007Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ```
Author
Owner

@rick-github commented on GitHub (Mar 17, 2026):

Don't set HSA_OVERRIDE_GFX_VERSION.

<!-- gh-comment-id:4071855087 --> @rick-github commented on GitHub (Mar 17, 2026): Don't set `HSA_OVERRIDE_GFX_VERSION`.
Author
Owner

@Haiwen-Yin commented on GitHub (Mar 17, 2026):

What version of ROCm is installed? 0.18.0+ needs ROCm 7.

ROCm 7.1.1 installed

<!-- gh-comment-id:4072361545 --> @Haiwen-Yin commented on GitHub (Mar 17, 2026): > What version of ROCm is installed? 0.18.0+ needs ROCm 7. ROCm 7.1.1 installed
Author
Owner

@rick-github commented on GitHub (Mar 17, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:4074951240 --> @rick-github commented on GitHub (Mar 17, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Author
Owner

@jdblns commented on GitHub (Mar 17, 2026):

tysy@horse:~$ ollama serve
time=2026-03-17T13:45:10.571Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:11.5.1 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:13624 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY:rocm OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/tysy/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-03-17T13:45:10.572Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false"
time=2026-03-17T13:45:10.572Z level=INFO source=images.go:477 msg="total blobs: 0"
time=2026-03-17T13:45:10.572Z level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2026-03-17T13:45:10.573Z level=INFO source=routes.go:1782 msg="Listening on [::]:13624 (version 0.18.0)"
time=2026-03-17T13:45:10.573Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-03-17T13:45:10.574Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-03-17T13:45:10.574Z level=WARN source=runner.go:485 msg="user overrode visible devices" HSA_OVERRIDE_GFX_VERSION=11.5.1
time=2026-03-17T13:45:10.574Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
time=2026-03-17T13:45:10.574Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v12
time=2026-03-17T13:45:10.574Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v13
time=2026-03-17T13:45:10.574Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41361"
time=2026-03-17T13:45:10.574Z level=DEBUG source=server.go:431 msg=subprocess GGML_VK_DEVICE_VRAM=96G OLLAMA_HOST=0.0.0.0:13624 OLLAMA_LLM_LIBRARY=rocm OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm::/opt/rocm/lib PATH=/home/tysy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm/bin:/opt/rocm/hip/bin:/home/tysy/.local/bin:/home/tysy/.local/bin HSA_OVERRIDE_GFX_VERSION=11.5.1 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm
time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=193.258796ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[]
time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/vulkan
time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=1
time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:146 msg="verifying if device is supported" library=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 id=0 pci_id=0000:c5:00.0
time=2026-03-17T13:45:10.767Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 44141"
time=2026-03-17T13:45:10.767Z level=DEBUG source=server.go:431 msg=subprocess GGML_VK_DEVICE_VRAM=96G OLLAMA_HOST=0.0.0.0:13624 OLLAMA_LLM_LIBRARY=rocm OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm::/opt/rocm/lib PATH=/home/tysy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm/bin:/opt/rocm/hip/bin:/home/tysy/.local/bin:/home/tysy/.local/bin HSA_OVERRIDE_GFX_VERSION=11.5.1 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm ROCR_VISIBLE_DEVICES=0 GGML_CUDA_INIT=1
time=2026-03-17T13:45:40.768Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="failed to finish discovery before timeout"
time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.001439724s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]"
time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=/usr/local/lib/ollama/rocm pci_id=0000:c5:00.0 library=ROCm
time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.195110396s
time=2026-03-17T13:45:40.769Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="28.4 GiB"
time=2026-03-17T13:45:40.769Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
<!-- gh-comment-id:4075122445 --> @jdblns commented on GitHub (Mar 17, 2026): ``` tysy@horse:~$ ollama serve time=2026-03-17T13:45:10.571Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:11.5.1 HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:13624 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY:rocm OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/tysy/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-03-17T13:45:10.572Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false" time=2026-03-17T13:45:10.572Z level=INFO source=images.go:477 msg="total blobs: 0" time=2026-03-17T13:45:10.572Z level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2026-03-17T13:45:10.573Z level=INFO source=routes.go:1782 msg="Listening on [::]:13624 (version 0.18.0)" time=2026-03-17T13:45:10.573Z level=DEBUG source=sched.go:145 msg="starting llm scheduler" time=2026-03-17T13:45:10.574Z level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-03-17T13:45:10.574Z level=WARN source=runner.go:485 msg="user overrode visible devices" HSA_OVERRIDE_GFX_VERSION=11.5.1 time=2026-03-17T13:45:10.574Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again" time=2026-03-17T13:45:10.574Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v12 time=2026-03-17T13:45:10.574Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v13 time=2026-03-17T13:45:10.574Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41361" time=2026-03-17T13:45:10.574Z level=DEBUG source=server.go:431 msg=subprocess GGML_VK_DEVICE_VRAM=96G OLLAMA_HOST=0.0.0.0:13624 OLLAMA_LLM_LIBRARY=rocm OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm::/opt/rocm/lib PATH=/home/tysy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm/bin:/opt/rocm/hip/bin:/home/tysy/.local/bin:/home/tysy/.local/bin HSA_OVERRIDE_GFX_VERSION=11.5.1 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=193.258796ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[] time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/vulkan time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=1 time=2026-03-17T13:45:10.767Z level=DEBUG source=runner.go:146 msg="verifying if device is supported" library=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 id=0 pci_id=0000:c5:00.0 time=2026-03-17T13:45:10.767Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 44141" time=2026-03-17T13:45:10.767Z level=DEBUG source=server.go:431 msg=subprocess GGML_VK_DEVICE_VRAM=96G OLLAMA_HOST=0.0.0.0:13624 OLLAMA_LLM_LIBRARY=rocm OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm::/opt/rocm/lib PATH=/home/tysy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/rocm/bin:/opt/rocm/hip/bin:/home/tysy/.local/bin:/home/tysy/.local/bin HSA_OVERRIDE_GFX_VERSION=11.5.1 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm ROCR_VISIBLE_DEVICES=0 GGML_CUDA_INIT=1 time=2026-03-17T13:45:40.768Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="failed to finish discovery before timeout" time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.001439724s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=/usr/local/lib/ollama/rocm pci_id=0000:c5:00.0 library=ROCm time=2026-03-17T13:45:40.769Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.195110396s time=2026-03-17T13:45:40.769Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="28.4 GiB" time=2026-03-17T13:45:40.769Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ```
Author
Owner

@rick-github commented on GitHub (Mar 17, 2026):

Don't set HSA_OVERRIDE_GFX_VERSION. Set OLLAMA_DEBUG=2 and post the log.

<!-- gh-comment-id:4075147662 --> @rick-github commented on GitHub (Mar 17, 2026): Don't set `HSA_OVERRIDE_GFX_VERSION`. Set `OLLAMA_DEBUG=2` and post the log.
Author
Owner

@jdblns commented on GitHub (Mar 17, 2026):

2026-03-17T13:58:13.283995+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.283Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES:0 HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:13624 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY:rocm OLLAMA_LOAD_TIMEOUT:1m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
2026-03-17T13:58:13.284080+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.283Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false"
2026-03-17T13:58:13.284208+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=images.go:477 msg="total blobs: 8"
2026-03-17T13:58:13.284266+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=images.go:484 msg="total unused blobs removed: 0"
2026-03-17T13:58:13.284394+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=routes.go:1782 msg="Listening on [::]:13624 (version 0.18.0)"
2026-03-17T13:58:13.284572+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
2026-03-17T13:58:13.284747+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
2026-03-17T13:58:13.284787+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=WARN source=runner.go:485 msg="user overrode visible devices" HIP_VISIBLE_DEVICES=0
2026-03-17T13:58:13.284809+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
2026-03-17T13:58:13.284825+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v12
2026-03-17T13:58:13.284836+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v13
2026-03-17T13:58:13.284850+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs=map[]
2026-03-17T13:58:13.285588+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.285Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 34259"
2026-03-17T13:58:13.285779+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.285Z level=DEBUG source=server.go:431 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/snap/bin OLLAMA_GPU_INFO="{\"library\":\"rocm\",\"vram\":103079215104,\"compute\":\"gfx1151\"}" OLLAMA_VULKAN=0 GGML_CUDA_INIT=0 OLLAMA_INTEL_GPU=0 HIP_VISIBLE_DEVICES=0 OLLAMA_MAX_VRAM=98304 OLLAMA_DEBUG=2 OLLAMA_SKIP_GPU_CHECK=1 OLLAMA_LLM_LIBRARY=rocm OLLAMA_LOAD_TIMEOUT=1m OLLAMA_MODELS=/home/ollama/models OLLAMA_HOST=0.0.0.0:13624 ROCM_VVM_ENABLE=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm
2026-03-17T13:58:13.294054+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.294Z level=INFO source=runner.go:1411 msg="starting ollama engine"
2026-03-17T13:58:13.294184+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.294Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:34259"
2026-03-17T13:58:13.296715+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
2026-03-17T13:58:13.296727+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
2026-03-17T13:58:13.296741+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
2026-03-17T13:58:13.296777+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
2026-03-17T13:58:13.296796+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
2026-03-17T13:58:13.296806+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
2026-03-17T13:58:13.296818+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
2026-03-17T13:58:13.296831+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
2026-03-17T13:58:13.300580+00:00 horse ollama[1917]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
2026-03-17T13:58:13.300612+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.300Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm
2026-03-17T13:58:13.467398+00:00 horse ollama[1917]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
2026-03-17T13:58:13.467495+00:00 horse ollama[1917]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
2026-03-17T13:58:13.467522+00:00 horse ollama[1917]: ggml_cuda_init: found 1 ROCm devices:
2026-03-17T13:58:13.467545+00:00 horse ollama[1917]: ggml_cuda_init: initializing rocBLAS on device 0
2026-03-17T13:57:34.457263+00:00 horse kernel: message repeated 2 times: [ ath12k_pci 0000:c2:00.0: received scan start failure event]
2026-03-17T13:58:13.471179+00:00 horse kernel: gmc_v11_0_process_interrupt: 114 callbacks suppressed
2026-03-17T13:58:13.471196+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:8 pasid:32770)
2026-03-17T13:58:13.471197+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:  Process ollama pid 1932 thread ollama pid 1938
2026-03-17T13:58:13.471197+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:   in page starting at address 0x00007ba6e668a000 from client 10
2026-03-17T13:58:13.471198+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800932
2026-03-17T13:58:13.471200+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      Faulty UTCL2 client ID: CPF (0x4)
2026-03-17T13:58:13.471201+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      MORE_FAULTS: 0x0
2026-03-17T13:58:13.471201+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      WALKER_ERROR: 0x1
2026-03-17T13:58:13.471202+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
2026-03-17T13:58:13.471202+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      MAPPING_ERROR: 0x1
2026-03-17T13:58:13.471203+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      RW: 0x0
2026-03-17T13:58:36.695702+00:00 horse wpa_supplicant[1131]: wlp194s0: WNM: Preferred List Available
2026-03-17T13:58:36.696589+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event
2026-03-17T13:58:43.315160+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[] error="failed to finish discovery before timeout"
2026-03-17T13:58:43.315230+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices=[]
2026-03-17T13:58:43.315291+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.03013886s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[]
2026-03-17T13:58:43.315329+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/vulkan
2026-03-17T13:58:43.315340+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0
2026-03-17T13:58:43.315352+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
2026-03-17T13:58:43.315361+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.030502875s
2026-03-17T13:58:43.315423+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="28.4 GiB"
2026-03-17T13:58:43.315451+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
<!-- gh-comment-id:4075209098 --> @jdblns commented on GitHub (Mar 17, 2026): ``` 2026-03-17T13:58:13.283995+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.283Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES:0 HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:13624 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY:rocm OLLAMA_LOAD_TIMEOUT:1m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" 2026-03-17T13:58:13.284080+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.283Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false" 2026-03-17T13:58:13.284208+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=images.go:477 msg="total blobs: 8" 2026-03-17T13:58:13.284266+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=images.go:484 msg="total unused blobs removed: 0" 2026-03-17T13:58:13.284394+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=routes.go:1782 msg="Listening on [::]:13624 (version 0.18.0)" 2026-03-17T13:58:13.284572+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=sched.go:145 msg="starting llm scheduler" 2026-03-17T13:58:13.284747+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=INFO source=runner.go:67 msg="discovering available GPUs..." 2026-03-17T13:58:13.284787+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=WARN source=runner.go:485 msg="user overrode visible devices" HIP_VISIBLE_DEVICES=0 2026-03-17T13:58:13.284809+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again" 2026-03-17T13:58:13.284825+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v12 2026-03-17T13:58:13.284836+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/cuda_v13 2026-03-17T13:58:13.284850+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.284Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs=map[] 2026-03-17T13:58:13.285588+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.285Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 34259" 2026-03-17T13:58:13.285779+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.285Z level=DEBUG source=server.go:431 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/snap/bin OLLAMA_GPU_INFO="{\"library\":\"rocm\",\"vram\":103079215104,\"compute\":\"gfx1151\"}" OLLAMA_VULKAN=0 GGML_CUDA_INIT=0 OLLAMA_INTEL_GPU=0 HIP_VISIBLE_DEVICES=0 OLLAMA_MAX_VRAM=98304 OLLAMA_DEBUG=2 OLLAMA_SKIP_GPU_CHECK=1 OLLAMA_LLM_LIBRARY=rocm OLLAMA_LOAD_TIMEOUT=1m OLLAMA_MODELS=/home/ollama/models OLLAMA_HOST=0.0.0.0:13624 ROCM_VVM_ENABLE=1 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm 2026-03-17T13:58:13.294054+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.294Z level=INFO source=runner.go:1411 msg="starting ollama engine" 2026-03-17T13:58:13.294184+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.294Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:34259" 2026-03-17T13:58:13.296715+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string 2026-03-17T13:58:13.296727+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string 2026-03-17T13:58:13.296741+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 2026-03-17T13:58:13.296777+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 2026-03-17T13:58:13.296796+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default="" 2026-03-17T13:58:13.296806+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default="" 2026-03-17T13:58:13.296818+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 2026-03-17T13:58:13.296831+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.296Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama 2026-03-17T13:58:13.300580+00:00 horse ollama[1917]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so 2026-03-17T13:58:13.300612+00:00 horse ollama[1917]: time=2026-03-17T13:58:13.300Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm 2026-03-17T13:58:13.467398+00:00 horse ollama[1917]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no 2026-03-17T13:58:13.467495+00:00 horse ollama[1917]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no 2026-03-17T13:58:13.467522+00:00 horse ollama[1917]: ggml_cuda_init: found 1 ROCm devices: 2026-03-17T13:58:13.467545+00:00 horse ollama[1917]: ggml_cuda_init: initializing rocBLAS on device 0 2026-03-17T13:57:34.457263+00:00 horse kernel: message repeated 2 times: [ ath12k_pci 0000:c2:00.0: received scan start failure event] 2026-03-17T13:58:13.471179+00:00 horse kernel: gmc_v11_0_process_interrupt: 114 callbacks suppressed 2026-03-17T13:58:13.471196+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:8 pasid:32770) 2026-03-17T13:58:13.471197+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: Process ollama pid 1932 thread ollama pid 1938 2026-03-17T13:58:13.471197+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: in page starting at address 0x00007ba6e668a000 from client 10 2026-03-17T13:58:13.471198+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800932 2026-03-17T13:58:13.471200+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: Faulty UTCL2 client ID: CPF (0x4) 2026-03-17T13:58:13.471201+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: MORE_FAULTS: 0x0 2026-03-17T13:58:13.471201+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: WALKER_ERROR: 0x1 2026-03-17T13:58:13.471202+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: PERMISSION_FAULTS: 0x3 2026-03-17T13:58:13.471202+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: MAPPING_ERROR: 0x1 2026-03-17T13:58:13.471203+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: RW: 0x0 2026-03-17T13:58:36.695702+00:00 horse wpa_supplicant[1131]: wlp194s0: WNM: Preferred List Available 2026-03-17T13:58:36.696589+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event 2026-03-17T13:58:43.315160+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[] error="failed to finish discovery before timeout" 2026-03-17T13:58:43.315230+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices=[] 2026-03-17T13:58:43.315291+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.03013886s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[] 2026-03-17T13:58:43.315329+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:98 msg="skipping available library at user's request" requested=rocm libDir=/usr/local/lib/ollama/vulkan 2026-03-17T13:58:43.315340+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.314Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0 2026-03-17T13:58:43.315352+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[] 2026-03-17T13:58:43.315361+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.030502875s 2026-03-17T13:58:43.315423+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.0 GiB" available="28.4 GiB" 2026-03-17T13:58:43.315451+00:00 horse ollama[1917]: time=2026-03-17T13:58:43.315Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ```
Author
Owner

@rick-github commented on GitHub (Mar 17, 2026):

env -i HOME=$HOME OLLAMA_DEBUG=2 /usr/local/bin/ollama serve
<!-- gh-comment-id:4075388360 --> @rick-github commented on GitHub (Mar 17, 2026): ``` env -i HOME=$HOME OLLAMA_DEBUG=2 /usr/local/bin/ollama serve ```
Author
Owner

@jdblns commented on GitHub (Mar 17, 2026):

$ env -i HOME=$HOME OLLAMA_DEBUG=2 /usr/local/bin/ollama serve
time=2026-03-17T14:28:55.106Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/tysy/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-03-17T14:28:55.106Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false"
time=2026-03-17T14:28:55.107Z level=INFO source=images.go:477 msg="total blobs: 0"
time=2026-03-17T14:28:55.107Z level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2026-03-17T14:28:55.107Z level=INFO source=routes.go:1782 msg="Listening on 127.0.0.1:11434 (version 0.18.0)"
time=2026-03-17T14:28:55.107Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-03-17T14:28:55.107Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-03-17T14:28:55.107Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extraEnvs=map[]
time=2026-03-17T14:28:55.108Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 45729"
time=2026-03-17T14:28:55.108Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13
time=2026-03-17T14:28:55.118Z level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-17T14:28:55.118Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:45729"
time=2026-03-17T14:28:55.120Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-17T14:28:55.120Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-17T14:28:55.120Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
time=2026-03-17T14:28:55.124Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v13
time=2026-03-17T14:28:55.126Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=6.564176ms
time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=563ns
time=2026-03-17T14:28:55.126Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" devices=[]
time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=18.849389ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extra_envs=map[]
time=2026-03-17T14:28:55.126Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs=map[]
time=2026-03-17T14:28:55.126Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 38207"
time=2026-03-17T14:28:55.126Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm
time=2026-03-17T14:28:55.134Z level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-17T14:28:55.135Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:38207"
time=2026-03-17T14:28:55.137Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-17T14:28:55.137Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-17T14:28:55.137Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
time=2026-03-17T14:28:55.141Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, ID: 0
load_backend: loaded ROCm backend from /usr/local/lib/ollama/rocm/libggml-hip.so
time=2026-03-17T14:28:55.315Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-17T14:28:55.316Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=178.651667ms
ggml_hip_get_device_memory searching for device 0000:c5:00.0
ggml_backend_cuda_device_get_memory device 0000:c5:00.0 utilizing AMD specific memory reporting free: 102082875392 total: 102271614976
time=2026-03-17T14:28:55.316Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=494.69µs
time=2026-03-17T14:28:55.316Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices="[{DeviceID:{ID:0 Library:ROCm} Name:ROCm0 Description:Radeon 8060S Graphics FilterID: Integrated:true PCIID:0000:c5:00.0 TotalMemory:102271614976 FreeMemory:102082875392 ComputeMajor:17 ComputeMinor:81 DriverMajor:70226 DriverMinor:1 LibraryPath:[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]}]"
time=2026-03-17T14:28:55.317Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=190.333927ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[]
time=2026-03-17T14:28:55.317Z level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-03-17T14:28:55.317Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extraEnvs=map[]
time=2026-03-17T14:28:55.317Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 45877"
time=2026-03-17T14:28:55.317Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12
time=2026-03-17T14:28:55.326Z level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-17T14:28:55.327Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:45877"
time=2026-03-17T14:28:55.328Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-17T14:28:55.328Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-17T14:28:55.328Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
time=2026-03-17T14:28:55.332Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12
time=2026-03-17T14:28:55.334Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=6.101799ms
time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=291ns
time=2026-03-17T14:28:55.334Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" devices=[]
time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=17.608304ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extra_envs=map[]
time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=1
time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:146 msg="verifying if device is supported" library=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 id=0 pci_id=0000:c5:00.0
time=2026-03-17T14:28:55.334Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]"
time=2026-03-17T14:28:55.334Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 39173"
time=2026-03-17T14:28:55.334Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1 ROCR_VISIBLE_DEVICES=0
time=2026-03-17T14:28:55.343Z level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-17T14:28:55.343Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:39173"
time=2026-03-17T14:28:55.345Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-17T14:28:55.345Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-17T14:28:55.345Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so
time=2026-03-17T14:28:55.349Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
ggml_cuda_init: initializing rocBLAS on device 0
time=2026-03-17T14:29:25.335Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="failed to finish discovery before timeout"
time=2026-03-17T14:29:25.335Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices=[]
time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.001273733s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]"
time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=/usr/local/lib/ollama/rocm pci_id=0000:c5:00.0 library=ROCm
time=2026-03-17T14:29:25.336Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
time=2026-03-17T14:29:25.336Z level=TRACE source=runner.go:183 msg="removing unsupported or overlapping GPU combination" libDir=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 pci_id=0000:c5:00.0
time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.228668583s
time=2026-03-17T14:29:25.336Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="60.2 GiB"
time=2026-03-17T14:29:25.336Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

/var/log/syslog

2026-03-17T14:28:28.823059+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0
2026-03-17T14:28:28.823074+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
2026-03-17T14:28:28.823089+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.030429611s2026-03-17T14:28:28.826475+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.826Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="60.1 GiB"2026-03-17T14:28:28.826499+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.826Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=40962026-03-17T14:28:28.988058+00:00 horse systemd[1]: systemd-timedated.service: Deactivated successfully.
2026-03-17T14:28:37.334306+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available
2026-03-17T14:28:37.336360+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event2026-03-17T14:28:55.298448+00:00 horse kernel: amdxdna 0000:c6:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -19
2026-03-17T14:28:55.386318+00:00 horse kernel: amdxdna 0000:c6:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -192026-03-17T14:28:55.400104+00:00 horse kernel: gmc_v11_0_process_interrupt: 114 callbacks suppressed
2026-03-17T14:28:55.400116+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:8 pasid:32770)
2026-03-17T14:28:55.400117+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:  Process ollama pid 1841 thread ollama pid 1843
2026-03-17T14:28:55.400118+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:   in page starting at address 0x0000703b0a281000 from client 10
2026-03-17T14:28:55.400120+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800932
2026-03-17T14:28:55.400120+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      Faulty UTCL2 client ID: CPF (0x4)
2026-03-17T14:28:55.400121+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      MORE_FAULTS: 0x0
2026-03-17T14:28:55.400122+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      WALKER_ERROR: 0x1
2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      MAPPING_ERROR: 0x1
2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu:      RW: 0x0
2026-03-17T14:29:07.538608+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event
2026-03-17T14:29:07.536684+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available
2026-03-17T14:29:24.491386+00:00 horse systemd[1]: Started session-3.scope - Session 3 of User blns.
2026-03-17T14:30:05.630735+00:00 horse systemd[1]: Starting sysstat-collect.service - system activity accounting tool...2026-03-17T14:30:05.637165+00:00 horse systemd[1]: sysstat-collect.service: Deactivated successfully.
2026-03-17T14:30:05.637388+00:00 horse systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
2026-03-17T14:30:24.204880+00:00 horse kernel: wlp194s0: AP 64:61:40:19:99:a8 changed bandwidth in beacon, new used config is 2437.000 MHz, width 2 (2447.000/0 MHz)
2026-03-17T14:30:32.789745+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available
2026-03-17T14:30:32.791395+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event
<!-- gh-comment-id:4075431887 --> @jdblns commented on GitHub (Mar 17, 2026): ``` $ env -i HOME=$HOME OLLAMA_DEBUG=2 /usr/local/bin/ollama serve time=2026-03-17T14:28:55.106Z level=INFO source=routes.go:1727 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/tysy/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-03-17T14:28:55.106Z level=INFO source=routes.go:1729 msg="Ollama cloud disabled: false" time=2026-03-17T14:28:55.107Z level=INFO source=images.go:477 msg="total blobs: 0" time=2026-03-17T14:28:55.107Z level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2026-03-17T14:28:55.107Z level=INFO source=routes.go:1782 msg="Listening on 127.0.0.1:11434 (version 0.18.0)" time=2026-03-17T14:28:55.107Z level=DEBUG source=sched.go:145 msg="starting llm scheduler" time=2026-03-17T14:28:55.107Z level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-03-17T14:28:55.107Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extraEnvs=map[] time=2026-03-17T14:28:55.108Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 45729" time=2026-03-17T14:28:55.108Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13 time=2026-03-17T14:28:55.118Z level=INFO source=runner.go:1411 msg="starting ollama engine" time=2026-03-17T14:28:55.118Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:45729" time=2026-03-17T14:28:55.120Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string time=2026-03-17T14:28:55.120Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0 time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default="" time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default="" time=2026-03-17T14:28:55.120Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2026-03-17T14:28:55.120Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so time=2026-03-17T14:28:55.124Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v13 time=2026-03-17T14:28:55.126Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2026-03-17T14:28:55.126Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=6.564176ms time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=563ns time=2026-03-17T14:28:55.126Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" devices=[] time=2026-03-17T14:28:55.126Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=18.849389ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extra_envs=map[] time=2026-03-17T14:28:55.126Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs=map[] time=2026-03-17T14:28:55.126Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 38207" time=2026-03-17T14:28:55.126Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm time=2026-03-17T14:28:55.134Z level=INFO source=runner.go:1411 msg="starting ollama engine" time=2026-03-17T14:28:55.135Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:38207" time=2026-03-17T14:28:55.137Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string time=2026-03-17T14:28:55.137Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0 time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default="" time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default="" time=2026-03-17T14:28:55.137Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2026-03-17T14:28:55.137Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so time=2026-03-17T14:28:55.141Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, ID: 0 load_backend: loaded ROCm backend from /usr/local/lib/ollama/rocm/libggml-hip.so time=2026-03-17T14:28:55.315Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0 time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0 time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2026-03-17T14:28:55.315Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2026-03-17T14:28:55.316Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2026-03-17T14:28:55.316Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=178.651667ms ggml_hip_get_device_memory searching for device 0000:c5:00.0 ggml_backend_cuda_device_get_memory device 0000:c5:00.0 utilizing AMD specific memory reporting free: 102082875392 total: 102271614976 time=2026-03-17T14:28:55.316Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=494.69µs time=2026-03-17T14:28:55.316Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices="[{DeviceID:{ID:0 Library:ROCm} Name:ROCm0 Description:Radeon 8060S Graphics FilterID: Integrated:true PCIID:0000:c5:00.0 TotalMemory:102271614976 FreeMemory:102082875392 ComputeMajor:17 ComputeMinor:81 DriverMajor:70226 DriverMinor:1 LibraryPath:[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]}]" time=2026-03-17T14:28:55.317Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=190.333927ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs=map[] time=2026-03-17T14:28:55.317Z level=INFO source=runner.go:106 msg="experimental Vulkan support disabled. To enable, set OLLAMA_VULKAN=1" time=2026-03-17T14:28:55.317Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extraEnvs=map[] time=2026-03-17T14:28:55.317Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 45877" time=2026-03-17T14:28:55.317Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 time=2026-03-17T14:28:55.326Z level=INFO source=runner.go:1411 msg="starting ollama engine" time=2026-03-17T14:28:55.327Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:45877" time=2026-03-17T14:28:55.328Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string time=2026-03-17T14:28:55.328Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0 time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default="" time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default="" time=2026-03-17T14:28:55.328Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2026-03-17T14:28:55.328Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so time=2026-03-17T14:28:55.332Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12 time=2026-03-17T14:28:55.334Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2026-03-17T14:28:55.334Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=6.101799ms time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=291ns time=2026-03-17T14:28:55.334Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" devices=[] time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=17.608304ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extra_envs=map[] time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=1 time=2026-03-17T14:28:55.334Z level=DEBUG source=runner.go:146 msg="verifying if device is supported" library=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 id=0 pci_id=0000:c5:00.0 time=2026-03-17T14:28:55.334Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extraEnvs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" time=2026-03-17T14:28:55.334Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 39173" time=2026-03-17T14:28:55.334Z level=DEBUG source=server.go:431 msg=subprocess OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1 ROCR_VISIBLE_DEVICES=0 time=2026-03-17T14:28:55.343Z level=INFO source=runner.go:1411 msg="starting ollama engine" time=2026-03-17T14:28:55.343Z level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:39173" time=2026-03-17T14:28:55.345Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string time=2026-03-17T14:28:55.345Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32 time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0 time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default="" time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default="" time=2026-03-17T14:28:55.345Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2026-03-17T14:28:55.345Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-icelake.so time=2026-03-17T14:28:55.349Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/rocm ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: initializing rocBLAS on device 0 time=2026-03-17T14:29:25.335Z level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="failed to finish discovery before timeout" time=2026-03-17T14:29:25.335Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" devices=[] time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=30.001273733s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=/usr/local/lib/ollama/rocm pci_id=0000:c5:00.0 library=ROCm time=2026-03-17T14:29:25.336Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[] time=2026-03-17T14:29:25.336Z level=TRACE source=runner.go:183 msg="removing unsupported or overlapping GPU combination" libDir=/usr/local/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 pci_id=0000:c5:00.0 time=2026-03-17T14:29:25.336Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.228668583s time=2026-03-17T14:29:25.336Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="60.2 GiB" time=2026-03-17T14:29:25.336Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ``` /var/log/syslog ``` 2026-03-17T14:28:28.823059+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0 2026-03-17T14:28:28.823074+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[] 2026-03-17T14:28:28.823089+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.822Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=30.030429611s2026-03-17T14:28:28.826475+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.826Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="60.1 GiB"2026-03-17T14:28:28.826499+00:00 horse ollama[1181]: time=2026-03-17T14:28:28.826Z level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="0 B" default_num_ctx=40962026-03-17T14:28:28.988058+00:00 horse systemd[1]: systemd-timedated.service: Deactivated successfully. 2026-03-17T14:28:37.334306+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available 2026-03-17T14:28:37.336360+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event2026-03-17T14:28:55.298448+00:00 horse kernel: amdxdna 0000:c6:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -19 2026-03-17T14:28:55.386318+00:00 horse kernel: amdxdna 0000:c6:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -192026-03-17T14:28:55.400104+00:00 horse kernel: gmc_v11_0_process_interrupt: 114 callbacks suppressed 2026-03-17T14:28:55.400116+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:8 pasid:32770) 2026-03-17T14:28:55.400117+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: Process ollama pid 1841 thread ollama pid 1843 2026-03-17T14:28:55.400118+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: in page starting at address 0x0000703b0a281000 from client 10 2026-03-17T14:28:55.400120+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800932 2026-03-17T14:28:55.400120+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: Faulty UTCL2 client ID: CPF (0x4) 2026-03-17T14:28:55.400121+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: MORE_FAULTS: 0x0 2026-03-17T14:28:55.400122+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: WALKER_ERROR: 0x1 2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: PERMISSION_FAULTS: 0x3 2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: MAPPING_ERROR: 0x1 2026-03-17T14:28:55.400123+00:00 horse kernel: amdgpu 0000:c5:00.0: amdgpu: RW: 0x0 2026-03-17T14:29:07.538608+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event 2026-03-17T14:29:07.536684+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available 2026-03-17T14:29:24.491386+00:00 horse systemd[1]: Started session-3.scope - Session 3 of User blns. 2026-03-17T14:30:05.630735+00:00 horse systemd[1]: Starting sysstat-collect.service - system activity accounting tool...2026-03-17T14:30:05.637165+00:00 horse systemd[1]: sysstat-collect.service: Deactivated successfully. 2026-03-17T14:30:05.637388+00:00 horse systemd[1]: Finished sysstat-collect.service - system activity accounting tool. 2026-03-17T14:30:24.204880+00:00 horse kernel: wlp194s0: AP 64:61:40:19:99:a8 changed bandwidth in beacon, new used config is 2437.000 MHz, width 2 (2447.000/0 MHz) 2026-03-17T14:30:32.789745+00:00 horse wpa_supplicant[1127]: wlp194s0: WNM: Preferred List Available 2026-03-17T14:30:32.791395+00:00 horse kernel: ath12k_pci 0000:c2:00.0: received scan start failure event ```
Author
Owner

@rick-github commented on GitHub (Mar 17, 2026):

Searching for "amdxdna_drm_open: SVA bind device failed" shows lots of results, seems to be a kernel issue. It's not clear what version of ROCm you are using, if not 7.x try upgrading. You can try alternative kernels or run the vulkan driver:

env -i HOME=$HOME OLLAMA_DEBUG=2 OLLAMA_VULKAN=1 OLLAMA_LLM_LIBRARY=vulkan /usr/local/bin/ollama serve
<!-- gh-comment-id:4075519751 --> @rick-github commented on GitHub (Mar 17, 2026): Searching for "amdxdna_drm_open: SVA bind device failed" shows lots of results, seems to be a kernel issue. It's not clear what version of ROCm you are using, if not 7.x try upgrading. You can try alternative kernels or run the vulkan driver: ``` env -i HOME=$HOME OLLAMA_DEBUG=2 OLLAMA_VULKAN=1 OLLAMA_LLM_LIBRARY=vulkan /usr/local/bin/ollama serve ```
Author
Owner

@jdblns commented on GitHub (Mar 17, 2026):

Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:245 msg="model weights" device=CPU size="170.2 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:251 msg="kv cache" device=ROCm0 size="19.3 GiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:262 msg="compute graph" device=ROCm0 size="665.6 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:267 msg="compute graph" device=CPU size="4.0 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:272 msg="total memory" size="37.7 GiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:782 msg=memory success=true required.InputWeights=178421760 required.CPU.Graph=4194304 required.ROCm0.ID=0 required.ROCm0.Weights="[58693376 416184320 416184320 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 260206592]" required.ROCm0.Cache="[441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 0]" required.ROCm0.Graph=697962112
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:976 msg="available gpu" id=0 library=ROCm "available layer vram"="93.7 GiB" backoff=0.00 minimum="457.0 MiB" overhead="0 B" graph="665.6 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:793 msg="new layout created" layers="48[ID:0 Layers:48(0..47)]"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=runner.go:1284 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:202752 KvCacheType: NumThreads:16 GPULayers:48[ID:0 Layers:48(0..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:482 msg="offloading 47 repeating layers to GPU"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:489 msg="offloading output layer to GPU"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:494 msg="offloaded 48/48 layers to GPU"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=device.go:240 msg="model weights" device=ROCm0 size="17.5 GiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=device.go:245 msg="model weights" device=CPU size="170.2 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:251 msg="kv cache" device=ROCm0 size="19.3 GiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:262 msg="compute graph" device=ROCm0 size="665.6 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="4.0 MiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:272 msg="total memory" size="37.7 GiB"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=sched.go:565 msg="loaded runners" count=1
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=server.go:1350 msg="waiting for llama runner to start responding"
Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=server.go:1384 msg="waiting for server to become available" status="llm server loading model"
Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.090Z level=DEBUG source=server.go:1394 msg="model load progress 0.23"
Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.341Z level=DEBUG source=server.go:1394 msg="model load progress 0.46"
Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.592Z level=DEBUG source=server.go:1394 msg="model load progress 0.69"
Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.843Z level=DEBUG source=server.go:1394 msg="model load progress 0.94"
Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.913Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=glm4moelite.pooling_type default=0
Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.095Z level=INFO source=server.go:1388 msg="llama runner started in 2.60 seconds"
Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.095Z level=DEBUG source=sched.go:577 msg="finished setting up" runner.name=registry.ollama.ai/library/glm-4.7-flash:q4_K_M runner.inference="[{ID:0 Library:ROCm}]" runner.size="37.7 GiB" runner.vram="37.7 GiB" runner.parallel=1 runner.pid=68226 runner.model=/home/ollama/models/blobs/sha256-9eba2761cf0b88b8bc11a065a7b5b47f1b13ce820e8e492cb1010b450f9ec950 runner.num_ctx=262144
Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.114Z level=DEBUG source=server.go:1536 msg="completion request" images=0 prompt=76 format=""
Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.128Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=13 used=0 remaining=13

I followed the method Gemini gave me:

Critical Log Analysis
This log is vital as it reveals a deep-seated conflict: the AMD NPU (XDNA) driver and the GPU (amdgpu) driver are competing for resources on the highly integrated Strix Halo chip, leading to a Shared Virtual Addressing (SVA) binding failure.

The "smoking gun" is this specific error line:

amdxdna 0000:c6:00.1: [drm] ERROR amdxdna_drm_open: SVA bind device failed, ret -19

This indicates that the system's IOMMU/SVA failed while attempting to establish memory mapping for the NPU. Consequently, when the GPU's rocBLAS tried to initialize immediately after, it triggered the PERMISSION_FAULTS: 0x3.

Since your attempt using env -i was unsuccessful, we must implement a more thorough isolation at the system level. Please follow these steps:

  1. Disable the NPU Driver (Eliminate Interference)
    On the Strix Halo architecture, the NPU and GPU share the same memory bus. Current Linux drivers (specifically amdxdna) often cause IOMMU state exceptions when handling large memory allocations exceeding 64GB.

Blacklist the NPU driver to see if it restores GPU functionality:

Create a blacklist file: sudo nano /etc/modprobe.d/blacklist-xdna.conf

Add the following content: blacklist amdxdna

Save and update: sudo update-initramfs -u

Reboot your computer.

  1. Corrected Launch Command (With Environment Variables)
    Your previous env -i command wiped all environment variables, which prevents the gfx1151 from being identified (Ollama requires HSA_OVERRIDE_GFX_VERSION for proper spoofing).

After rebooting, use this full command to start manually:

Bash

Stop the background service

sudo systemctl stop ollama

Manually inject key variables and start

export HSA_OVERRIDE_GFX_VERSION=11.5.1
export HSA_XNACK=0
export OLLAMA_DEBUG=2
export ROCR_VISIBLE_DEVICES=0

/usr/local/bin/ollama serve
3. Check IOMMU Status
If the PERMISSION_FAULTS persist, verify if iommu=off is actually active.

Run: cat /proc/cmdline

Confirm: Check if iommu=off appears in the output.

If it shows iommu=on or iommu=pt, you must go into your BIOS settings, locate the "IOMMU" option, and set it to Disabled.

Note: On some Strix Halo firmware, even if you specify off in Grub, the kernel may still attempt to initialize DMAR tables if IOMMU is enabled in BIOS, obstructing the 40CU GPU from accessing memory.

  1. Why is 102GB of VRAM being displayed?
    There is an interesting line in your log:
    free: 102082875392 total: 102271614976

This shows that the driver actually detects over 95GB of available space. This exceeds your 64GB BIOS setting because the driver is attempting to use the system's GTT (Graphics Translation Table) shared space.

The current bottleneck: The driver "sees" the memory, but because the SVA bind device failed, it doesn't have the "knife" (access permissions) to use it.

Summary of Recommendations
Blacklist amdxdna (the XDNA driver is currently extremely unstable with large memory mappings).

Disable IOMMU in BIOS (if the option is available).

Relaunch using the command with the specified environment variables.

Once you have blacklisted the driver and rebooted, please run ollama serve again. If the error message changes (e.g., it is no longer a Permission Fault), let me know!

<!-- gh-comment-id:4075530970 --> @jdblns commented on GitHub (Mar 17, 2026): Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:245 msg="model weights" device=CPU size="170.2 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:251 msg="kv cache" device=ROCm0 size="19.3 GiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:262 msg="compute graph" device=ROCm0 size="665.6 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:267 msg="compute graph" device=CPU size="4.0 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=device.go:272 msg="total memory" size="37.7 GiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:782 msg=memory success=true required.InputWeights=178421760 required.CPU.Graph=4194304 required.ROCm0.ID=0 required.ROCm0.Weights="[58693376 416184320 416184320 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 363468800 415373312 364279808 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 416184320 260206592]" required.ROCm0.Cache="[441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 441188352 0]" required.ROCm0.Graph=697962112 Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:976 msg="available gpu" id=0 library=ROCm "available layer vram"="93.7 GiB" backoff=0.00 minimum="457.0 MiB" overhead="0 B" graph="665.6 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=DEBUG source=server.go:793 msg="new layout created" layers="48[ID:0 Layers:48(0..47)]" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=runner.go:1284 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:202752 KvCacheType: NumThreads:16 GPULayers:48[ID:0 Layers:48(0..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:482 msg="offloading 47 repeating layers to GPU" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:489 msg="offloading output layer to GPU" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=ggml.go:494 msg="offloaded 48/48 layers to GPU" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=device.go:240 msg="model weights" device=ROCm0 size="17.5 GiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.837Z level=INFO source=device.go:245 msg="model weights" device=CPU size="170.2 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:251 msg="kv cache" device=ROCm0 size="19.3 GiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:262 msg="compute graph" device=ROCm0 size="665.6 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="4.0 MiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=device.go:272 msg="total memory" size="37.7 GiB" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=sched.go:565 msg="loaded runners" count=1 Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=server.go:1350 msg="waiting for llama runner to start responding" Mar 17 14:42:10 horse ollama[68153]: time=2026-03-17T14:42:10.838Z level=INFO source=server.go:1384 msg="waiting for server to become available" status="llm server loading model" Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.090Z level=DEBUG source=server.go:1394 msg="model load progress 0.23" Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.341Z level=DEBUG source=server.go:1394 msg="model load progress 0.46" Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.592Z level=DEBUG source=server.go:1394 msg="model load progress 0.69" Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.843Z level=DEBUG source=server.go:1394 msg="model load progress 0.94" Mar 17 14:42:11 horse ollama[68153]: time=2026-03-17T14:42:11.913Z level=DEBUG source=ggml.go:324 msg="key with type not found" key=glm4moelite.pooling_type default=0 Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.095Z level=INFO source=server.go:1388 msg="llama runner started in 2.60 seconds" Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.095Z level=DEBUG source=sched.go:577 msg="finished setting up" runner.name=registry.ollama.ai/library/glm-4.7-flash:q4_K_M runner.inference="[{ID:0 Library:ROCm}]" runner.size="37.7 GiB" runner.vram="37.7 GiB" runner.parallel=1 runner.pid=68226 runner.model=/home/ollama/models/blobs/sha256-9eba2761cf0b88b8bc11a065a7b5b47f1b13ce820e8e492cb1010b450f9ec950 runner.num_ctx=262144 Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.114Z level=DEBUG source=server.go:1536 msg="completion request" images=0 prompt=76 format="" Mar 17 14:42:12 horse ollama[68153]: time=2026-03-17T14:42:12.128Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=13 used=0 remaining=13 I followed the method Gemini gave me: Critical Log Analysis This log is vital as it reveals a deep-seated conflict: the AMD NPU (XDNA) driver and the GPU (amdgpu) driver are competing for resources on the highly integrated Strix Halo chip, leading to a Shared Virtual Addressing (SVA) binding failure. The "smoking gun" is this specific error line: amdxdna 0000:c6:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -19 This indicates that the system's IOMMU/SVA failed while attempting to establish memory mapping for the NPU. Consequently, when the GPU's rocBLAS tried to initialize immediately after, it triggered the PERMISSION_FAULTS: 0x3. Since your attempt using env -i was unsuccessful, we must implement a more thorough isolation at the system level. Please follow these steps: 1. Disable the NPU Driver (Eliminate Interference) On the Strix Halo architecture, the NPU and GPU share the same memory bus. Current Linux drivers (specifically amdxdna) often cause IOMMU state exceptions when handling large memory allocations exceeding 64GB. Blacklist the NPU driver to see if it restores GPU functionality: Create a blacklist file: sudo nano /etc/modprobe.d/blacklist-xdna.conf Add the following content: blacklist amdxdna Save and update: sudo update-initramfs -u Reboot your computer. 2. Corrected Launch Command (With Environment Variables) Your previous env -i command wiped all environment variables, which prevents the gfx1151 from being identified (Ollama requires HSA_OVERRIDE_GFX_VERSION for proper spoofing). After rebooting, use this full command to start manually: Bash # Stop the background service sudo systemctl stop ollama # Manually inject key variables and start export HSA_OVERRIDE_GFX_VERSION=11.5.1 export HSA_XNACK=0 export OLLAMA_DEBUG=2 export ROCR_VISIBLE_DEVICES=0 /usr/local/bin/ollama serve 3. Check IOMMU Status If the PERMISSION_FAULTS persist, verify if iommu=off is actually active. Run: cat /proc/cmdline Confirm: Check if iommu=off appears in the output. If it shows iommu=on or iommu=pt, you must go into your BIOS settings, locate the "IOMMU" option, and set it to Disabled. Note: On some Strix Halo firmware, even if you specify off in Grub, the kernel may still attempt to initialize DMAR tables if IOMMU is enabled in BIOS, obstructing the 40CU GPU from accessing memory. 4. Why is 102GB of VRAM being displayed? There is an interesting line in your log: free: 102082875392 total: 102271614976 This shows that the driver actually detects over 95GB of available space. This exceeds your 64GB BIOS setting because the driver is attempting to use the system's GTT (Graphics Translation Table) shared space. The current bottleneck: The driver "sees" the memory, but because the SVA bind device failed, it doesn't have the "knife" (access permissions) to use it. Summary of Recommendations Blacklist amdxdna (the XDNA driver is currently extremely unstable with large memory mappings). Disable IOMMU in BIOS (if the option is available). Relaunch using the command with the specified environment variables. Once you have blacklisted the driver and rebooted, please run ollama serve again. If the error message changes (e.g., it is no longer a Permission Fault), let me know!
Author
Owner

@slojosic-amd commented on GitHub (Mar 20, 2026):

https://rocm.docs.amd.com/en/latest/how-to/system-optimization/strixhalo.html

Image
<!-- gh-comment-id:4096933554 --> @slojosic-amd commented on GitHub (Mar 20, 2026): https://rocm.docs.amd.com/en/latest/how-to/system-optimization/strixhalo.html <img width="780" height="825" alt="Image" src="https://github.com/user-attachments/assets/bbfec8c7-47d4-4b83-ba9b-bee789db3664" />
Author
Owner

@gangrif commented on GitHub (Mar 22, 2026):

Ive run into this issue tonight on a Framework desktop, new installation.

After several hours muddling through this with assistance from Claude, i have the following info:

  • My system is RHEL 10.1, running in bootc/Image Mode, ollama is in a container under podman.
  • The rocBLAS crash produces a SIGSEGV at addr=0x34
  • We worked around it using ollama/ollama:latest with OLLAMA_VULKAN=true instead of the :rocm image

Here is some of the info Claude and I gathered while troubleshooting, if its helpful.

Hardware: Framework Desktop, AMD Ryzen AI Max 385 (Strix Halo), 32GB unified memory, RHEL 10 image mode, Podman container using ollama/ollama:rocm (Ollama 0.18.2, ROCm 7.2.0)

GPU is correctly detected by the kernel:

[ 6.520007] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 6.520015] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[ 6.520417] kfd kfd: amdgpu: added device 1002:1586
HSA topology correctly identifies the device as gfx1151:

gfx_target_version 110501
simd_count 64
The ROCm container (7.2.0) does include compiled gfx1151 kernels:

/usr/lib/ollama/rocm/rocblas/library/Kernels.so-000-gfx1151.hsaco
/usr/lib/ollama/rocm/rocblas/library/TensileLibrary_..._fallback_gfx1151.hsaco
The actual failure — SIGSEGV during rocBLAS initialization:

ggml_cuda_init: found 1 ROCm devices:
ggml_cuda_init: initializing rocBLAS on device 0
SIGSEGV: segmentation violation
PC=0x7ff0c88f8170 m=0 sigcode=1 addr=0x34
signal arrived during cgo execution

goroutine 1 gp=0xc000002380 m=0 mp=0x55ea353b5c80 [syscall]:
runtime.cgocall(0x55ea3404f800, 0xc00029f710)
github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1.1
addr=0x34 is a null pointer dereference (offset 52 from null) inside rocBLAS handle initialization.

Things that did NOT fix the ROCm crash:

HSA_OVERRIDE_GFX_VERSION=11.0.0 or 11.5.1
GGML_HIP_UMA=1
HSA_ENABLE_UNIFIED_MEMORY=1
Ulimit=memlock=-1:-1
Increasing UMA Frame Buffer to 16GB in BIOS (confirms memory size is not the root cause)
SELinux permissive mode
Memory state with 16GB UMA Frame Buffer:

mem_info_vram_total: 17179869184 (16GB)
mem_info_gtt_total: 16447606784 (~15.3GB)
heap_type 1 (system/unified memory)
Workaround: Switch from ollama/ollama:rocm to ollama/ollama:latest with OLLAMA_VULKAN=true. The standard image includes the RADV Vulkan backend (libvulkan_radeon.so, radeon_icd.json) which works correctly on gfx1151.

Result with Vulkan:

NAME ID SIZE PROCESSOR CONTEXT
llama3.1:8b 46e0c10c039e 11 GB 100% GPU 32768

<!-- gh-comment-id:4105345887 --> @gangrif commented on GitHub (Mar 22, 2026): Ive run into this issue tonight on a Framework desktop, new installation. After several hours muddling through this with assistance from Claude, i have the following info: * My system is RHEL 10.1, running in bootc/Image Mode, ollama is in a container under podman. * The rocBLAS crash produces a SIGSEGV at addr=0x34 * We worked around it using ollama/ollama:latest with OLLAMA_VULKAN=true instead of the :rocm image Here is some of the info Claude and I gathered while troubleshooting, if its helpful. --------------- Hardware: Framework Desktop, AMD Ryzen AI Max 385 (Strix Halo), 32GB unified memory, RHEL 10 image mode, Podman container using ollama/ollama:rocm (Ollama 0.18.2, ROCm 7.2.0) GPU is correctly detected by the kernel: [ 6.520007] kfd kfd: amdgpu: Allocated 3969056 bytes on gart [ 6.520015] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1 [ 6.520417] kfd kfd: amdgpu: added device 1002:1586 HSA topology correctly identifies the device as gfx1151: gfx_target_version 110501 simd_count 64 The ROCm container (7.2.0) does include compiled gfx1151 kernels: /usr/lib/ollama/rocm/rocblas/library/Kernels.so-000-gfx1151.hsaco /usr/lib/ollama/rocm/rocblas/library/TensileLibrary_..._fallback_gfx1151.hsaco The actual failure — SIGSEGV during rocBLAS initialization: ggml_cuda_init: found 1 ROCm devices: ggml_cuda_init: initializing rocBLAS on device 0 SIGSEGV: segmentation violation PC=0x7ff0c88f8170 m=0 sigcode=1 addr=0x34 signal arrived during cgo execution goroutine 1 gp=0xc000002380 m=0 mp=0x55ea353b5c80 [syscall]: runtime.cgocall(0x55ea3404f800, 0xc00029f710) github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1.1 addr=0x34 is a null pointer dereference (offset 52 from null) inside rocBLAS handle initialization. Things that did NOT fix the ROCm crash: HSA_OVERRIDE_GFX_VERSION=11.0.0 or 11.5.1 GGML_HIP_UMA=1 HSA_ENABLE_UNIFIED_MEMORY=1 Ulimit=memlock=-1:-1 Increasing UMA Frame Buffer to 16GB in BIOS (confirms memory size is not the root cause) SELinux permissive mode Memory state with 16GB UMA Frame Buffer: mem_info_vram_total: 17179869184 (16GB) mem_info_gtt_total: 16447606784 (~15.3GB) heap_type 1 (system/unified memory) Workaround: Switch from ollama/ollama:rocm to ollama/ollama:latest with OLLAMA_VULKAN=true. The standard image includes the RADV Vulkan backend (libvulkan_radeon.so, radeon_icd.json) which works correctly on gfx1151. Result with Vulkan: NAME ID SIZE PROCESSOR CONTEXT llama3.1:8b 46e0c10c039e 11 GB 100% GPU 32768
Author
Owner

@fdff87554 commented on GitHub (Apr 8, 2026):

Sharing my experience on the same hardware (Ryzen AI MAX+ 395 / Radeon 8060S / gfx1151) in case it's useful for anyone here. I ran into the same crash with ollama/ollama:0.20.3-rocm on Ubuntu 24.04 in Docker, and want to share what ended up working for me.

What I saw

The bootstrap discovery pass detected the GPU correctly:

verifying if device is supported library=/usr/lib/ollama/rocm
description="Radeon 8060S Graphics" compute=gfx1151 id=0

But the second runner (the one started with GGML_CUDA_INIT=1) exited within ~150 ms:

runner.go:464 msg="failure during GPU discovery"
extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]"
error="runner crashed"

What I tried before finding the fix

To narrow down which call was failing, I built small C tests against the libraries bundled in the container (/usr/lib/ollama/rocm/). In my environment:

Call Result
hipInit / hipGetDeviceCount / hipSetDevice OK
hipGetDeviceProperties (Radeon 8060S Graphics) OK
hipMemGetInfo (98148 / 98304 MB) OK
hipMalloc(64MB) / hipFree OK
hipblasCreate OK
hipblasLtCreate crashes with Memory access fault by GPU node-1 ... Page not present or supervisor privilege

I also tried a few env-var workarounds before going further: HSA_OVERRIDE_GFX_VERSION (with no override, 11.0.0, and 11.5.0), HSA_ENABLE_SDMA=0, and ROCBLAS_USE_HIPBLASLT=0. None of them changed the outcome on my system.

What ended up working

While reading around, I found that AMD's Strix Halo system optimization guide lists Linux kernel 6.18.4+ as the minimum and recommends the kernel inbox amdgpu driver (rather than amdgpu-dkms) for Ryzen APUs. My system was on 6.17.0-20-generic, so I tried upgrading.

I installed mainline kernel 6.18.14 from https://kernel.ubuntu.com/mainline/v6.18.14/amd64/ (4 .deb files: linux-headers _all, linux-headers -generic, linux-modules -generic, linux-image-unsigned -generic):

sudo dpkg -i linux-headers-6.18*.deb linux-modules-6.18*.deb linux-image-unsigned-6.18*.deb
sudo update-grub && sudo reboot

After reboot, the existing amdgpu-dkms 6.16.13 no longer builds against 6.18.x, so the kernel inbox amdgpu driver takes over. ROCm 7.2.1 userspace (installed earlier via the AMD apt repo) kept working unchanged. No other changes needed.

Result

$ uname -r
6.18.14-061814-generic

$ docker compose --profile amd up
ollama  | inference compute id=0 filter_id=0 library=ROCm compute=gfx1151
        |   description="Radeon 8060S Graphics" type=iGPU
        |   total="111.6 GiB" available="111.3 GiB"

ollama/ollama:0.20.3-rocm now picks up the GPU with the full 111.6 GiB UMA allocation, and inference runs on the GPU.

References I found useful

Caveats

  • I tested 6.18.14 specifically. The AMD docs list 6.18.4 as the minimum, but I haven't personally verified the lowest working point release.
  • Mainline kernels aren't supported by Canonical, so they don't get auto-updates -- worth keeping in mind. The previous kernel stays selectable from the GRUB menu, so rolling back is easy if needed.
  • This is just what worked on my setup; sharing in case it helps others narrow down their own issue.
<!-- gh-comment-id:4204195138 --> @fdff87554 commented on GitHub (Apr 8, 2026): Sharing my experience on the same hardware (Ryzen AI MAX+ 395 / Radeon 8060S / gfx1151) in case it's useful for anyone here. I ran into the same crash with `ollama/ollama:0.20.3-rocm` on Ubuntu 24.04 in Docker, and want to share what ended up working for me. ### What I saw The bootstrap discovery pass detected the GPU correctly: ``` verifying if device is supported library=/usr/lib/ollama/rocm description="Radeon 8060S Graphics" compute=gfx1151 id=0 ``` But the second runner (the one started with `GGML_CUDA_INIT=1`) exited within ~150 ms: ``` runner.go:464 msg="failure during GPU discovery" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" error="runner crashed" ``` ### What I tried before finding the fix To narrow down which call was failing, I built small C tests against the libraries bundled in the container (`/usr/lib/ollama/rocm/`). In my environment: | Call | Result | |---|---| | `hipInit` / `hipGetDeviceCount` / `hipSetDevice` | OK | | `hipGetDeviceProperties` (`Radeon 8060S Graphics`) | OK | | `hipMemGetInfo` (98148 / 98304 MB) | OK | | `hipMalloc(64MB)` / `hipFree` | OK | | `hipblasCreate` | OK | | `hipblasLtCreate` | crashes with `Memory access fault by GPU node-1 ... Page not present or supervisor privilege` | I also tried a few env-var workarounds before going further: `HSA_OVERRIDE_GFX_VERSION` (with no override, `11.0.0`, and `11.5.0`), `HSA_ENABLE_SDMA=0`, and `ROCBLAS_USE_HIPBLASLT=0`. None of them changed the outcome on my system. ### What ended up working While reading around, I found that AMD's Strix Halo system optimization guide lists Linux kernel 6.18.4+ as the minimum and recommends the kernel inbox amdgpu driver (rather than amdgpu-dkms) for Ryzen APUs. My system was on 6.17.0-20-generic, so I tried upgrading. I installed mainline kernel **6.18.14** from <https://kernel.ubuntu.com/mainline/v6.18.14/amd64/> (4 .deb files: linux-headers `_all`, linux-headers `-generic`, linux-modules `-generic`, linux-image-unsigned `-generic`): ```bash sudo dpkg -i linux-headers-6.18*.deb linux-modules-6.18*.deb linux-image-unsigned-6.18*.deb sudo update-grub && sudo reboot ``` After reboot, the existing `amdgpu-dkms 6.16.13` no longer builds against 6.18.x, so the kernel inbox `amdgpu` driver takes over. ROCm 7.2.1 userspace (installed earlier via the AMD apt repo) kept working unchanged. No other changes needed. ### Result ``` $ uname -r 6.18.14-061814-generic $ docker compose --profile amd up ollama | inference compute id=0 filter_id=0 library=ROCm compute=gfx1151 | description="Radeon 8060S Graphics" type=iGPU | total="111.6 GiB" available="111.3 GiB" ``` `ollama/ollama:0.20.3-rocm` now picks up the GPU with the full 111.6 GiB UMA allocation, and inference runs on the GPU. ### References I found useful - AMD Strix Halo system optimization: <https://rocm.docs.amd.com/en/latest/how-to/system-optimization/strixhalo.html> - AMD Ryzen ROCm install (`--no-dkms`): <https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installryz/native_linux/install-ryzen.html> - A related ROCm issue with the same memory access fault signature: <https://github.com/ROCm/ROCm/issues/5824> ### Caveats - I tested 6.18.14 specifically. The AMD docs list 6.18.4 as the minimum, but I haven't personally verified the lowest working point release. - Mainline kernels aren't supported by Canonical, so they don't get auto-updates -- worth keeping in mind. The previous kernel stays selectable from the GRUB menu, so rolling back is easy if needed. - This is just what worked on my setup; sharing in case it helps others narrow down their own issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71648