[GH-ISSUE #13236] ROCm GPU discovery times out on AMD Radeon AI PRO R9700 (gfx1201) – CPU fallback only #70811

New Issue

GiteaMirror · 2026-05-04T23:04:24-05:00

GiteaMirror commented

2026-05-04 23:04:24 -05:00

Originally created by @dmoraine on GitHub (Nov 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13236

Hi,

I’m trying to use Ollama with an AMD GPU via ROCm, but Ollama always times out during GPU discovery and falls back to CPU-only inference, both on the host installation and in the official Docker ollama/ollama:rocm image.

I’m opening this issue to report the behavior and provide logs and environment details to help debug it.

Environment

OS: Linux Mint (Ubuntu-based)
Kernel: 6.14.0-36-generic
CPU: AMD Ryzen 7 8700F
GPU: AMD Radeon AI PRO R9700
- gfx arch: gfx1201
- PCI ID: 0000:03:00.0
RAM: 64 GiB
ROCm:
- rocminfo reports:
  - GPU agent: AMD Radeon AI PRO R9700
  - ISA: amdgcn-amd-amdhsa--gfx1201
  - VRAM: 33406976 KB (~32 GiB)
- rocm-smi sees the card and reports utilization / VRAM correctly outside of Ollama.
Ollama versions tested:
- Native install: ollama version is 0.13.0
- Docker image: ollama/ollama:rocm (Ollama 0.13.0 inside the container)

ROCm validation (outside Ollama)

On the host, ROCm sees the GPU correctly:

rocminfo shows both CPU and GPU agents, including:
- Name: AMD Radeon AI PRO R9700
- Device Type: GPU
- ISA: amdgcn-amd-amdhsa--gfx1201
rocm-smi lists card0 as AMD Radeon AI PRO R9700 and can report GPU utilization and VRAM usage when running ROCm workloads.

So ROCm itself appears to be installed and functioning correctly with this GPU.

Problem description

When starting ollama serve with ROCm available, Ollama:

Detects the GPU and the ROCm backend.
Starts a runner with OLLAMA_LIBRARY_PATH pointing to the ROCm backend.
Attempts GPU discovery for ~30 seconds.
Fails with failed to finish discovery before timeout.
Filters out the GPU device and falls back to CPU-only inference, reporting total vram = 0 B.

This happens:

With the native binary installed via the official script.
With the official Docker ollama/ollama:rocm image (with /dev/kfd and /dev/dri passed through).

Logs – native install

Command:

export ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e
OLLAMA_DEBUG=1 ollama serve

Behavior is identical on native and docker install: GPU discovery via ROCm times out, then Ollama runs CPU-only.

What I’ve tried

Ensuring ROCm is installed and working on the host (rocminfo, rocm-smi OK).
Passing /dev/kfd and /dev/dri into the container.
Setting ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e explicitly.
Starting ollama serve with OLLAMA_DEBUG=1 to get detailed logs.

The failure mode is the same across:

Host install (0.13.0) and Docker (0.13.0).
Different ways of starting ollama serve.

Expected behavior

Ollama’s ROCm backend (libggml-hip) should successfully initialize the AMD Radeon AI PRO R9700 (gfx1201) GPU and use it for inference (at least partially, even if with some limitations).

Actual behavior

GPU is detected and recognized (correct description, gfx arch, PCI ID).
GPU discovery via ROCm backend blocks for ~30 seconds and fails with:
- error="failed to finish discovery before timeout"
Device is filtered out as “didn't fully initialize”.
Ollama switches to CPU-only inference with total vram = 0 B.

Questions

Is gfx1201 (AMD Radeon AI PRO R9700) currently supported / tested with the ROCm backend in Ollama?
Is there any known issue with the combination of:
- this GPU arch
- ROCm version
- and the version of libggml-hip.so shipped in Ollama 0.13.0?
Is there any extra debug flag or environment variable I can set to get more detailed ROCm/backend logs (e.g., from libggml-hip itself) to help narrow down where the initialization is hanging?

Relevant log output

time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1544 msg="server config" env="... OLLAMA_DEBUG:DEBUG ... OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e ..."

time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.12.11)"
time=2025-11-24T20:35:20.629+01:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."

time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=runner.go:128 msg="verifying device is supported" library=/usr/local/lib/ollama/rocm description="AMD Radeon AI PRO R9700" compute=gfx1201 id=GPU-b718cad49010a54e pci_id=0000:03:00.0

time=2025-11-24T20:35:22.127+01:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42531"
time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_DEBUG=1 ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib::/usr/local/cuda/lib64:/usr/local/lib/x86_64-linux-gnu ... OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1

time=2025-11-24T20:35:52.128+01:00 level=INFO source=runner.go:445 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" error="failed to finish discovery before timeout"
time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:418 msg="bootstrap discovery took" duration=30.000956172s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]"
time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:135 msg="filtering device which didn't fully initialize" id=GPU-b718cad49010a54e libdir=/usr/local/lib/ollama/rocm pci_id=0000:03:00.0 library=ROCm

time=2025-11-24T20:35:52.128+01:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.4 GiB" available="50.3 GiB"
time=2025-11-24T20:35:52.128+01:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.13.0

Originally created by @dmoraine on GitHub (Nov 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13236 Hi, I’m trying to use Ollama with an AMD GPU via ROCm, but Ollama always times out during GPU discovery and falls back to CPU-only inference, both on the host installation and in the official Docker `ollama/ollama:rocm` image. I’m opening this issue to report the behavior and provide logs and environment details to help debug it. ## Environment - OS: Linux Mint (Ubuntu-based) - Kernel: 6.14.0-36-generic - CPU: AMD Ryzen 7 8700F - GPU: AMD Radeon AI PRO R9700 - gfx arch: `gfx1201` - PCI ID: `0000:03:00.0` - RAM: 64 GiB - ROCm: - `rocminfo` reports: - GPU agent: `AMD Radeon AI PRO R9700` - ISA: `amdgcn-amd-amdhsa--gfx1201` - VRAM: `33406976 KB` (~32 GiB) - `rocm-smi` sees the card and reports utilization / VRAM correctly outside of Ollama. - Ollama versions tested: - Native install: `ollama version is 0.13.0` - Docker image: `ollama/ollama:rocm` (Ollama 0.13.0 inside the container) ## ROCm validation (outside Ollama) On the host, ROCm sees the GPU correctly: - `rocminfo` shows both CPU and GPU agents, including: - `Name: AMD Radeon AI PRO R9700` - `Device Type: GPU` - `ISA: amdgcn-amd-amdhsa--gfx1201` - `rocm-smi` lists `card0` as `AMD Radeon AI PRO R9700` and can report GPU utilization and VRAM usage when running ROCm workloads. So ROCm itself appears to be installed and functioning correctly with this GPU. ## Problem description When starting `ollama serve` with ROCm available, Ollama: 1. Detects the GPU and the ROCm backend. 2. Starts a runner with `OLLAMA_LIBRARY_PATH` pointing to the ROCm backend. 3. Attempts GPU discovery for ~30 seconds. 4. Fails with `failed to finish discovery before timeout`. 5. Filters out the GPU device and falls back to CPU-only inference, reporting `total vram = 0 B`. This happens: - With the native binary installed via the official script. - With the official Docker `ollama/ollama:rocm` image (with `/dev/kfd` and `/dev/dri` passed through). ## Logs – native install Command: ```bash export ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e OLLAMA_DEBUG=1 ollama serve ``` Behavior is identical on native and docker install: GPU discovery via ROCm times out, then Ollama runs CPU-only. ## What I’ve tried - Ensuring ROCm is installed and working on the host (rocminfo, rocm-smi OK). - Passing /dev/kfd and /dev/dri into the container. - Setting ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e explicitly. - Starting ollama serve with OLLAMA_DEBUG=1 to get detailed logs. The failure mode is the same across: - Host install (0.13.0) and Docker (0.13.0). - Different ways of starting ollama serve. ## Expected behavior - Ollama’s ROCm backend (libggml-hip) should successfully initialize the AMD Radeon AI PRO R9700 (gfx1201) GPU and use it for inference (at least partially, even if with some limitations). ## Actual behavior - GPU is detected and recognized (correct description, gfx arch, PCI ID). - GPU discovery via ROCm backend blocks for ~30 seconds and fails with: - error="failed to finish discovery before timeout" - Device is filtered out as “didn't fully initialize”. - Ollama switches to CPU-only inference with total vram = 0 B. Questions - Is gfx1201 (AMD Radeon AI PRO R9700) currently supported / tested with the ROCm backend in Ollama? - Is there any known issue with the combination of: - this GPU arch - ROCm version - and the version of libggml-hip.so shipped in Ollama 0.13.0? - Is there any extra debug flag or environment variable I can set to get more detailed ROCm/backend logs (e.g., from libggml-hip itself) to help narrow down where the initialization is hanging? ### Relevant log output ```shell time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1544 msg="server config" env="... OLLAMA_DEBUG:DEBUG ... OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e ..." time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.12.11)" time=2025-11-24T20:35:20.629+01:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=runner.go:128 msg="verifying device is supported" library=/usr/local/lib/ollama/rocm description="AMD Radeon AI PRO R9700" compute=gfx1201 id=GPU-b718cad49010a54e pci_id=0000:03:00.0 time=2025-11-24T20:35:22.127+01:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42531" time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_DEBUG=1 ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib::/usr/local/cuda/lib64:/usr/local/lib/x86_64-linux-gnu ... OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1 time=2025-11-24T20:35:52.128+01:00 level=INFO source=runner.go:445 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" error="failed to finish discovery before timeout" time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:418 msg="bootstrap discovery took" duration=30.000956172s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:135 msg="filtering device which didn't fully initialize" id=GPU-b718cad49010a54e libdir=/usr/local/lib/ollama/rocm pci_id=0000:03:00.0 library=ROCm time=2025-11-24T20:35:52.128+01:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.4 GiB" available="50.3 GiB" time=2025-11-24T20:35:52.128+01:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.13.0

GiteaMirror added the bug label 2026-05-04 23:04:24 -05:00

GiteaMirror commented

2026-05-04 23:04:29 -05:00

@Battlesheepu commented on GitHub (Nov 25, 2025):

When it comes to the docker image (where I'm experiencing the same problem with my 9070XT), from what I can see in the project's Dockerfile, current ROCm release ollama's rocking is 6.3.3, whereas based on the ROCm release page, the version that introduces 9060xt, 9070xt and Pro 9700 support is 6.4.1.

So in the case of the rocm tagged Docker image, I assume we could blame that?

@Battlesheepu commented on GitHub (Nov 25, 2025): When it comes to the docker image (where I'm experiencing the same problem with my 9070XT), from what I can see [in the project's Dockerfile](https://github.com/ollama/ollama/blob/main/Dockerfile#L6), current ROCm release ollama's rocking is 6.3.3, whereas [based on the ROCm release page](https://github.com/ROCm/ROCm/releases), the version that introduces 9060xt, 9070xt and Pro 9700 support is 6.4.1. So in the case of the `rocm` tagged Docker image, I assume we could blame that?

GiteaMirror commented

2026-05-04 23:04:30 -05:00

@dmoraine commented on GitHub (Nov 25, 2025):

Good point on the docker image, but locally, I run the latest version available on the repository AMD (7.1) which is compatible with my Pro R9700.

$ rocminfo
ROCk module version 6.16.6 is loaded [...]

@dmoraine commented on GitHub (Nov 25, 2025): Good point on the docker image, but locally, I run the latest version available on the [repository](https://repo.radeon.com/rocm/apt/7.1) AMD (7.1) which is [compatible](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#) with my Pro R9700. ``$ rocminfo`` ``ROCk module version 6.16.6 is loaded [...]``

GiteaMirror commented

2026-05-04 23:04:32 -05:00

@Battlesheepu commented on GitHub (Nov 25, 2025):

Good point on the docker image, but locally, I run the latest version available on the repository AMD (7.1) which is compatible with my Pro R9700.

$ rocminfo ROCk module version 6.16.6 is loaded [...]

What helped me debug my issue was setting the following set of env vars:

OLLAMA_DEBUG=2
AMD_LOG_LEVEL=3

to enable trace logs for Ollama and Info logs for the AMD stack.
The first one I've deduced myself from the code - I don't know if it's documented somewhere. The second, I've found in the AMD docs.

These env vars should help Ollama spew out a lot more logs for you to scour through :)

As for the Docker image, I admit, I'm a bit lost here. The Dockerfile uses version 6.3.3, but the workflows seem to use ROCm v6.1.2 - but it seems that should be solved by the PR that's currently being cooked.

@Battlesheepu commented on GitHub (Nov 25, 2025): > Good point on the docker image, but locally, I run the latest version available on the [repository](https://repo.radeon.com/rocm/apt/7.1) AMD (7.1) which is [compatible](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#) with my Pro R9700. > > `$ rocminfo` `ROCk module version 6.16.6 is loaded [...]` What helped me debug my issue was setting the following set of env vars: ``` OLLAMA_DEBUG=2 AMD_LOG_LEVEL=3 ``` to enable trace logs for Ollama and Info logs for the AMD stack. The first one I've deduced myself [from the code](https://github.com/ollama/ollama/blob/main/envconfig/config.go#L181) - I don't know if it's documented somewhere. The second, [I've found in the AMD docs](https://rocm.docs.amd.com/projects/HIP/en/docs-5.7.0/developer_guide/logging.html). These env vars should help Ollama spew out a lot more logs for you to scour through :) As for the Docker image, I admit, I'm a bit lost here. [The Dockerfile uses version 6.3.3](https://github.com/ollama/ollama/blob/main/Dockerfile#L6), but the [workflows seem to use ROCm v6.1.2](https://github.com/ollama/ollama/blob/main/.github/workflows/test.yaml#L52) - but it seems that should be solved by [the PR that's currently being cooked](https://github.com/ollama/ollama/pull/13000).

GiteaMirror commented

2026-05-04 23:04:32 -05:00

@dmoraine commented on GitHub (Nov 25, 2025):

Apparently, I'm not the only one who has struggled with their R9700:
Is AMD's AI Pro R9700 supported? #13085

I switched to kernel 6.17.0-1005-oem:
apt install linux-oem-24.04d
But no improvement.

This appears to be a bug or incompatibility in the way libggml-hip (Ollama's ROCm backend) uses ROCm on gfx1201.
Potentially:

a ROCm function that never returns (deadlock, spin, blockage in a driver call),
or a code path specific to gfx12xx that is not yet stable/tested.

failure during GPU discovery error="failed to finish discovery before timeout"
filtering device which didn't fully initialize id=GPU-b718cad49010a54e
devices=[]

The problem is not detection, but an internal stage of discovery that never completes on gfx1201. According to the more details logs, the symptom, as seen by Ollama, is:

libggml-hip.so loads,
detects the GPU,
queries its memory and PCI,
goes through HSA / COMGR / code objects gfx1201,
then blocks (never returns control) until timeout.

libggml-hip is the caller that waits, but perhaps it is the kernel/driver that never responds correctly for this card as long as I am on ROCk 6.16.6?

@dmoraine commented on GitHub (Nov 25, 2025): Apparently, I'm not the only one who has struggled with their R9700: [Is AMD's AI Pro R9700 supported? #13085 ](https://github.com/ollama/ollama/issues/13085) I switched to kernel 6.17.0-1005-oem: ``apt install linux-oem-24.04d`` But no improvement. This appears to be a bug or incompatibility in the way ``libggml-hip`` (Ollama's ROCm backend) uses ROCm on ``gfx1201``. Potentially: - a ROCm function that never returns (deadlock, spin, blockage in a driver call), - or a code path specific to ``gfx12xx`` that is not yet stable/tested. ``failure during GPU discovery error="failed to finish discovery before timeout"`` ``filtering device which didn't fully initialize id=GPU-b718cad49010a54e`` ``devices=[]`` The problem is not detection, but an internal stage of discovery that never completes on ``gfx1201``. According to the more details logs, the symptom, as seen by Ollama, is: - ``libggml-hip.so`` loads, - detects the GPU, - queries its memory and PCI, - goes through HSA / COMGR / code objects ``gfx1201``, - then blocks (never returns control) until timeout. ``libggml-hip`` is the caller that waits, but perhaps it is the kernel/driver that never responds correctly for this card as long as I am on ROCk 6.16.6?

GiteaMirror commented

2026-05-04 23:04:33 -05:00

@cminnoy commented on GitHub (Nov 26, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

@cminnoy commented on GitHub (Nov 26, 2025): I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

GiteaMirror commented

2026-05-04 23:04:34 -05:00

@Battlesheepu commented on GitHub (Nov 26, 2025):

I did try applying changes from the aforementioned MR, rebuilding the docker image myself and running it with the following config, commenting/uncommenting env vars in my attempts:

services:
  ollama:
    image: ollama-local-build-rocm
#    image: "ollama/ollama:rocm"
    devices:
      - "/dev/kfd"
      - "/dev/dri"
    volumes:
      - ./ollamavolume:/root/.ollama
    ports:
      - "127.0.0.1:11434:11434"
    group_add:
      - video
    environment:
      - "OLLAMA_DEBUG=2"
      - "AMD_LOG_LEVEL=4"
#      - "GPU_MAX_HW_QUEUES=1"      - 
#      - "HSA_OVERRIDE_GFX_VERSION=12.0.1"
#      - "ROCR_VISIBLE_DEVICES=GPU-(...)

No difference on my side.

ollama-1  | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:462 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" devices=[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:432 msg="bootstrap discovery took" duration=422.012043ms OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" extra_envs=map[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0
ollama-1  | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:170 msg="supported GPU library combinations before filtering" supported=map[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=814.161228ms
ollama-1  | time=2025-11-26T18:34:02.532Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="15.0 GiB" available="14.8 GiB"
ollama-1  | time=2025-11-26T18:34:02.532Z level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

The thing that bothers me is that AMD docs mention I should have amdgpu-dkms installed for running ROCm in the container, which is not even compatible with my host's Kernel (6.17) and just doesn't compile. Could that be the issue, too, or are the Ollama images somehow equipped to handle that?

@Battlesheepu commented on GitHub (Nov 26, 2025): I did try applying changes [from the aforementioned MR](https://github.com/ollama/ollama/pull/13000), rebuilding the docker image myself and running it with the following config, commenting/uncommenting env vars in my attempts: ```yaml services: ollama: image: ollama-local-build-rocm # image: "ollama/ollama:rocm" devices: - "/dev/kfd" - "/dev/dri" volumes: - ./ollamavolume:/root/.ollama ports: - "127.0.0.1:11434:11434" group_add: - video environment: - "OLLAMA_DEBUG=2" - "AMD_LOG_LEVEL=4" # - "GPU_MAX_HW_QUEUES=1" - # - "HSA_OVERRIDE_GFX_VERSION=12.0.1" # - "ROCR_VISIBLE_DEVICES=GPU-(...) ``` No difference on my side. ``` ollama-1 | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:462 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" devices=[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:432 msg="bootstrap discovery took" duration=422.012043ms OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" extra_envs=map[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0 ollama-1 | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:170 msg="supported GPU library combinations before filtering" supported=map[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=814.161228ms ollama-1 | time=2025-11-26T18:34:02.532Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="15.0 GiB" available="14.8 GiB" ollama-1 | time=2025-11-26T18:34:02.532Z level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" ``` The thing that bothers me is that [AMD docs mention I should have `amdgpu-dkms` installed](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-access-gpus-in-container) for running ROCm in the container, which is not even compatible with my host's Kernel (6.17) and just doesn't compile. Could that be the issue, too, or are the Ollama images somehow equipped to handle that?

GiteaMirror commented

2026-05-04 23:04:36 -05:00

@StrykeSlammerII commented on GitHub (Nov 26, 2025):

edit: I come back after the holiday, do some unrelated updates and restart ollama server, and it's working normally again.
Leaving my initial report below for posterity.

Same issue here, using Linux CLI with a gfx1200 / R9060XT.
Manjaro updated ROCm from 6.x to 7.1.0 along with ollama ->13.0 so I'm not sure which triggered the issue.

I'm not seeing anything from the AMD_LOG_LEVEL flag, but I suspect this is the main error from OLLAMA_DEBUG=2 log:

operator() double registration of ggml_uncaught_exception
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)

Full log follows:
$AMD_LOG_LEVEL=3 OLLAMA_DEBUG="2" OLLAMA_FLASH_ATTENTION=1 ollama serve
time=2025-11-26T15:57:01.956-05:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG-4 OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/strike/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-11-26T15:57:01.963-05:00 level=INFO source=images.go:522 msg="total blobs: 75"
time=2025-11-26T15:57:01.965-05:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-26T15:57:01.966-05:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.13.0)"
time=2025-11-26T15:57:01.967-05:00 level=DEBUG source=sched.go:120 msg="starting llm scheduler"
time=2025-11-26T15:57:01.967-05:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2025-11-26T15:57:01.967-05:00 level=TRACE source=runner.go:425 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[]
time=2025-11-26T15:57:01.969-05:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 34049"
time=2025-11-26T15:57:01.969-05:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_FLASH_ATTENTION=1 OLLAMA_DEBUG=2 ROCM_PATH=/opt/rocm PATH=/home/strike/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/lib/rustup/bin LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama
time=2025-11-26T15:57:01.986-05:00 level=INFO source=runner.go:1398 msg="starting ollama engine"
time=2025-11-26T15:57:01.987-05:00 level=INFO source=runner.go:1433 msg="Server listening on 127.0.0.1:34049"
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=general.architecture type=string
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=tokenizer.ggml.model type=string
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.file_type default=0
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.name default=""
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.description default=""
time=2025-11-26T15:57:01.992-05:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama
operator() double registration of ggml_uncaught_exception
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.pooling_type default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.expert_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.embedding_length default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.key_length default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1373 msg="dummy model load took" duration=9.411883ms
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1378 msg="gathering device infos took" duration=573ns
time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:452 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:422 msg="bootstrap discovery took" duration=34.110172ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0
time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:160 msg="supported GPU library combinations before filtering" supported=map[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=34.498436ms
time=2025-11-26T15:57:02.001-05:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="50.2 GiB"
time=2025-11-26T15:57:02.001-05:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

@StrykeSlammerII commented on GitHub (Nov 26, 2025): edit: I come back after the holiday, do some unrelated updates and restart ollama server, and it's working normally again. Leaving my initial report below for posterity. ---- Same issue here, using Linux CLI with a gfx1200 / R9060XT. Manjaro updated ROCm from 6.x to 7.1.0 along with ollama ->13.0 so I'm not sure which triggered the issue. I'm not seeing anything from the `AMD_LOG_LEVEL` flag, but I suspect this is the main error from `OLLAMA_DEBUG=2` log: ```time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama operator() double registration of ggml_uncaught_exception load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) ``` Full log follows: `$AMD_LOG_LEVEL=3 OLLAMA_DEBUG="2" OLLAMA_FLASH_ATTENTION=1 ollama serve` time=2025-11-26T15:57:01.956-05:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG-4 OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/strike/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-11-26T15:57:01.963-05:00 level=INFO source=images.go:522 msg="total blobs: 75" time=2025-11-26T15:57:01.965-05:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-26T15:57:01.966-05:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.13.0)" time=2025-11-26T15:57:01.967-05:00 level=DEBUG source=sched.go:120 msg="starting llm scheduler" time=2025-11-26T15:57:01.967-05:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-26T15:57:01.967-05:00 level=TRACE source=runner.go:425 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[] time=2025-11-26T15:57:01.969-05:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 34049" time=2025-11-26T15:57:01.969-05:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_FLASH_ATTENTION=1 OLLAMA_DEBUG=2 ROCM_PATH=/opt/rocm PATH=/home/strike/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/lib/rustup/bin LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama time=2025-11-26T15:57:01.986-05:00 level=INFO source=runner.go:1398 msg="starting ollama engine" time=2025-11-26T15:57:01.987-05:00 level=INFO source=runner.go:1433 msg="Server listening on 127.0.0.1:34049" time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=general.architecture type=string time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=tokenizer.ggml.model type=string time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.file_type default=0 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.name default="" time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.description default="" time=2025-11-26T15:57:01.992-05:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama operator() double registration of ggml_uncaught_exception load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.pooling_type default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.expert_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.embedding_length default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.key_length default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1373 msg="dummy model load took" duration=9.411883ms time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1378 msg="gathering device infos took" duration=573ns time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:452 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:422 msg="bootstrap discovery took" duration=34.110172ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0 time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:160 msg="supported GPU library combinations before filtering" supported=map[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=34.498436ms time=2025-11-26T15:57:02.001-05:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="50.2 GiB" time=2025-11-26T15:57:02.001-05:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

GiteaMirror commented

2026-05-04 23:04:37 -05:00

@gbnk0 commented on GitHub (Nov 27, 2025):

Same problem on ubuntu 22.04 with latest rocm drivers. Upgrade to kernel 6.8 resolved the issue:

apt install linux-generic-hwe-22.04

uname -r
6.8.0-87-generic

docker compose env:
- HSA_OVERRIDE_GFX_VERSION=12.0.1
- HSA_ENABLE_SDMA=0
- ROCR_VISIBLE_DEVICES=GPU-xxx (do rocminfo |grep Uuid)

Hope this can help.

@gbnk0 commented on GitHub (Nov 27, 2025): Same problem on ubuntu 22.04 with latest rocm drivers. Upgrade to kernel 6.8 resolved the issue: apt install linux-generic-hwe-22.04 uname -r 6.8.0-87-generic docker compose env: - HSA_OVERRIDE_GFX_VERSION=12.0.1 - HSA_ENABLE_SDMA=0 - ROCR_VISIBLE_DEVICES=GPU-xxx (do rocminfo |grep Uuid) Hope this can help.

GiteaMirror commented

2026-05-04 23:04:38 -05:00

@dmoraine commented on GitHub (Dec 3, 2025):

Quick update:

I was previously running with a custom ROCm installation under /opt/rocm (installed from AMD’s packages), in addition to what the distro provides. With that setup, Ollama consistently timed out during ROCm GPU discovery and I was getting repeated ollama crashes in libhsa-runtime64.so / libamdhip64.so / libggml-hip.so in the system journal.

I’ve now removed my custom /opt/rocm stack and switched to using only the ROCm/hip packages from the Linux Mint/Ubuntu repositories (so Ollama links against the distro-provided ROCm libraries instead of a mixed setup).

With this change:

ollama run qwen3:32b "test" --verbose completes successfully.

Thinking...

Hello! It looks like you're testing the waters. 😊 How can I assist you today? Whether you need help with a specific question, want to explore a topic, or just need someone to chat with, I'm here! Let me know what's on your mind.

total duration:       11.626279527s
load duration:        5.374859364s
prompt eval count:    11 token(s)
prompt eval duration: 106.941043ms
prompt eval rate:     102.86 tokens/s
eval count:           141 token(s)
eval duration:        6.102646156s
eval rate:            23.10 tokens/s

There are no new ROCm/HIP-related ollama coredumps in journalctl since switching.
Inference runs stably; I’m no longer seeing the 30s “failed to finish discovery before timeout” followed by CPU-only fallback.

So at least on my system, the discovery timeout and crashes seem to have been caused by the combination of Ollama’s ROCm backend with my manually installed /opt/rocm stack. Using the distro ROCm packages only appears to resolve the issue.

sudo systemctl edit ollama

[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434
Environment=OLLAMA_NUM_PARALLEL=3
Environment=OLLAMA_MAX_LOADED_MODELS=2
Environment=OLLAMA_FLASH_ATTENTION=1
Environment=ROCM_PATH=/opt/rocm

If there’s any specific debug flag or env var you’d like me to enable with this “clean” ROCm setup to compare behavior, I’m happy to run it and share the logs.

@dmoraine commented on GitHub (Dec 3, 2025): Quick update: I was previously running with a custom ROCm installation under `/opt/rocm` (installed from AMD’s packages), in addition to what the distro provides. With that setup, Ollama consistently timed out during ROCm GPU discovery and I was getting repeated `ollama` crashes in `libhsa-runtime64.so` / `libamdhip64.so` / `libggml-hip.so` in the system journal. I’ve now removed my custom `/opt/rocm` stack and switched to using only the ROCm/hip packages from the Linux Mint/Ubuntu repositories (so Ollama links against the distro-provided ROCm libraries instead of a mixed setup). With this change: - `ollama run qwen3:32b "test" --verbose` completes successfully. ``` Thinking... Hello! It looks like you're testing the waters. 😊 How can I assist you today? Whether you need help with a specific question, want to explore a topic, or just need someone to chat with, I'm here! Let me know what's on your mind. total duration: 11.626279527s load duration: 5.374859364s prompt eval count: 11 token(s) prompt eval duration: 106.941043ms prompt eval rate: 102.86 tokens/s eval count: 141 token(s) eval duration: 6.102646156s eval rate: 23.10 tokens/s ``` - There are no new ROCm/HIP-related `ollama` coredumps in `journalctl` since switching. - Inference runs stably; I’m no longer seeing the 30s “failed to finish discovery before timeout” followed by CPU-only fallback. So at least on my system, the discovery timeout and crashes seem to have been caused by the combination of Ollama’s ROCm backend with my manually installed `/opt/rocm` stack. Using the distro ROCm packages only appears to resolve the issue. `sudo systemctl edit ollama` ``` [Service] Environment=OLLAMA_HOST=0.0.0.0:11434 Environment=OLLAMA_NUM_PARALLEL=3 Environment=OLLAMA_MAX_LOADED_MODELS=2 Environment=OLLAMA_FLASH_ATTENTION=1 Environment=ROCM_PATH=/opt/rocm ``` If there’s any specific debug flag or env var you’d like me to enable with this “clean” ROCm setup to compare behavior, I’m happy to run it and share the logs.

GiteaMirror commented

2026-05-04 23:04:39 -05:00

@Cresius34 commented on GitHub (Dec 12, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

Same here, any new for that ?

@Cresius34 commented on GitHub (Dec 12, 2025): > I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1. Same here, any new for that ?

GiteaMirror commented

2026-05-04 23:04:40 -05:00

@cvocvo commented on GitHub (Dec 19, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

Same here, any new for that ?

It looks like Adrenaline v25.12.1 drivers are available as of 12/10/25: https://www.amd.com/en/support/downloads/drivers.html/graphics/radeon-ai-pro/radeon-ai-pro-r9000-series/amd-radeon-ai-pro-r9700.html
Any chance you could test those? I was considering this card too but without support it's sort of a no go.

We also may need to wait for this PR to be merged? https://github.com/ollama/ollama/pull/13000#issuecomment-3614654286

@cvocvo commented on GitHub (Dec 19, 2025): > > I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1. > > Same here, any new for that ? It looks like Adrenaline v25.12.1 drivers are available as of 12/10/25: https://www.amd.com/en/support/downloads/drivers.html/graphics/radeon-ai-pro/radeon-ai-pro-r9000-series/amd-radeon-ai-pro-r9700.html Any chance you could test those? I was considering this card too but without support it's sort of a no go. We also may need to wait for this PR to be merged? https://github.com/ollama/ollama/pull/13000#issuecomment-3614654286

GiteaMirror commented

2026-05-04 23:04:41 -05:00

@Cresius34 commented on GitHub (Dec 19, 2025):

I tested the latest drivers as well as the HIP but Ollama uses its own compilation of rocm 6.4.2, rocm which is compatible with the radeon pro AI (gfx 1201), I suspect a non-update of compatible gpus in Ollama.

@Cresius34 commented on GitHub (Dec 19, 2025): I tested the latest drivers as well as the HIP but Ollama uses its own compilation of rocm 6.4.2, rocm which is compatible with the radeon pro AI (gfx 1201), I suspect a non-update of compatible gpus in Ollama.

GiteaMirror commented

2026-05-04 23:04:43 -05:00

@LukeLamb commented on GitHub (Apr 25, 2026):

Working configuration for R9700 (gfx1201) on Ubuntu 24.04 / kernel 6.17: build 0.20.6 from source against ROCm 7.2.1 with AMDGPU_TARGETS=gfx1201. Detection succeeds, full layer offload, 92.99 tok/s on llama3.1:8b-q4_K_M.

Full repro steps and the journal line confirming library=ROCm compute=gfx1201 in #14927.

@LukeLamb commented on GitHub (Apr 25, 2026): Working configuration for R9700 (gfx1201) on Ubuntu 24.04 / kernel 6.17: build 0.20.6 from source against ROCm 7.2.1 with `AMDGPU_TARGETS=gfx1201`. Detection succeeds, full layer offload, 92.99 tok/s on `llama3.1:8b-q4_K_M`. Full repro steps and the journal line confirming `library=ROCm compute=gfx1201` in #14927.

Sign in to join this conversation.

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#70811