[GH-ISSUE #13236] ROCm GPU discovery times out on AMD Radeon AI PRO R9700 (gfx1201) – CPU fallback only #70811

Open
opened 2026-05-04 23:04:24 -05:00 by GiteaMirror · 13 comments
Owner

Originally created by @dmoraine on GitHub (Nov 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13236

Hi,

I’m trying to use Ollama with an AMD GPU via ROCm, but Ollama always times out during GPU discovery and falls back to CPU-only inference, both on the host installation and in the official Docker ollama/ollama:rocm image.

I’m opening this issue to report the behavior and provide logs and environment details to help debug it.

Environment

  • OS: Linux Mint (Ubuntu-based)
  • Kernel: 6.14.0-36-generic
  • CPU: AMD Ryzen 7 8700F
  • GPU: AMD Radeon AI PRO R9700
    • gfx arch: gfx1201
    • PCI ID: 0000:03:00.0
  • RAM: 64 GiB
  • ROCm:
    • rocminfo reports:
      • GPU agent: AMD Radeon AI PRO R9700
      • ISA: amdgcn-amd-amdhsa--gfx1201
      • VRAM: 33406976 KB (~32 GiB)
    • rocm-smi sees the card and reports utilization / VRAM correctly outside of Ollama.
  • Ollama versions tested:
    • Native install: ollama version is 0.13.0
    • Docker image: ollama/ollama:rocm (Ollama 0.13.0 inside the container)

ROCm validation (outside Ollama)

On the host, ROCm sees the GPU correctly:

  • rocminfo shows both CPU and GPU agents, including:
    • Name: AMD Radeon AI PRO R9700
    • Device Type: GPU
    • ISA: amdgcn-amd-amdhsa--gfx1201
  • rocm-smi lists card0 as AMD Radeon AI PRO R9700 and can report GPU utilization and VRAM usage when running ROCm workloads.

So ROCm itself appears to be installed and functioning correctly with this GPU.

Problem description

When starting ollama serve with ROCm available, Ollama:

  1. Detects the GPU and the ROCm backend.
  2. Starts a runner with OLLAMA_LIBRARY_PATH pointing to the ROCm backend.
  3. Attempts GPU discovery for ~30 seconds.
  4. Fails with failed to finish discovery before timeout.
  5. Filters out the GPU device and falls back to CPU-only inference, reporting total vram = 0 B.

This happens:

  • With the native binary installed via the official script.
  • With the official Docker ollama/ollama:rocm image (with /dev/kfd and /dev/dri passed through).

Logs – native install

Command:

export ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e
OLLAMA_DEBUG=1 ollama serve

Behavior is identical on native and docker install: GPU discovery via ROCm times out, then Ollama runs CPU-only.

What I’ve tried

  • Ensuring ROCm is installed and working on the host (rocminfo, rocm-smi OK).
  • Passing /dev/kfd and /dev/dri into the container.
  • Setting ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e explicitly.
  • Starting ollama serve with OLLAMA_DEBUG=1 to get detailed logs.

The failure mode is the same across:

  • Host install (0.13.0) and Docker (0.13.0).
  • Different ways of starting ollama serve.

Expected behavior

  • Ollama’s ROCm backend (libggml-hip) should successfully initialize the AMD Radeon AI PRO R9700 (gfx1201) GPU and use it for inference (at least partially, even if with some limitations).

Actual behavior

  • GPU is detected and recognized (correct description, gfx arch, PCI ID).
  • GPU discovery via ROCm backend blocks for ~30 seconds and fails with:
    • error="failed to finish discovery before timeout"
  • Device is filtered out as “didn't fully initialize”.
  • Ollama switches to CPU-only inference with total vram = 0 B.

Questions

  • Is gfx1201 (AMD Radeon AI PRO R9700) currently supported / tested with the ROCm backend in Ollama?
  • Is there any known issue with the combination of:
    • this GPU arch
    • ROCm version
    • and the version of libggml-hip.so shipped in Ollama 0.13.0?
  • Is there any extra debug flag or environment variable I can set to get more detailed ROCm/backend logs (e.g., from libggml-hip itself) to help narrow down where the initialization is hanging?

Relevant log output

time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1544 msg="server config" env="... OLLAMA_DEBUG:DEBUG ... OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e ..."

time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.12.11)"
time=2025-11-24T20:35:20.629+01:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."

time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=runner.go:128 msg="verifying device is supported" library=/usr/local/lib/ollama/rocm description="AMD Radeon AI PRO R9700" compute=gfx1201 id=GPU-b718cad49010a54e pci_id=0000:03:00.0

time=2025-11-24T20:35:22.127+01:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42531"
time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_DEBUG=1 ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib::/usr/local/cuda/lib64:/usr/local/lib/x86_64-linux-gnu ... OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1

time=2025-11-24T20:35:52.128+01:00 level=INFO source=runner.go:445 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" error="failed to finish discovery before timeout"
time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:418 msg="bootstrap discovery took" duration=30.000956172s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]"
time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:135 msg="filtering device which didn't fully initialize" id=GPU-b718cad49010a54e libdir=/usr/local/lib/ollama/rocm pci_id=0000:03:00.0 library=ROCm

time=2025-11-24T20:35:52.128+01:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.4 GiB" available="50.3 GiB"
time=2025-11-24T20:35:52.128+01:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.13.0

Originally created by @dmoraine on GitHub (Nov 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13236 Hi, I’m trying to use Ollama with an AMD GPU via ROCm, but Ollama always times out during GPU discovery and falls back to CPU-only inference, both on the host installation and in the official Docker `ollama/ollama:rocm` image. I’m opening this issue to report the behavior and provide logs and environment details to help debug it. ## Environment - OS: Linux Mint (Ubuntu-based) - Kernel: 6.14.0-36-generic - CPU: AMD Ryzen 7 8700F - GPU: AMD Radeon AI PRO R9700 - gfx arch: `gfx1201` - PCI ID: `0000:03:00.0` - RAM: 64 GiB - ROCm: - `rocminfo` reports: - GPU agent: `AMD Radeon AI PRO R9700` - ISA: `amdgcn-amd-amdhsa--gfx1201` - VRAM: `33406976 KB` (~32 GiB) - `rocm-smi` sees the card and reports utilization / VRAM correctly outside of Ollama. - Ollama versions tested: - Native install: `ollama version is 0.13.0` - Docker image: `ollama/ollama:rocm` (Ollama 0.13.0 inside the container) ## ROCm validation (outside Ollama) On the host, ROCm sees the GPU correctly: - `rocminfo` shows both CPU and GPU agents, including: - `Name: AMD Radeon AI PRO R9700` - `Device Type: GPU` - `ISA: amdgcn-amd-amdhsa--gfx1201` - `rocm-smi` lists `card0` as `AMD Radeon AI PRO R9700` and can report GPU utilization and VRAM usage when running ROCm workloads. So ROCm itself appears to be installed and functioning correctly with this GPU. ## Problem description When starting `ollama serve` with ROCm available, Ollama: 1. Detects the GPU and the ROCm backend. 2. Starts a runner with `OLLAMA_LIBRARY_PATH` pointing to the ROCm backend. 3. Attempts GPU discovery for ~30 seconds. 4. Fails with `failed to finish discovery before timeout`. 5. Filters out the GPU device and falls back to CPU-only inference, reporting `total vram = 0 B`. This happens: - With the native binary installed via the official script. - With the official Docker `ollama/ollama:rocm` image (with `/dev/kfd` and `/dev/dri` passed through). ## Logs – native install Command: ```bash export ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e OLLAMA_DEBUG=1 ollama serve ``` Behavior is identical on native and docker install: GPU discovery via ROCm times out, then Ollama runs CPU-only. ## What I’ve tried - Ensuring ROCm is installed and working on the host (rocminfo, rocm-smi OK). - Passing /dev/kfd and /dev/dri into the container. - Setting ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e explicitly. - Starting ollama serve with OLLAMA_DEBUG=1 to get detailed logs. The failure mode is the same across: - Host install (0.13.0) and Docker (0.13.0). - Different ways of starting ollama serve. ## Expected behavior - Ollama’s ROCm backend (libggml-hip) should successfully initialize the AMD Radeon AI PRO R9700 (gfx1201) GPU and use it for inference (at least partially, even if with some limitations). ## Actual behavior - GPU is detected and recognized (correct description, gfx arch, PCI ID). - GPU discovery via ROCm backend blocks for ~30 seconds and fails with: - error="failed to finish discovery before timeout" - Device is filtered out as “didn't fully initialize”. - Ollama switches to CPU-only inference with total vram = 0 B. Questions - Is gfx1201 (AMD Radeon AI PRO R9700) currently supported / tested with the ROCm backend in Ollama? - Is there any known issue with the combination of: - this GPU arch - ROCm version - and the version of libggml-hip.so shipped in Ollama 0.13.0? - Is there any extra debug flag or environment variable I can set to get more detailed ROCm/backend logs (e.g., from libggml-hip itself) to help narrow down where the initialization is hanging? ### Relevant log output ```shell time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1544 msg="server config" env="... OLLAMA_DEBUG:DEBUG ... OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e ..." time=2025-11-24T20:35:20.629+01:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.12.11)" time=2025-11-24T20:35:20.629+01:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=runner.go:128 msg="verifying device is supported" library=/usr/local/lib/ollama/rocm description="AMD Radeon AI PRO R9700" compute=gfx1201 id=GPU-b718cad49010a54e pci_id=0000:03:00.0 time=2025-11-24T20:35:22.127+01:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42531" time=2025-11-24T20:35:22.127+01:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_DEBUG=1 ROCR_VISIBLE_DEVICES=GPU-b718cad49010a54e ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm:/opt/rocm/lib::/usr/local/cuda/lib64:/usr/local/lib/x86_64-linux-gnu ... OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/rocm GGML_CUDA_INIT=1 time=2025-11-24T20:35:52.128+01:00 level=INFO source=runner.go:445 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" error="failed to finish discovery before timeout" time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:418 msg="bootstrap discovery took" duration=30.000956172s OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:GPU-b718cad49010a54e]" time=2025-11-24T20:35:52.128+01:00 level=DEBUG source=runner.go:135 msg="filtering device which didn't fully initialize" id=GPU-b718cad49010a54e libdir=/usr/local/lib/ollama/rocm pci_id=0000:03:00.0 library=ROCm time=2025-11-24T20:35:52.128+01:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.4 GiB" available="50.3 GiB" time=2025-11-24T20:35:52.128+01:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.13.0
GiteaMirror added the bug label 2026-05-04 23:04:24 -05:00
Author
Owner

@Battlesheepu commented on GitHub (Nov 25, 2025):

When it comes to the docker image (where I'm experiencing the same problem with my 9070XT), from what I can see in the project's Dockerfile, current ROCm release ollama's rocking is 6.3.3, whereas based on the ROCm release page, the version that introduces 9060xt, 9070xt and Pro 9700 support is 6.4.1.

So in the case of the rocm tagged Docker image, I assume we could blame that?

<!-- gh-comment-id:3576539561 --> @Battlesheepu commented on GitHub (Nov 25, 2025): When it comes to the docker image (where I'm experiencing the same problem with my 9070XT), from what I can see [in the project's Dockerfile](https://github.com/ollama/ollama/blob/main/Dockerfile#L6), current ROCm release ollama's rocking is 6.3.3, whereas [based on the ROCm release page](https://github.com/ROCm/ROCm/releases), the version that introduces 9060xt, 9070xt and Pro 9700 support is 6.4.1. So in the case of the `rocm` tagged Docker image, I assume we could blame that?
Author
Owner

@dmoraine commented on GitHub (Nov 25, 2025):

Good point on the docker image, but locally, I run the latest version available on the repository AMD (7.1) which is compatible with my Pro R9700.

$ rocminfo
ROCk module version 6.16.6 is loaded [...]

<!-- gh-comment-id:3576729087 --> @dmoraine commented on GitHub (Nov 25, 2025): Good point on the docker image, but locally, I run the latest version available on the [repository](https://repo.radeon.com/rocm/apt/7.1) AMD (7.1) which is [compatible](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#) with my Pro R9700. ``$ rocminfo`` ``ROCk module version 6.16.6 is loaded [...]``
Author
Owner

@Battlesheepu commented on GitHub (Nov 25, 2025):

Good point on the docker image, but locally, I run the latest version available on the repository AMD (7.1) which is compatible with my Pro R9700.

$ rocminfo ROCk module version 6.16.6 is loaded [...]

What helped me debug my issue was setting the following set of env vars:

OLLAMA_DEBUG=2
AMD_LOG_LEVEL=3

to enable trace logs for Ollama and Info logs for the AMD stack.
The first one I've deduced myself from the code - I don't know if it's documented somewhere. The second, I've found in the AMD docs.

These env vars should help Ollama spew out a lot more logs for you to scour through :)

As for the Docker image, I admit, I'm a bit lost here. The Dockerfile uses version 6.3.3, but the workflows seem to use ROCm v6.1.2 - but it seems that should be solved by the PR that's currently being cooked.

<!-- gh-comment-id:3577446649 --> @Battlesheepu commented on GitHub (Nov 25, 2025): > Good point on the docker image, but locally, I run the latest version available on the [repository](https://repo.radeon.com/rocm/apt/7.1) AMD (7.1) which is [compatible](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#) with my Pro R9700. > > `$ rocminfo` `ROCk module version 6.16.6 is loaded [...]` What helped me debug my issue was setting the following set of env vars: ``` OLLAMA_DEBUG=2 AMD_LOG_LEVEL=3 ``` to enable trace logs for Ollama and Info logs for the AMD stack. The first one I've deduced myself [from the code](https://github.com/ollama/ollama/blob/main/envconfig/config.go#L181) - I don't know if it's documented somewhere. The second, [I've found in the AMD docs](https://rocm.docs.amd.com/projects/HIP/en/docs-5.7.0/developer_guide/logging.html). These env vars should help Ollama spew out a lot more logs for you to scour through :) As for the Docker image, I admit, I'm a bit lost here. [The Dockerfile uses version 6.3.3](https://github.com/ollama/ollama/blob/main/Dockerfile#L6), but the [workflows seem to use ROCm v6.1.2](https://github.com/ollama/ollama/blob/main/.github/workflows/test.yaml#L52) - but it seems that should be solved by [the PR that's currently being cooked](https://github.com/ollama/ollama/pull/13000).
Author
Owner

@dmoraine commented on GitHub (Nov 25, 2025):

Apparently, I'm not the only one who has struggled with their R9700:
Is AMD's AI Pro R9700 supported? #13085

I switched to kernel 6.17.0-1005-oem:
apt install linux-oem-24.04d
But no improvement.

This appears to be a bug or incompatibility in the way libggml-hip (Ollama's ROCm backend) uses ROCm on gfx1201.
Potentially:

  • a ROCm function that never returns (deadlock, spin, blockage in a driver call),
  • or a code path specific to gfx12xx that is not yet stable/tested.

failure during GPU discovery error="failed to finish discovery before timeout"
filtering device which didn't fully initialize id=GPU-b718cad49010a54e
devices=[]

The problem is not detection, but an internal stage of discovery that never completes on gfx1201. According to the more details logs, the symptom, as seen by Ollama, is:

  • libggml-hip.so loads,
  • detects the GPU,
  • queries its memory and PCI,
  • goes through HSA / COMGR / code objects gfx1201,
  • then blocks (never returns control) until timeout.

libggml-hip is the caller that waits, but perhaps it is the kernel/driver that never responds correctly for this card as long as I am on ROCk 6.16.6?

<!-- gh-comment-id:3577705892 --> @dmoraine commented on GitHub (Nov 25, 2025): Apparently, I'm not the only one who has struggled with their R9700: [Is AMD's AI Pro R9700 supported? #13085 ](https://github.com/ollama/ollama/issues/13085) I switched to kernel 6.17.0-1005-oem: ``apt install linux-oem-24.04d`` But no improvement. This appears to be a bug or incompatibility in the way ``libggml-hip`` (Ollama's ROCm backend) uses ROCm on ``gfx1201``. Potentially: - a ROCm function that never returns (deadlock, spin, blockage in a driver call), - or a code path specific to ``gfx12xx`` that is not yet stable/tested. ``failure during GPU discovery error="failed to finish discovery before timeout"`` ``filtering device which didn't fully initialize id=GPU-b718cad49010a54e`` ``devices=[]`` The problem is not detection, but an internal stage of discovery that never completes on ``gfx1201``. According to the more details logs, the symptom, as seen by Ollama, is: - ``libggml-hip.so`` loads, - detects the GPU, - queries its memory and PCI, - goes through HSA / COMGR / code objects ``gfx1201``, - then blocks (never returns control) until timeout. ``libggml-hip`` is the caller that waits, but perhaps it is the kernel/driver that never responds correctly for this card as long as I am on ROCk 6.16.6?
Author
Owner

@cminnoy commented on GitHub (Nov 26, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

<!-- gh-comment-id:3582216489 --> @cminnoy commented on GitHub (Nov 26, 2025): I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.
Author
Owner

@Battlesheepu commented on GitHub (Nov 26, 2025):

I did try applying changes from the aforementioned MR, rebuilding the docker image myself and running it with the following config, commenting/uncommenting env vars in my attempts:

services:
  ollama:
    image: ollama-local-build-rocm
#    image: "ollama/ollama:rocm"
    devices:
      - "/dev/kfd"
      - "/dev/dri"
    volumes:
      - ./ollamavolume:/root/.ollama
    ports:
      - "127.0.0.1:11434:11434"
    group_add:
      - video
    environment:
      - "OLLAMA_DEBUG=2"
      - "AMD_LOG_LEVEL=4"
#      - "GPU_MAX_HW_QUEUES=1"      - 
#      - "HSA_OVERRIDE_GFX_VERSION=12.0.1"
#      - "ROCR_VISIBLE_DEVICES=GPU-(...)

No difference on my side.

ollama-1  | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:462 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" devices=[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:432 msg="bootstrap discovery took" duration=422.012043ms OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" extra_envs=map[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0
ollama-1  | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:170 msg="supported GPU library combinations before filtering" supported=map[]
ollama-1  | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=814.161228ms
ollama-1  | time=2025-11-26T18:34:02.532Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="15.0 GiB" available="14.8 GiB"
ollama-1  | time=2025-11-26T18:34:02.532Z level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

The thing that bothers me is that AMD docs mention I should have amdgpu-dkms installed for running ROCm in the container, which is not even compatible with my host's Kernel (6.17) and just doesn't compile. Could that be the issue, too, or are the Ollama images somehow equipped to handle that?

<!-- gh-comment-id:3582724846 --> @Battlesheepu commented on GitHub (Nov 26, 2025): I did try applying changes [from the aforementioned MR](https://github.com/ollama/ollama/pull/13000), rebuilding the docker image myself and running it with the following config, commenting/uncommenting env vars in my attempts: ```yaml services: ollama: image: ollama-local-build-rocm # image: "ollama/ollama:rocm" devices: - "/dev/kfd" - "/dev/dri" volumes: - ./ollamavolume:/root/.ollama ports: - "127.0.0.1:11434:11434" group_add: - video environment: - "OLLAMA_DEBUG=2" - "AMD_LOG_LEVEL=4" # - "GPU_MAX_HW_QUEUES=1" - # - "HSA_OVERRIDE_GFX_VERSION=12.0.1" # - "ROCR_VISIBLE_DEVICES=GPU-(...) ``` No difference on my side. ``` ollama-1 | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:462 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" devices=[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:432 msg="bootstrap discovery took" duration=422.012043ms OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm_v7]" extra_envs=map[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0 ollama-1 | time=2025-11-26T18:34:02.532Z level=TRACE source=runner.go:170 msg="supported GPU library combinations before filtering" supported=map[] ollama-1 | time=2025-11-26T18:34:02.532Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=814.161228ms ollama-1 | time=2025-11-26T18:34:02.532Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="15.0 GiB" available="14.8 GiB" ollama-1 | time=2025-11-26T18:34:02.532Z level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" ``` The thing that bothers me is that [AMD docs mention I should have `amdgpu-dkms` installed](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html#docker-access-gpus-in-container) for running ROCm in the container, which is not even compatible with my host's Kernel (6.17) and just doesn't compile. Could that be the issue, too, or are the Ollama images somehow equipped to handle that?
Author
Owner

@StrykeSlammerII commented on GitHub (Nov 26, 2025):

edit: I come back after the holiday, do some unrelated updates and restart ollama server, and it's working normally again.
Leaving my initial report below for posterity.


Same issue here, using Linux CLI with a gfx1200 / R9060XT.
Manjaro updated ROCm from 6.x to 7.1.0 along with ollama ->13.0 so I'm not sure which triggered the issue.

I'm not seeing anything from the AMD_LOG_LEVEL flag, but I suspect this is the main error from OLLAMA_DEBUG=2 log:

operator() double registration of ggml_uncaught_exception
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)

Full log follows:
$AMD_LOG_LEVEL=3 OLLAMA_DEBUG="2" OLLAMA_FLASH_ATTENTION=1 ollama serve
time=2025-11-26T15:57:01.956-05:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG-4 OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/strike/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-11-26T15:57:01.963-05:00 level=INFO source=images.go:522 msg="total blobs: 75"
time=2025-11-26T15:57:01.965-05:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-26T15:57:01.966-05:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.13.0)"
time=2025-11-26T15:57:01.967-05:00 level=DEBUG source=sched.go:120 msg="starting llm scheduler"
time=2025-11-26T15:57:01.967-05:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2025-11-26T15:57:01.967-05:00 level=TRACE source=runner.go:425 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[]
time=2025-11-26T15:57:01.969-05:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 34049"
time=2025-11-26T15:57:01.969-05:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_FLASH_ATTENTION=1 OLLAMA_DEBUG=2 ROCM_PATH=/opt/rocm PATH=/home/strike/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/lib/rustup/bin LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama
time=2025-11-26T15:57:01.986-05:00 level=INFO source=runner.go:1398 msg="starting ollama engine"
time=2025-11-26T15:57:01.987-05:00 level=INFO source=runner.go:1433 msg="Server listening on 127.0.0.1:34049"
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=general.architecture type=string
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=tokenizer.ggml.model type=string
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.file_type default=0
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.name default=""
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.description default=""
time=2025-11-26T15:57:01.992-05:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama
operator() double registration of ggml_uncaught_exception
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.pooling_type default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.expert_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.embedding_length default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.key_length default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1373 msg="dummy model load took" duration=9.411883ms
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1378 msg="gathering device infos took" duration=573ns
time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:452 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:422 msg="bootstrap discovery took" duration=34.110172ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0
time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:160 msg="supported GPU library combinations before filtering" supported=map[]
time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=34.498436ms
time=2025-11-26T15:57:02.001-05:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="50.2 GiB"
time=2025-11-26T15:57:02.001-05:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"

<!-- gh-comment-id:3583214300 --> @StrykeSlammerII commented on GitHub (Nov 26, 2025): edit: I come back after the holiday, do some unrelated updates and restart ollama server, and it's working normally again. Leaving my initial report below for posterity. ---- Same issue here, using Linux CLI with a gfx1200 / R9060XT. Manjaro updated ROCm from 6.x to 7.1.0 along with ollama ->13.0 so I'm not sure which triggered the issue. I'm not seeing anything from the `AMD_LOG_LEVEL` flag, but I suspect this is the main error from `OLLAMA_DEBUG=2` log: ```time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama operator() double registration of ggml_uncaught_exception load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) ``` Full log follows: `$AMD_LOG_LEVEL=3 OLLAMA_DEBUG="2" OLLAMA_FLASH_ATTENTION=1 ollama serve` time=2025-11-26T15:57:01.956-05:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:DEBUG-4 OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/strike/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-11-26T15:57:01.963-05:00 level=INFO source=images.go:522 msg="total blobs: 75" time=2025-11-26T15:57:01.965-05:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-26T15:57:01.966-05:00 level=INFO source=routes.go:1597 msg="Listening on 127.0.0.1:11434 (version 0.13.0)" time=2025-11-26T15:57:01.967-05:00 level=DEBUG source=sched.go:120 msg="starting llm scheduler" time=2025-11-26T15:57:01.967-05:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-26T15:57:01.967-05:00 level=TRACE source=runner.go:425 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[] time=2025-11-26T15:57:01.969-05:00 level=INFO source=server.go:392 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 34049" time=2025-11-26T15:57:01.969-05:00 level=DEBUG source=server.go:393 msg=subprocess OLLAMA_FLASH_ATTENTION=1 OLLAMA_DEBUG=2 ROCM_PATH=/opt/rocm PATH=/home/strike/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/lib/rustup/bin LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama time=2025-11-26T15:57:01.986-05:00 level=INFO source=runner.go:1398 msg="starting ollama engine" time=2025-11-26T15:57:01.987-05:00 level=INFO source=runner.go:1433 msg="Server listening on 127.0.0.1:34049" time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=general.architecture type=string time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=gguf.go:589 msg=tokenizer.ggml.model type=string time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.alignment default=32 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.file_type default=0 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.name default="" time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=general.description default="" time=2025-11-26T15:57:01.992-05:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2025-11-26T15:57:01.992-05:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama operator() double registration of ggml_uncaught_exception load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so time=2025-11-26T15:57:02.001-05:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.pooling_type default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.expert_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.block_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.embedding_length default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.key_length default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=ggml.go:278 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1373 msg="dummy model load took" duration=9.411883ms time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:1378 msg="gathering device infos took" duration=573ns time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:452 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:422 msg="bootstrap discovery took" duration=34.110172ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:120 msg="evaluating which, if any, devices to filter out" initial_count=0 time=2025-11-26T15:57:02.001-05:00 level=TRACE source=runner.go:160 msg="supported GPU library combinations before filtering" supported=map[] time=2025-11-26T15:57:02.001-05:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=34.498436ms time=2025-11-26T15:57:02.001-05:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.5 GiB" available="50.2 GiB" time=2025-11-26T15:57:02.001-05:00 level=INFO source=routes.go:1638 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"
Author
Owner

@gbnk0 commented on GitHub (Nov 27, 2025):

Same problem on ubuntu 22.04 with latest rocm drivers. Upgrade to kernel 6.8 resolved the issue:

apt install linux-generic-hwe-22.04

uname -r
6.8.0-87-generic

docker compose env:
- HSA_OVERRIDE_GFX_VERSION=12.0.1
- HSA_ENABLE_SDMA=0
- ROCR_VISIBLE_DEVICES=GPU-xxx (do rocminfo |grep Uuid)

Hope this can help.

<!-- gh-comment-id:3587250681 --> @gbnk0 commented on GitHub (Nov 27, 2025): Same problem on ubuntu 22.04 with latest rocm drivers. Upgrade to kernel 6.8 resolved the issue: apt install linux-generic-hwe-22.04 uname -r 6.8.0-87-generic docker compose env: - HSA_OVERRIDE_GFX_VERSION=12.0.1 - HSA_ENABLE_SDMA=0 - ROCR_VISIBLE_DEVICES=GPU-xxx (do rocminfo |grep Uuid) Hope this can help.
Author
Owner

@dmoraine commented on GitHub (Dec 3, 2025):

Quick update:

I was previously running with a custom ROCm installation under /opt/rocm (installed from AMD’s packages), in addition to what the distro provides. With that setup, Ollama consistently timed out during ROCm GPU discovery and I was getting repeated ollama crashes in libhsa-runtime64.so / libamdhip64.so / libggml-hip.so in the system journal.

I’ve now removed my custom /opt/rocm stack and switched to using only the ROCm/hip packages from the Linux Mint/Ubuntu repositories (so Ollama links against the distro-provided ROCm libraries instead of a mixed setup).

With this change:

  • ollama run qwen3:32b "test" --verbose completes successfully.
Thinking...

Hello! It looks like you're testing the waters. 😊 How can I assist you today? Whether you need help with a specific question, want to explore a topic, or just need someone to chat with, I'm here! Let me know what's on your mind.

total duration:       11.626279527s
load duration:        5.374859364s
prompt eval count:    11 token(s)
prompt eval duration: 106.941043ms
prompt eval rate:     102.86 tokens/s
eval count:           141 token(s)
eval duration:        6.102646156s
eval rate:            23.10 tokens/s
  • There are no new ROCm/HIP-related ollama coredumps in journalctl since switching.
  • Inference runs stably; I’m no longer seeing the 30s “failed to finish discovery before timeout” followed by CPU-only fallback.

So at least on my system, the discovery timeout and crashes seem to have been caused by the combination of Ollama’s ROCm backend with my manually installed /opt/rocm stack. Using the distro ROCm packages only appears to resolve the issue.

sudo systemctl edit ollama

[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434
Environment=OLLAMA_NUM_PARALLEL=3
Environment=OLLAMA_MAX_LOADED_MODELS=2
Environment=OLLAMA_FLASH_ATTENTION=1
Environment=ROCM_PATH=/opt/rocm

If there’s any specific debug flag or env var you’d like me to enable with this “clean” ROCm setup to compare behavior, I’m happy to run it and share the logs.

<!-- gh-comment-id:3608227416 --> @dmoraine commented on GitHub (Dec 3, 2025): Quick update: I was previously running with a custom ROCm installation under `/opt/rocm` (installed from AMD’s packages), in addition to what the distro provides. With that setup, Ollama consistently timed out during ROCm GPU discovery and I was getting repeated `ollama` crashes in `libhsa-runtime64.so` / `libamdhip64.so` / `libggml-hip.so` in the system journal. I’ve now removed my custom `/opt/rocm` stack and switched to using only the ROCm/hip packages from the Linux Mint/Ubuntu repositories (so Ollama links against the distro-provided ROCm libraries instead of a mixed setup). With this change: - `ollama run qwen3:32b "test" --verbose` completes successfully. ``` Thinking... Hello! It looks like you're testing the waters. 😊 How can I assist you today? Whether you need help with a specific question, want to explore a topic, or just need someone to chat with, I'm here! Let me know what's on your mind. total duration: 11.626279527s load duration: 5.374859364s prompt eval count: 11 token(s) prompt eval duration: 106.941043ms prompt eval rate: 102.86 tokens/s eval count: 141 token(s) eval duration: 6.102646156s eval rate: 23.10 tokens/s ``` - There are no new ROCm/HIP-related `ollama` coredumps in `journalctl` since switching. - Inference runs stably; I’m no longer seeing the 30s “failed to finish discovery before timeout” followed by CPU-only fallback. So at least on my system, the discovery timeout and crashes seem to have been caused by the combination of Ollama’s ROCm backend with my manually installed `/opt/rocm` stack. Using the distro ROCm packages only appears to resolve the issue. `sudo systemctl edit ollama` ``` [Service] Environment=OLLAMA_HOST=0.0.0.0:11434 Environment=OLLAMA_NUM_PARALLEL=3 Environment=OLLAMA_MAX_LOADED_MODELS=2 Environment=OLLAMA_FLASH_ATTENTION=1 Environment=ROCM_PATH=/opt/rocm ``` If there’s any specific debug flag or env var you’d like me to enable with this “clean” ROCm setup to compare behavior, I’m happy to run it and share the logs.
Author
Owner

@Cresius34 commented on GitHub (Dec 12, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

Same here, any new for that ?

<!-- gh-comment-id:3647301679 --> @Cresius34 commented on GitHub (Dec 12, 2025): > I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1. Same here, any new for that ?
Author
Owner

@cvocvo commented on GitHub (Dec 19, 2025):

I can confirm that Ollama does not detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1.

Same here, any new for that ?

It looks like Adrenaline v25.12.1 drivers are available as of 12/10/25: https://www.amd.com/en/support/downloads/drivers.html/graphics/radeon-ai-pro/radeon-ai-pro-r9000-series/amd-radeon-ai-pro-r9700.html
Any chance you could test those? I was considering this card too but without support it's sort of a no go.

We also may need to wait for this PR to be merged? https://github.com/ollama/ollama/pull/13000#issuecomment-3614654286

<!-- gh-comment-id:3675829668 --> @cvocvo commented on GitHub (Dec 19, 2025): > > I can confirm that Ollama does **not** detect any GPU on Windows 11 for my AMD 9700 AI Pro with Adrenaline drivers 25.11.1. > > Same here, any new for that ? It looks like Adrenaline v25.12.1 drivers are available as of 12/10/25: https://www.amd.com/en/support/downloads/drivers.html/graphics/radeon-ai-pro/radeon-ai-pro-r9000-series/amd-radeon-ai-pro-r9700.html Any chance you could test those? I was considering this card too but without support it's sort of a no go. We also may need to wait for this PR to be merged? https://github.com/ollama/ollama/pull/13000#issuecomment-3614654286
Author
Owner

@Cresius34 commented on GitHub (Dec 19, 2025):

I tested the latest drivers as well as the HIP but Ollama uses its own compilation of rocm 6.4.2, rocm which is compatible with the radeon pro AI (gfx 1201), I suspect a non-update of compatible gpus in Ollama.

<!-- gh-comment-id:3676974587 --> @Cresius34 commented on GitHub (Dec 19, 2025): I tested the latest drivers as well as the HIP but Ollama uses its own compilation of rocm 6.4.2, rocm which is compatible with the radeon pro AI (gfx 1201), I suspect a non-update of compatible gpus in Ollama.
Author
Owner

@LukeLamb commented on GitHub (Apr 25, 2026):

Working configuration for R9700 (gfx1201) on Ubuntu 24.04 / kernel 6.17: build 0.20.6 from source against ROCm 7.2.1 with AMDGPU_TARGETS=gfx1201. Detection succeeds, full layer offload, 92.99 tok/s on llama3.1:8b-q4_K_M.

Full repro steps and the journal line confirming library=ROCm compute=gfx1201 in #14927.

<!-- gh-comment-id:4319099971 --> @LukeLamb commented on GitHub (Apr 25, 2026): Working configuration for R9700 (gfx1201) on Ubuntu 24.04 / kernel 6.17: build 0.20.6 from source against ROCm 7.2.1 with `AMDGPU_TARGETS=gfx1201`. Detection succeeds, full layer offload, 92.99 tok/s on `llama3.1:8b-q4_K_M`. Full repro steps and the journal line confirming `library=ROCm compute=gfx1201` in #14927.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70811