[GH-ISSUE #6685] AMD 7900XTX fails with "Could not initialize Tensile host: No devices found" #29967

New Issue

GiteaMirror · 2026-04-22T09:20:34-05:00

GiteaMirror commented

2026-04-22 09:20:34 -05:00

Originally created by @svaningelgem on GitHub (Sep 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6685

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

I installed the AMD drivers with https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/native-install/ubuntu.html ✔️

OS: Ubuntu 24.04.1 LTS
ROCm: ROCm version: 6.2.0
CPU: AMD Ryzen 9 7950X3D
GPU: Radeon RX 7900 XTX
model: llama3.1

Started with:
docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm

Then tried to start llama3.1 with (I pulled it first successfully):
OLLAMA_DEBUG=1 ollama run llama3.1

Log file:
ollama.log

It looks like it is detecting the GPU correctly at the start of the container, but somehow fails to use it?

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.3.9

Originally created by @svaningelgem on GitHub (Sep 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6685 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? I installed the AMD drivers with https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/native-install/ubuntu.html :heavy_check_mark: OS: `Ubuntu 24.04.1 LTS` ROCm: `ROCm version: 6.2.0` CPU: `AMD Ryzen 9 7950X3D` GPU: `Radeon RX 7900 XTX` model: `llama3.1` Started with: `docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm` Then tried to start llama3.1 with (I pulled it first successfully): `OLLAMA_DEBUG=1 ollama run llama3.1` Log file: [ollama.log](https://github.com/user-attachments/files/16917396/ollama.log) It looks like it is detecting the GPU correctly at the start of the container, but somehow fails to use it? ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.3.9

GiteaMirror added the docker bug labels 2026-04-22 09:20:34 -05:00

GiteaMirror closed this issue

2026-04-22 09:20:36 -05:00

GiteaMirror commented

2026-04-22 09:20:37 -05:00

@svaningelgem commented on GitHub (Sep 7, 2024):

Other (closed) issues I found concerning similar behavior: #4798 , #6165 (but no real responses with a solution I could try in there)

@svaningelgem commented on GitHub (Sep 7, 2024): Other (closed) issues I found concerning similar behavior: #4798 , #6165 (but no real responses with a solution I could try in there)

GiteaMirror commented

2026-04-22 09:20:39 -05:00

@rick-github commented on GitHub (Sep 7, 2024):

What's the output of ls -la /dev/kfd* /dev/dri*? Have you tried running the container with --device as mentioned in the docs?

@rick-github commented on GitHub (Sep 7, 2024): What's the output of `ls -la /dev/kfd* /dev/dri*`? Have you tried running the container with `--device` as mentioned in the [docs](https://github.com/ollama/ollama/blob/main/docs/docker.md#amd-gpu)?

GiteaMirror commented

2026-04-22 09:20:39 -05:00

@svaningelgem commented on GitHub (Sep 7, 2024):

Ok, I tried it:

Output of the ls:

$ find /dev/kfd* /dev/dri* | xargs ls -ld
drwxr-xr-x  3 root root        100 sep  7 16:43 /dev/dri
drwxr-xr-x  2 root root         80 sep  7 16:43 /dev/dri/by-path
lrwxrwxrwx  1 root root          8 sep  7 16:43 /dev/dri/by-path/pci-0000:03:00.0-card -> ../card1
lrwxrwxrwx  1 root root         13 sep  7 16:43 /dev/dri/by-path/pci-0000:03:00.0-render -> ../renderD128
crw-rw----+ 1 root video  226,   1 sep  7 16:43 /dev/dri/card1
crw-rw----+ 1 root render 226, 128 sep  7 16:43 /dev/dri/renderD128
crw-rw----  1 root video  235,   0 sep  7 16:43 /dev/kfd

Output of the updated command:

docker run --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm |& tee -a "/var/log/ollama/$(date +%Y-%m-%d).log"

2024-09-07.log

It looks the same to me as before at first glance tho.

@svaningelgem commented on GitHub (Sep 7, 2024): Ok, I tried it: Output of the ls: ``` $ find /dev/kfd* /dev/dri* | xargs ls -ld drwxr-xr-x 3 root root 100 sep 7 16:43 /dev/dri drwxr-xr-x 2 root root 80 sep 7 16:43 /dev/dri/by-path lrwxrwxrwx 1 root root 8 sep 7 16:43 /dev/dri/by-path/pci-0000:03:00.0-card -> ../card1 lrwxrwxrwx 1 root root 13 sep 7 16:43 /dev/dri/by-path/pci-0000:03:00.0-render -> ../renderD128 crw-rw----+ 1 root video 226, 1 sep 7 16:43 /dev/dri/card1 crw-rw----+ 1 root render 226, 128 sep 7 16:43 /dev/dri/renderD128 crw-rw---- 1 root video 235, 0 sep 7 16:43 /dev/kfd ``` Output of the updated command: ``` docker run --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm |& tee -a "/var/log/ollama/$(date +%Y-%m-%d).log" ``` [2024-09-07.log](https://github.com/user-attachments/files/16918785/2024-09-07.log) It looks the same to me as before at first glance tho.

GiteaMirror commented

2026-04-22 09:20:40 -05:00

@Froggy232 commented on GitHub (Sep 7, 2024):

Hi,
I'm in a very similar situation I think, except I use podman and fedora silverblue.
I have the same error messages, and also this one : error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
I run silverblue 41 beta, but I had the exact same problem few weeks ago on fedora silverblue 40. Running on the CPU image work well.
Thanks for your help!.

Edit : Sorry, in my case, it was from SELinux, I had to set container_use_devices to on.
Thanks for your soft!

@Froggy232 commented on GitHub (Sep 7, 2024): Hi, I'm in a very similar situation I think, except I use podman and fedora silverblue. I have the same error messages, and also this one : `error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"` I run silverblue 41 beta, but I had the exact same problem few weeks ago on fedora silverblue 40. Running on the CPU image work well. Thanks for your help!. Edit : Sorry, in my case, it was from SELinux, I had to set `container_use_devices` to on. Thanks for your soft!

GiteaMirror commented

2026-04-22 09:20:41 -05:00

@dhiltgen commented on GitHub (Sep 9, 2024):

@svaningelgem do following these instructions resolve the issue, or is there something else preventing GPU access on your system?

@dhiltgen commented on GitHub (Sep 9, 2024): @svaningelgem do following [these instructions](https://github.com/ollama/ollama/blob/main/docs/gpu.md#container-permission) resolve the issue, or is there something else preventing GPU access on your system?

GiteaMirror commented

2026-04-22 09:20:41 -05:00

@svaningelgem commented on GitHub (Sep 9, 2024):

Hi @dhiltgen , it seems to me that I don't need to force anything as my GPU is supported by default:

time=2024-09-07T14:58:25.149Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cuda_v12 rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11]"
time=2024-09-07T14:58:25.150Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
time=2024-09-07T14:58:25.163Z level=INFO source=amd_linux.go:345 msg="amdgpu is supported" gpu=0 gpu_type=gfx1100
time=2024-09-07T14:58:25.163Z level=INFO source=types.go:107 msg="inference compute" id=0 library=rocm variant="" compute=gfx1100 driver=6.8 name=1002:744c total="24.0 GiB" available="23.3 GiB"

I also have only 1 gpu in my system (unless it sees my CPU also a a GPU, but I don't see that appearing in the logs).

rocminfo:

$ /opt/rocm/bin/rocminfo | grep gfx
  Name:                    gfx1100

Full rocminfo output: rocminfo.log

@svaningelgem commented on GitHub (Sep 9, 2024): Hi @dhiltgen , it seems to me that I don't need to force anything as my GPU is supported by default: ``` time=2024-09-07T14:58:25.149Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cuda_v12 rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11]" time=2024-09-07T14:58:25.150Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs" time=2024-09-07T14:58:25.163Z level=INFO source=amd_linux.go:345 msg="amdgpu is supported" gpu=0 gpu_type=gfx1100 time=2024-09-07T14:58:25.163Z level=INFO source=types.go:107 msg="inference compute" id=0 library=rocm variant="" compute=gfx1100 driver=6.8 name=1002:744c total="24.0 GiB" available="23.3 GiB" ``` I also have only 1 gpu in my system (unless it sees my CPU also a a GPU, but I don't see that appearing in the logs). **rocminfo**: ``` $ /opt/rocm/bin/rocminfo | grep gfx Name: gfx1100 ``` Full rocminfo output: [rocminfo.log](https://github.com/user-attachments/files/16933731/rocminfo.log)

GiteaMirror commented

2026-04-22 09:20:42 -05:00

@dhiltgen commented on GitHub (Sep 9, 2024):

@svaningelgem to clarify, the startup messages you're seeing are based on Ollama code looking in sysfs to discover the GPUs. This is different from performing inference where C++ code is using ROCm libraries to access the device directly. I don't have a system handy to test at the moment, but it's plausible SElinux may only be involved in the device access via ROCm, and not the sysfs discovery at startup. If you haven't tried those steps, I would give them a try so we can rule them out as a possible root cause.

@dhiltgen commented on GitHub (Sep 9, 2024): @svaningelgem to clarify, the startup messages you're seeing are based on Ollama code looking in sysfs to discover the GPUs. This is different from performing inference where C++ code is using ROCm libraries to access the device directly. I don't have a system handy to test at the moment, but it's plausible SElinux may only be involved in the device access via ROCm, and not the sysfs discovery at startup. If you haven't tried those steps, I would give them a try so we can rule them out as a possible root cause.

GiteaMirror commented

2026-04-22 09:20:43 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

SElinux is (afaik) a feature of Redhat linux systems. On Ubuntu it's AppArmor. That might interfer too, but I couldn't immediately find out anything there. It seems that if it would be blocked, it'd be just not being able to be reported. However, it does get reported as an AMD GPU... So that would lead me to conclude it's not blocked.

Could you maybe tell me what I can try and should iterate and I'll do this when I get back home? Maybe also tell me on how to enabled debug logging, as I don't see anything wrong till it just crashes with my initial message...

Because as far as I see, the gfx1100 should be fine. [FYI: Linux kernel patches]

Thanks!

@svaningelgem commented on GitHub (Sep 10, 2024): SElinux is (afaik) a feature of Redhat linux systems. On Ubuntu it's AppArmor. That might interfer too, but I couldn't immediately find out anything there. It seems that if it would be blocked, it'd be just not being able to be reported. However, it does get reported as an AMD GPU... So that would lead me to conclude it's not blocked. Could you maybe tell me what I can try and should iterate and I'll do this when I get back home? Maybe also tell me on how to enabled debug logging, as I don't see anything wrong till it just crashes with my initial message... Because as far as I see, the gfx1100 should be fine. [FYI: [Linux kernel patches](https://lists.freedesktop.org/archives/amd-gfx/2022-April/078400.html)] Thanks!

GiteaMirror commented

2026-04-22 09:20:44 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

Hmmm, I got this list from ChatGPT (so take it with a grain of salt as I couldn't verify the contents):

RDNA 3 (GFX1100 series)

GFX1102 - Radeon RX 7900 XTX, RX 7900 XT
GFX1103 - Radeon RX 7800 XT, RX 7700 XT

RDNA 2 (GFX1030 series)

GFX1030 - Radeon RX 6900 XT, RX 6800 XT, RX 6800
GFX1031 - Radeon RX 6700 XT
GFX1032 - Radeon RX 6600, RX 6600 XT

RDNA 1 (GFX1010 series)

GFX1010 - Radeon RX 5700 XT, RX 5700
GFX1011 - Radeon RX 5600 XT
GFX1012 - Radeon RX 5500 XT

Vega (GFX900 series)

GFX906 - Radeon VII, Radeon Instinct MI50, MI60
GFX900 - Radeon RX Vega 64, Vega 56

Polaris (GFX803/804 series)

GFX804 - Radeon RX 590, RX 580, RX 570 (Polaris 20)
GFX803 - Radeon RX 480, RX 470 (Polaris 10)

Navi 1X (GFX1010/1011)

GFX1010 - Radeon RX 5700 Series
GFX1011 - Radeon RX 5500 Series

Navi 2X (GFX1030/1031)

GFX1030 - Radeon RX 6800 Series, RX 6900 Series
GFX1031 - Radeon RX 6700 Series

Navi 3X (GFX1100)

GFX1102 - Radeon RX 7900 Series

@svaningelgem commented on GitHub (Sep 10, 2024): Hmmm, I got this list from ChatGPT (so take it with a grain of salt as I couldn't verify the contents): ### **RDNA 3 (GFX1100 series)** - **GFX1102** - Radeon RX 7900 XTX, RX 7900 XT - **GFX1103** - Radeon RX 7800 XT, RX 7700 XT ### **RDNA 2 (GFX1030 series)** - **GFX1030** - Radeon RX 6900 XT, RX 6800 XT, RX 6800 - **GFX1031** - Radeon RX 6700 XT - **GFX1032** - Radeon RX 6600, RX 6600 XT ### **RDNA 1 (GFX1010 series)** - **GFX1010** - Radeon RX 5700 XT, RX 5700 - **GFX1011** - Radeon RX 5600 XT - **GFX1012** - Radeon RX 5500 XT ### **Vega (GFX900 series)** - **GFX906** - Radeon VII, Radeon Instinct MI50, MI60 - **GFX900** - Radeon RX Vega 64, Vega 56 ### **Polaris (GFX803/804 series)** - **GFX804** - Radeon RX 590, RX 580, RX 570 (Polaris 20) - **GFX803** - Radeon RX 480, RX 470 (Polaris 10) ### **Navi 1X (GFX1010/1011)** - **GFX1010** - Radeon RX 5700 Series - **GFX1011** - Radeon RX 5500 Series ### **Navi 2X (GFX1030/1031)** - **GFX1030** - Radeon RX 6800 Series, RX 6900 Series - **GFX1031** - Radeon RX 6700 Series ### **Navi 3X (GFX1100)** - **GFX1102** - Radeon RX 7900 Series

GiteaMirror commented

2026-04-22 09:20:45 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

Ok, my current command line:

docker run -e HSA_OVERRIDE_GFX_VERSION=gfx1102 -e OLLAMA_DEBUG=true --gpus=all --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm

Failure log: 2024-09-10.log

I also tried with "11.0.2" instead of "gfx1102", but that also got the same result.

@svaningelgem commented on GitHub (Sep 10, 2024): Ok, my current command line: ``` docker run -e HSA_OVERRIDE_GFX_VERSION=gfx1102 -e OLLAMA_DEBUG=true --gpus=all --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm ``` Failure log: [2024-09-10.log](https://github.com/user-attachments/files/16939218/2024-09-10.log) I also tried with `"11.0.2"` instead of `"gfx1102"`, but that also got the same result.

GiteaMirror commented

2026-04-22 09:20:45 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

Ok, I used this auto-detection script:

#!/bin/bash

# Comprehensive list of GFX versions to try for AMD 7900 XTX
# Starting with GFX11 (RDNA 3) and including earlier versions for compatibility
GFX_VERSIONS=(
    # GFX11 (RDNA 3) - Primary target for 7900 XTX
    "11.0.0" "11.0.1" "11.0.2" "11.0.3" "11.1.0" "11.1.1" "11.1.2"
    # GFX10 (RDNA 2 and 1) - For potential backwards compatibility
    "10.3.0" "10.3.1" "10.3.2" "10.3.3" "10.3.4"
    "10.1.0" "10.1.1" "10.1.2"
    "10.0.0" "10.0.1" "10.0.2" "10.0.3"
    # GFX9 (Vega) - Included for extended backwards compatibility testing
    "9.0.0" "9.0.1" "9.0.2" "9.0.3" "9.0.4" "9.0.5" "9.0.6" "9.0.7" "9.0.8" "9.0.9"
    # Earlier versions included for thoroughness, though less likely to be optimal
    "8.1.0" "8.0.0" "7.0.0"
)

# Function to check if ollama is ready
check_ollama() {
  curl -s http://localhost:11434/api/version > /dev/null
  return $?
}

# Function to run ollama and test it
run_ollama() {
  # Wait for ollama to be ready
  while ! check_ollama; do
    echo "Waiting for ollama to be ready..."
    sleep 5
  done

  # Run ollama and test it
  if docker exec -it ollama ollama run llama3.1 "Hello, how are you?"; then
    echo "Ollama run successful"
    touch /tmp/ollama_success
  else
    echo "Ollama run failed"
    rm -f /tmp/ollama_success
  fi
}

for GFX_VERSION in "${GFX_VERSIONS[@]}"; do
    echo "Trying GFX version: $GFX_VERSION"

    # Stop and remove existing container if it exists
    docker stop ollama 2>/dev/null
    docker rm ollama 2>/dev/null

    # Run the Docker container
    docker run -d \
      -e HSA_OVERRIDE_GFX_VERSION=$GFX_VERSION \
      -e OLLAMA_DEBUG=true \
      --gpus=all \
      --device /dev/kfd \
      --device /dev/dri \
      -v ollama:/root/.ollama \
      -p 11434:11434 \
      --name ollama \
      ollama/ollama:rocm

    # Save the log with the version in background
    docker logs -f ollama |& tee "/var/log/ollama/${GFX_VERSION}.log" &
    LOG_PID=$!

    # Run ollama test in background
    run_ollama &
    OLLAMA_PID=$!

    # Wait for the Ollama run to complete or timeout after 5 minutes
    timeout 300 tail --pid=$OLLAMA_PID -f /dev/null

    # Check if Ollama run was successful
    if [ -f "/tmp/ollama_success" ]; then
        echo "Ollama run successful with GFX version $GFX_VERSION" >> /tmp/auto.txt
        kill $LOG_PID
        break
    else
        echo "Ollama run failed or timed out with GFX version $GFX_VERSION. Trying next version..." >> /tmp/auto.txt
        kill $LOG_PID
        kill $OLLAMA_PID 2>/dev/null
    fi
done

# Clean up
rm -f /tmp/ollama_success

if [ ! -f "/tmp/ollama_success" ]; then
    echo "Failed to find a working GFX version."
else
    echo "Successfully found a working GFX version: $GFX_VERSION"
fi

But all of them failed...

What else could I try?

@svaningelgem commented on GitHub (Sep 10, 2024): Ok, I used this auto-detection script: ```bash #!/bin/bash # Comprehensive list of GFX versions to try for AMD 7900 XTX # Starting with GFX11 (RDNA 3) and including earlier versions for compatibility GFX_VERSIONS=( # GFX11 (RDNA 3) - Primary target for 7900 XTX "11.0.0" "11.0.1" "11.0.2" "11.0.3" "11.1.0" "11.1.1" "11.1.2" # GFX10 (RDNA 2 and 1) - For potential backwards compatibility "10.3.0" "10.3.1" "10.3.2" "10.3.3" "10.3.4" "10.1.0" "10.1.1" "10.1.2" "10.0.0" "10.0.1" "10.0.2" "10.0.3" # GFX9 (Vega) - Included for extended backwards compatibility testing "9.0.0" "9.0.1" "9.0.2" "9.0.3" "9.0.4" "9.0.5" "9.0.6" "9.0.7" "9.0.8" "9.0.9" # Earlier versions included for thoroughness, though less likely to be optimal "8.1.0" "8.0.0" "7.0.0" ) # Function to check if ollama is ready check_ollama() { curl -s http://localhost:11434/api/version > /dev/null return $? } # Function to run ollama and test it run_ollama() { # Wait for ollama to be ready while ! check_ollama; do echo "Waiting for ollama to be ready..." sleep 5 done # Run ollama and test it if docker exec -it ollama ollama run llama3.1 "Hello, how are you?"; then echo "Ollama run successful" touch /tmp/ollama_success else echo "Ollama run failed" rm -f /tmp/ollama_success fi } for GFX_VERSION in "${GFX_VERSIONS[@]}"; do echo "Trying GFX version: $GFX_VERSION" # Stop and remove existing container if it exists docker stop ollama 2>/dev/null docker rm ollama 2>/dev/null # Run the Docker container docker run -d \ -e HSA_OVERRIDE_GFX_VERSION=$GFX_VERSION \ -e OLLAMA_DEBUG=true \ --gpus=all \ --device /dev/kfd \ --device /dev/dri \ -v ollama:/root/.ollama \ -p 11434:11434 \ --name ollama \ ollama/ollama:rocm # Save the log with the version in background docker logs -f ollama |& tee "/var/log/ollama/${GFX_VERSION}.log" & LOG_PID=$! # Run ollama test in background run_ollama & OLLAMA_PID=$! # Wait for the Ollama run to complete or timeout after 5 minutes timeout 300 tail --pid=$OLLAMA_PID -f /dev/null # Check if Ollama run was successful if [ -f "/tmp/ollama_success" ]; then echo "Ollama run successful with GFX version $GFX_VERSION" >> /tmp/auto.txt kill $LOG_PID break else echo "Ollama run failed or timed out with GFX version $GFX_VERSION. Trying next version..." >> /tmp/auto.txt kill $LOG_PID kill $OLLAMA_PID 2>/dev/null fi done # Clean up rm -f /tmp/ollama_success if [ ! -f "/tmp/ollama_success" ]; then echo "Failed to find a working GFX version." else echo "Successfully found a working GFX version: $GFX_VERSION" fi ``` But all of them failed... What else could I try?

GiteaMirror commented

2026-04-22 09:20:46 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

Have you tried with a system install? that worked for me, although it's not ideal
ps: this is also happening on Arch Linux without SELinux or AppArmor and using a GFX version that has been verified to work on other AI applications and on a system install of ollama, relevant discord thread
ps2: the --gpus all option only seems to be relevant for nvidia GPUs.

@TheRedCyclops commented on GitHub (Sep 10, 2024): Have you tried with a system install? that worked for me, although it's not ideal ps: this is also happening on Arch Linux without SELinux or AppArmor and using a GFX version that has been verified to work on other AI applications and on a system install of ollama, [relevant discord thread](https://discord.com/channels/1128867683291627614/1281320841124253850) ps2: the --gpus all option only seems to be relevant for nvidia GPUs.

GiteaMirror commented

2026-04-22 09:20:47 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

@Glich440 : no, not yet. I'll give it a try, but that's not really what I want to do... I'd like to run it from within a container.
I'll update this comment once I tried via a system install.

@svaningelgem commented on GitHub (Sep 10, 2024): @Glich440 : no, not yet. I'll give it a try, but that's not really what I want to do... I'd like to run it from within a container. I'll update this comment once I tried via a system install.

GiteaMirror commented

2026-04-22 09:20:48 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

@svaningelgem you could try setting AMD_LOG_LEVEL to 2 or 3 and see if some more useful details emerge from ROCm. https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/debugging.html#hip-environment-variable-summary

I'm also curious if this has regressed in newer versions of Ollama, or if all ~recent version fail in the same way.

@dhiltgen commented on GitHub (Sep 10, 2024): @svaningelgem you could try setting AMD_LOG_LEVEL to 2 or 3 and see if some more useful details emerge from ROCm. https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/debugging.html#hip-environment-variable-summary I'm also curious if this has regressed in newer versions of Ollama, or if all ~recent version fail in the same way.

GiteaMirror commented

2026-04-22 09:20:49 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

I have also tested with the 0.3.7-rocm and 0.3.1-rocm tags, it fails in the same way

@TheRedCyclops commented on GitHub (Sep 10, 2024): I have also tested with the 0.3.7-rocm and 0.3.1-rocm tags, it fails in the same way

GiteaMirror commented

2026-04-22 09:20:50 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

@Glich440 if you can identify which version it was working on for your setup that will help us isolate what changed and is leading to the regression.

@dhiltgen commented on GitHub (Sep 10, 2024): @Glich440 if you can identify which version it was working on for your setup that will help us isolate what changed and is leading to the regression.

GiteaMirror commented

2026-04-22 09:20:51 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

on the native system it seems like it works with any version, I have never actually gotten the gpu to work with the docker container

@TheRedCyclops commented on GitHub (Sep 10, 2024): on the native system it seems like it works with any version, I have never actually gotten the gpu to work with the docker container

GiteaMirror commented

2026-04-22 09:20:52 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

I have now tested the 0.2.1-rocm tag and I get a slightly different error message, from:
Error: llama runner process has terminated: error:Could not initialize Tensile host: No devices found
to:
Error: llama runner process has terminated: signal: aborted (core dumped) error:Could not initialize Tensile host: No devices found
but it still seems to be the same error

@TheRedCyclops commented on GitHub (Sep 10, 2024): I have now tested the 0.2.1-rocm tag and I get a slightly different error message, from: `Error: llama runner process has terminated: error:Could not initialize Tensile host: No devices found` to: `Error: llama runner process has terminated: signal: aborted (core dumped) error:Could not initialize Tensile host: No devices found` but it still seems to be the same error

GiteaMirror commented

2026-04-22 09:20:52 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

Wait, huge development, when I use this command docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 -e HSA_OVERRIDE_GFX_VERSION="10.3.0" --name ollama-test2 ollama/ollama:rocm it actually runs on the gpu!
now I just don't understand why that works but this fails:

name: ollama-debugging
services:
  ollama:
    image: ollama/ollama:rocm
    container_name: ollama-testing
    environment:
      - HSA_OVERRIDE_GFX_VERSION="10.3.0"
      - OLLAMA_DEBUG=true
      - AMD_LOG_LEVEL=2
    devices:
      - /dev/kfd:/dev/kfd
      - /dev/dri:/dev/dri
    volumes:
      - ./data:/root/.ollama
    expose:
      - 11434:11434
    restart: unless-stopped

@TheRedCyclops commented on GitHub (Sep 10, 2024): Wait, huge development, when I use this command `docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 -e HSA_OVERRIDE_GFX_VERSION="10.3.0" --name ollama-test2 ollama/ollama:rocm` it actually runs on the gpu! now I just don't understand why that works but this fails: ``` name: ollama-debugging services: ollama: image: ollama/ollama:rocm container_name: ollama-testing environment: - HSA_OVERRIDE_GFX_VERSION="10.3.0" - OLLAMA_DEBUG=true - AMD_LOG_LEVEL=2 devices: - /dev/kfd:/dev/kfd - /dev/dri:/dev/dri volumes: - ./data:/root/.ollama expose: - 11434:11434 restart: unless-stopped ```

GiteaMirror commented

2026-04-22 09:20:53 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

AMD_LOG_LEVEL

I do get indeed a little bit more info:

:3:rocdevice.cpp            :468 : 0079107519 us: [pid:37    tid:0x79adf78bf340] Initializing HSA stack.
:1:rocdevice.cpp            :478 : 0079107773 us: [pid:37    tid:0x79adf78bf340] hsa_init failed with 1008
:1:runtime.cpp              :78  : 0079107777 us: [pid:37    tid:0x79adf78bf340] Runtime initialization failed
:3:hip_device_runtime.cpp   :638 : 0079107787 us: [pid:37    tid:0x79adf78bf340]  hipGetDeviceCount ( 0x7ffd06c92c9c ) 
:3:hip_device_runtime.cpp   :640 : 0079107789 us: [pid:37    tid:0x79adf78bf340] hipGetDeviceCount: Returned hipErrorNoDevice : 

rocBLAS error: Could not initialize Tensile host: No devices found

To me it doesn't seems useful, but I hope to you it does? ;-)

@svaningelgem commented on GitHub (Sep 10, 2024): > AMD_LOG_LEVEL I do get indeed a little bit more info: ``` :3:rocdevice.cpp :468 : 0079107519 us: [pid:37 tid:0x79adf78bf340] Initializing HSA stack. :1:rocdevice.cpp :478 : 0079107773 us: [pid:37 tid:0x79adf78bf340] hsa_init failed with 1008 :1:runtime.cpp :78 : 0079107777 us: [pid:37 tid:0x79adf78bf340] Runtime initialization failed :3:hip_device_runtime.cpp :638 : 0079107787 us: [pid:37 tid:0x79adf78bf340] hipGetDeviceCount ( 0x7ffd06c92c9c ) :3:hip_device_runtime.cpp :640 : 0079107789 us: [pid:37 tid:0x79adf78bf340] hipGetDeviceCount: Returned hipErrorNoDevice : rocBLAS error: Could not initialize Tensile host: No devices found ``` To me it doesn't seems useful, but I hope to you it does? ;-)

GiteaMirror commented

2026-04-22 09:20:54 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

This is what is loaded inside the pod:

[root@0a507b630d4c lib]# lsmod | grep amd
edac_mce_amd           28672  0 
kvm_amd               208896  0 
kvm                  1404928  1 kvm_amd
ccp                   143360  1 kvm_amd
gpio_amdpt             16384  0 
amdgpu              19636224  11 
amddrm_ttm_helper      12288  1 amdgpu
amdttm                110592  2 amdgpu,amddrm_ttm_helper
amddrm_buddy           20480  1 amdgpu
amdxcp                 12288  1 amdgpu
drm_exec               12288  1 amdgpu
drm_suballoc_helper    16384  1 amdgpu
amd_sched              61440  1 amdgpu
amdkcl                 32768  3 amd_sched,amdttm,amdgpu
drm_display_helper    237568  1 amdgpu
video                  73728  3 asus_wmi,amdgpu,asus_nb_wmi
i2c_algo_bit           16384  1 amdgpu

I was thinking that maybe the kernel module wasn't loaded, but that seems not the case.

@svaningelgem commented on GitHub (Sep 10, 2024): This is what is loaded inside the pod: ``` [root@0a507b630d4c lib]# lsmod | grep amd edac_mce_amd 28672 0 kvm_amd 208896 0 kvm 1404928 1 kvm_amd ccp 143360 1 kvm_amd gpio_amdpt 16384 0 amdgpu 19636224 11 amddrm_ttm_helper 12288 1 amdgpu amdttm 110592 2 amdgpu,amddrm_ttm_helper amddrm_buddy 20480 1 amdgpu amdxcp 12288 1 amdgpu drm_exec 12288 1 amdgpu drm_suballoc_helper 16384 1 amdgpu amd_sched 61440 1 amdgpu amdkcl 32768 3 amd_sched,amdttm,amdgpu drm_display_helper 237568 1 amdgpu video 73728 3 asus_wmi,amdgpu,asus_nb_wmi i2c_algo_bit 16384 1 amdgpu ``` I was thinking that maybe the kernel module wasn't loaded, but that seems not the case.

GiteaMirror commented

2026-04-22 09:20:55 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

@svaningelgem poking around online, it seems like the kfd driver might be involved. Anything interesting on the host in sudo dmesg | grep kfd ?

@dhiltgen commented on GitHub (Sep 10, 2024): @svaningelgem poking around online, it seems like the kfd driver might be involved. Anything interesting on the host in `sudo dmesg | grep kfd` ?

GiteaMirror commented

2026-04-22 09:20:55 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 -e HSA_OVERRIDE_GFX_VERSION="10.3.0" --name ollama-test2 ollama/ollama:rocm

Sorry mate... Not working at my side :( (the only thing i removed is the "-d" to not run it in the background)

I tried the exact same command, and ... nothing, still the same error.

@svaningelgem commented on GitHub (Sep 10, 2024): > docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 -e HSA_OVERRIDE_GFX_VERSION="10.3.0" --name ollama-test2 ollama/ollama:rocm Sorry mate... Not working at my side :( (the only thing i removed is the "-d" to not run it in the background) I tried the exact same command, and ... nothing, still the same error.

GiteaMirror commented

2026-04-22 09:20:56 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

sudo dmesg | grep kfd

[    3.980681] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    3.980697] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[    3.980980] kfd kfd: amdgpu: added device 1002:744c

@svaningelgem commented on GitHub (Sep 10, 2024): > sudo dmesg | grep kfd ``` [ 3.980681] kfd kfd: amdgpu: Allocated 3969056 bytes on gart [ 3.980697] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1 [ 3.980980] kfd kfd: amdgpu: added device 1002:744c ```

GiteaMirror commented

2026-04-22 09:20:57 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

  /**
   * The HSA runtime failed to allocate the necessary resources. This error
   * may also occur when the HSA runtime needs to spawn threads or create
   * internal OS-specific events.
   */
  HSA_STATUS_ERROR_OUT_OF_RESOURCES = 0x1008,

It seems like it may be permissions related to the kfd device. Are you attempting to run deprivielged by any chance?

@dhiltgen commented on GitHub (Sep 10, 2024): ``` /** * The HSA runtime failed to allocate the necessary resources. This error * may also occur when the HSA runtime needs to spawn threads or create * internal OS-specific events. */ HSA_STATUS_ERROR_OUT_OF_RESOURCES = 0x1008, ``` It seems like it may be permissions related to the kfd device. Are you attempting to run deprivielged by any chance?

GiteaMirror commented

2026-04-22 09:20:58 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

Try adding --privileged to the docker run and see if that resolves it?

@dhiltgen commented on GitHub (Sep 10, 2024): Try adding `--privileged` to the `docker run` and see if that resolves it?

GiteaMirror commented

2026-04-22 09:20:58 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

Adding --privileged to the docker run command didn't solve anything. Still the same.
I'm now trying with sudo on top of the --privileged command (but it seems to be copying the container to the root user, so it can take a while ;)).

The command now is:
sudo docker run -e HSA_OVERRIDE_GFX_VERSION=11.0.0 -e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true --privileged --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm

@svaningelgem commented on GitHub (Sep 10, 2024): Adding `--privileged` to the docker run command didn't solve anything. Still the same. I'm now trying with `sudo` on top of the `--privileged` command (but it seems to be copying the container to the root user, so it can take a while ;)). The command now is: `sudo docker run -e HSA_OVERRIDE_GFX_VERSION=11.0.0 -e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true --privileged --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --replace --name ollama ollama/ollama:rocm`

GiteaMirror commented

2026-04-22 09:20:59 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

I've been able to reproduce the same failure mode by running on a linux host after removing my account from video and render groups, so it seems like this is most likely a permission problem somewhere in the docker container runtime.

llm_load_print_meta: max token length = 18
:3:rocdevice.cpp            :468 : 1550426468 us: [pid:4574  tid:0x7fedb08b8340] Initializing HSA stack.
:1:rocdevice.cpp            :478 : 1550426510 us: [pid:4574  tid:0x7fedb08b8340] hsa_init failed with 1008
:1:runtime.cpp              :78  : 1550426513 us: [pid:4574  tid:0x7fedb08b8340] Runtime initialization failed
:3:hip_device_runtime.cpp   :638 : 1550426518 us: [pid:4574  tid:0x7fedb08b8340]  hipGetDeviceCount ( 0x7ffd160d91fc )
:3:hip_device_runtime.cpp   :640 : 1550426521 us: [pid:4574  tid:0x7fedb08b8340] hipGetDeviceCount: Returned hipErrorNoDevice :

rocBLAS error: Could not initialize Tensile host: No devices found
time=2024-09-10T16:21:43.039Z level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: error:Could not initialize Tensile host: No devices found"

We used to have a check at startup on ROCm system to verify permissions, but it seems that code got accidentally removed at some point in our GPU discovery refactoring work. I'll add a check back in so we can fail fast with a more helpful error message that permissions are not set up correctly for the ollama serve command to access the radeon device.

@dhiltgen commented on GitHub (Sep 10, 2024): I've been able to reproduce the same failure mode by running on a linux host after removing my account from `video` and `render` groups, so it seems like this is most likely a permission problem somewhere in the docker container runtime. ``` llm_load_print_meta: max token length = 18 :3:rocdevice.cpp :468 : 1550426468 us: [pid:4574 tid:0x7fedb08b8340] Initializing HSA stack. :1:rocdevice.cpp :478 : 1550426510 us: [pid:4574 tid:0x7fedb08b8340] hsa_init failed with 1008 :1:runtime.cpp :78 : 1550426513 us: [pid:4574 tid:0x7fedb08b8340] Runtime initialization failed :3:hip_device_runtime.cpp :638 : 1550426518 us: [pid:4574 tid:0x7fedb08b8340] hipGetDeviceCount ( 0x7ffd160d91fc ) :3:hip_device_runtime.cpp :640 : 1550426521 us: [pid:4574 tid:0x7fedb08b8340] hipGetDeviceCount: Returned hipErrorNoDevice : rocBLAS error: Could not initialize Tensile host: No devices found time=2024-09-10T16:21:43.039Z level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: error:Could not initialize Tensile host: No devices found" ``` We used to have a check at startup on ROCm system to verify permissions, but it seems that code got accidentally removed at some point in our GPU discovery refactoring work. I'll add a check back in so we can fail fast with a more helpful error message that permissions are not set up correctly for the `ollama serve` command to access the radeon device.

GiteaMirror commented

2026-04-22 09:21:00 -05:00

@TheRedCyclops commented on GitHub (Sep 10, 2024):

I ended up using composerize and got this, it seems to work:

name: <your project name>
services:
    ollama:
        devices:
            - /dev/kfd
            - /dev/dri
        volumes:
            - ./ollama:/root/.ollama
        ports:
            - 11434:11434
        environment:
            - HSA_OVERRIDE_GFX_VERSION=10.3.0
        container_name: ollama-test2
        image: ollama/ollama:rocm

My issue is solved, hope you can solve it too @svaningelgem

@TheRedCyclops commented on GitHub (Sep 10, 2024): I ended up using composerize and got this, it seems to work: ``` name: <your project name> services: ollama: devices: - /dev/kfd - /dev/dri volumes: - ./ollama:/root/.ollama ports: - 11434:11434 environment: - HSA_OVERRIDE_GFX_VERSION=10.3.0 container_name: ollama-test2 image: ollama/ollama:rocm ``` My issue is solved, hope you can solve it too @svaningelgem

GiteaMirror commented

2026-04-22 09:21:01 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

@dhiltgen :

(base) root@LinuxPC:/var/lib/containers# adduser steven video
info: The user `steven' is already a member of `video'.
(base) root@LinuxPC:/var/lib/containers# adduser steven render
info: The user `steven' is already a member of `render'.

But indeed it seems to be pointing to a permission issue

@svaningelgem commented on GitHub (Sep 10, 2024): @dhiltgen : ``` (base) root@LinuxPC:/var/lib/containers# adduser steven video info: The user `steven' is already a member of `video'. (base) root@LinuxPC:/var/lib/containers# adduser steven render info: The user `steven' is already a member of `render'. ``` But indeed it seems to be pointing to a permission issue

GiteaMirror commented

2026-04-22 09:21:01 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

Might it be an issue that on my machine I am running rocm6.2 & on the pod it's rocm6.0 ? I doubt, but you never know...

@svaningelgem commented on GitHub (Sep 10, 2024): Might it be an issue that on my machine I am running rocm6.2 & on the pod it's rocm6.0 ? I doubt, but you never know...

GiteaMirror commented

2026-04-22 09:21:02 -05:00

@rtaic-coder commented on GitHub (Sep 10, 2024):

I am wondering if this is because of rootless container. I get this error while running rcominfo inside the container:

docker exec -it ollama rocminfo
ROCk module version 6.8.5 is loaded
Unable to open /dev/kfd read-write: Permission denied

I tried to add render,video,nogroup groups to the user running the ollama inside container.

@rtaic-coder commented on GitHub (Sep 10, 2024): I am wondering if this is because of rootless container. I get this error while running `rcominfo` inside the container: ```bash docker exec -it ollama rocminfo ROCk module version 6.8.5 is loaded Unable to open /dev/kfd read-write: Permission denied ``` I tried to add render,video,nogroup groups to the user running the ollama inside container.

GiteaMirror commented

2026-04-22 09:21:03 -05:00

@rtaic-coder commented on GitHub (Sep 10, 2024):

@svaningelgem I thought the same thing, so I built my own ollama image locally with rcom 6.2 but still giving me same error.

@rtaic-coder commented on GitHub (Sep 10, 2024): @svaningelgem I thought the same thing, so I built my own ollama image locally with rcom 6.2 but still giving me same error.

GiteaMirror commented

2026-04-22 09:21:05 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

docker exec -it ollama rocminfo
ROCk module version 6.8.5 is loaded
Unable to open /dev/kfd read-write: Permission denied

I get something more:

(base) steven@LinuxPC:~$ docker exec -it ollama rocminfo
ROCk module version 6.8.5 is loaded
Unable to open /dev/kfd read-write: Permission denied
root is not member of "video" group, the default DRM access group. Users must be a member of the "video" group or another DRM access group in order for ROCm applications to run successfully.

So to me it looks like the root user INSIDE the container needs to be a member of the video group?

Tried it with:

# adduser root video
info: Adding user `root' to group `video' ...

On the host, but that didn't change anything ;)

After: usermod -aG video root, at least the warning went away:

 docker exec -it ollama rocminfo
ROCk module version 6.8.5 is loaded
Unable to open /dev/kfd read-write: Permission denied
root is member of video group

Still isn't right yet...

@svaningelgem commented on GitHub (Sep 10, 2024): > ```shell > docker exec -it ollama rocminfo > ROCk module version 6.8.5 is loaded > Unable to open /dev/kfd read-write: Permission denied > ``` I get something more: ``` (base) steven@LinuxPC:~$ docker exec -it ollama rocminfo ROCk module version 6.8.5 is loaded Unable to open /dev/kfd read-write: Permission denied root is not member of "video" group, the default DRM access group. Users must be a member of the "video" group or another DRM access group in order for ROCm applications to run successfully. ``` So to me it looks like the root user INSIDE the container needs to be a member of the video group? Tried it with: ``` # adduser root video info: Adding user `root' to group `video' ... ``` On the host, but that didn't change anything ;) After: `usermod -aG video root`, at least the warning went away: ``` docker exec -it ollama rocminfo ROCk module version 6.8.5 is loaded Unable to open /dev/kfd read-write: Permission denied root is member of video group ``` Still isn't right yet...

GiteaMirror commented

2026-04-22 09:21:05 -05:00

@rtaic-coder commented on GitHub (Sep 10, 2024):

@svaningelgem Since I am using dev-ubuntu-24.04:6.2-complete as base of my image. Ubuntu has nogroup as owner of DRM. So my error was about nogroup. So I add nogroup, video and render groups to root user inside container. So I don't get the group error rather only permission denied error.

Bottom line it doesn't work.

@rtaic-coder commented on GitHub (Sep 10, 2024): @svaningelgem Since I am using `dev-ubuntu-24.04:6.2-complete` as base of my image. Ubuntu has `nogroup` as owner of DRM. So my error was about nogroup. So I add nogroup, video and render groups to root user inside container. So I don't get the group error rather only permission denied error. Bottom line it doesn't work.

GiteaMirror commented

2026-04-22 09:21:06 -05:00

@svaningelgem commented on GitHub (Sep 10, 2024):

A bit more info:

$ podman info | grep -i apparmor
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +WASM:wasmedge +YAJL
    apparmorEnabled: false

==> app armor is NOT enabled on my system, so that isn't interfering with anything either.

@svaningelgem commented on GitHub (Sep 10, 2024): A bit more info: ``` $ podman info | grep -i apparmor +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +WASM:wasmedge +YAJL apparmorEnabled: false ``` ==> app armor is NOT enabled on my system, so that isn't interfering with anything either.

GiteaMirror commented

2026-04-22 09:21:06 -05:00

@dhiltgen commented on GitHub (Sep 10, 2024):

@svaningelgem it sounds like you're using podman. In that case try:

podman run --rm  -it  --device=/dev/kfd   --device=/dev/dri  --ipc=host   ...

If that doesn't clear it up, there are some other suggestions on https://github.com/ROCm/ROCm/issues/1549 that might help you find a configuration that gets the permissions wired up correctly so the container can access /dev/kfd

Once we find the solution, I'll update our docs to include that as well.

@dhiltgen commented on GitHub (Sep 10, 2024): @svaningelgem it sounds like you're using podman. In that case try: ``` podman run --rm -it --device=/dev/kfd --device=/dev/dri --ipc=host ... ``` If that doesn't clear it up, there are some other suggestions on https://github.com/ROCm/ROCm/issues/1549 that might help you find a configuration that gets the permissions wired up correctly so the container can access /dev/kfd Once we find the solution, I'll update our docs to include that as well.

GiteaMirror commented

2026-04-22 09:21:07 -05:00

@svaningelgem commented on GitHub (Sep 11, 2024):

This is from within the pod:

[root@LinuxPC /]# id && groups
uid=0(root) gid=0(root) groups=0(root)
root
[root@LinuxPC /]# ls -la /dev/kfd
crw-rw---- 1 65534 65534 235, 0 Sep 10 19:47 /dev/kfd
[root@LinuxPC /]# chown root:video /dev/kfd
chown: changing ownership of ‘/dev/kfd’: Operation not permitted

So it looks like the kfd device has a wrong user/group assigned. Checking now how I can make the "root" user part of it.

I also tried via subgid, but that didn't work out either. I'll try a little bit with base rocm images to see if things work out first and take it from there.

@svaningelgem commented on GitHub (Sep 11, 2024): This is from within the pod: ``` [root@LinuxPC /]# id && groups uid=0(root) gid=0(root) groups=0(root) root [root@LinuxPC /]# ls -la /dev/kfd crw-rw---- 1 65534 65534 235, 0 Sep 10 19:47 /dev/kfd [root@LinuxPC /]# chown root:video /dev/kfd chown: changing ownership of ‘/dev/kfd’: Operation not permitted ``` So it looks like the kfd device has a wrong user/group assigned. Checking now how I can make the "root" user part of it. I also tried via subgid, but that didn't work out either. I'll try a little bit with base rocm images to see if things work out first and take it from there.

GiteaMirror commented

2026-04-22 09:21:08 -05:00

@svaningelgem commented on GitHub (Sep 11, 2024):

Ok, update: I tried with this command:
docker run -it --device=/dev/kfd --group-add daemon rocm/pytorch:latest rocminfo

This made it output stuff (and not the permission error anymore).
When I check the device in this pytorch image, I see this:

$ docker run -it --device=/dev/kfd --group-add daemon rocm/pytorch:latest ls -l /dev/kfd
crw-rw---- 1 nobody daemon 235, 0 Sep 11 07:18 /dev/kfd

When I do the same in the ollama image, I get:

$ docker run -it --device=/dev/kfd --group-add daemon --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -l /dev/kfd"
crw-rw---- 1 65534 bin 235, 0 Sep 11 07:18 /dev/kfd

So to me it looks like there is something wrong with the user/group of this device inside the ollama image.

@svaningelgem commented on GitHub (Sep 11, 2024): Ok, update: I tried with this command: `docker run -it --device=/dev/kfd --group-add daemon rocm/pytorch:latest rocminfo` This made it output stuff (and not the permission error anymore). When I check the device in this pytorch image, I see this: ``` $ docker run -it --device=/dev/kfd --group-add daemon rocm/pytorch:latest ls -l /dev/kfd crw-rw---- 1 nobody daemon 235, 0 Sep 11 07:18 /dev/kfd ``` When I do the same in the ollama image, I get: ``` $ docker run -it --device=/dev/kfd --group-add daemon --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -l /dev/kfd" crw-rw---- 1 65534 bin 235, 0 Sep 11 07:18 /dev/kfd ``` So to me it looks like there is something wrong with the user/group of this device inside the ollama image.

GiteaMirror commented

2026-04-22 09:21:08 -05:00

@TheRedCyclops commented on GitHub (Sep 11, 2024):

have you tried adding root to the bin group?

@TheRedCyclops commented on GitHub (Sep 11, 2024): have you tried adding root to the bin group?

GiteaMirror commented

2026-04-22 09:21:09 -05:00

@svaningelgem commented on GitHub (Sep 11, 2024):

Well, today it showed as "bin", yesterday it was "65534"... But bin is wrong anyhow. It should be video or render I presume?
I also noticed that the API calls to the service are colored now, so likely it's a new rocm image.

Ok, rocminfo doesn't give the error anymore, but it's only showing the CPU, not the GPU...

@svaningelgem commented on GitHub (Sep 11, 2024): Well, today it showed as "bin", yesterday it was "65534"... But bin is wrong anyhow. It should be video or render I presume? I also noticed that the API calls to the service are colored now, so likely it's a new rocm image. Ok, rocminfo doesn't give the error anymore, but it's only showing the CPU, not the GPU...

GiteaMirror commented

2026-04-22 09:21:10 -05:00

@TheRedCyclops commented on GitHub (Sep 11, 2024):

Have you also forwarded /dev/dri?
and also what are the permisions and owner of /dev/kfd on your base system?

@TheRedCyclops commented on GitHub (Sep 11, 2024): Have you also forwarded /dev/dri? and also what are the permisions and owner of /dev/kfd on your base system?

GiteaMirror commented

2026-04-22 09:21:10 -05:00

@svaningelgem commented on GitHub (Sep 11, 2024):

Yeah, I did notice that one as well:

So the smallest command line that works becomes:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest rocminfo | grep GPU
  Uuid:                    GPU-afcb395b37c2835e               
  Device Type:             GPU     
$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest ls -ld /dev/kfd /dev/dri /dev/dri/*
ls: cannot access '/dev/dri/by-path': No such file or directory
drwxr-xr-x  2 root   root         80 Sep 11 09:18 /dev/dri
crw-rw----+ 1 nobody daemon 226,   1 Sep 11 09:06 /dev/dri/card1
crw-rw----+ 1 nobody bin    226, 128 Sep 11 09:06 /dev/dri/renderD128
crw-rw----  1 nobody daemon 235,   0 Sep 11 09:06 /dev/kfd

In ollama:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add bin --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -ld /dev/kfd /dev/dri /dev/dri/*"
drwxr-xr-x  2 root  root         80 Sep 11 09:19 /dev/dri
crw-rw----+ 1 65534 bin    226,   1 Sep 11 09:06 /dev/dri/card1
crw-rw----+ 1 65534 daemon 226, 128 Sep 11 09:06 /dev/dri/renderD128
crw-rw----  1 65534 bin    235,   0 Sep 11 09:06 /dev/kfd

on my base system:

(base) steven@LinuxPC:~$ ls -ld /dev/kfd /dev/dri /dev/dri/*
drwxr-xr-x  3 root root        100 sep 11 11:06 /dev/dri
drwxr-xr-x  2 root root         80 sep 11 11:06 /dev/dri/by-path
crw-rw----+ 1 root video  226,   1 sep 11 11:06 /dev/dri/card1
crw-rw----+ 1 root render 226, 128 sep 11 11:06 /dev/dri/renderD128
crw-rw----  1 root video  235,   0 sep 11 11:06 /dev/kfd

And we have liftoff 🚀 !!

(base) steven@LinuxPC:~$ ollama run llama3.1
>>> hello
Hello! How are you today? Is there something I can help you with or would you like to chat?

>>> Send a message (/? for help)

Used command line:

docker run -it --replace \
	-v ollama:/root/.ollama \
	--device /dev/kfd --device /dev/dri \
	--group-add bin \
	-e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true \
	-p 11434:11434 \
	--name ollama \
	ollama/ollama:rocm

@svaningelgem commented on GitHub (Sep 11, 2024): Yeah, I did notice that one as well: So the smallest command line that works becomes: ``` $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest rocminfo | grep GPU Uuid: GPU-afcb395b37c2835e Device Type: GPU $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest ls -ld /dev/kfd /dev/dri /dev/dri/* ls: cannot access '/dev/dri/by-path': No such file or directory drwxr-xr-x 2 root root 80 Sep 11 09:18 /dev/dri crw-rw----+ 1 nobody daemon 226, 1 Sep 11 09:06 /dev/dri/card1 crw-rw----+ 1 nobody bin 226, 128 Sep 11 09:06 /dev/dri/renderD128 crw-rw---- 1 nobody daemon 235, 0 Sep 11 09:06 /dev/kfd ``` In ollama: ``` $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add bin --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -ld /dev/kfd /dev/dri /dev/dri/*" drwxr-xr-x 2 root root 80 Sep 11 09:19 /dev/dri crw-rw----+ 1 65534 bin 226, 1 Sep 11 09:06 /dev/dri/card1 crw-rw----+ 1 65534 daemon 226, 128 Sep 11 09:06 /dev/dri/renderD128 crw-rw---- 1 65534 bin 235, 0 Sep 11 09:06 /dev/kfd ``` on my base system: ``` (base) steven@LinuxPC:~$ ls -ld /dev/kfd /dev/dri /dev/dri/* drwxr-xr-x 3 root root 100 sep 11 11:06 /dev/dri drwxr-xr-x 2 root root 80 sep 11 11:06 /dev/dri/by-path crw-rw----+ 1 root video 226, 1 sep 11 11:06 /dev/dri/card1 crw-rw----+ 1 root render 226, 128 sep 11 11:06 /dev/dri/renderD128 crw-rw---- 1 root video 235, 0 sep 11 11:06 /dev/kfd ``` And we have liftoff :rocket: !! ``` (base) steven@LinuxPC:~$ ollama run llama3.1 >>> hello Hello! How are you today? Is there something I can help you with or would you like to chat? >>> Send a message (/? for help) ``` Used command line: ``` docker run -it --replace \ -v ollama:/root/.ollama \ --device /dev/kfd --device /dev/dri \ --group-add bin \ -e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true \ -p 11434:11434 \ --name ollama \ ollama/ollama:rocm ```

GiteaMirror commented

2026-04-22 09:21:11 -05:00

@svaningelgem commented on GitHub (Sep 11, 2024):

Thanks @dhiltgen , but could you maybe also add the group name of the driver, so you know what to --group-add.
You would assume video, or render. But in my case it was "bin" that was necessary.

So it'd be advantageous to have this knowledge added already in the logs. And you don't have to bash into the pod to know that.

@svaningelgem commented on GitHub (Sep 11, 2024): Thanks @dhiltgen , but could you maybe also add the group name of the driver, so you know what to `--group-add`. You would assume `video`, or `render`. But in my case it was "bin" that was necessary. So it'd be advantageous to have this knowledge added already in the logs. And you don't have to bash into the pod to know that.

GiteaMirror commented

2026-04-22 09:21:11 -05:00

@rtaic-coder commented on GitHub (Sep 12, 2024):

Yeah, I did notice that one as well:

So the smallest command line that works becomes:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest rocminfo | grep GPU
  Uuid:                    GPU-afcb395b37c2835e               
  Device Type:             GPU     
$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest ls -ld /dev/kfd /dev/dri /dev/dri/*
ls: cannot access '/dev/dri/by-path': No such file or directory
drwxr-xr-x  2 root   root         80 Sep 11 09:18 /dev/dri
crw-rw----+ 1 nobody daemon 226,   1 Sep 11 09:06 /dev/dri/card1
crw-rw----+ 1 nobody bin    226, 128 Sep 11 09:06 /dev/dri/renderD128
crw-rw----  1 nobody daemon 235,   0 Sep 11 09:06 /dev/kfd

In ollama:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add bin --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -ld /dev/kfd /dev/dri /dev/dri/*"
drwxr-xr-x  2 root  root         80 Sep 11 09:19 /dev/dri
crw-rw----+ 1 65534 bin    226,   1 Sep 11 09:06 /dev/dri/card1
crw-rw----+ 1 65534 daemon 226, 128 Sep 11 09:06 /dev/dri/renderD128
crw-rw----  1 65534 bin    235,   0 Sep 11 09:06 /dev/kfd

on my base system:

(base) steven@LinuxPC:~$ ls -ld /dev/kfd /dev/dri /dev/dri/*
drwxr-xr-x  3 root root        100 sep 11 11:06 /dev/dri
drwxr-xr-x  2 root root         80 sep 11 11:06 /dev/dri/by-path
crw-rw----+ 1 root video  226,   1 sep 11 11:06 /dev/dri/card1
crw-rw----+ 1 root render 226, 128 sep 11 11:06 /dev/dri/renderD128
crw-rw----  1 root video  235,   0 sep 11 11:06 /dev/kfd

And we have liftoff 🚀 !!

(base) steven@LinuxPC:~$ ollama run llama3.1
>>> hello
Hello! How are you today? Is there something I can help you with or would you like to chat?

>>> Send a message (/? for help)

Used command line:

docker run -it --replace \
	-v ollama:/root/.ollama \
	--device /dev/kfd --device /dev/dri \
	--group-add bin \
	-e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true \
	-p 11434:11434 \
	--name ollama \
	ollama/ollama:rocm

In my case, owner of kfd device is 65534

docker run -dit --device=/dev/kfd --device=/dev/dri --group-add bin --rm --name ollama ollama/ollama:rocm
docker exec -it ollama bash
[root@4e2fa1b63144 /]# ls -ld /dev/dri/* /dev/kfd
crw-rw---- 1 65534 65534 226,   0 Sep 12 01:01 /dev/dri/card0
crw-rw---- 1 65534 65534 226, 128 Sep 12 01:01 /dev/dri/renderD128
crw-rw---- 1 65534 65534 235,   0 Sep 12 01:01 /dev/kfd

So adding bin group doesn't really work.

@rtaic-coder commented on GitHub (Sep 12, 2024): > Yeah, I did notice that one as well: > > So the smallest command line that works becomes: > > ``` > $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest rocminfo | grep GPU > Uuid: GPU-afcb395b37c2835e > Device Type: GPU > $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add daemon rocm/pytorch:latest ls -ld /dev/kfd /dev/dri /dev/dri/* > ls: cannot access '/dev/dri/by-path': No such file or directory > drwxr-xr-x 2 root root 80 Sep 11 09:18 /dev/dri > crw-rw----+ 1 nobody daemon 226, 1 Sep 11 09:06 /dev/dri/card1 > crw-rw----+ 1 nobody bin 226, 128 Sep 11 09:06 /dev/dri/renderD128 > crw-rw---- 1 nobody daemon 235, 0 Sep 11 09:06 /dev/kfd > ``` > > In ollama: > > ``` > $ docker run -it --device=/dev/kfd --device=/dev/dri --group-add bin --replace --name ollama --entrypoint /bin/bash ollama/ollama:rocm -c "ls -ld /dev/kfd /dev/dri /dev/dri/*" > drwxr-xr-x 2 root root 80 Sep 11 09:19 /dev/dri > crw-rw----+ 1 65534 bin 226, 1 Sep 11 09:06 /dev/dri/card1 > crw-rw----+ 1 65534 daemon 226, 128 Sep 11 09:06 /dev/dri/renderD128 > crw-rw---- 1 65534 bin 235, 0 Sep 11 09:06 /dev/kfd > ``` > > on my base system: > > ``` > (base) steven@LinuxPC:~$ ls -ld /dev/kfd /dev/dri /dev/dri/* > drwxr-xr-x 3 root root 100 sep 11 11:06 /dev/dri > drwxr-xr-x 2 root root 80 sep 11 11:06 /dev/dri/by-path > crw-rw----+ 1 root video 226, 1 sep 11 11:06 /dev/dri/card1 > crw-rw----+ 1 root render 226, 128 sep 11 11:06 /dev/dri/renderD128 > crw-rw---- 1 root video 235, 0 sep 11 11:06 /dev/kfd > ``` > > And we have liftoff 🚀 !! > > ``` > (base) steven@LinuxPC:~$ ollama run llama3.1 > >>> hello > Hello! How are you today? Is there something I can help you with or would you like to chat? > > >>> Send a message (/? for help) > ``` > > Used command line: > > ``` > docker run -it --replace \ > -v ollama:/root/.ollama \ > --device /dev/kfd --device /dev/dri \ > --group-add bin \ > -e AMD_LOG_LEVEL=3 -e OLLAMA_DEBUG=true \ > -p 11434:11434 \ > --name ollama \ > ollama/ollama:rocm > ``` In my case, owner of kfd device is 65534 ```bash docker run -dit --device=/dev/kfd --device=/dev/dri --group-add bin --rm --name ollama ollama/ollama:rocm docker exec -it ollama bash [root@4e2fa1b63144 /]# ls -ld /dev/dri/* /dev/kfd crw-rw---- 1 65534 65534 226, 0 Sep 12 01:01 /dev/dri/card0 crw-rw---- 1 65534 65534 226, 128 Sep 12 01:01 /dev/dri/renderD128 crw-rw---- 1 65534 65534 235, 0 Sep 12 01:01 /dev/kfd ``` So adding `bin` group doesn't really work.

GiteaMirror commented

2026-04-22 09:21:12 -05:00

@svaningelgem commented on GitHub (Sep 12, 2024):

You can use --group-add 65534, but that didn't do anything in my case. It really needs to be an available group.

So the final fix for this would be to have it assigned to a group (like the tensorflow image I showed in my comment)

@svaningelgem commented on GitHub (Sep 12, 2024): You can use `--group-add 65534`, but that didn't do anything in my case. It really needs to be an available group. So the final fix for this would be to have it assigned to a group (like the tensorflow image I showed in [my comment](https://github.com/ollama/ollama/issues/6685#issuecomment-2343110896))

GiteaMirror commented

2026-04-22 09:21:13 -05:00

@svaningelgem commented on GitHub (Sep 12, 2024):

@rtaic-coder : as this ticket is closed, maybe reference it in another issue to raise awareness to your case?

@svaningelgem commented on GitHub (Sep 12, 2024): @rtaic-coder : as this ticket is closed, maybe reference it in another issue to raise awareness to your case?

GiteaMirror commented

2026-04-22 09:21:14 -05:00

@dhiltgen commented on GitHub (Sep 12, 2024):

I updated the troubleshooting section for AMD GPUs here - https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#amd-gpu-discovery

My understanding was the --group-add needed to match the group of the device on the host, not inside the container. Is that not the case? Is there some mapping taking place in these deprivileged scenarios where GID on the host differs from the GID inside the container?

@dhiltgen commented on GitHub (Sep 12, 2024): I updated the troubleshooting section for AMD GPUs here - https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#amd-gpu-discovery My understanding was the `--group-add` needed to match the group of the device on the host, not inside the container. Is that not the case? Is there some mapping taking place in these deprivileged scenarios where GID on the host differs from the GID inside the container?

GiteaMirror commented

2026-04-22 09:21:14 -05:00

@svaningelgem commented on GitHub (Sep 20, 2024):

@dhiltgen , no, the --group-add had to match the group of the device inside the container. Which is kind of logical when you think about it: the container is its own system, and has a certain rights to the file. The rights on the host does not really matter in that case, because the pod is separated from it.

I think the underlying issue is: why is the group-name not the same for the pod and the host? (as demonstrated with the pytorch pod above, they have a fixed "daemon" group in place, whereas it seems ollama's one is kind of dynamic...)

@svaningelgem commented on GitHub (Sep 20, 2024): @dhiltgen , no, the `--group-add` had to match the group of the device inside the container. Which is kind of logical when you think about it: the container is its own system, and has a certain rights to the file. The rights on the host does not really matter in that case, because the pod is separated from it. I think the underlying issue is: *why is the group-name not the same for the pod and the host?* (as demonstrated with the pytorch pod above, they have a fixed "daemon" group in place, whereas it seems ollama's one is kind of dynamic...)

GiteaMirror commented

2026-04-22 09:21:14 -05:00

@dhiltgen commented on GitHub (Sep 24, 2024):

@svaningelgem can you try using the numeric group ID instead of the name? On your host ls -n /dev/kfd /dev/dri/renderD128 should show it.

This is somewhat similar to #5986

@dhiltgen commented on GitHub (Sep 24, 2024): @svaningelgem can you try using the numeric group ID instead of the name? On your host `ls -n /dev/kfd /dev/dri/renderD128 ` should show it. This is somewhat similar to #5986

GiteaMirror commented

2026-04-22 09:21:15 -05:00

@rtaic-coder commented on GitHub (Sep 26, 2024):

I ended up switching from rootless docker to sudo docker since I spent countless hour to make is work in rootless scenario no matter what I do it give the same error in rootless. Thanks for all the help.

@rtaic-coder commented on GitHub (Sep 26, 2024): I ended up switching from rootless docker to sudo docker since I spent countless hour to make is work in rootless scenario no matter what I do it give the same error in rootless. Thanks for all the help.

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#29967

[GH-ISSUE #6685] AMD 7900XTX fails with "Could not initialize Tensile host: No devices found" #29967

What is the issue?

OS

GPU

CPU

Ollama version

RDNA 3 (GFX1100 series)

RDNA 2 (GFX1030 series)

RDNA 1 (GFX1010 series)

Vega (GFX900 series)

Polaris (GFX803/804 series)

Navi 1X (GFX1010/1011)

Navi 2X (GFX1030/1031)

Navi 3X (GFX1100)

[GH-ISSUE #6685] AMD 7900XTX fails with `"Could not initialize Tensile host: No devices found"` #29967