[GH-ISSUE #2749] AMD GPU ROCm library search path hardcoded to wrong path #1655

Closed
opened 2026-04-12 11:36:57 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @jeffcpullen on GitHub (Feb 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2749

Originally assigned to: @dhiltgen on GitHub.

Problem Summary

When using AMD GPUs with ROCm, ollama will fail to find the librocm_smi64.so library if it was not installed in the hardcoded path set with MR #2553 . It will not search any of the system library paths such as /lib64/.

The specific bit of code below does not work for Fedora systems as the rocm-smi-devel package installs it at lib64/librocm_smi64.so.

var RocmLinuxGlobs = []string{
	"/opt/rocm*/lib*/librocm_smi64.so*",

e95b896790/gpu/gpu.go (L57C1-L59C2)

Errors observed

Checking that the library is installed

$ ldconfig -p | grep rocm
	librocm_smi64.so.5 (libc6,x86-64) => /lib64/librocm_smi64.so.5
	librocm_smi64.so (libc6,x86-64) => /lib64/librocm_smi64.so

Starting ollama v0.1.27

$ ollama serve
time=2024-02-25T14:23:31.275-05:00 level=INFO source=images.go:710 msg="total blobs: 0"
time=2024-02-25T14:23:31.276-05:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-02-25T14:23:31.277-05:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)"
time=2024-02-25T14:23:31.277-05:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-02-25T14:23:34.444-05:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cuda_v11 cpu rocm_v5 rocm_v6 cpu_avx2 cpu_avx]"
time=2024-02-25T14:23:34.444-05:00 level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-02-25T14:23:34.444-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-02-25T14:23:34.449-05:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T14:23:34.449-05:00 level=INFO source=routes.go:1042 msg="no GPU detected"

Workaround

Added a symlink to the expected path of '/opt/rocm*/lib*/librocm_smi64.so' manually.

sudo ln -s /lib64/librocm_smi64.so /opt/rocm/lib64/librocm_smi64.so

Verify symlink and start ollama

$ ls -al /opt/rocm/lib64/librocm_smi64.so 
lrwxrwxrwx. 1 root root 23 Feb 25 14:31 /opt/rocm/lib64/librocm_smi64.so -> /lib64/librocm_smi64.so
[user@host ~]$ ollama serve
time=2024-02-25T14:31:47.298-05:00 level=INFO source=images.go:710 msg="total blobs: 0"
time=2024-02-25T14:31:47.298-05:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-02-25T14:31:47.299-05:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)"
time=2024-02-25T14:31:47.299-05:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-02-25T14:31:49.855-05:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cpu rocm_v6 cuda_v11 cpu_avx cpu_avx2 rocm_v5]"
time=2024-02-25T14:31:49.855-05:00 level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-02-25T14:31:49.855-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/lib64/librocm_smi64.so.5.0]"
time=2024-02-25T14:31:49.864-05:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected"
time=2024-02-25T14:31:49.864-05:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"

GPU is now detected

System Information

Ollama version: ollama-linux-amd64-v0.1.27
OS: Fedora release 39 (Thirty Nine)
ROCM packages:

$ rpm -qa | grep rocm
hsakmt-1.0.6-34.rocm5.7.0.fc39.x86_64
rocm-runtime-5.7.1-1.fc39.x86_64
rocminfo-5.7.0-1.fc39.x86_64
rocm-comgr-17.0-3.fc39.x86_64
rocm-opencl-5.7.1-1.fc39.x86_64
rocm-clinfo-5.7.1-1.fc39.x86_64
rocm-device-libs-17.1-1.fc39.x86_64
rocm-hip-5.7.1-1.fc39.x86_64
rocm-smi-5.7.1-1.fc39.x86_64
rocm-smi-devel-5.7.1-1.fc39.x86_64

inxi info:

$ inxi -Gzxx
Graphics:
  Device-1: AMD Vega 10 XTX [Radeon Frontier Edition] driver: amdgpu v: kernel
    arch: GCN-5 pcie: speed: 8 GT/s lanes: 16 ports: active: DP-3
    empty: DP-1,DP-2,HDMI-A-1 bus-ID: 03:00.0 chip-ID: 1002:6863
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 23.2.4
    compositor: gnome-shell v: 45.4 driver: X: loaded: amdgpu
    unloaded: fbdev,modesetting,vesa dri: radeonsi gpu: amdgpu display-ID: :1
    screens: 1
  Screen-1: 0 s-res: 3440x1440 s-dpi: 96
  Monitor-1: DP-3 mapped: DisplayPort-2 model: Asus ROG PG348Q
    res: 3440x1440 dpi: 109 diag: 865mm (34.1")
  API: OpenGL v: 4.6 vendor: amd mesa v: 23.3.5 glx-v: 1.4 es-v: 3.2
    direct-render: yes renderer: AMD Radeon Vega Frontier Edition (radeonsi
    vega10 LLVM 17.0.6 DRM 3.57 6.7.5-100.fc38.x86_64) device-ID: 1002:6863
  API: Vulkan v: 1.3.268 surfaces: xcb,xlib device: 0 type: discrete-gpu
    driver: mesa radv device-ID: 1002:6863 device: 1 type: cpu
    driver: mesa llvmpipe device-ID: 10005:0000
  API: EGL Message: EGL data requires eglinfo. Check --recommends.

rocmminfo:

*******                  
Agent 2                  
*******                  
  Name:                    gfx900                             
  Uuid:                    GPU-0214ffb436543084               
  Marketing Name:          AMD Radeon Vega Frontier Edition   
  Vendor Name:             AMD                                
<snip>
Originally created by @jeffcpullen on GitHub (Feb 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2749 Originally assigned to: @dhiltgen on GitHub. # Problem Summary When using AMD GPUs with ROCm, ollama will fail to find the librocm_smi64.so library if it was not installed in the hardcoded path set with MR #2553 . It will not search any of the system library paths such as /lib64/. The specific bit of code below does not work for Fedora systems as the rocm-smi-devel package installs it at lib64/librocm_smi64.so. ``` var RocmLinuxGlobs = []string{ "/opt/rocm*/lib*/librocm_smi64.so*", ``` https://github.com/ollama/ollama/blob/e95b8967909c490cf0cf608388dbeae96fbe3bcf/gpu/gpu.go#L57C1-L59C2 # Errors observed Checking that the library is installed ``` $ ldconfig -p | grep rocm librocm_smi64.so.5 (libc6,x86-64) => /lib64/librocm_smi64.so.5 librocm_smi64.so (libc6,x86-64) => /lib64/librocm_smi64.so ``` Starting ollama v0.1.27 ``` $ ollama serve time=2024-02-25T14:23:31.275-05:00 level=INFO source=images.go:710 msg="total blobs: 0" time=2024-02-25T14:23:31.276-05:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0" time=2024-02-25T14:23:31.277-05:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)" time=2024-02-25T14:23:31.277-05:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..." time=2024-02-25T14:23:34.444-05:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cuda_v11 cpu rocm_v5 rocm_v6 cpu_avx2 cpu_avx]" time=2024-02-25T14:23:34.444-05:00 level=INFO source=gpu.go:94 msg="Detecting GPU type" time=2024-02-25T14:23:34.444-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so" time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so" time=2024-02-25T14:23:34.449-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" time=2024-02-25T14:23:34.449-05:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-02-25T14:23:34.449-05:00 level=INFO source=routes.go:1042 msg="no GPU detected" ``` # Workaround Added a symlink to the expected path of '/opt/rocm*/lib*/librocm_smi64.so' manually. ``` sudo ln -s /lib64/librocm_smi64.so /opt/rocm/lib64/librocm_smi64.so ``` Verify symlink and start ollama ``` $ ls -al /opt/rocm/lib64/librocm_smi64.so lrwxrwxrwx. 1 root root 23 Feb 25 14:31 /opt/rocm/lib64/librocm_smi64.so -> /lib64/librocm_smi64.so [user@host ~]$ ollama serve time=2024-02-25T14:31:47.298-05:00 level=INFO source=images.go:710 msg="total blobs: 0" time=2024-02-25T14:31:47.298-05:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0" time=2024-02-25T14:31:47.299-05:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)" time=2024-02-25T14:31:47.299-05:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..." time=2024-02-25T14:31:49.855-05:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cpu rocm_v6 cuda_v11 cpu_avx cpu_avx2 rocm_v5]" time=2024-02-25T14:31:49.855-05:00 level=INFO source=gpu.go:94 msg="Detecting GPU type" time=2024-02-25T14:31:49.855-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so" time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so" time=2024-02-25T14:31:49.859-05:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/lib64/librocm_smi64.so.5.0]" time=2024-02-25T14:31:49.864-05:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected" time=2024-02-25T14:31:49.864-05:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" ``` GPU is now detected # System Information Ollama version: ollama-linux-amd64-v0.1.27 OS: Fedora release 39 (Thirty Nine) ROCM packages: ``` $ rpm -qa | grep rocm hsakmt-1.0.6-34.rocm5.7.0.fc39.x86_64 rocm-runtime-5.7.1-1.fc39.x86_64 rocminfo-5.7.0-1.fc39.x86_64 rocm-comgr-17.0-3.fc39.x86_64 rocm-opencl-5.7.1-1.fc39.x86_64 rocm-clinfo-5.7.1-1.fc39.x86_64 rocm-device-libs-17.1-1.fc39.x86_64 rocm-hip-5.7.1-1.fc39.x86_64 rocm-smi-5.7.1-1.fc39.x86_64 rocm-smi-devel-5.7.1-1.fc39.x86_64 ``` inxi info: ``` $ inxi -Gzxx Graphics: Device-1: AMD Vega 10 XTX [Radeon Frontier Edition] driver: amdgpu v: kernel arch: GCN-5 pcie: speed: 8 GT/s lanes: 16 ports: active: DP-3 empty: DP-1,DP-2,HDMI-A-1 bus-ID: 03:00.0 chip-ID: 1002:6863 Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 23.2.4 compositor: gnome-shell v: 45.4 driver: X: loaded: amdgpu unloaded: fbdev,modesetting,vesa dri: radeonsi gpu: amdgpu display-ID: :1 screens: 1 Screen-1: 0 s-res: 3440x1440 s-dpi: 96 Monitor-1: DP-3 mapped: DisplayPort-2 model: Asus ROG PG348Q res: 3440x1440 dpi: 109 diag: 865mm (34.1") API: OpenGL v: 4.6 vendor: amd mesa v: 23.3.5 glx-v: 1.4 es-v: 3.2 direct-render: yes renderer: AMD Radeon Vega Frontier Edition (radeonsi vega10 LLVM 17.0.6 DRM 3.57 6.7.5-100.fc38.x86_64) device-ID: 1002:6863 API: Vulkan v: 1.3.268 surfaces: xcb,xlib device: 0 type: discrete-gpu driver: mesa radv device-ID: 1002:6863 device: 1 type: cpu driver: mesa llvmpipe device-ID: 10005:0000 API: EGL Message: EGL data requires eglinfo. Check --recommends. ``` rocmminfo: ``` ******* Agent 2 ******* Name: gfx900 Uuid: GPU-0214ffb436543084 Marketing Name: AMD Radeon Vega Frontier Edition Vendor Name: AMD <snip> ```
GiteaMirror added the bugamd labels 2026-04-12 11:36:57 -05:00
Author
Owner

@jmorganca commented on GitHub (Feb 25, 2024):

@dhiltgen

<!-- gh-comment-id:1963048529 --> @jmorganca commented on GitHub (Feb 25, 2024): @dhiltgen
Author
Owner

@jeffcpullen commented on GitHub (Feb 25, 2024):

Update to the workaround. Setting the LD_LIBRARY_PATH variable on launch is a lot cleaner.

LD_LIBRARY_PATH=/usr/lib64 ollama serve
<!-- gh-comment-id:1963065454 --> @jeffcpullen commented on GitHub (Feb 25, 2024): Update to the workaround. Setting the LD_LIBRARY_PATH variable on launch is a lot cleaner. ``` LD_LIBRARY_PATH=/usr/lib64 ollama serve ```
Author
Owner

@dhiltgen commented on GitHub (Feb 26, 2024):

I'm working on an update to pivot over to relying on sysfs instead of the mgmt library, which hopefully should make discovery more robust.

<!-- gh-comment-id:1964655720 --> @dhiltgen commented on GitHub (Feb 26, 2024): I'm working on an update to pivot over to relying on sysfs instead of the mgmt library, which hopefully should make discovery more robust.
Author
Owner

@dhiltgen commented on GitHub (Mar 11, 2024):

Starting in 0.1.29 we no longer rely on the rocm management library for discovery of Radeon GPUs on linux. This should be fixed now.

<!-- gh-comment-id:1989463436 --> @dhiltgen commented on GitHub (Mar 11, 2024): Starting in [0.1.29](https://github.com/ollama/ollama/releases/tag/v0.1.29) we no longer rely on the rocm management library for discovery of Radeon GPUs on linux. This should be fixed now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1655