[GH-ISSUE #11474] 0.9.6 ollama can't find /usr/share/libdrm/amdgpu.ids instead if searches in /opt/amdgpu/share/libdrm/amdgpu.ids #69633

Closed
opened 2026-05-04 18:43:23 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @ghost on GitHub (Jul 19, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11474

What is the issue?

0.6.8 searches for /usr/share/libdrm/amdgpu.ids the correct one
While 0.9.6 uses /opt/amdgpu/share/libdrm/amdgpu.ids the non-existing one
Fedora 42.

Relevant log output

It basically says file not found `/opt/amdgpu/share/libdrm/amdgpu.ids`.
Both instances 0.6.8 and 0.9.6 are installed onto `/opt`

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.9.6

Originally created by @ghost on GitHub (Jul 19, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11474 ### What is the issue? 0.6.8 searches for /usr/share/libdrm/amdgpu.ids the correct one While 0.9.6 uses /opt/amdgpu/share/libdrm/amdgpu.ids the non-existing one Fedora 42. ### Relevant log output ```shell It basically says file not found `/opt/amdgpu/share/libdrm/amdgpu.ids`. Both instances 0.6.8 and 0.9.6 are installed onto `/opt` ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.9.6
GiteaMirror added the amdbuglinux labels 2026-05-04 18:43:24 -05:00
Author
Owner

@ghost commented on GitHub (Jul 19, 2025):

Basically my RDNA2 laptop cannot load any models, because RoCM fails to load, I've setup the correct RDNA2 HSA_OVERRIDE that works with R7Pro 6850U iGPU.
But my RDNA3 desktop with older ollama version doesn't.

<!-- gh-comment-id:3092554114 --> @ghost commented on GitHub (Jul 19, 2025): Basically my RDNA2 laptop cannot load any models, because RoCM fails to load, I've setup the correct RDNA2 HSA_OVERRIDE that works with R7Pro 6850U iGPU. But my RDNA3 desktop with older ollama version doesn't.
Author
Owner

@dhiltgen commented on GitHub (Jul 31, 2025):

Can you share server logs with OLLAMA_DEBUG=1 set on both the old version that works and ideally the latest version (currently 0.10.1) so we can compare and see what might be going wrong?

<!-- gh-comment-id:3141594099 --> @dhiltgen commented on GitHub (Jul 31, 2025): Can you share server logs with OLLAMA_DEBUG=1 set on both the old version that works and ideally the latest version (currently 0.10.1) so we can compare and see what might be going wrong?
Author
Owner

@ghost commented on GitHub (Aug 6, 2025):

old.log
new.log
I think it works? gpu does get utilized.
I figure that on a laptop it puts a copy of the model into shared ram and then it's processed by cpu and gpu at the sametime, so I notice the GPU time utilization is off.

<!-- gh-comment-id:3158510791 --> @ghost commented on GitHub (Aug 6, 2025): [old.log](https://github.com/user-attachments/files/21614162/old.log) [new.log](https://github.com/user-attachments/files/21614822/new.log) I think it works? gpu does get utilized. I figure that on a laptop it puts a copy of the model into shared ram and then it's processed by cpu and gpu at the sametime, so I notice the GPU time utilization is off.
Author
Owner

@dhiltgen commented on GitHub (Aug 8, 2025):

It sounds like everything is working correctly. The new.log shows your GPU is discovered, and the model is loaded on it.

<!-- gh-comment-id:3168561381 --> @dhiltgen commented on GitHub (Aug 8, 2025): It sounds like everything is working correctly. The new.log shows your GPU is discovered, and the model is loaded on it.
Author
Owner

@ghost commented on GitHub (Aug 8, 2025):

Yeah I've looked it up, turns out the laptop wasn't loading all layers.
4GB dedicated vram wasn't enough to fit in a 4B Q4 model.

(...)
load_tensors: offloading 25 repeating layers to GPU
load_tensors: offloaded 25/37 layers to GPU
load_tensors:        ROCm0 model buffer size =  1517.35 MiB
load_tensors:   CPU_Mapped model buffer size =   976.34 MiB
(...)
llama_kv_cache_unified:      ROCm0 KV buffer size =   400.00 MiB
llama_kv_cache_unified:        CPU KV buffer size =   176.00 MiB

I just thought the model should have fit so the issue might have been bogus in a first place.

<!-- gh-comment-id:3169018662 --> @ghost commented on GitHub (Aug 8, 2025): Yeah I've looked it up, turns out the laptop wasn't loading all layers. 4GB dedicated vram wasn't enough to fit in a 4B Q4 model. ``` (...) load_tensors: offloading 25 repeating layers to GPU load_tensors: offloaded 25/37 layers to GPU load_tensors: ROCm0 model buffer size = 1517.35 MiB load_tensors: CPU_Mapped model buffer size = 976.34 MiB (...) llama_kv_cache_unified: ROCm0 KV buffer size = 400.00 MiB llama_kv_cache_unified: CPU KV buffer size = 176.00 MiB ``` I just thought the model should have fit so the issue might have been bogus in a first place.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69633