[GH-ISSUE #2165] ROCm v5 crash - free(): invalid pointer #47749

Closed
opened 2026-04-28 05:09:38 -05:00 by GiteaMirror · 26 comments
Owner

Originally created by @dhiltgen on GitHub (Jan 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2165

Originally assigned to: @dhiltgen on GitHub.

loading library /tmp/ollama800487147/rocm_v5/libext_server.so
2024/01/23 19:26:51 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama800487147/rocm_v5/libext_server.so
2024/01/23 19:26:51 dyn_ext_server.go:145: INFO Initializing llama server
free(): invalid pointer
Aborted (core dumped)

Most likely there is some other problem/error, but it appears we're not handling that error case gracefully and are trying to free an invalid pointer.

Originally created by @dhiltgen on GitHub (Jan 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2165 Originally assigned to: @dhiltgen on GitHub. ``` loading library /tmp/ollama800487147/rocm_v5/libext_server.so 2024/01/23 19:26:51 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama800487147/rocm_v5/libext_server.so 2024/01/23 19:26:51 dyn_ext_server.go:145: INFO Initializing llama server free(): invalid pointer Aborted (core dumped) ``` Most likely there is some other problem/error, but it appears we're not handling that error case gracefully and are trying to free an invalid pointer.
GiteaMirror added the amd label 2026-04-28 05:09:38 -05:00
Author
Owner

@dhiltgen commented on GitHub (Jan 24, 2024):

I've tried to simulate some potential failure modes and from what I can tell, this free(): invalid pointer isn't coming from ollama cgo or our extern C wrapper code freeing an invalid pointer. It may be something within the rocm library during some init function, or possibly llama_backend_init before any log messages show up. I've just merged #2162 so once we have a new build available for people to try, it may be helpful to see what else is reported in the logs

OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve

<!-- gh-comment-id:1907214259 --> @dhiltgen commented on GitHub (Jan 24, 2024): I've tried to simulate some potential failure modes and from what I can tell, this `free(): invalid pointer` isn't coming from ollama cgo or our extern C wrapper code freeing an invalid pointer. It may be something within the rocm library during some init function, or possibly `llama_backend_init` before any log messages show up. I've just merged #2162 so once we have a new build available for people to try, it may be helpful to see what else is reported in the logs `OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve`
Author
Owner

@kylianpl commented on GitHub (Jan 24, 2024):

had the same problem, with this log

recompiling it simply with go generate ./... and go build . made a binary that could work
maybe the problem is just the way a lib required by ROCm is loaded
Archlinux, ollama v0.1.21 pre-release

<!-- gh-comment-id:1908852835 --> @kylianpl commented on GitHub (Jan 24, 2024): had the same problem, with [this log](https://github.com/ollama/ollama/files/14043129/ollama-log.txt) recompiling it simply with `go generate ./...` and `go build .` made a binary that could work maybe the problem is just the way a lib required by ROCm is loaded Archlinux, ollama v0.1.21 pre-release
Author
Owner

@dhiltgen commented on GitHub (Jan 24, 2024):

Thanks for that data point @kylianpl. Could you also share the output of

rocm-smi --showdriverversion --showproductname --showhw
rocm-smi -V
<!-- gh-comment-id:1908956000 --> @dhiltgen commented on GitHub (Jan 24, 2024): Thanks for that data point @kylianpl. Could you also share the output of ``` rocm-smi --showdriverversion --showproductname --showhw rocm-smi -V ```
Author
Owner

@kylianpl commented on GitHub (Jan 24, 2024):

$ rocm-smi --showdriverversion --showproductname --showhw
========================= ROCm System Management Interface =========================
============================== Concise Hardware Info ===============================
GPU  DID   GFX RAS  SDMA RAS  UMC RAS  VBIOS                   BUS           
0    73bf  N/A      N/A       N/A      113-1MS21XL203W_210810  0000:08:00.0  
====================================================================================
=========================== Version of System Component ============================
Driver version: 6.7.0-arch3-1
====================================================================================
=================================== Product Info ===================================
GPU[0]          : Card series:          Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
GPU[0]          : Card model:           0x6705
GPU[0]          : Card vendor:          Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]          : Card SKU:             unknown
====================================================================================
=============================== End of ROCm SMI Log ================================

$ rocm-smi -v
========================= ROCm System Management Interface =========================
====================================== VBIOS =======================================
GPU[0]          : VBIOS version: 113-1MS21XL203W_210810
====================================================================================
=============================== End of ROCm SMI Log ================================

(rocm-smi -V just said unrecognized arguments: -V)

<!-- gh-comment-id:1909017343 --> @kylianpl commented on GitHub (Jan 24, 2024): ``` $ rocm-smi --showdriverversion --showproductname --showhw ========================= ROCm System Management Interface ========================= ============================== Concise Hardware Info =============================== GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS 0 73bf N/A N/A N/A 113-1MS21XL203W_210810 0000:08:00.0 ==================================================================================== =========================== Version of System Component ============================ Driver version: 6.7.0-arch3-1 ==================================================================================== =================================== Product Info =================================== GPU[0] : Card series: Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] GPU[0] : Card model: 0x6705 GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[0] : Card SKU: unknown ==================================================================================== =============================== End of ROCm SMI Log ================================ $ rocm-smi -v ========================= ROCm System Management Interface ========================= ====================================== VBIOS ======================================= GPU[0] : VBIOS version: 113-1MS21XL203W_210810 ==================================================================================== =============================== End of ROCm SMI Log ================================ ``` (`rocm-smi -V` just said `unrecognized arguments: -V`)
Author
Owner

@dhiltgen commented on GitHub (Jan 25, 2024):

@kylianpl it looks like your driver is v6, but we're loading v5 based on the discovered librocm_smi64 version. Is it possible you have mixed versions installed on your system? If so, you could try upgrading everything to v6 so the driver and ROCm libraries are matched.

You could also try forcing it to use v6 and although if the v6 libraries aren't present it wont load properly and should fall back to CPU mode?

OLLAMA_LLM_LIBRARY="rocm_v6" ollama serve

It might also be interesting to know what version of rocm winds up being used when you build from source.

<!-- gh-comment-id:1909174739 --> @dhiltgen commented on GitHub (Jan 25, 2024): @kylianpl it looks like your driver is v6, but we're loading v5 based on the discovered librocm_smi64 version. Is it possible you have mixed versions installed on your system? If so, you could try upgrading everything to v6 so the driver and ROCm libraries are matched. You could also try forcing it to use v6 and although if the v6 libraries aren't present it wont load properly and should fall back to CPU mode? ``` OLLAMA_LLM_LIBRARY="rocm_v6" ollama serve ``` It might also be interesting to know what version of rocm winds up being used when you build from source.
Author
Owner

@kylianpl commented on GitHub (Jan 25, 2024):

running with the suggested command indeed made an error about a missing lib (libhipblas.so.2) but didn't fall back to CPU mode (didn't crash either) ollama-log.txt

I searched for the arch repo and it seems like hipblas is still on 5.7.1-1, but there is a 6.0.0 release in extra-testing I didn't test

The compiled version log compiled-ollama-log.txt
let me know if you want other info

<!-- gh-comment-id:1910835978 --> @kylianpl commented on GitHub (Jan 25, 2024): running with the suggested command indeed made an error about a missing lib (libhipblas.so.2) but didn't fall back to CPU mode (didn't crash either) [ollama-log.txt](https://github.com/ollama/ollama/files/14056968/ollama-log.txt) I searched for the arch repo and it seems like [hipblas](https://archlinux.org/packages/extra/x86_64/hipblas/) is still on 5.7.1-1, but there is a 6.0.0 release in extra-testing I didn't test The compiled version log [compiled-ollama-log.txt](https://github.com/ollama/ollama/files/14056988/compiled-ollama-log.txt) let me know if you want other info
Author
Owner

@gentooboontoo commented on GitHub (Jan 25, 2024):

I have the same problem when using pre-release version of 0.1.21:

λ ./ollama -v
ollama version is 0.1.21

λ OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib64 ./ollama serve 2>&1 | tee ollama-0.1.21.txt
…
time=2024-01-25T21:24:58.806+01:00 level=INFO source=/go/src/github.com/jmorganca/ollama/llm/dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3786850131/rocm_v5/libext_server.so"
time=2024-01-25T21:24:58.806+01:00 level=INFO source=/go/src/github.com/jmorganca/ollama/llm/dyn_ext_server.go:139 msg="Initializing llama server"
free(): invalid pointer

ollama-0.1.21.txt

But there's no crash when building and using main branch (commit e64b5b0):

λ OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib64 ./ollama serve 2>&1 | tee ../logs/ollama-main-e64b5b0.txt
…
[1706214435] print_timings: prompt eval time =     850.00 ms /    15 tokens (   56.67 ms per token,    17.65 tokens per second)
[1706214435] print_timings:        eval time =   13463.62 ms /    98 runs   (  137.38 ms per token,     7.28 tokens per second)
[1706214435] print_timings:       total time =   14313.62 ms
[1706214435] slot 0 released (113 tokens in cache)
[GIN] 2024/01/25 - 21:27:15 | 200 | 14.314246177s |       127.0.0.1 | POST     "/api/chat"

ollama-main-e64b5b0.txt

λ rocm-smi --showdriverversion --showproductname --showhw


======================= ROCm System Management Interface =======================
============================ Concise Hardware Info =============================
GPU  DID   GFX RAS  SDMA RAS  UMC RAS  VBIOS        BUS
0    743f  N/A      N/A       N/A      113-001-XT7  0000:08:00.0
================================================================================
========================= Version of System Component ==========================
Driver version: 5.16.12-gentoo
================================================================================
================================= Product Info =================================
GPU[0]          : Card series:          Navi 24 [Radeon RX 6400/6500 XT/6500M]
GPU[0]          : Card model:           0x2415
GPU[0]          : Card vendor:          Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]          : Card SKU:             001
================================================================================
============================= End of ROCm SMI Log ==============================

λ rocm-smi -v

======================= ROCm System Management Interface =======================
==================================== VBIOS =====================================
GPU[0]          : VBIOS version: 113-001-XT7
================================================================================
============================= End of ROCm SMI Log ==============================
<!-- gh-comment-id:1910965943 --> @gentooboontoo commented on GitHub (Jan 25, 2024): I have the same problem when using pre-release version of 0.1.21: ``` λ ./ollama -v ollama version is 0.1.21 λ OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib64 ./ollama serve 2>&1 | tee ollama-0.1.21.txt … time=2024-01-25T21:24:58.806+01:00 level=INFO source=/go/src/github.com/jmorganca/ollama/llm/dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3786850131/rocm_v5/libext_server.so" time=2024-01-25T21:24:58.806+01:00 level=INFO source=/go/src/github.com/jmorganca/ollama/llm/dyn_ext_server.go:139 msg="Initializing llama server" free(): invalid pointer ``` [ollama-0.1.21.txt](https://github.com/ollama/ollama/files/14057538/ollama-0.1.21.txt) But there's **no crash** when building and using `main` branch (commit `e64b5b0`): ``` λ OLLAMA_DEBUG=1 LD_LIBRARY_PATH=/usr/lib64 ./ollama serve 2>&1 | tee ../logs/ollama-main-e64b5b0.txt … [1706214435] print_timings: prompt eval time = 850.00 ms / 15 tokens ( 56.67 ms per token, 17.65 tokens per second) [1706214435] print_timings: eval time = 13463.62 ms / 98 runs ( 137.38 ms per token, 7.28 tokens per second) [1706214435] print_timings: total time = 14313.62 ms [1706214435] slot 0 released (113 tokens in cache) [GIN] 2024/01/25 - 21:27:15 | 200 | 14.314246177s | 127.0.0.1 | POST "/api/chat" ``` [ollama-main-e64b5b0.txt](https://github.com/ollama/ollama/files/14057557/ollama-main-e64b5b0.txt) ``` λ rocm-smi --showdriverversion --showproductname --showhw ======================= ROCm System Management Interface ======================= ============================ Concise Hardware Info ============================= GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS 0 743f N/A N/A N/A 113-001-XT7 0000:08:00.0 ================================================================================ ========================= Version of System Component ========================== Driver version: 5.16.12-gentoo ================================================================================ ================================= Product Info ================================= GPU[0] : Card series: Navi 24 [Radeon RX 6400/6500 XT/6500M] GPU[0] : Card model: 0x2415 GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[0] : Card SKU: 001 ================================================================================ ============================= End of ROCm SMI Log ============================== λ rocm-smi -v ======================= ROCm System Management Interface ======================= ==================================== VBIOS ===================================== GPU[0] : VBIOS version: 113-001-XT7 ================================================================================ ============================= End of ROCm SMI Log ============================== ```
Author
Owner

@dhiltgen commented on GitHub (Jan 25, 2024):

@kylianpl that's great to hear it works when you build from source! It sounds like the pre-built v5 linked version we create is somehow incompatible with the libraries on your system. We're using an official Docker hub image from AMD/ROCm to build - https://hub.docker.com/r/rocm/dev-centos-7/tags - 5.7.1-complete. Hopefully once the 6.0 libraries are available, that pre-built binary will start working for you.

@gentooboontoo it looks like your driver and user-space rocm libs are all v5, but our pre-built binary doesn't work. Also good to hear you're able to build from source and get it working.

We'll keep looking into it to see if we can find a way to produce v5 based binaries that work on these systems.

Could you both share your OS/version and rocm version information in case that helps narrow things down?

<!-- gh-comment-id:1911139438 --> @dhiltgen commented on GitHub (Jan 25, 2024): @kylianpl that's great to hear it works when you build from source! It sounds like the pre-built v5 linked version we create is somehow incompatible with the libraries on your system. We're using an official Docker hub image from AMD/ROCm to build - https://hub.docker.com/r/rocm/dev-centos-7/tags - 5.7.1-complete. Hopefully once the 6.0 libraries are available, that pre-built binary will start working for you. @gentooboontoo it looks like your driver and user-space rocm libs are all v5, but our pre-built binary doesn't work. Also good to hear you're able to build from source and get it working. We'll keep looking into it to see if we can find a way to produce v5 based binaries that work on these systems. Could you both share your OS/version and rocm version information in case that helps narrow things down?
Author
Owner

@dhiltgen commented on GitHub (Jan 27, 2024):

We've just pushed an updated release v0.1.22 which has some misc ROCm fixes, including the iGPU fix. There's also a container image now specific for ROCm support based on v5. ollama/ollama:0.1.22-rocm

<!-- gh-comment-id:1912889357 --> @dhiltgen commented on GitHub (Jan 27, 2024): We've just pushed an updated release [v0.1.22](https://github.com/ollama/ollama/releases/tag/v0.1.22) which has some misc ROCm fixes, including the iGPU fix. There's also a container image now specific for ROCm support based on v5. `ollama/ollama:0.1.22-rocm`
Author
Owner

@matteopt commented on GitHub (Jan 27, 2024):

Hi, having the same issue on the v0.1.22 binary provided in the "Releases" page, ollama-linux-amd64

> uname -a
Linux arch 6.7.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 21 Jan 2024 22:14:10 +0000 x86_64 GNU/Linux
> rocm-smi --showdriverversion --showproductname --showhw


========================= ROCm System Management Interface =========================
============================== Concise Hardware Info ===============================
GPU  DID   GFX RAS  SDMA RAS  UMC RAS  VBIOS          BUS
0    744c  N/A      N/A       N/A      113-D70401-00  0000:03:00.0
====================================================================================
=========================== Version of System Component ============================
Driver version: 6.7.1-arch1-1
====================================================================================
=================================== Product Info ===================================
GPU[0]          : Card series:          Navi 31 [Radeon RX 7900 XT/7900 XTX]
GPU[0]          : Card model:           0x1002
GPU[0]          : Card vendor:          Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]          : Card SKU:             D70401
====================================================================================
=============================== End of ROCm SMI Log ================================
> rocm-smi -v


========================= ROCm System Management Interface =========================
====================================== VBIOS =======================================
GPU[0]          : VBIOS version: 113-D70401-00
====================================================================================
=============================== End of ROCm SMI Log ================================
> pacman -Q | grep -i -e hip -e rocm -e '^roc'
hip-runtime-amd 5.7.1-1
hipblas 5.7.1-1
rocblas 5.7.1-1
rocm-cmake 5.7.1-1
rocm-core 5.7.1-1
rocm-device-libs 5.7.1-1
rocm-hip-runtime 5.7.1-2
rocm-language-runtime 5.7.1-2
rocm-llvm 5.7.1-1
rocm-smi-lib 5.7.1-1
rocminfo 5.7.1-1
rocprim 5.7.1-1
rocsolver 5.7.1-1
rocsparse 5.7.1-1

I did not have hipblas installed before, and was also seeing warning logs about missing libs libhipblas.so.1 and libhipblas.so.2. It was working fine but seemed quite slow, although I'm not that knowledgeable on expected performance. After installing hipblas, I am seeing the free(): invalid pointer error which results in SIGABRT.

Regarding the mentioned "driver version", how do you deduce that from the rocm-smi output? I can only see the kernel version. I wonder if I have incompatible libraries on my system but I've also attached versions of all ROCm/HIP libraries I could find, and they seem to be all fairly matching, being on 5.7.1.

With hipblas now installed, I also ran a manual build with go generate ./... and go build ..

After running ./ollama serve and ./ollama run <model> with the built binary, I am stuck at the spinning symbol, with no prompt available. Some of the logs from the server process:

loading library /tmp/ollama3508669940/rocm_v5/libext_server.so
2024/01/27 12:18:41 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama3508669940/rocm_v5/libext_server.so
2024/01/27 12:18:41 dyn_ext_server.go:145: INFO Initializing llama server
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 ROCm devices:
  Device 0: AMD Radeon RX 7900 XT, compute capability 11.0, VMM: no

The server process keeps printing dots after the log section. At this point, the VRAM is filled up and apparently under load:

> rocm-smi


========================= ROCm System Management Interface =========================
=================================== Concise Info ===================================
GPU  Temp (DieEdge)  AvgPwr  SCLK     MCLK   Fan     Perf  PwrCap  VRAM%  GPU%
0    51.0c           46.0W   2664Mhz  96Mhz  22.75%  auto  257.0W   92%   78%
====================================================================================
=============================== End of ROCm SMI Log ================================

After a while, the server logs this and crashes/aborts:

....................................................................................................
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: freq_base  = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:      ROCm0 KV buffer size =   184.00 MiB
WARNING: failed to allocate 72.00 MB of pinned memory: out of memory
llama_kv_cache_init:        CPU KV buffer size =    72.00 MiB
llama_new_context_with_model: KV self size  =  256.00 MiB, K (f16):  128.00 MiB, V (f16):  128.00 MiB
WARNING: failed to allocate 12.01 MB of pinned memory: out of memory
llama_new_context_with_model:        CPU input buffer size   =    12.01 MiB
WARNING: failed to allocate 180.03 MB of pinned memory: out of memory
llama_new_context_with_model:      ROCm0 compute buffer size =   192.00 MiB
llama_new_context_with_model:        CPU compute buffer size =   180.03 MiB
llama_new_context_with_model: graph splits (measure): 5
CUDA error: shared object initialization failed
  current device: 0, in function ggml_cuda_op_flatten at /home/matt/comp/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:8825
  hipGetLastError()
GGML_ASSERT: /home/matt/comp/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:237: !"CUDA error"
ptrace: Operation not permitted.
No stack.
The program is not being run.
SIGABRT: abort

I assume it is working as expected, however the VRAM is not enough for this model. However, I was able to submit prompts without hipblas installed, so I wonder if that was making it fallback to CPU. I will try a smaller model and also try the Docker image out of curiosity. Still, the main issue seems to be with the provided binary. Hope this is helpful.

<!-- gh-comment-id:1913143174 --> @matteopt commented on GitHub (Jan 27, 2024): Hi, having the same issue on the `v0.1.22` binary provided in the "Releases" page, `ollama-linux-amd64` ``` > uname -a Linux arch 6.7.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 21 Jan 2024 22:14:10 +0000 x86_64 GNU/Linux ``` ``` > rocm-smi --showdriverversion --showproductname --showhw ========================= ROCm System Management Interface ========================= ============================== Concise Hardware Info =============================== GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS 0 744c N/A N/A N/A 113-D70401-00 0000:03:00.0 ==================================================================================== =========================== Version of System Component ============================ Driver version: 6.7.1-arch1-1 ==================================================================================== =================================== Product Info =================================== GPU[0] : Card series: Navi 31 [Radeon RX 7900 XT/7900 XTX] GPU[0] : Card model: 0x1002 GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[0] : Card SKU: D70401 ==================================================================================== =============================== End of ROCm SMI Log ================================ ``` ``` > rocm-smi -v ========================= ROCm System Management Interface ========================= ====================================== VBIOS ======================================= GPU[0] : VBIOS version: 113-D70401-00 ==================================================================================== =============================== End of ROCm SMI Log ================================ ``` ``` > pacman -Q | grep -i -e hip -e rocm -e '^roc' hip-runtime-amd 5.7.1-1 hipblas 5.7.1-1 rocblas 5.7.1-1 rocm-cmake 5.7.1-1 rocm-core 5.7.1-1 rocm-device-libs 5.7.1-1 rocm-hip-runtime 5.7.1-2 rocm-language-runtime 5.7.1-2 rocm-llvm 5.7.1-1 rocm-smi-lib 5.7.1-1 rocminfo 5.7.1-1 rocprim 5.7.1-1 rocsolver 5.7.1-1 rocsparse 5.7.1-1 ``` I did not have `hipblas` installed before, and was also seeing warning logs about missing libs `libhipblas.so.1` and `libhipblas.so.2`. It was working fine but seemed quite slow, although I'm not that knowledgeable on expected performance. After installing `hipblas`, I am seeing the `free(): invalid pointer` error which results in `SIGABRT`. Regarding the mentioned "driver version", how do you deduce that from the `rocm-smi` output? I can only see the kernel version. I wonder if I have incompatible libraries on my system but I've also attached versions of all ROCm/HIP libraries I could find, and they seem to be all fairly matching, being on 5.7.1. With `hipblas` now installed, I also ran a manual build with `go generate ./...` and `go build .`. After running `./ollama serve` and `./ollama run <model>` with the built binary, I am stuck at the spinning symbol, with no prompt available. Some of the logs from the server process: ``` loading library /tmp/ollama3508669940/rocm_v5/libext_server.so 2024/01/27 12:18:41 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama3508669940/rocm_v5/libext_server.so 2024/01/27 12:18:41 dyn_ext_server.go:145: INFO Initializing llama server ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes ggml_init_cublas: found 1 ROCm devices: Device 0: AMD Radeon RX 7900 XT, compute capability 11.0, VMM: no ``` The server process keeps printing dots after the log section. At this point, the VRAM is filled up and apparently under load: ``` > rocm-smi ========================= ROCm System Management Interface ========================= =================================== Concise Info =================================== GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0 51.0c 46.0W 2664Mhz 96Mhz 22.75% auto 257.0W 92% 78% ==================================================================================== =============================== End of ROCm SMI Log ================================ ``` After a while, the server logs this and crashes/aborts: ``` .................................................................................................... llama_new_context_with_model: n_ctx = 2048 llama_new_context_with_model: freq_base = 1000000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: ROCm0 KV buffer size = 184.00 MiB WARNING: failed to allocate 72.00 MB of pinned memory: out of memory llama_kv_cache_init: CPU KV buffer size = 72.00 MiB llama_new_context_with_model: KV self size = 256.00 MiB, K (f16): 128.00 MiB, V (f16): 128.00 MiB WARNING: failed to allocate 12.01 MB of pinned memory: out of memory llama_new_context_with_model: CPU input buffer size = 12.01 MiB WARNING: failed to allocate 180.03 MB of pinned memory: out of memory llama_new_context_with_model: ROCm0 compute buffer size = 192.00 MiB llama_new_context_with_model: CPU compute buffer size = 180.03 MiB llama_new_context_with_model: graph splits (measure): 5 CUDA error: shared object initialization failed current device: 0, in function ggml_cuda_op_flatten at /home/matt/comp/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:8825 hipGetLastError() GGML_ASSERT: /home/matt/comp/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:237: !"CUDA error" ptrace: Operation not permitted. No stack. The program is not being run. SIGABRT: abort ``` I assume it is working as expected, however the VRAM is not enough for this model. However, I was able to submit prompts without `hipblas` installed, so I wonder if that was making it fallback to CPU. I will try a smaller model and also try the Docker image out of curiosity. Still, the main issue seems to be with the provided binary. Hope this is helpful.
Author
Owner

@matteopt commented on GitHub (Jan 27, 2024):

Sorry, quick update, I tried a much smaller model advertised in the README, dolphin-phi, and I was able to use the locally-built binary and achieve great performance. Apologies but I can only describe it as "very fast text". I assume it is using ROCm correctly.

Using the provided v0.1.22 binary with hipblas installed, I encounter the free(): invalid pointer error.

Uninstalling hipblas and using the provided v0.1.22 binary works, but performance is degraded, and I can see that VRAM is not used at all and the GPU is not under load, so I assume it falls back to CPU + RAM. However, it seems the logs are misleading. See full server logs:

2024/01/27 12:40:52 images.go:857: INFO total blobs: 10
2024/01/27 12:40:52 images.go:864: INFO total unused blobs removed: 0
2024/01/27 12:40:52 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22)
2024/01/27 12:40:52 payload_common.go:106: INFO Extracting dynamic libraries...
2024/01/27 12:40:54 payload_common.go:145: INFO Dynamic LLM libraries [cpu_avx rocm_v6 cpu cuda_v11 cpu_avx2 rocm_v5]
2024/01/27 12:40:54 gpu.go:94: INFO Detecting GPU type
2024/01/27 12:40:54 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so
2024/01/27 12:40:54 gpu.go:282: INFO Discovered GPU libraries: []
2024/01/27 12:40:54 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so
2024/01/27 12:40:54 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0]
2024/01/27 12:40:54 gpu.go:109: INFO Radeon GPU detected
[GIN] 2024/01/27 - 12:40:55 | 200 |      18.591µs |       127.0.0.1 | HEAD     "/"
[GIN] 2024/01/27 - 12:40:55 | 200 |     343.714µs |       127.0.0.1 | POST     "/api/show"
[GIN] 2024/01/27 - 12:40:55 | 200 |     181.422µs |       127.0.0.1 | POST     "/api/show"
2024/01/27 12:40:55 cpu_common.go:11: INFO CPU has AVX2
loading library /tmp/ollama1905349292/rocm_v5/libext_server.so
2024/01/27 12:40:55 llm.go:152: WARN Failed to load dynamic library /tmp/ollama1905349292/rocm_v5/libext_server.so  Unable to load dynamic libr
ary: Unable to load dynamic server library: libhipblas.so.1: cannot open shared object file: No such file or directory
loading library /tmp/ollama1905349292/rocm_v6/libext_server.so
2024/01/27 12:40:55 llm.go:152: WARN Failed to load dynamic library /tmp/ollama1905349292/rocm_v6/libext_server.so  Unable to load dynamic libr
ary: Unable to load dynamic server library: libhipblas.so.2: cannot open shared object file: No such file or directory
loading library /tmp/ollama1905349292/cpu_avx2/libext_server.so
2024/01/27 12:40:55 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama1905349292/cpu_avx2/libext_server.so
2024/01/27 12:40:55 dyn_ext_server.go:145: INFO Initializing llama server
llama_model_loader: loaded meta data with 22 key-value pairs and 325 tensors from /home/matt/.ollama/models/blobs/sha256:4eca7304a07a42c48887f1
59ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = phi2
llama_model_loader: - kv   1:                               general.name str              = Phi2
llama_model_loader: - kv   2:                        phi2.context_length u32              = 2048
llama_model_loader: - kv   3:                      phi2.embedding_length u32              = 2560
llama_model_loader: - kv   4:                   phi2.feed_forward_length u32              = 10240
llama_model_loader: - kv   5:                           phi2.block_count u32              = 32
llama_model_loader: - kv   6:                  phi2.attention.head_count u32              = 32
llama_model_loader: - kv   7:               phi2.attention.head_count_kv u32              = 32
llama_model_loader: - kv   8:          phi2.attention.layer_norm_epsilon f32              = 0.000010
llama_model_loader: - kv   9:                  phi2.rope.dimension_count u32              = 32
llama_model_loader: - kv  10:                          general.file_type u32              = 2
llama_model_loader: - kv  11:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,51200]   = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr[i32,51200]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  15:                      tokenizer.ggml.merges arr[str,50000]   = ["Ġ t", "Ġ a", "h e", "i n", "r e",...
llama_model_loader: - kv  16:                tokenizer.ggml.bos_token_id u32              = 50256
llama_model_loader: - kv  17:                tokenizer.ggml.eos_token_id u32              = 50295
llama_model_loader: - kv  18:            tokenizer.ggml.unknown_token_id u32              = 50256
llama_model_loader: - kv  19:            tokenizer.ggml.padding_token_id u32              = 50256
llama_model_loader: - kv  20:                    tokenizer.chat_template str              = {{ bos_token }}{%- set ns = namespace...
llama_model_loader: - kv  21:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  195 tensors
llama_model_loader: - type q4_0:  129 tensors
llama_model_loader: - type q6_K:    1 tensors
llm_load_vocab: mismatch in special tokens definition ( 910/51200 vs 944/51200 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = phi2
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 51200
llm_load_print_meta: n_merges         = 50000
llm_load_print_meta: n_ctx_train      = 2048
llm_load_print_meta: n_embd           = 2560
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 32
llm_load_print_meta: n_layer          = 32
llm_load_print_meta: n_rot            = 32
llm_load_print_meta: n_embd_head_k    = 80
llm_load_print_meta: n_embd_head_v    = 80
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: n_embd_k_gqa     = 2560
llm_load_print_meta: n_embd_v_gqa     = 2560
llm_load_print_meta: f_norm_eps       = 1.0e-05
llm_load_print_meta: f_norm_rms_eps   = 0.0e+00
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff             = 10240
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 2048
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: model type       = 3B
llm_load_print_meta: model ftype      = Q4_0
llm_load_print_meta: model params     = 2.78 B
llm_load_print_meta: model size       = 1.49 GiB (4.61 BPW)
llm_load_print_meta: general.name     = Phi2
llm_load_print_meta: BOS token        = 50256 '<|endoftext|>'
llm_load_print_meta: EOS token        = 50295 '<|im_end|>'
llm_load_print_meta: UNK token        = 50256 '<|endoftext|>'
llm_load_print_meta: PAD token        = 50256 '<|endoftext|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_tensors: ggml ctx size =    0.12 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors:        CPU buffer size =  1526.50 MiB
...........................................................................................
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:        CPU KV buffer size =   640.00 MiB
llama_new_context_with_model: KV self size  =  640.00 MiB, K (f16):  320.00 MiB, V (f16):  320.00 MiB
llama_new_context_with_model:        CPU input buffer size   =     9.01 MiB
llama_new_context_with_model:        CPU compute buffer size =   158.00 MiB
llama_new_context_with_model: graph splits (measure): 1
2024/01/27 12:40:56 dyn_ext_server.go:156: INFO Starting llama main loop

The logs state offloaded 33/33 layers to GPU and there is no indication of a CPU fallback, at least from what I can see and understand.

To explain my reasoning, I have tried a prompt such as "tell me a long story" and monitored VRAM usage and GPU load with rocm-smi, which both stay at 1%. However, looking at top, ollama is using 400% CPU so I assume 4 cores. Between running and terminating the server process, I can see 1.4 GB of RAM being released.

There are no issues with building the binary locally however, but it may be worthwhile to document what libraries are required for using ollama with ROCm.

<!-- gh-comment-id:1913148053 --> @matteopt commented on GitHub (Jan 27, 2024): Sorry, quick update, I tried a much smaller model advertised in the README, `dolphin-phi`, and I was able to use the locally-built binary and achieve great performance. Apologies but I can only describe it as "very fast text". I assume it is using ROCm correctly. Using the provided `v0.1.22` binary with `hipblas` installed, I encounter the `free(): invalid pointer` error. Uninstalling `hipblas` and using the provided `v0.1.22` binary works, but performance is degraded, and I can see that VRAM is not used at all and the GPU is not under load, so I assume it falls back to CPU + RAM. However, it seems the logs are misleading. See full server logs: ``` 2024/01/27 12:40:52 images.go:857: INFO total blobs: 10 2024/01/27 12:40:52 images.go:864: INFO total unused blobs removed: 0 2024/01/27 12:40:52 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22) 2024/01/27 12:40:52 payload_common.go:106: INFO Extracting dynamic libraries... 2024/01/27 12:40:54 payload_common.go:145: INFO Dynamic LLM libraries [cpu_avx rocm_v6 cpu cuda_v11 cpu_avx2 rocm_v5] 2024/01/27 12:40:54 gpu.go:94: INFO Detecting GPU type 2024/01/27 12:40:54 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so 2024/01/27 12:40:54 gpu.go:282: INFO Discovered GPU libraries: [] 2024/01/27 12:40:54 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so 2024/01/27 12:40:54 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0] 2024/01/27 12:40:54 gpu.go:109: INFO Radeon GPU detected [GIN] 2024/01/27 - 12:40:55 | 200 | 18.591µs | 127.0.0.1 | HEAD "/" [GIN] 2024/01/27 - 12:40:55 | 200 | 343.714µs | 127.0.0.1 | POST "/api/show" [GIN] 2024/01/27 - 12:40:55 | 200 | 181.422µs | 127.0.0.1 | POST "/api/show" 2024/01/27 12:40:55 cpu_common.go:11: INFO CPU has AVX2 loading library /tmp/ollama1905349292/rocm_v5/libext_server.so 2024/01/27 12:40:55 llm.go:152: WARN Failed to load dynamic library /tmp/ollama1905349292/rocm_v5/libext_server.so Unable to load dynamic libr ary: Unable to load dynamic server library: libhipblas.so.1: cannot open shared object file: No such file or directory loading library /tmp/ollama1905349292/rocm_v6/libext_server.so 2024/01/27 12:40:55 llm.go:152: WARN Failed to load dynamic library /tmp/ollama1905349292/rocm_v6/libext_server.so Unable to load dynamic libr ary: Unable to load dynamic server library: libhipblas.so.2: cannot open shared object file: No such file or directory loading library /tmp/ollama1905349292/cpu_avx2/libext_server.so 2024/01/27 12:40:55 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama1905349292/cpu_avx2/libext_server.so 2024/01/27 12:40:55 dyn_ext_server.go:145: INFO Initializing llama server llama_model_loader: loaded meta data with 22 key-value pairs and 325 tensors from /home/matt/.ollama/models/blobs/sha256:4eca7304a07a42c48887f1 59ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = phi2 llama_model_loader: - kv 1: general.name str = Phi2 llama_model_loader: - kv 2: phi2.context_length u32 = 2048 llama_model_loader: - kv 3: phi2.embedding_length u32 = 2560 llama_model_loader: - kv 4: phi2.feed_forward_length u32 = 10240 llama_model_loader: - kv 5: phi2.block_count u32 = 32 llama_model_loader: - kv 6: phi2.attention.head_count u32 = 32 llama_model_loader: - kv 7: phi2.attention.head_count_kv u32 = 32 llama_model_loader: - kv 8: phi2.attention.layer_norm_epsilon f32 = 0.000010 llama_model_loader: - kv 9: phi2.rope.dimension_count u32 = 32 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: tokenizer.ggml.add_bos_token bool = false llama_model_loader: - kv 12: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,51200] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,51200] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 15: tokenizer.ggml.merges arr[str,50000] = ["Ġ t", "Ġ a", "h e", "i n", "r e",... llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32 = 50256 llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32 = 50295 llama_model_loader: - kv 18: tokenizer.ggml.unknown_token_id u32 = 50256 llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 50256 llama_model_loader: - kv 20: tokenizer.chat_template str = {{ bos_token }}{%- set ns = namespace... llama_model_loader: - kv 21: general.quantization_version u32 = 2 llama_model_loader: - type f32: 195 tensors llama_model_loader: - type q4_0: 129 tensors llama_model_loader: - type q6_K: 1 tensors llm_load_vocab: mismatch in special tokens definition ( 910/51200 vs 944/51200 ). llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = phi2 llm_load_print_meta: vocab type = BPE llm_load_print_meta: n_vocab = 51200 llm_load_print_meta: n_merges = 50000 llm_load_print_meta: n_ctx_train = 2048 llm_load_print_meta: n_embd = 2560 llm_load_print_meta: n_head = 32 llm_load_print_meta: n_head_kv = 32 llm_load_print_meta: n_layer = 32 llm_load_print_meta: n_rot = 32 llm_load_print_meta: n_embd_head_k = 80 llm_load_print_meta: n_embd_head_v = 80 llm_load_print_meta: n_gqa = 1 llm_load_print_meta: n_embd_k_gqa = 2560 llm_load_print_meta: n_embd_v_gqa = 2560 llm_load_print_meta: f_norm_eps = 1.0e-05 llm_load_print_meta: f_norm_rms_eps = 0.0e+00 llm_load_print_meta: f_clamp_kqv = 0.0e+00 llm_load_print_meta: f_max_alibi_bias = 0.0e+00 llm_load_print_meta: n_ff = 10240 llm_load_print_meta: n_expert = 0 llm_load_print_meta: n_expert_used = 0 llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 10000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_yarn_orig_ctx = 2048 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: model type = 3B llm_load_print_meta: model ftype = Q4_0 llm_load_print_meta: model params = 2.78 B llm_load_print_meta: model size = 1.49 GiB (4.61 BPW) llm_load_print_meta: general.name = Phi2 llm_load_print_meta: BOS token = 50256 '<|endoftext|>' llm_load_print_meta: EOS token = 50295 '<|im_end|>' llm_load_print_meta: UNK token = 50256 '<|endoftext|>' llm_load_print_meta: PAD token = 50256 '<|endoftext|>' llm_load_print_meta: LF token = 128 'Ä' llm_load_tensors: ggml ctx size = 0.12 MiB llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading non-repeating layers to GPU llm_load_tensors: offloaded 33/33 layers to GPU llm_load_tensors: CPU buffer size = 1526.50 MiB ........................................................................................... llama_new_context_with_model: n_ctx = 2048 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: CPU KV buffer size = 640.00 MiB llama_new_context_with_model: KV self size = 640.00 MiB, K (f16): 320.00 MiB, V (f16): 320.00 MiB llama_new_context_with_model: CPU input buffer size = 9.01 MiB llama_new_context_with_model: CPU compute buffer size = 158.00 MiB llama_new_context_with_model: graph splits (measure): 1 2024/01/27 12:40:56 dyn_ext_server.go:156: INFO Starting llama main loop ``` The logs state `offloaded 33/33 layers to GPU` and there is no indication of a CPU fallback, at least from what I can see and understand. To explain my reasoning, I have tried a prompt such as "tell me a long story" and monitored VRAM usage and GPU load with `rocm-smi`, which both stay at 1%. However, looking at `top`, ollama is using 400% CPU so I assume 4 cores. Between running and terminating the server process, I can see 1.4 GB of RAM being released. There are no issues with building the binary locally however, but it may be worthwhile to document what libraries are required for using ollama with ROCm.
Author
Owner

@chaource commented on GitHub (Jan 27, 2024):

Chiming in to say that I managed to pass my 7900xtx to the ollama/ollama:0.1.22-rocm docker image. However I had to explicitly pass the device corresponding to my graphic card:

docker run -d --device /dev/kfd --device /dev/dri/renderD128 -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:0.1.22-rocm
<!-- gh-comment-id:1913148961 --> @chaource commented on GitHub (Jan 27, 2024): Chiming in to say that I managed to pass my 7900xtx to the `ollama/ollama:0.1.22-rocm` docker image. However I had to explicitly pass the device corresponding to my graphic card: ```bash docker run -d --device /dev/kfd --device /dev/dri/renderD128 -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:0.1.22-rocm ```
Author
Owner

@kylianpl commented on GitHub (Jan 27, 2024):

v0.1.22 still doesn't work on "stable" arch linux (ollama-log-0.1.22.txt basically the same error).
After installing a fresh arch and adding the extra-testing repo, which contains the 6.0.0 version of hipblas (as well as the deps...), i can confirm it working on v0.1.21 pre-release and v0.1.22.

<!-- gh-comment-id:1913169698 --> @kylianpl commented on GitHub (Jan 27, 2024): v0.1.22 still doesn't work on "stable" arch linux ([ollama-log-0.1.22.txt](https://github.com/ollama/ollama/files/14072560/ollama-log-0.1.22.txt) basically the same error). After installing a fresh arch and adding the `extra-testing` repo, which contains the 6.0.0 version of hipblas (as well as the deps...), i can confirm it working on v0.1.21 pre-release and v0.1.22.
Author
Owner

@dhiltgen commented on GitHub (Jan 27, 2024):

@mmmpx from your output, it looks like you have a v6 driver, with v5 libraries on arch linux. Building from source works, but you're unable to get our pre-built binaries to work. (correct me if I got any of that wrong.) I'm curious if you're able to test our container image and if that works on your v6 driver?

@mlvl42 just to confirm, you're seeing it load on your GPU, no crashes, and everything is stable. Can you share what driver version and OS you're running?

@kylianpl that's great to hear! So arch-linux with the full v6 stack (driver and libraries) is working for you with our pre-built binaries, correct? You see it load on the GPU and no crashes, with the rocm_v6 llm library.

<!-- gh-comment-id:1913253332 --> @dhiltgen commented on GitHub (Jan 27, 2024): @mmmpx from your output, it looks like you have a v6 driver, with v5 libraries on arch linux. Building from source works, but you're unable to get our pre-built binaries to work. (correct me if I got any of that wrong.) I'm curious if you're able to test our container image and if that works on your v6 driver? @mlvl42 just to confirm, you're seeing it load on your GPU, no crashes, and everything is stable. Can you share what driver version and OS you're running? @kylianpl that's great to hear! So arch-linux with the full v6 stack (driver and libraries) is working for you with our pre-built binaries, correct? You see it load on the GPU and no crashes, with the rocm_v6 llm library.
Author
Owner

@chaource commented on GitHub (Jan 27, 2024):

@mlvl42 just to confirm, you're seeing it load on your GPU, no crashes, and everything is stable. Can you share what driver version and OS you're running?

Correct, no crashes and so far everything looks stable using the docker image you mentioned. I am running arch linux and my driver version is 6.6.9-arch1-1:

$ rocm-smi --showdriverversion --showproductname --showhw


========================= ROCm System Management Interface =========================
============================== Concise Hardware Info ===============================
GPU  DID   GFX RAS  SDMA RAS  UMC RAS  VBIOS             BUS
0    164e  N/A      N/A       N/A      102-RAPHAEL-008   0000
1    744c  N/A      N/A       N/A      113-EXT78395-001  0000
====================================================================================
=========================== Version of System Component ============================
Driver version: 6.6.9-arch1-1
====================================================================================
=================================== Product Info ===================================
GPU[0]		: Card series: 		Raphael
GPU[0]		: Card model: 		GA-MA78GM-S2H Motherboard
GPU[0]		: Card vendor: 		Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]		: Card SKU: 		RAPHAEL
GPU[1]		: Card series: 		Navi 31 [Radeon RX 7900 XT/7900 XTX]
GPU[1]		: Card model: 		0x240e
GPU[1]		: Card vendor: 		Advanced Micro Devices, Inc. [AMD/ATI]
GPU[1]		: Card SKU: 		EXT78395
====================================================================================
=============================== End of ROCm SMI Log ================================
<!-- gh-comment-id:1913280390 --> @chaource commented on GitHub (Jan 27, 2024): > @mlvl42 just to confirm, you're seeing it load on your GPU, no crashes, and everything is stable. Can you share what driver version and OS you're running? Correct, no crashes and so far everything looks stable using the docker image you mentioned. I am running arch linux and my driver version is `6.6.9-arch1-1`: ``` $ rocm-smi --showdriverversion --showproductname --showhw ========================= ROCm System Management Interface ========================= ============================== Concise Hardware Info =============================== GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS 0 164e N/A N/A N/A 102-RAPHAEL-008 0000 1 744c N/A N/A N/A 113-EXT78395-001 0000 ==================================================================================== =========================== Version of System Component ============================ Driver version: 6.6.9-arch1-1 ==================================================================================== =================================== Product Info =================================== GPU[0] : Card series: Raphael GPU[0] : Card model: GA-MA78GM-S2H Motherboard GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[0] : Card SKU: RAPHAEL GPU[1] : Card series: Navi 31 [Radeon RX 7900 XT/7900 XTX] GPU[1] : Card model: 0x240e GPU[1] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[1] : Card SKU: EXT78395 ==================================================================================== =============================== End of ROCm SMI Log ================================ ```
Author
Owner

@kylianpl commented on GitHub (Jan 27, 2024):

I just tried the docker command above, worked perfectly with gpu acceleration on archlinux without the testing repo (the arch linux with old v5 ROCm)
I also confirm that the full v6 stack in the testing repo of arch is working for the pre-built binary with gpu acceleration

To resume, the 3 current ways to run ollama with gpu acceleration on arch seems to be

<!-- gh-comment-id:1913302010 --> @kylianpl commented on GitHub (Jan 27, 2024): I just tried the docker command above, worked perfectly with gpu acceleration on archlinux without the testing repo (the arch linux with old v5 ROCm) I also confirm that the full v6 stack in the testing repo of arch is working for the pre-built binary with gpu acceleration To resume, the 3 current ways to run ollama with gpu acceleration on arch seems to be - Running the docker with the command [above](https://github.com/ollama/ollama/issues/2165#issuecomment-1913148961) - [compiling ollama](https://github.com/ollama/ollama/blob/main/docs/development.md) with the same machine it will be running on - [use the extra-testing repo](https://wiki.archlinux.org/title/Official_repositories#Testing_repositories), which could make your system unstable
Author
Owner

@dhiltgen commented on GitHub (Jan 27, 2024):

My current theory is there's some forwards-incompatible variation sneaking in somewhere in the ROCm v5 libraries, and we're building with version(s) that are ~newer than what's in the arch-linux repo(s).

To test that theory, would it be possible for someone who's hitting the crash on arch-linux and is building from source to try building using our container? First build using BUILD_ARCH=amd64 ./scripts/build_linux.sh which should produce a ./dist/ollama-linux-amd64 binary that will crash on your system. Confirm that first. Then modify the Dockerfile around here so that we're using an older tag for the v5 ROCm library. Looking at Docker Hub https://hub.docker.com/r/rocm/dev-centos-7/tags it seems plausible tags to try might be 5.6.1-complete or maybe 5.5-complete. With any luck, building with an older base image might just do the trick.

<!-- gh-comment-id:1913308992 --> @dhiltgen commented on GitHub (Jan 27, 2024): My current theory is there's some forwards-incompatible variation sneaking in somewhere in the ROCm v5 libraries, and we're building with version(s) that are ~newer than what's in the arch-linux repo(s). To test that theory, would it be possible for someone who's hitting the crash on arch-linux and is building from source to try building using our container? First build using `BUILD_ARCH=amd64 ./scripts/build_linux.sh` which should produce a `./dist/ollama-linux-amd64` binary that will crash on your system. Confirm that first. Then modify the Dockerfile around [here](https://github.com/ollama/ollama/blob/main/Dockerfile#L31) so that we're using an older tag for the v5 ROCm library. Looking at Docker Hub https://hub.docker.com/r/rocm/dev-centos-7/tags it seems plausible tags to try might be `5.6.1-complete` or maybe `5.5-complete`. With any luck, building with an older base image might just do the trick.
Author
Owner

@matteopt commented on GitHub (Jan 28, 2024):

Hi @dhiltgen, I have tried the Docker image you provide, ollama/ollama:0.1.22-rocm, and it works fine. I ran these inside the container too:

> rocm-smi --showdriverversion --showproductname --showhw


========================= ROCm System Management Interface =========================
============================== Concise Hardware Info ===============================
GPU  DID   GFX RAS  SDMA RAS  UMC RAS  VBIOS          BUS
0    744c  N/A      N/A       N/A      113-D70401-00  0000:03:00.0
====================================================================================
=========================== Version of System Component ============================
Driver version: 6.7.1-arch1-1
====================================================================================
=================================== Product Info ===================================
GPU[0]          : Card series:          Navi 31 [Radeon RX 7900 XT/7900 XTX/7900M]
GPU[0]          : Card model:           0x1002
GPU[0]          : Card vendor:          Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]          : Card SKU:             D70401
====================================================================================
=============================== End of ROCm SMI Log ================================
> yum list installed | grep -e hip -e rocm -e '^roc'
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
hip-devel.x86_64                    5.7.31921.50701-98.el7       @ROCm
hip-doc.x86_64                      5.7.31921.50701-98.el7       @ROCm
hip-runtime-amd.x86_64              5.7.31921.50701-98.el7       @ROCm
hip-samples.x86_64                  5.7.31921.50701-98.el7       @ROCm
hipblas.x86_64                      1.1.0.50701-98.el7           @ROCm
hipblas-devel.x86_64                1.1.0.50701-98.el7           @ROCm
hipblaslt.x86_64                    0.3.0.50701-98.el7           @ROCm
hipblaslt-devel.x86_64              0.3.0.50701-98.el7           @ROCm
hipcc.x86_64                        1.0.0.50701-98.el7           @ROCm
hipcub-devel.x86_64                 2.13.1.50701-98.el7          @ROCm
hipfft.x86_64                       1.0.12.50701-98.el7          @ROCm
hipfft-devel.x86_64                 1.0.12.50701-98.el7          @ROCm
hipify-clang.x86_64                 17.0.0.50701-98.el7          @ROCm
hipsolver.x86_64                    1.8.2.50701-98.el7           @ROCm
hipsolver-devel.x86_64              1.8.2.50701-98.el7           @ROCm
hipsparse.x86_64                    2.3.8.50701-98.el7           @ROCm
hipsparse-devel.x86_64              2.3.8.50701-98.el7           @ROCm
miopen-hip.x86_64                   2.20.0.50701-98.el7          @ROCm
miopen-hip-devel.x86_64             2.20.0.50701-98.el7          @ROCm
rocalution.x86_64                   2.1.11.50701-98.el7          @ROCm
rocalution-devel.x86_64             2.1.11.50701-98.el7          @ROCm
rocblas.x86_64                      3.1.0.50701-98.el7           @ROCm
rocblas-devel.x86_64                3.1.0.50701-98.el7           @ROCm
rocfft.x86_64                       1.0.23.50701-98.el7          @ROCm
rocfft-devel.x86_64                 1.0.23.50701-98.el7          @ROCm
rocm-clang-ocl.x86_64               0.5.0.50701-98.el7           @ROCm
rocm-cmake.x86_64                   0.10.0.50701-98.el7          @ROCm
rocm-core.x86_64                    5.7.1.50701-98.el7           @ROCm
rocm-dbgapi.x86_64                  0.70.1.50701-98.el7          @ROCm
rocm-debug-agent.x86_64             2.0.3.50701-98.el7           @ROCm
rocm-dev.x86_64                     5.7.1.50701-98.el7           @ROCm
rocm-device-libs.x86_64             1.0.0.50701-98.el7           @ROCm
rocm-gdb.x86_64                     13.2.50701-98.el7            @ROCm
rocm-libs.x86_64                    5.7.1.50701-98.el7           @ROCm
rocm-llvm.x86_64                    17.0.0.23382.50701-98.el7    @ROCm
rocm-ocl-icd.x86_64                 2.0.0.50701-98.el7           @ROCm
rocm-opencl.x86_64                  2.0.0.50701-98.el7           @ROCm
rocm-opencl-devel.x86_64            2.0.0.50701-98.el7           @ROCm
rocm-smi-lib.x86_64                 5.0.0.50701-98.el7           @ROCm
rocm-utils.x86_64                   5.7.1.50701-98.el7           @ROCm
rocminfo.x86_64                     1.0.0.50701-98.el7           @ROCm
rocprim-devel.x86_64                2.13.1.50701-98.el7          @ROCm
rocprofiler.x86_64                  2.0.0.50701-98.el7           @ROCm
rocprofiler-devel.x86_64            2.0.0.50701-98.el7           @ROCm
rocprofiler-plugins.x86_64          2.0.0.50701-98.el7           @ROCm
rocrand.x86_64                      2.10.17.50701-98.el7         @ROCm
rocrand-devel.x86_64                2.10.17.50701-98.el7         @ROCm
rocsolver.x86_64                    3.23.0.50701-98.el7          @ROCm
rocsolver-devel.x86_64              3.23.0.50701-98.el7          @ROCm
rocsparse.x86_64                    2.5.4.50701-98.el7           @ROCm
rocsparse-devel.x86_64              2.5.4.50701-98.el7           @ROCm
rocthrust-devel.x86_64              2.18.0.50701-98.el7          @ROCm
roctracer.x86_64                    4.1.0.50701-98.el7           @ROCm
roctracer-devel.x86_64              4.1.0.50701-98.el7           @ROCm
rocwmma-devel.x86_64                1.2.0.50701-98.el7           @ROCm

I can see that the libraries inside the container have quite inconsistent versions, but it still works. As for the "driver version" show by rocm-smi, I mentioned that I think that is simply the kernel version, and doesn't have much to do with ROCm. That is my understanding but please let me know otherwise. It still shows up as arch because the GPU is passed through, I imagine.

I tried building using the Dockerfile using the command you provided but I'm running into errors.

Step 19/86 : COPY ./scripts/rh_linux_deps.sh /
failed to get destination image "sha256:f8b7564b3710c06848eaff0b68bd3a2ceb75285a598abffa9d775148784dc31a": image with reference sha256:f8b7564b3710c06848eaff0b68bd3a2ceb75285a598abffa9d775148784dc31a was found but does not match the specified platform: wanted linux/amd64, actual: linux/arm64
<!-- gh-comment-id:1913603793 --> @matteopt commented on GitHub (Jan 28, 2024): Hi @dhiltgen, I have tried the Docker image you provide, `ollama/ollama:0.1.22-rocm`, and it works fine. I ran these inside the container too: ``` > rocm-smi --showdriverversion --showproductname --showhw ========================= ROCm System Management Interface ========================= ============================== Concise Hardware Info =============================== GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS 0 744c N/A N/A N/A 113-D70401-00 0000:03:00.0 ==================================================================================== =========================== Version of System Component ============================ Driver version: 6.7.1-arch1-1 ==================================================================================== =================================== Product Info =================================== GPU[0] : Card series: Navi 31 [Radeon RX 7900 XT/7900 XTX/7900M] GPU[0] : Card model: 0x1002 GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI] GPU[0] : Card SKU: D70401 ==================================================================================== =============================== End of ROCm SMI Log ================================ ``` ``` > yum list installed | grep -e hip -e rocm -e '^roc' Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast hip-devel.x86_64 5.7.31921.50701-98.el7 @ROCm hip-doc.x86_64 5.7.31921.50701-98.el7 @ROCm hip-runtime-amd.x86_64 5.7.31921.50701-98.el7 @ROCm hip-samples.x86_64 5.7.31921.50701-98.el7 @ROCm hipblas.x86_64 1.1.0.50701-98.el7 @ROCm hipblas-devel.x86_64 1.1.0.50701-98.el7 @ROCm hipblaslt.x86_64 0.3.0.50701-98.el7 @ROCm hipblaslt-devel.x86_64 0.3.0.50701-98.el7 @ROCm hipcc.x86_64 1.0.0.50701-98.el7 @ROCm hipcub-devel.x86_64 2.13.1.50701-98.el7 @ROCm hipfft.x86_64 1.0.12.50701-98.el7 @ROCm hipfft-devel.x86_64 1.0.12.50701-98.el7 @ROCm hipify-clang.x86_64 17.0.0.50701-98.el7 @ROCm hipsolver.x86_64 1.8.2.50701-98.el7 @ROCm hipsolver-devel.x86_64 1.8.2.50701-98.el7 @ROCm hipsparse.x86_64 2.3.8.50701-98.el7 @ROCm hipsparse-devel.x86_64 2.3.8.50701-98.el7 @ROCm miopen-hip.x86_64 2.20.0.50701-98.el7 @ROCm miopen-hip-devel.x86_64 2.20.0.50701-98.el7 @ROCm rocalution.x86_64 2.1.11.50701-98.el7 @ROCm rocalution-devel.x86_64 2.1.11.50701-98.el7 @ROCm rocblas.x86_64 3.1.0.50701-98.el7 @ROCm rocblas-devel.x86_64 3.1.0.50701-98.el7 @ROCm rocfft.x86_64 1.0.23.50701-98.el7 @ROCm rocfft-devel.x86_64 1.0.23.50701-98.el7 @ROCm rocm-clang-ocl.x86_64 0.5.0.50701-98.el7 @ROCm rocm-cmake.x86_64 0.10.0.50701-98.el7 @ROCm rocm-core.x86_64 5.7.1.50701-98.el7 @ROCm rocm-dbgapi.x86_64 0.70.1.50701-98.el7 @ROCm rocm-debug-agent.x86_64 2.0.3.50701-98.el7 @ROCm rocm-dev.x86_64 5.7.1.50701-98.el7 @ROCm rocm-device-libs.x86_64 1.0.0.50701-98.el7 @ROCm rocm-gdb.x86_64 13.2.50701-98.el7 @ROCm rocm-libs.x86_64 5.7.1.50701-98.el7 @ROCm rocm-llvm.x86_64 17.0.0.23382.50701-98.el7 @ROCm rocm-ocl-icd.x86_64 2.0.0.50701-98.el7 @ROCm rocm-opencl.x86_64 2.0.0.50701-98.el7 @ROCm rocm-opencl-devel.x86_64 2.0.0.50701-98.el7 @ROCm rocm-smi-lib.x86_64 5.0.0.50701-98.el7 @ROCm rocm-utils.x86_64 5.7.1.50701-98.el7 @ROCm rocminfo.x86_64 1.0.0.50701-98.el7 @ROCm rocprim-devel.x86_64 2.13.1.50701-98.el7 @ROCm rocprofiler.x86_64 2.0.0.50701-98.el7 @ROCm rocprofiler-devel.x86_64 2.0.0.50701-98.el7 @ROCm rocprofiler-plugins.x86_64 2.0.0.50701-98.el7 @ROCm rocrand.x86_64 2.10.17.50701-98.el7 @ROCm rocrand-devel.x86_64 2.10.17.50701-98.el7 @ROCm rocsolver.x86_64 3.23.0.50701-98.el7 @ROCm rocsolver-devel.x86_64 3.23.0.50701-98.el7 @ROCm rocsparse.x86_64 2.5.4.50701-98.el7 @ROCm rocsparse-devel.x86_64 2.5.4.50701-98.el7 @ROCm rocthrust-devel.x86_64 2.18.0.50701-98.el7 @ROCm roctracer.x86_64 4.1.0.50701-98.el7 @ROCm roctracer-devel.x86_64 4.1.0.50701-98.el7 @ROCm rocwmma-devel.x86_64 1.2.0.50701-98.el7 @ROCm ``` I can see that the libraries inside the container have quite inconsistent versions, but it still works. As for the "driver version" show by `rocm-smi`, I mentioned that I think that is simply the kernel version, and doesn't have much to do with ROCm. That is my understanding but please let me know otherwise. It still shows up as arch because the GPU is passed through, I imagine. I tried building using the Dockerfile using the command you provided but I'm running into errors. ``` Step 19/86 : COPY ./scripts/rh_linux_deps.sh / failed to get destination image "sha256:f8b7564b3710c06848eaff0b68bd3a2ceb75285a598abffa9d775148784dc31a": image with reference sha256:f8b7564b3710c06848eaff0b68bd3a2ceb75285a598abffa9d775148784dc31a was found but does not match the specified platform: wanted linux/amd64, actual: linux/arm64 ```
Author
Owner

@dhiltgen commented on GitHub (Jan 28, 2024):

I tried building using the Dockerfile using the command you provided but I'm running into errors.

The script is primarily intended for arm mac's which can emulate x86 via rosetta thus allowing us to build both arm and x86 linux binaries. The error you got seems to imply you may have omitted the BUILD_ARCH=amd64 to only build x86. Without that variable set, the script is going to try to compile arm too, and I'm pretty sure that wont work on a standard Docker setup on linux x86. That said, the script does x86 first, so it may have produced a binary in ./dist/ before it failed to build arm.

<!-- gh-comment-id:1913738441 --> @dhiltgen commented on GitHub (Jan 28, 2024): > I tried building using the Dockerfile using the command you provided but I'm running into errors. The script is primarily intended for arm mac's which can emulate x86 via rosetta thus allowing us to build both arm and x86 linux binaries. The error you got seems to imply you may have omitted the `BUILD_ARCH=amd64` to only build x86. Without that variable set, the script is going to try to compile arm too, and I'm pretty sure that wont work on a standard Docker setup on linux x86. That said, the script does x86 first, so it may have produced a binary in `./dist/` before it failed to build arm.
Author
Owner

@musiaht commented on GitHub (Jan 30, 2024):

I don't know if this helps, but I had the same issue when running off of the version from the main install script; but I was able to get it working by compiling from source. I had to explicitly set the AMDGPU_TARGETS to the names of the agents found on the output of rocminfo (as described at https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd). I had to make sure that the variable was identifying the GPU and not the integrated graphics.

So when I encounter this error, I see the following from journalctl -u ollama

Jan 29 23:45:00 somehostname systemd[1]: Started Ollama Service.
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:857: INFO total blobs: 5
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:864: INFO total unused blobs removed: 0
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22)
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 payload_common.go:106: INFO Extracting dynamic libraries...
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 payload_common.go:145: INFO Dynamic LLM libraries [cuda_v11 rocm_v6 cpu cpu_avx rocm_v5 cpu_avx2]
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:94: INFO Detecting GPU type
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: []
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702]
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:109: INFO Radeon GPU detected
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |      29.059µs |       127.0.0.1 | HEAD     "/"
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |     707.873µs |       127.0.0.1 | POST     "/api/show"
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |     181.471µs |       127.0.0.1 | POST     "/api/show"
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 cpu_common.go:11: INFO CPU has AVX2
Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama2832838508/rocm_v5/libext_server.so
Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:145: INFO Initializing llama server
Jan 29 23:45:03 somehostname ollama[4824]: free(): invalid pointer
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Main process exited, code=dumped, status=6/ABRT
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Failed with result 'core-dump'.
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time.
Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1.
Jan 29 23:45:06 somehostname systemd[1]: Stopped Ollama Service.
Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time.
Jan 29 23:45:06 somehostname systemd[1]: Started Ollama Service.
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:857: INFO total blobs: 5
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:864: INFO total unused blobs removed: 0
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22)
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 payload_common.go:106: INFO Extracting dynamic libraries...
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 payload_common.go:145: INFO Dynamic LLM libraries [rocm_v6 cuda_v11 cpu cpu_avx cpu_avx2]
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:94: INFO Detecting GPU type
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: []
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702]
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:109: INFO Radeon GPU detected
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0

I guess by default, it uses the integrated graphics from my CPU and runs out of memory.

This is on Ubuntu 22.04

<!-- gh-comment-id:1916301308 --> @musiaht commented on GitHub (Jan 30, 2024): I don't know if this helps, but I had the same issue when running off of the version from the main install script; but I was able to get it working by compiling from source. I had to explicitly set the `AMDGPU_TARGETS` to the names of the agents found on the output of `rocminfo` (as described at https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd). I had to make sure that the variable was identifying the GPU and not the integrated graphics. So when I encounter this error, I see the following from ` journalctl -u ollama` ``` Jan 29 23:45:00 somehostname systemd[1]: Started Ollama Service. Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:857: INFO total blobs: 5 Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:864: INFO total unused blobs removed: 0 Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22) Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 payload_common.go:106: INFO Extracting dynamic libraries... Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 payload_common.go:145: INFO Dynamic LLM libraries [cuda_v11 rocm_v6 cpu cpu_avx rocm_v5 cpu_avx2] Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:94: INFO Detecting GPU type Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [] Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702] Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:109: INFO Radeon GPU detected Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 29.059µs | 127.0.0.1 | HEAD "/" Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 707.873µs | 127.0.0.1 | POST "/api/show" Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 181.471µs | 127.0.0.1 | POST "/api/show" Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 cpu_common.go:11: INFO CPU has AVX2 Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama2832838508/rocm_v5/libext_server.so Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:145: INFO Initializing llama server Jan 29 23:45:03 somehostname ollama[4824]: free(): invalid pointer Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Main process exited, code=dumped, status=6/ABRT Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Failed with result 'core-dump'. Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time. Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1. Jan 29 23:45:06 somehostname systemd[1]: Stopped Ollama Service. Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time. Jan 29 23:45:06 somehostname systemd[1]: Started Ollama Service. Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:857: INFO total blobs: 5 Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:864: INFO total unused blobs removed: 0 Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22) Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 payload_common.go:106: INFO Extracting dynamic libraries... Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 payload_common.go:145: INFO Dynamic LLM libraries [rocm_v6 cuda_v11 cpu cpu_avx cpu_avx2] Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:94: INFO Detecting GPU type Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [] Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702] Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:109: INFO Radeon GPU detected Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 ``` I guess by default, it uses the integrated graphics from my CPU and runs out of memory. This is on Ubuntu 22.04
Author
Owner

@dhiltgen commented on GitHub (Jan 30, 2024):

I guess by default, it uses the integrated graphics from my CPU and runs out of memory.

I don't think that's what's going wrong. We detected the integrated GPU, and since we didn't detect ROCR_VISIBLE_DEVICES set in the environment, we went ahead and set it to force ROCm to just use the discrete GPU. This started to work, but then we crashed with the free(): invalid pointer. My current theory is this is due to mismatched libraries on our build container image we use for the official builds vs. what is installed on your system. This may explain why building from source works since it's now linked against the correct version(s) of the various ROCm related libraries.

<!-- gh-comment-id:1917425964 --> @dhiltgen commented on GitHub (Jan 30, 2024): > I guess by default, it uses the integrated graphics from my CPU and runs out of memory. I don't think that's what's going wrong. We detected the integrated GPU, and since we didn't detect `ROCR_VISIBLE_DEVICES` set in the environment, we went ahead and set it to force ROCm to just use the discrete GPU. This started to work, but then we crashed with the `free(): invalid pointer`. My current theory is this is due to mismatched libraries on our build container image we use for the official builds vs. what is installed on your system. This may explain why building from source works since it's now linked against the correct version(s) of the various ROCm related libraries.
Author
Owner

@meminens commented on GitHub (Jan 31, 2024):

I am getting the same invalid pointer error using version 0.1.22. Posted some details here:

https://github.com/ollama/ollama/issues/2285

<!-- gh-comment-id:1918312531 --> @meminens commented on GitHub (Jan 31, 2024): I am getting the same invalid pointer error using version 0.1.22. Posted some details here: https://github.com/ollama/ollama/issues/2285
Author
Owner

@meminens commented on GitHub (Jan 31, 2024):

I don't know if this helps, but I had the same issue when running off of the version from the main install script; but I was able to get it working by compiling from source. I had to explicitly set the AMDGPU_TARGETS to the names of the agents found on the output of rocminfo (as described at https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd). I had to make sure that the variable was identifying the GPU and not the integrated graphics.

So when I encounter this error, I see the following from journalctl -u ollama

Jan 29 23:45:00 somehostname systemd[1]: Started Ollama Service.
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:857: INFO total blobs: 5
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:864: INFO total unused blobs removed: 0
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22)
Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 payload_common.go:106: INFO Extracting dynamic libraries...
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 payload_common.go:145: INFO Dynamic LLM libraries [cuda_v11 rocm_v6 cpu cpu_avx rocm_v5 cpu_avx2]
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:94: INFO Detecting GPU type
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: []
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702]
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:109: INFO Radeon GPU detected
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |      29.059µs |       127.0.0.1 | HEAD     "/"
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |     707.873µs |       127.0.0.1 | POST     "/api/show"
Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 |     181.471µs |       127.0.0.1 | POST     "/api/show"
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0
Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 cpu_common.go:11: INFO CPU has AVX2
Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama2832838508/rocm_v5/libext_server.so
Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:145: INFO Initializing llama server
Jan 29 23:45:03 somehostname ollama[4824]: free(): invalid pointer
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Main process exited, code=dumped, status=6/ABRT
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Failed with result 'core-dump'.
Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time.
Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1.
Jan 29 23:45:06 somehostname systemd[1]: Stopped Ollama Service.
Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time.
Jan 29 23:45:06 somehostname systemd[1]: Started Ollama Service.
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:857: INFO total blobs: 5
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:864: INFO total unused blobs removed: 0
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22)
Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 payload_common.go:106: INFO Extracting dynamic libraries...
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 payload_common.go:145: INFO Dynamic LLM libraries [rocm_v6 cuda_v11 cpu cpu_avx cpu_avx2]
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:94: INFO Detecting GPU type
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: []
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702]
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:109: INFO Radeon GPU detected
Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0

I guess by default, it uses the integrated graphics from my CPU and runs out of memory.

This is on Ubuntu 22.04

Can you show me how to compile it from source on Arch Linux? I have the same problem. Specifically, how to set the AMDGPU_TARGETS parameter. I found the build instructions here: https://github.com/ollama/ollama/blob/main/docs/development.md But it doesn't telllhow to set it.

<!-- gh-comment-id:1918313657 --> @meminens commented on GitHub (Jan 31, 2024): > I don't know if this helps, but I had the same issue when running off of the version from the main install script; but I was able to get it working by compiling from source. I had to explicitly set the `AMDGPU_TARGETS` to the names of the agents found on the output of `rocminfo` (as described at https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd). I had to make sure that the variable was identifying the GPU and not the integrated graphics. > > So when I encounter this error, I see the following from ` journalctl -u ollama` > > ``` > Jan 29 23:45:00 somehostname systemd[1]: Started Ollama Service. > Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:857: INFO total blobs: 5 > Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 images.go:864: INFO total unused blobs removed: 0 > Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22) > Jan 29 23:45:00 somehostname ollama[4824]: 2024/01/29 23:45:00 payload_common.go:106: INFO Extracting dynamic libraries... > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 payload_common.go:145: INFO Dynamic LLM libraries [cuda_v11 rocm_v6 cpu cpu_avx rocm_v5 cpu_avx2] > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:94: INFO Detecting GPU type > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [] > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702] > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:109: INFO Radeon GPU detected > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 > Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 29.059µs | 127.0.0.1 | HEAD "/" > Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 707.873µs | 127.0.0.1 | POST "/api/show" > Jan 29 23:45:02 somehostname ollama[4824]: [GIN] 2024/01/29 - 23:45:02 | 200 | 181.471µs | 127.0.0.1 | POST "/api/show" > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 > Jan 29 23:45:02 somehostname ollama[4824]: 2024/01/29 23:45:02 cpu_common.go:11: INFO CPU has AVX2 > Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:90: INFO Loading Dynamic llm server: /tmp/ollama2832838508/rocm_v5/libext_server.so > Jan 29 23:45:03 somehostname ollama[4824]: 2024/01/29 23:45:03 dyn_ext_server.go:145: INFO Initializing llama server > Jan 29 23:45:03 somehostname ollama[4824]: free(): invalid pointer > Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Main process exited, code=dumped, status=6/ABRT > Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Failed with result 'core-dump'. > Jan 29 23:45:03 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time. > Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1. > Jan 29 23:45:06 somehostname systemd[1]: Stopped Ollama Service. > Jan 29 23:45:06 somehostname systemd[1]: ollama.service: Consumed 3.177s CPU time. > Jan 29 23:45:06 somehostname systemd[1]: Started Ollama Service. > Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:857: INFO total blobs: 5 > Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 images.go:864: INFO total unused blobs removed: 0 > Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 routes.go:950: INFO Listening on 127.0.0.1:11434 (version 0.1.22) > Jan 29 23:45:06 somehostname ollama[4862]: 2024/01/29 23:45:06 payload_common.go:106: INFO Extracting dynamic libraries... > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 payload_common.go:145: INFO Dynamic LLM libraries [rocm_v6 cuda_v11 cpu cpu_avx cpu_avx2] > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:94: INFO Detecting GPU type > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library libnvidia-ml.so > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [] > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:236: INFO Searching for GPU management library librocm_smi64.so > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:282: INFO Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0.50702 /opt/rocm-5.7.2/lib/librocm_smi64.so.5.0.50702] > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:109: INFO Radeon GPU detected > Jan 29 23:45:08 somehostname ollama[4862]: 2024/01/29 23:45:08 gpu.go:171: INFO ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=0 > ``` > > I guess by default, it uses the integrated graphics from my CPU and runs out of memory. > > This is on Ubuntu 22.04 Can you show me how to compile it from source on Arch Linux? I have the same problem. Specifically, how to set the AMDGPU_TARGETS parameter. I found the build instructions here: https://github.com/ollama/ollama/blob/main/docs/development.md But it doesn't telllhow to set it.
Author
Owner

@meminens commented on GitHub (Jan 31, 2024):

Scratch that! I was able to figure out. Thanks for leaving me breadcrumbs to be able to sort this out. Unfortunately none of the packages worked but compiling from source with the AMDGPU_TARGETS parameter finally worked. GPU is fully utilized now!

<!-- gh-comment-id:1918324603 --> @meminens commented on GitHub (Jan 31, 2024): Scratch that! I was able to figure out. Thanks for leaving me breadcrumbs to be able to sort this out. Unfortunately none of the packages worked but compiling from source with the AMDGPU_TARGETS parameter finally worked. GPU is fully utilized now!
Author
Owner

@dhiltgen commented on GitHub (Feb 11, 2024):

I have a repro scenario, but it's based on an older card gfx803 which looks officially unsupported by ROCm these days, although getting it supported might be possible with workarounds. I'm going to split support for older cards out into a new ticket #2453, and focus on getting this free(): invalid pointer crash resolved for newer GPUs. Until we can add support for older cards we'll make sure we fallback to CPU if we detect one so it doesn't crash.

<!-- gh-comment-id:1937891223 --> @dhiltgen commented on GitHub (Feb 11, 2024): I have a repro scenario, but it's based on an older card `gfx803` which looks officially unsupported by ROCm these days, although getting it supported might be possible with workarounds. I'm going to split support for older cards out into a new ticket #2453, and focus on getting this `free(): invalid pointer` crash resolved for newer GPUs. Until we can add support for older cards we'll make sure we fallback to CPU if we detect one so it doesn't crash.
Author
Owner

@dhiltgen commented on GitHub (Mar 12, 2024):

With the latest release 0.1.29, we've switched to ROCm v6 on linux, so I believe that should render this issue moot now since it seemed to be a v5 specific defect.

If anyone is still seeing this after upgrading to 0.1.29, please let me know and I'll re-open this and continue the investigation.

<!-- gh-comment-id:1992293481 --> @dhiltgen commented on GitHub (Mar 12, 2024): With the latest release [0.1.29](https://github.com/ollama/ollama/releases/tag/v0.1.29), we've switched to ROCm v6 on linux, so I believe that should render this issue moot now since it seemed to be a v5 specific defect. If anyone is still seeing this after upgrading to 0.1.29, please let me know and I'll re-open this and continue the investigation.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#47749