[GH-ISSUE #6591] Ollama failing with CUDA error: PTX JIT compiler library not found #66186

Closed
opened 2026-05-04 00:30:38 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @leobenkel on GitHub (Sep 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6591

What is the issue?

The GPU seems to be detected:

CUDA driver version: 11.4
time=2024-08-30T16:57:23.032Z level=DEBUG source=gpu.go:123 msg="detected GPUs" count=1 library=/usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1
[GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] CUDA totalMem 30990 mb
[GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] CUDA freeMem 23431 mb
[GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] Compute Capability 7.2
...
time=2024-08-30T16:57:23.201Z level=INFO source=types.go:105 msg="inference compute" id=GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328 library=cuda compute=7.2 driver=11.4 name=Xavier total="30.3 GiB" available="22.9 GiB"
...
time=2024-08-30T16:58:09.212Z level=DEBUG source=gpu.go:410 msg="updating cuda memory data" gpu=GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328 name=Xavier overhead="0 B" before.total="30.3 GiB" before.free="22.9 GiB" now.total="30.3 GiB" now.free="23.0 GiB" now.used="7.3 GiB"

But then I get this error:

CUDA error: PTX JIT compiler library not found
  current device: 0, in function ggml_cuda_compute_forward at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:2313
/go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:101: CUDA error

Here are the installed version:

ii  nvidia-cuda                                5.1.3-b29                             arm64        NVIDIA CUDA Meta Package
ii  nvidia-cuda-dev                            5.1.3-b29                             arm64        NVIDIA CUDA dev Meta Package
ii  nvidia-jetpack                             5.1.3-b29                             arm64        NVIDIA Jetpack Meta Package
ii  nvidia-jetpack-dev                         5.1.3-b29                             arm64        NVIDIA Jetpack dev Meta Package
ii  nvidia-jetpack-runtime                     5.1.3-b29                             arm64        NVIDIA Jetpack runtime Meta Package
ii  nvidia-l4t-cuda                            35.5.0-20240219203809                 arm64        NVIDIA CUDA Package
ii  cuda                                       11.4.19-1                             arm64        CUDA meta-package

Ollama is used with the docker container et inside a kubernetes cluster.

OS

Docker

GPU

Nvidia

CPU

ARM

Ollama version

0.3.8

Originally created by @leobenkel on GitHub (Sep 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6591 ### What is the issue? The GPU seems to be detected: ``` CUDA driver version: 11.4 time=2024-08-30T16:57:23.032Z level=DEBUG source=gpu.go:123 msg="detected GPUs" count=1 library=/usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1 [GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] CUDA totalMem 30990 mb [GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] CUDA freeMem 23431 mb [GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328] Compute Capability 7.2 ... time=2024-08-30T16:57:23.201Z level=INFO source=types.go:105 msg="inference compute" id=GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328 library=cuda compute=7.2 driver=11.4 name=Xavier total="30.3 GiB" available="22.9 GiB" ... time=2024-08-30T16:58:09.212Z level=DEBUG source=gpu.go:410 msg="updating cuda memory data" gpu=GPU-d90c0d9d-5e59-56b1-b519-6439b1d74328 name=Xavier overhead="0 B" before.total="30.3 GiB" before.free="22.9 GiB" now.total="30.3 GiB" now.free="23.0 GiB" now.used="7.3 GiB" ``` But then I get this error: ``` CUDA error: PTX JIT compiler library not found current device: 0, in function ggml_cuda_compute_forward at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:2313 /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml/src/ggml-cuda.cu:101: CUDA error ``` Here are the installed version: ``` ii nvidia-cuda 5.1.3-b29 arm64 NVIDIA CUDA Meta Package ii nvidia-cuda-dev 5.1.3-b29 arm64 NVIDIA CUDA dev Meta Package ii nvidia-jetpack 5.1.3-b29 arm64 NVIDIA Jetpack Meta Package ii nvidia-jetpack-dev 5.1.3-b29 arm64 NVIDIA Jetpack dev Meta Package ii nvidia-jetpack-runtime 5.1.3-b29 arm64 NVIDIA Jetpack runtime Meta Package ii nvidia-l4t-cuda 35.5.0-20240219203809 arm64 NVIDIA CUDA Package ii cuda 11.4.19-1 arm64 CUDA meta-package ``` Ollama is used with the docker container et inside a kubernetes cluster. ### OS Docker ### GPU Nvidia ### CPU ARM ### Ollama version 0.3.8
GiteaMirror added the bug label 2026-05-04 00:30:39 -05:00
Author
Owner

@dhiltgen commented on GitHub (Sep 3, 2024):

The Jetson ARM systems wont be working properly until we merge #6400

Support for Jetpack 5 is tracked via #4693

<!-- gh-comment-id:2327448413 --> @dhiltgen commented on GitHub (Sep 3, 2024): The Jetson ARM systems wont be working properly until we merge #6400 Support for Jetpack 5 is tracked via #4693
Author
Owner

@leobenkel commented on GitHub (Sep 4, 2024):

thank you for your response. Is there a work around in the mid-time ?

<!-- gh-comment-id:2328218280 --> @leobenkel commented on GitHub (Sep 4, 2024): thank you for your response. Is there a work around in the mid-time ?
Author
Owner

@dhiltgen commented on GitHub (Sep 4, 2024):

Building from source locally with the cuda toolkit installed should work.

<!-- gh-comment-id:2329496553 --> @dhiltgen commented on GitHub (Sep 4, 2024): Building from source locally with the cuda toolkit installed should work.
Author
Owner

@leobenkel commented on GitHub (Sep 4, 2024):

ok, i am going to try this then. thank you

<!-- gh-comment-id:2329502612 --> @leobenkel commented on GitHub (Sep 4, 2024): ok, i am going to try this then. thank you
Author
Owner

@leobenkel commented on GitHub (Sep 7, 2024):

@dhiltgen , just a question: should i pull and build on the physical node of the cluster or should i pull and build from inside the docker image ?

<!-- gh-comment-id:2336427620 --> @leobenkel commented on GitHub (Sep 7, 2024): @dhiltgen , just a question: should i pull and build on the physical node of the cluster or should i pull and build from inside the docker image ?
Author
Owner

@dhiltgen commented on GitHub (Sep 25, 2024):

@leobenkel it doesn't really matter, however make sure you have the right jetpack cuda version to match your target system.

<!-- gh-comment-id:2375210100 --> @dhiltgen commented on GitHub (Sep 25, 2024): @leobenkel it doesn't really matter, however make sure you have the right jetpack cuda version to match your target system.
Author
Owner

@leobenkel commented on GitHub (Sep 26, 2024):

@leobenkel it doesn't really matter, however make sure you have the right jetpack cuda version to match your target system.

thank you for your response.

Right now the system i am working with has jetpack 5, it should work right ?
which script should i run to build ? i found several ways in the documentation. is it ./build_linux.sh ?

<!-- gh-comment-id:2376538245 --> @leobenkel commented on GitHub (Sep 26, 2024): > @leobenkel it doesn't really matter, however make sure you have the right jetpack cuda version to match your target system. thank you for your response. Right now the system i am working with has jetpack 5, it should work right ? which script should i run to build ? i found several ways in the documentation. is it `./build_linux.sh` ?
Author
Owner

@dhiltgen commented on GitHub (Sep 26, 2024):

Linux dev guide: https://github.com/ollama/ollama/blob/main/docs/development.md#linux

<!-- gh-comment-id:2377671287 --> @dhiltgen commented on GitHub (Sep 26, 2024): Linux dev guide: https://github.com/ollama/ollama/blob/main/docs/development.md#linux
Author
Owner

@leobenkel commented on GitHub (Sep 26, 2024):

Linux dev guide: https://github.com/ollama/ollama/blob/main/docs/development.md#linux

ok thats what i followed but without success

<!-- gh-comment-id:2377816534 --> @leobenkel commented on GitHub (Sep 26, 2024): > Linux dev guide: https://github.com/ollama/ollama/blob/main/docs/development.md#linux ok thats what i followed but without success
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#66186