[GH-ISSUE #4140] Add binary support for Nvidia Jetson Nano- JetPack 4 #28333

Closed
opened 2026-04-22 06:26:15 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @dtischler on GitHub (May 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4140

Originally assigned to: @dhiltgen on GitHub.

Hi folks, I have been experimenting with attempting to get GPU acceleration working on the older (but still nice!) Jetson Nano Developer Kit hardware with Ollama. I have both the 2gb and 4gb RAM versions. I've been unable to get it working, due to this situation:

Nvidia only provides JetPack 4.6 for these devices, which is based on Ubuntu 18.04, and contains built-in CUDA 10.2. The GPU is an old Maxwell generation, sm_53, which needs to be added to gen_common.sh, no problem there.

I install Go 1.22, and Cmake 2.29 successfully, and add to .profile - All good there too.

However, the challenge becomes gcc: Ollama (and llama.cpp from what I gather) require gcc-11, however, CUDA 10.2 is not supported past gcc-8. The JetPack / Ubuntu distribution includes gcc-7.5 out-of-the-box. So attempting to build Ollama from source (because I need to add sm_53 in gen_common.sh) results in errors, and fails of course. I can upgrade to gcc-11 rather easily, and then a build will actually complete, but CUDA and the GPU are not usable, and it falls back to CPU inferencing.

Any ideas on how to overcome the chicken and egg scenario, or, build only the actual CUDA bits with gcc-7.5 but the rest of Ollama/Llama.cpp with gcc-11 ?

I actually tried injecting update-alternatives --set gcc /usr/bin/gcc-7 into gen_linux.sh just before the CUDA bits, but that didn't work as the build is looping through components and some of the next bits need to go back to gcc-11 😄

Thanks!

Originally created by @dtischler on GitHub (May 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4140 Originally assigned to: @dhiltgen on GitHub. Hi folks, I have been experimenting with attempting to get GPU acceleration working on the older (but still nice!) Jetson Nano Developer Kit hardware with Ollama. I have both the 2gb and 4gb RAM versions. I've been unable to get it working, due to this situation: Nvidia only provides JetPack 4.6 for these devices, which is based on Ubuntu 18.04, and contains built-in CUDA 10.2. The GPU is an old Maxwell generation, `sm_53`, which needs to be added to `gen_common.sh`, no problem there. I install Go 1.22, and Cmake 2.29 successfully, and add to `.profile` - All good there too. However, the challenge becomes `gcc`: Ollama (and llama.cpp from what I gather) require gcc-11, however, CUDA 10.2 is not supported past gcc-8. The JetPack / Ubuntu distribution includes gcc-7.5 out-of-the-box. So attempting to build Ollama from source (because I need to add `sm_53` in gen_common.sh) results in errors, and fails of course. I can upgrade to gcc-11 rather easily, and then a build will actually complete, but CUDA and the GPU are not usable, and it falls back to CPU inferencing. Any ideas on how to overcome the chicken and egg scenario, or, build *only* the actual CUDA bits with gcc-7.5 but the rest of Ollama/Llama.cpp with gcc-11 ? I actually tried injecting `update-alternatives --set gcc /usr/bin/gcc-7` into `gen_linux.sh` just before the CUDA bits, but that didn't work as the build is looping through components and some of the next bits need to go back to gcc-11 😄 Thanks!
GiteaMirror added the feature requestnvidia labels 2026-04-22 06:26:16 -05:00
Author
Owner

@dhiltgen commented on GitHub (May 31, 2024):

Based on my current understanding, to support binary releases, we'll need one distinct cuda build for each JetPack major version. I'm going to dedup and tidy up our issues to track this with 3 distinct issues.

<!-- gh-comment-id:2143033389 --> @dhiltgen commented on GitHub (May 31, 2024): Based on my current understanding, to support binary releases, we'll need one distinct cuda build for each JetPack major version. I'm going to dedup and tidy up our issues to track this with 3 distinct issues.
Author
Owner

@dtischler commented on GitHub (Jun 1, 2024):

Sounds great, thanks @dhiltgen !

<!-- gh-comment-id:2143263177 --> @dtischler commented on GitHub (Jun 1, 2024): Sounds great, thanks @dhiltgen !
Author
Owner

@juhaem commented on GitHub (Jun 13, 2024):

Greatly appreciated, just installed ollama on my Jetson Nano 4GB w/ CUDA 10.2. and it only uses CPU.

<!-- gh-comment-id:2164487619 --> @juhaem commented on GitHub (Jun 13, 2024): Greatly appreciated, just installed ollama on my Jetson Nano 4GB w/ CUDA 10.2. and it only uses CPU.
Author
Owner

@dhiltgen commented on GitHub (Jul 25, 2024):

I spent some time trying to see if I could get our build working on JP4, but as you pointed out, the GCC version requirements between CUDA v10 and llama.cpp seem to be mutually exclusive. Unfortunately I think this combination isn't something we'll be able to support. JP5 looks like it's going to be the oldest version we'll be able to support.

<!-- gh-comment-id:2250948042 --> @dhiltgen commented on GitHub (Jul 25, 2024): I spent some time trying to see if I could get our build working on JP4, but as you pointed out, the GCC version requirements between CUDA v10 and llama.cpp seem to be mutually exclusive. Unfortunately I think this combination isn't something we'll be able to support. JP5 looks like it's going to be the oldest version we'll be able to support.
Author
Owner

@dtischler commented on GitHub (Jul 25, 2024):

Thanks for looking into it, appreciate that you all gave it a try.

<!-- gh-comment-id:2251290573 --> @dtischler commented on GitHub (Jul 25, 2024): Thanks for looking into it, appreciate that you all gave it a try.
Author
Owner

@konanast commented on GitHub (Feb 8, 2025):

Any updates on this topic? I tried using Ollama but was unsuccessful in getting it to utilize the GPU. It would be interesting to see if this could be supported, especially now with some new smaller good models becoming available.

<!-- gh-comment-id:2645096332 --> @konanast commented on GitHub (Feb 8, 2025): Any updates on this topic? I tried using Ollama but was unsuccessful in getting it to utilize the GPU. It would be interesting to see if this could be supported, especially now with some new smaller good models becoming available.
Author
Owner

@felipetobars commented on GitHub (Feb 9, 2026):

any update? :(

<!-- gh-comment-id:3868931266 --> @felipetobars commented on GitHub (Feb 9, 2026): any update? :(
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28333