[PR #4741] [CLOSED] Add Jetson cuda variants for arm #37454

Closed
opened 2026-04-22 22:09:49 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/4741
Author: @dhiltgen
Created: 5/31/2024
Status: Closed

Base: mainHead: jetson_variants


📝 Commits (3)

  • c818cd5 Refactor linux packaging
  • e8923a3 Add Jetson cuda variants for arm
  • 1abf1cf Report GPU variant in log

📊 Changes

19 files changed (+275 additions, -158 deletions)

View changed files

📝 .github/workflows/release.yaml (+0 -1)
📝 Dockerfile (+67 -25)
📝 app/ollama.iss (+1 -10)
📝 envconfig/config.go (+2 -2)
📝 gpu/amd_common.go (+1 -1)
📝 gpu/amd_windows.go (+1 -1)
📝 gpu/gpu.go (+76 -15)
📝 gpu/gpu_darwin.go (+2 -2)
📝 gpu/gpu_linux.go (+1 -1)
📝 gpu/types.go (+4 -3)
📝 llm/ext_server/CMakeLists.txt (+2 -1)
📝 llm/generate/gen_common.sh (+15 -2)
📝 llm/generate/gen_linux.sh (+39 -45)
📝 llm/generate/gen_windows.ps1 (+20 -23)
📝 llm/payload.go (+2 -2)
📝 llm/server.go (+5 -7)
📝 scripts/build_linux.sh (+5 -6)
📝 scripts/build_windows.ps1 (+6 -6)
📝 scripts/install.sh (+26 -5)

📄 Description

This adds new variants for arm64 specific to Jetson platforms

Fixes #2408 #4693 #5100 #4861

Updated to layer on #5631 to reduce the payload size so we're not at risk of exceeding the limits.

After extracting the tgz:

% ls -F
cuda/  cuda_jetpack5/  cuda_jetpack6/  ollama*  ollama-linux-arm64.tgz
% ls -lh ollama-linux-arm64.tgz
-rw-r--r-- 1 daniel daniel 1.2G Jul 12 09:09 ollama-linux-arm64.tgz
% ls -lh ollama
-rwxr-xr-x 1 daniel daniel 245M Jul 12 08:58 ollama
% du -sh cuda*
356M	cuda
519M	cuda_jetpack5
552M	cuda_jetpack6

"Dynamic LLM libraries [cuda_jetpack6 cuda_v11 cpu cuda_jetpack5]"

time=2024-07-12T09:16:54.654-07:00 level=INFO source=server.go:383 msg="starting llama server" cmd="/tmp/ollama2043741824/runners/cuda_jetpack5/ollama_llama_server --model /home/daniel/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --verbose --parallel 4 --port 45659"
time=2024-07-12T09:16:54.654-07:00 level=DEBUG source=server.go:398 msg=subprocess environment="[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin LD_LIBRARY_PATH=/home/daniel/tmp/cuda_jetpack5:/tmp/ollama2043741824/runners/cuda_jetpack5:/tmp/ollama2043741824/runners CUDA_VISIBLE_DEVICES=GPU-de921cca-84a7-545a-ac50-34a5746dc088]"
...
ggml_cuda_init: found 1 CUDA devices:
  Device 0: Orin, compute capability 8.7, VMM: yes
llm_load_tensors: ggml ctx size =    0.27 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
...

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/4741 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 5/31/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `jetson_variants` --- ### 📝 Commits (3) - [`c818cd5`](https://github.com/ollama/ollama/commit/c818cd5834d629aa18de805e6026908a3ed43910) Refactor linux packaging - [`e8923a3`](https://github.com/ollama/ollama/commit/e8923a3b5221a02552261e662247fbf88efa8f3c) Add Jetson cuda variants for arm - [`1abf1cf`](https://github.com/ollama/ollama/commit/1abf1cf3511e1670ba440f77aceac9799e5f1ade) Report GPU variant in log ### 📊 Changes **19 files changed** (+275 additions, -158 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/release.yaml` (+0 -1) 📝 `Dockerfile` (+67 -25) 📝 `app/ollama.iss` (+1 -10) 📝 `envconfig/config.go` (+2 -2) 📝 `gpu/amd_common.go` (+1 -1) 📝 `gpu/amd_windows.go` (+1 -1) 📝 `gpu/gpu.go` (+76 -15) 📝 `gpu/gpu_darwin.go` (+2 -2) 📝 `gpu/gpu_linux.go` (+1 -1) 📝 `gpu/types.go` (+4 -3) 📝 `llm/ext_server/CMakeLists.txt` (+2 -1) 📝 `llm/generate/gen_common.sh` (+15 -2) 📝 `llm/generate/gen_linux.sh` (+39 -45) 📝 `llm/generate/gen_windows.ps1` (+20 -23) 📝 `llm/payload.go` (+2 -2) 📝 `llm/server.go` (+5 -7) 📝 `scripts/build_linux.sh` (+5 -6) 📝 `scripts/build_windows.ps1` (+6 -6) 📝 `scripts/install.sh` (+26 -5) </details> ### 📄 Description This adds new variants for arm64 specific to Jetson platforms Fixes #2408 #4693 #5100 #4861 Updated to layer on #5631 to reduce the payload size so we're not at risk of exceeding the limits. After extracting the tgz: ``` % ls -F cuda/ cuda_jetpack5/ cuda_jetpack6/ ollama* ollama-linux-arm64.tgz % ls -lh ollama-linux-arm64.tgz -rw-r--r-- 1 daniel daniel 1.2G Jul 12 09:09 ollama-linux-arm64.tgz % ls -lh ollama -rwxr-xr-x 1 daniel daniel 245M Jul 12 08:58 ollama % du -sh cuda* 356M cuda 519M cuda_jetpack5 552M cuda_jetpack6 ``` `"Dynamic LLM libraries [cuda_jetpack6 cuda_v11 cpu cuda_jetpack5]"` ``` time=2024-07-12T09:16:54.654-07:00 level=INFO source=server.go:383 msg="starting llama server" cmd="/tmp/ollama2043741824/runners/cuda_jetpack5/ollama_llama_server --model /home/daniel/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --verbose --parallel 4 --port 45659" time=2024-07-12T09:16:54.654-07:00 level=DEBUG source=server.go:398 msg=subprocess environment="[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin LD_LIBRARY_PATH=/home/daniel/tmp/cuda_jetpack5:/tmp/ollama2043741824/runners/cuda_jetpack5:/tmp/ollama2043741824/runners CUDA_VISIBLE_DEVICES=GPU-de921cca-84a7-545a-ac50-34a5746dc088]" ... ggml_cuda_init: found 1 CUDA devices: Device 0: Orin, compute capability 8.7, VMM: yes llm_load_tensors: ggml ctx size = 0.27 MiB llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading non-repeating layers to GPU llm_load_tensors: offloaded 33/33 layers to GPU ... ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:09:49 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#37454