[PR #5049] [MERGED] Cuda v12 #11659

Closed
opened 2026-04-12 23:34:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5049
Author: @dhiltgen
Created: 6/14/2024
Status: Merged
Merged: 8/19/2024
Merged by: @dhiltgen

Base: mainHead: cuda_v12


📝 Commits (10+)

  • 74d45f0 Refactor linux packaging
  • c7bcb00 Wire up ccache and pigz in the docker based build
  • d470ebe Add Jetson cuda variants for arm
  • fc3b4cd Report GPU variant in log
  • 4fe3a55 Add cuda v12 variant and selection logic
  • f6c811b Enable cuda v12 flags
  • 927d98a Add windows cuda v12 + v11 support
  • 3b19cdb Remove Jetpack
  • 88bb9e3 Adjust layout to bin+lib/ollama
  • f9e31da Review comments

📊 Changes

23 files changed (+447 additions, -217 deletions)

View changed files

📝 .github/workflows/release.yaml (+15 -5)
📝 Dockerfile (+103 -30)
📝 app/ollama.iss (+6 -15)
📝 docs/linux.md (+4 -6)
📝 envconfig/config.go (+5 -5)
📝 gpu/amd_common.go (+1 -1)
📝 gpu/amd_windows.go (+1 -1)
📝 gpu/cuda_common.go (+43 -0)
📝 gpu/gpu.go (+49 -21)
📝 gpu/gpu_darwin.go (+2 -2)
📝 gpu/gpu_linux.go (+1 -1)
📝 gpu/types.go (+8 -5)
📝 llm/ext_server/CMakeLists.txt (+2 -1)
📝 llm/generate/gen_common.sh (+24 -8)
📝 llm/generate/gen_darwin.sh (+2 -0)
📝 llm/generate/gen_linux.sh (+41 -45)
📝 llm/generate/gen_windows.ps1 (+26 -29)
📝 llm/payload.go (+2 -2)
📝 llm/server.go (+5 -7)
📝 scripts/build_linux.sh (+6 -6)

...and 3 more files

📄 Description

This builds upon the new linux packaging model in #5631 to support building 2 different CUDA runners: v11 for support going back to CC 5.0, and v12 for CC 6.0 and up GPUs. This allows us to start enabling new features such as GGML_CUDA_USE_GRAPHS which require cuda v12 support without dropping support for older GPUs.

Fixes #4958
Fixes #5737
Fixes #2361
Fixes #6144

Resulting sizes:

% ls -lh dist/*.xz
-rw-r--r--  1 daniel  staff   1.4G Aug 12 11:43 dist/ollama-linux-amd64.tar.xz
-rw-r--r--  1 daniel  staff   1.5G Aug 12 12:11 dist/ollama-linux-arm64.tar.xz
time=2024-07-12T20:24:36.369Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60101 cpu]"

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5049 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 6/14/2024 **Status:** ✅ Merged **Merged:** 8/19/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `cuda_v12` --- ### 📝 Commits (10+) - [`74d45f0`](https://github.com/ollama/ollama/commit/74d45f010276c2f2653f3ca8c4f76cb0552fb46e) Refactor linux packaging - [`c7bcb00`](https://github.com/ollama/ollama/commit/c7bcb0031965e33531358639620a11516d101b54) Wire up ccache and pigz in the docker based build - [`d470ebe`](https://github.com/ollama/ollama/commit/d470ebe78bc76c098bc378f98f08f7094063ab4d) Add Jetson cuda variants for arm - [`fc3b4cd`](https://github.com/ollama/ollama/commit/fc3b4cda89f468f923e2e6095c6a62a5c3c336ff) Report GPU variant in log - [`4fe3a55`](https://github.com/ollama/ollama/commit/4fe3a556faf790ba993223cfdda16e281b6cb76d) Add cuda v12 variant and selection logic - [`f6c811b`](https://github.com/ollama/ollama/commit/f6c811b32075cb3b7633d7d4213251d474a77682) Enable cuda v12 flags - [`927d98a`](https://github.com/ollama/ollama/commit/927d98a6cde43ffee3ef269cf013df5e96cbe767) Add windows cuda v12 + v11 support - [`3b19cdb`](https://github.com/ollama/ollama/commit/3b19cdba2a090772b2e886dbfbf712992fafe0cd) Remove Jetpack - [`88bb9e3`](https://github.com/ollama/ollama/commit/88bb9e332877dfbba40030c19570fdbe00f41a21) Adjust layout to bin+lib/ollama - [`f9e31da`](https://github.com/ollama/ollama/commit/f9e31da9463092d7b3661594788c259d6d55b3d9) Review comments ### 📊 Changes **23 files changed** (+447 additions, -217 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/release.yaml` (+15 -5) 📝 `Dockerfile` (+103 -30) 📝 `app/ollama.iss` (+6 -15) 📝 `docs/linux.md` (+4 -6) 📝 `envconfig/config.go` (+5 -5) 📝 `gpu/amd_common.go` (+1 -1) 📝 `gpu/amd_windows.go` (+1 -1) 📝 `gpu/cuda_common.go` (+43 -0) 📝 `gpu/gpu.go` (+49 -21) 📝 `gpu/gpu_darwin.go` (+2 -2) 📝 `gpu/gpu_linux.go` (+1 -1) 📝 `gpu/types.go` (+8 -5) 📝 `llm/ext_server/CMakeLists.txt` (+2 -1) 📝 `llm/generate/gen_common.sh` (+24 -8) 📝 `llm/generate/gen_darwin.sh` (+2 -0) 📝 `llm/generate/gen_linux.sh` (+41 -45) 📝 `llm/generate/gen_windows.ps1` (+26 -29) 📝 `llm/payload.go` (+2 -2) 📝 `llm/server.go` (+5 -7) 📝 `scripts/build_linux.sh` (+6 -6) _...and 3 more files_ </details> ### 📄 Description This builds upon the new linux packaging model in #5631 to support building 2 different CUDA runners: v11 for support going back to CC 5.0, and v12 for CC 6.0 and up GPUs. This allows us to start enabling new features such as `GGML_CUDA_USE_GRAPHS` which require cuda v12 support without dropping support for older GPUs. Fixes #4958 Fixes #5737 Fixes #2361 Fixes #6144 Resulting sizes: ``` % ls -lh dist/*.xz -rw-r--r-- 1 daniel staff 1.4G Aug 12 11:43 dist/ollama-linux-amd64.tar.xz -rw-r--r-- 1 daniel staff 1.5G Aug 12 12:11 dist/ollama-linux-arm64.tar.xz ``` ``` time=2024-07-12T20:24:36.369Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60101 cpu]" ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:34:58 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11659