[PR #556] [MERGED] pack in cuda libs #15489

Closed
opened 2026-04-16 05:00:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/556
Author: @BruceMacD
Created: 9/20/2023
Status: Merged
Merged: 9/20/2023
Merged by: @mxyng

Base: mainHead: pack-cuda


📝 Commits (6)

📊 Changes

14 files changed (+52 additions, -115 deletions)

View changed files

📝 docs/development.md (+1 -1)
📝 llm/llama.cpp/generate_darwin_amd64.go (+4 -5)
📝 llm/llama.cpp/generate_darwin_arm64.go (+4 -5)
📝 llm/llama.cpp/generate_linux.go (+9 -9)
📝 llm/llama.cpp/generate_windows.go (+2 -6)
llm/llama.cpp/ggml_patch/0003-metal-add-missing-barriers-for-mul-mat-2699.patch (+0 -32)
📝 llm/llama.cpp/patches/0001-add-detokenize-endpoint.patch (+0 -0)
llm/llama.cpp/patches/0001-copy-cuda-runtime-libraries.patch (+27 -0)
📝 llm/llama.cpp/patches/0002-34B-model-support.patch (+0 -0)
📝 llm/llama.cpp/patches/0003-metal-fix-synchronization-in-new-matrix-multiplicati.patch (+0 -0)
📝 llm/llama.cpp/patches/0004-metal-add-missing-barriers-for-mul-mat-2699.patch (+0 -0)
📝 llm/llama.cpp/patches/0005-ggml-support-CUDA-s-half-type-for-aarch64-1455-2670.patch (+0 -0)
📝 llm/llama.go (+4 -56)
📝 server/routes.go (+1 -1)

📄 Description

This change packs CUDA libs into the llama runner and tells the runner to use those libs.

Here is the example generate in my case.

go generate ./...

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/556 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 9/20/2023 **Status:** ✅ Merged **Merged:** 9/20/2023 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `pack-cuda` --- ### 📝 Commits (6) - [`4e8be78`](https://github.com/ollama/ollama/commit/4e8be787c7855f02a7bcf9af116d2905ddd59725) pack in cuda libs - [`b9bb5ca`](https://github.com/ollama/ollama/commit/b9bb5ca288523338729fcd4687e654096f635eb6) use cuda_version - [`1255bc9`](https://github.com/ollama/ollama/commit/1255bc9b45686e50795f2bd7a3f312cac2536bca) only package 11.8 runner - [`fc6ec35`](https://github.com/ollama/ollama/commit/fc6ec356fc018704ff49823a624e95069544408f) remove libcuda.so - [`6c6a31a`](https://github.com/ollama/ollama/commit/6c6a31a1e8f33b132388c71e562c07c2564f6dbe) embed libraries using cmake - [`a9ed7cc`](https://github.com/ollama/ollama/commit/a9ed7cc6aaacf4d4f05da69c9b0a14a0ab2b6a81) rename generate.go ### 📊 Changes **14 files changed** (+52 additions, -115 deletions) <details> <summary>View changed files</summary> 📝 `docs/development.md` (+1 -1) 📝 `llm/llama.cpp/generate_darwin_amd64.go` (+4 -5) 📝 `llm/llama.cpp/generate_darwin_arm64.go` (+4 -5) 📝 `llm/llama.cpp/generate_linux.go` (+9 -9) 📝 `llm/llama.cpp/generate_windows.go` (+2 -6) ➖ `llm/llama.cpp/ggml_patch/0003-metal-add-missing-barriers-for-mul-mat-2699.patch` (+0 -32) 📝 `llm/llama.cpp/patches/0001-add-detokenize-endpoint.patch` (+0 -0) ➕ `llm/llama.cpp/patches/0001-copy-cuda-runtime-libraries.patch` (+27 -0) 📝 `llm/llama.cpp/patches/0002-34B-model-support.patch` (+0 -0) 📝 `llm/llama.cpp/patches/0003-metal-fix-synchronization-in-new-matrix-multiplicati.patch` (+0 -0) 📝 `llm/llama.cpp/patches/0004-metal-add-missing-barriers-for-mul-mat-2699.patch` (+0 -0) 📝 `llm/llama.cpp/patches/0005-ggml-support-CUDA-s-half-type-for-aarch64-1455-2670.patch` (+0 -0) 📝 `llm/llama.go` (+4 -56) 📝 `server/routes.go` (+1 -1) </details> ### 📄 Description This change packs CUDA libs into the llama runner and tells the runner to use those libs. Here is the example generate in my case. ``` go generate ./... ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 05:00:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#15489