[PR #628] [CLOSED] Dynamically support ROCm, CUDA, or OpenCL in the GPU-accelerated binary #10264

Closed
opened 2026-04-12 22:56:26 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/628
Author: @65a
Created: 9/28/2023
Status: Closed

Base: mainHead: main


📄 Description

This PR changes the way CMake generation works for the cuda binary, and adds support for querying AMD VRAM. ROCm or CUDA (or OpenCL if neither are available) support are enabled dynamically for gguf, and CUDA or OpenCL support are enabled dynamically for ggml. This is performed by running a CMake managing go script in go generate to query via heuristics the presence of various accelerator SDKs, and enable them in the following order: CUDA, ROCm, OpenCL.

The VRAM detection change uses rocm-info. Note that devices with both an AMD and nVidia GPU will use CUDA and report CUDA VRAM by default, so the binary name default of cuda is still appropriate, but it might make sense to call it gpu or accelerated or something in the future.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/628 **Author:** [@65a](https://github.com/65a) **Created:** 9/28/2023 **Status:** ❌ Closed **Base:** `main` ← **Head:** `main` --- ### 📄 Description This PR changes the way CMake generation works for the `cuda` binary, and adds support for querying AMD VRAM. ROCm or CUDA (or OpenCL if neither are available) support are enabled dynamically for `gguf`, and CUDA or OpenCL support are enabled dynamically for `ggml`. This is performed by running a CMake managing go script in go generate to query via heuristics the presence of various accelerator SDKs, and enable them in the following order: CUDA, ROCm, OpenCL. The VRAM detection change uses rocm-info. Note that devices with both an AMD and nVidia GPU will use CUDA and report CUDA VRAM by default, so the binary name default of `cuda` is still appropriate, but it might make sense to call it `gpu` or `accelerated` or something in the future. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 22:56:26 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#10264