[PR #13317] [MERGED] CUDA: filter devices on secondary discovery #45410

Closed
opened 2026-04-25 01:07:11 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13317
Author: @dhiltgen
Created: 12/3/2025
Status: Merged
Merged: 12/3/2025
Merged by: @dhiltgen

Base: mainHead: cuda_filter


📝 Commits (1)

  • 8137ffb CUDA: filter devices on secondary discovery

📊 Changes

3 files changed (+16 additions, -7 deletions)

View changed files

📝 discover/runner.go (+3 -2)
📝 llm/server.go (+1 -1)
📝 ml/device.go (+12 -4)

📄 Description

We now do a deeper probe of CUDA devices to verify the library version has the correct compute capability coverage for the device. Due to ROCm also interpreting the CUDA env var to filter AMD devices, we try to avoid setting it which leads to problems in mixed vendor systems. However without setting it for this deeper probe, each CUDA library subprocess discovers all CUDA GPUs and on systems with lots of GPUs, this can lead to hitting timeouts. The fix is to turn on the CUDA visibility env var just for this deeper probe use-case.

Fixes #13308


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13317 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 12/3/2025 **Status:** ✅ Merged **Merged:** 12/3/2025 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `cuda_filter` --- ### 📝 Commits (1) - [`8137ffb`](https://github.com/ollama/ollama/commit/8137ffb516364c4a7567b9bff03c20888d2962a4) CUDA: filter devices on secondary discovery ### 📊 Changes **3 files changed** (+16 additions, -7 deletions) <details> <summary>View changed files</summary> 📝 `discover/runner.go` (+3 -2) 📝 `llm/server.go` (+1 -1) 📝 `ml/device.go` (+12 -4) </details> ### 📄 Description We now do a deeper probe of CUDA devices to verify the library version has the correct compute capability coverage for the device. Due to ROCm also interpreting the CUDA env var to filter AMD devices, we try to avoid setting it which leads to problems in mixed vendor systems. However without setting it for this deeper probe, each CUDA library subprocess discovers all CUDA GPUs and on systems with lots of GPUs, this can lead to hitting timeouts. The fix is to turn on the CUDA visibility env var just for this deeper probe use-case. Fixes #13308 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 01:07:11 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#45410