[PR #7212] [MERGED] Better support for AMD multi-GPU on linux #12351

Closed
opened 2026-04-12 23:56:26 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7212
Author: @dhiltgen
Created: 10/15/2024
Status: Merged
Merged: 10/26/2024
Merged by: @dhiltgen

Base: mainHead: dual_rocm


📝 Commits (2)

  • 42e7386 Better support for AMD multi-GPU
  • 18eef7a ROCR_VISIBLE_DEVICES only works on linux

📊 Changes

6 files changed (+69 additions, -43 deletions)

View changed files

📝 discover/amd_common.go (+0 -13)
📝 discover/amd_hip_windows.go (+1 -1)
📝 discover/amd_linux.go (+40 -23)
📝 discover/amd_windows.go (+18 -1)
📝 docs/gpu.md (+7 -2)
📝 envconfig/config.go (+3 -3)

📄 Description

This resolves a number of problems related to AMD multi-GPU setups on linux.

The numeric IDs used by rocm are not the same as the numeric IDs exposed in sysfs although the ordering is consistent. We have to count up from the first valid gfx (major/minor/patch with non-zero values) we find starting at zero.

There are 3 different env vars for selecting GPUs, and only ROCR_VISIBLE_DEVICES supports UUID based identification, so we should favor that one, and try to use UUIDs if detected to avoid potential ordering bugs with numeric IDs.

Fixes #6595
Fixes #6304
Fixes #6802
Fixes #5143


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7212 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 10/15/2024 **Status:** ✅ Merged **Merged:** 10/26/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `dual_rocm` --- ### 📝 Commits (2) - [`42e7386`](https://github.com/ollama/ollama/commit/42e73869ee6296c88a07ff56003f341263939db2) Better support for AMD multi-GPU - [`18eef7a`](https://github.com/ollama/ollama/commit/18eef7a5e82e9ecb8dbac5419b251da0941bc1a2) ROCR_VISIBLE_DEVICES only works on linux ### 📊 Changes **6 files changed** (+69 additions, -43 deletions) <details> <summary>View changed files</summary> 📝 `discover/amd_common.go` (+0 -13) 📝 `discover/amd_hip_windows.go` (+1 -1) 📝 `discover/amd_linux.go` (+40 -23) 📝 `discover/amd_windows.go` (+18 -1) 📝 `docs/gpu.md` (+7 -2) 📝 `envconfig/config.go` (+3 -3) </details> ### 📄 Description This resolves a number of problems related to AMD multi-GPU setups on linux. The numeric IDs used by rocm are not the same as the numeric IDs exposed in sysfs although the ordering is consistent. We have to count up from the first valid gfx (major/minor/patch with non-zero values) we find starting at zero. There are 3 different env vars for selecting GPUs, and only ROCR_VISIBLE_DEVICES supports UUID based identification, so we should favor that one, and try to use UUIDs if detected to avoid potential ordering bugs with numeric IDs. Fixes #6595 Fixes #6304 Fixes #6802 Fixes #5143 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:56:26 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12351