[PR #1261] [MERGED] Disable CUDA peer access as a workaround for multi-gpu inference bug #15801

Closed
opened 2026-04-16 05:08:48 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/1261
Author: @wookayin
Created: 11/24/2023
Status: Merged
Merged: 11/24/2023
Merged by: @jmorganca

Base: mainHead: multigpu-workaround


📝 Commits (1)

  • 107385d Disable CUDA peer access as a workaround for multi-gpu inference bug

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 llm/llama.cpp/generate_linux.go (+1 -1)

📄 Description

When CUDA peer access is enabled, multi-gpu inference will produce garbage output. This is a known bug of llama.cpp (or nvidia). Until the upstream bug https://github.com/ggerganov/llama.cpp/issues/3772 is fixed, we can disable CUDA peer access temporarily to ensure correct output.

See #961.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/1261 **Author:** [@wookayin](https://github.com/wookayin) **Created:** 11/24/2023 **Status:** ✅ Merged **Merged:** 11/24/2023 **Merged by:** [@jmorganca](https://github.com/jmorganca) **Base:** `main` ← **Head:** `multigpu-workaround` --- ### 📝 Commits (1) - [`107385d`](https://github.com/ollama/ollama/commit/107385d2d91a82c7370b9707afd832f7a11f2350) Disable CUDA peer access as a workaround for multi-gpu inference bug ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/llama.cpp/generate_linux.go` (+1 -1) </details> ### 📄 Description When CUDA peer access is enabled, multi-gpu inference will produce garbage output. This is a known bug of llama.cpp (or nvidia). Until the upstream bug https://github.com/ggerganov/llama.cpp/issues/3772 is fixed, we can disable CUDA peer access temporarily to ensure correct output. See #961. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 05:08:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#15801