[PR #13186] discover: increase GPU discovery timeout when HSA_OVERRIDE_GFX_VERSION is set #19385

Open
opened 2026-04-16 07:05:45 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13186
Author: @Guedxx
Created: 11/21/2025
Status: 🔄 Open

Base: mainHead: main


📝 Commits (1)

  • 09274c4 fix: override at model switch

📊 Changes

1 file changed (+12 additions, -2 deletions)

View changed files

📝 discover/runner.go (+12 -2)

📄 Description

This PR extends the GPU discovery timeout from 3 seconds to 10 seconds specifically when the HSA_OVERRIDE_GFX_VERSION environment variable is detected. This resolves an issue where "unsupported" AMD GPUs would fall back to CPU inference when switching models.

Deploying HSA_OVERRIDE_GFX_VERSION to force support for my specific AMD GPU often experiences slower initialization times (typically 4-6 seconds) due to JIT kernel compilation or driver overhead.

Currently, discover/runner.go has a hardcoded 3-second timeout for refreshing GPU state during a model switch. Because these overridden configurations take longer than 3 seconds to report back, the refresh times out. This causes Ollama to assume the GPU is unavailable or has 0 VRAM free, forcing a fallback to CPU inference for all subsequent models after the first one.

The initial server startup works correctly because it uses a generous 30-second "bootstrap" timeout. This fix applies a similar logic to the refresh loop only when the user has explicitly opted into using the override.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13186 **Author:** [@Guedxx](https://github.com/Guedxx) **Created:** 11/21/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (1) - [`09274c4`](https://github.com/ollama/ollama/commit/09274c44b7b3ea8f24bca02ef439e3c1fb6be922) fix: override at model switch ### 📊 Changes **1 file changed** (+12 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `discover/runner.go` (+12 -2) </details> ### 📄 Description This PR extends the GPU discovery timeout from 3 seconds to 10 seconds specifically when the HSA_OVERRIDE_GFX_VERSION environment variable is detected. This resolves an issue where "unsupported" AMD GPUs would fall back to CPU inference when switching models. Deploying HSA_OVERRIDE_GFX_VERSION to force support for my specific AMD GPU often experiences slower initialization times (typically 4-6 seconds) due to JIT kernel compilation or driver overhead. Currently, discover/runner.go has a hardcoded 3-second timeout for refreshing GPU state during a model switch. Because these overridden configurations take longer than 3 seconds to report back, the refresh times out. This causes Ollama to assume the GPU is unavailable or has 0 VRAM free, forcing a fallback to CPU inference for all subsequent models after the first one. The initial server startup works correctly because it uses a generous 30-second "bootstrap" timeout. This fix applies a similar logic to the refresh loop only when the user has explicitly opted into using the override. #### Related Issues * Most likely fixes #13002 * Relates to #13070 - Addresses GPU discovery failures associated with `HSA_OVERRIDE_GFX_VERSION`. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:05:45 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#19385