[GH-ISSUE #14357] ollama's NVML queries fail with the 580.xx driver #71390

Closed
opened 2026-05-05 01:28:09 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @hyperu on GitHub (Feb 22, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14357

The issue is that ollama's NVML queries fail with the 580.xx driver, causing total_vram="0 B" during GPU discovery. This prevents any GPU layer offloading.

What we tried:

  1. Added LD_LIBRARY_PATH=/usr/lib/ollama to service override
  2. Added ollama user to video and render groups
  3. Rebuilt ollama-cuda against CUDA 13.0
  4. None fixed the VRAM detection

Root cause: Ollama's GPU discovery code uses NVML library calls that don't work properly with the new 580.xx driver (CUDA 13.0). The CUDA backend loads correctly later, but the initial GPU detection fails.

Originally created by @hyperu on GitHub (Feb 22, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14357 The issue is that ollama's NVML queries fail with the 580.xx driver, causing total_vram="0 B" during GPU discovery. This prevents any GPU layer offloading. What we tried: 1. ✅ Added LD_LIBRARY_PATH=/usr/lib/ollama to service override 2. ✅ Added ollama user to video and render groups 3. ✅ Rebuilt ollama-cuda against CUDA 13.0 4. ❌ None fixed the VRAM detection Root cause: Ollama's GPU discovery code uses NVML library calls that don't work properly with the new 580.xx driver (CUDA 13.0). The CUDA backend loads correctly later, but the initial GPU detection fails.
GiteaMirror added the bugnvidianeeds more info labels 2026-05-05 01:28:10 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 22, 2026):

Works fine on my systems:

| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
level=INFO source=routes.go:1739 msg="vram-based default context" total_vram="31.9 GiB" default_num_ctx=32768

Set OLLAMA_DEBUG=2 in the server environment and post server logs to help in debugging.

<!-- gh-comment-id:3940530298 --> @rick-github commented on GitHub (Feb 22, 2026): Works fine on my systems: ``` | NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0 | level=INFO source=routes.go:1739 msg="vram-based default context" total_vram="31.9 GiB" default_num_ctx=32768 ``` Set `OLLAMA_DEBUG=2` in the server environment and post [server logs](https://docs.ollama.com/troubleshooting) to help in debugging.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71390