[PR #2162] [MERGED] Report more information about GPUs in verbose mode #10803

Closed
opened 2026-04-12 23:11:19 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/2162
Author: @dhiltgen
Created: 1/23/2024
Status: Merged
Merged: 1/24/2024
Merged by: @dhiltgen

Base: mainHead: rocm_real_gpus


📝 Commits (1)

  • 987c16b Report more information about GPUs in verbose mode

📊 Changes

6 files changed (+171 additions, -30 deletions)

View changed files

📝 gpu/gpu.go (+9 -0)
📝 gpu/gpu_info.h (+7 -0)
📝 gpu/gpu_info_cuda.c (+52 -2)
📝 gpu/gpu_info_cuda.h (+12 -0)
📝 gpu/gpu_info_rocm.c (+81 -27)
📝 gpu/gpu_info_rocm.h (+10 -1)

📄 Description

This adds additional calls to both CUDA and ROCm management libraries to discover additional attributes about the GPU(s) detected in the system, and wires up runtime verbosity selection. When users hit problems with GPUs we can ask them to run with OLLAMA_DEBUG=1 ollama serve and share the server log.

Example output on a CUDA laptop:

% OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve
...
time=2024-01-23T11:31:22.828-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.545.23.08]"
CUDA driver version: 545.23.08
time=2024-01-23T11:31:22.859-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:96 msg="Nvidia GPU detected"
[0] CUDA device name: NVIDIA GeForce GTX 1650 with Max-Q Design
[0] CUDA part number:
nvmlDeviceGetSerial failed: 3
[0] CUDA vbios version: 90.17.31.00.26
[0] CUDA brand: 5
[0] CUDA totalMem 4294967296
[0] CUDA usedMem 3789357056
time=2024-01-23T11:31:22.865-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:137 msg="CUDA Compute Capability detected: 7.5"

Example output on a ROCM GPU system

% OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve
...
time=2024-01-23T19:24:55.162Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.6.0.60000 /opt/rocm-6.0.0/lib/librocm_smi64.so.6.0.60000]"
time=2024-01-23T19:24:55.163Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:106 msg="Radeon GPU detected"
discovered 1 ROCm GPU Devices
[0] ROCm device name: Navi 31 [Radeon RX 7900 XT/7900 XTX]
[0] ROCm GPU brand: Navi 31 [Radeon RX 7900 XT/7900 XTX]
[0] ROCm GPU vendor: Advanced Micro Devices, Inc. [AMD/ATI]
[0] ROCm GPU VRAM vendor: samsung
[0] ROCm GPU S/N: 43cfeecf3446fbf7
[0] ROCm GPU subsystem name: NITRO+ RX 7900 XTX Vapor-X
[0] ROCm GPU vbios version: 113-4E4710U-T4Y
[0] ROCm totalMem 25753026560
[0] ROCm usedMem 27852800

This also implements the TODO on ROCm to handle multiple GPUs reported by the management library.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/2162 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 1/23/2024 **Status:** ✅ Merged **Merged:** 1/24/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `rocm_real_gpus` --- ### 📝 Commits (1) - [`987c16b`](https://github.com/ollama/ollama/commit/987c16b2f755c0ac7f3c3b03d4b875dffc96a551) Report more information about GPUs in verbose mode ### 📊 Changes **6 files changed** (+171 additions, -30 deletions) <details> <summary>View changed files</summary> 📝 `gpu/gpu.go` (+9 -0) 📝 `gpu/gpu_info.h` (+7 -0) 📝 `gpu/gpu_info_cuda.c` (+52 -2) 📝 `gpu/gpu_info_cuda.h` (+12 -0) 📝 `gpu/gpu_info_rocm.c` (+81 -27) 📝 `gpu/gpu_info_rocm.h` (+10 -1) </details> ### 📄 Description This adds additional calls to both CUDA and ROCm management libraries to discover additional attributes about the GPU(s) detected in the system, and wires up runtime verbosity selection. When users hit problems with GPUs we can ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the server log. Example output on a CUDA laptop: ``` % OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve ... time=2024-01-23T11:31:22.828-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.545.23.08]" CUDA driver version: 545.23.08 time=2024-01-23T11:31:22.859-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:96 msg="Nvidia GPU detected" [0] CUDA device name: NVIDIA GeForce GTX 1650 with Max-Q Design [0] CUDA part number: nvmlDeviceGetSerial failed: 3 [0] CUDA vbios version: 90.17.31.00.26 [0] CUDA brand: 5 [0] CUDA totalMem 4294967296 [0] CUDA usedMem 3789357056 time=2024-01-23T11:31:22.865-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:137 msg="CUDA Compute Capability detected: 7.5" ``` Example output on a ROCM GPU system ``` % OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve ... time=2024-01-23T19:24:55.162Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.6.0.60000 /opt/rocm-6.0.0/lib/librocm_smi64.so.6.0.60000]" time=2024-01-23T19:24:55.163Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:106 msg="Radeon GPU detected" discovered 1 ROCm GPU Devices [0] ROCm device name: Navi 31 [Radeon RX 7900 XT/7900 XTX] [0] ROCm GPU brand: Navi 31 [Radeon RX 7900 XT/7900 XTX] [0] ROCm GPU vendor: Advanced Micro Devices, Inc. [AMD/ATI] [0] ROCm GPU VRAM vendor: samsung [0] ROCm GPU S/N: 43cfeecf3446fbf7 [0] ROCm GPU subsystem name: NITRO+ RX 7900 XTX Vapor-X [0] ROCm GPU vbios version: 113-4E4710U-T4Y [0] ROCm totalMem 25753026560 [0] ROCm usedMem 27852800 ``` This also implements the TODO on ROCm to handle multiple GPUs reported by the management library. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:11:19 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#10803