[PR #4238] [MERGED] Record more GPU information #37296

Closed
opened 2026-04-22 22:00:40 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/4238
Author: @dhiltgen
Created: 5/7/2024
Status: Merged
Merged: 5/9/2024
Merged by: @dhiltgen

Base: mainHead: gpu_info


📝 Commits (1)

  • 8727a9c Record more GPU information

📊 Changes

10 files changed (+150 additions, -96 deletions)

View changed files

📝 gpu/amd_hip_windows.go (+10 -5)
📝 gpu/amd_linux.go (+61 -21)
📝 gpu/amd_windows.go (+19 -47)
📝 gpu/gpu.go (+11 -5)
📝 gpu/gpu_info.h (+3 -0)
📝 gpu/gpu_info_cpu.c (+0 -4)
📝 gpu/gpu_info_nvcuda.c (+12 -8)
📝 gpu/gpu_info_nvcuda.h (+3 -0)
📝 gpu/types.go (+29 -5)
📝 server/routes.go (+2 -1)

📄 Description

This cleans up the logging for GPU discovery a bit, and can serve as a foundation to report GPU information in a future UX.

Some example output (without OLLAMA_DEBUG set)

Windows Cuda:

time=2024-05-07T14:55:04.202-07:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11.3 cuda_v12.3 rocm_v5.7]"
time=2024-05-07T14:55:04.310-07:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-05-07T14:55:04.572-07:00 level=INFO source=amd_windows.go:90 msg="unsupported Radeon iGPU detected skipping" id=0 name="AMD Radeon(TM) Graphics" gfx=gfx1036
time=2024-05-07T14:55:04.635-07:00 level=INFO source=gpu.go:246 msg="inference compute" id=GPU-13b3d4ff-808b-ab50-e395-de65e58aa716 library=cuda compute=8.9 driver=12.3 name="NVIDIA GeForce RTX 4090" total="24563.5 MiB" available="23008.0 MiB"

Windows Radeon:

time=2024-05-07T15:00:22.839-07:00 level=INFO source=amd_windows.go:90 msg="unsupported Radeon iGPU detected skipping" id=0 name="AMD Radeon(TM) Graphics" gfx=gfx1036
time=2024-05-07T15:00:23.082-07:00 level=INFO source=gpu.go:246 msg="inference compute" id=1 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24560.0 MiB" available="24432.0 MiB"

Linux Radeon mismatch gfx without override

time=2024-05-07T21:56:20.080Z level=WARN source=amd_linux.go:290 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1034 library=/usr/share/ollama/lib/rocm supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-05-07T21:56:20.080Z level=WARN source=amd_linux.go:292 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-05-07T21:56:20.080Z level=INFO source=amd_linux.go:305 msg="no compatible amdgpu devices detected"
time=2024-05-07T21:56:20.080Z level=INFO source=gpu.go:246 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="32051.6 MiB" available="271.3 MiB"

Same system with the override set

time=2024-05-07T21:57:08.855Z level=INFO source=amd_linux.go:298 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0
time=2024-05-07T21:57:08.855Z level=INFO source=gpu.go:246 msg="inference compute" id=0 library=rocm compute=gfx1034 driver=6.3 name=1002:743f total="4080.0 MiB" available="1102.9 MiB"

Dual CUDA setup:

time=2024-05-07T21:51:19.894Z level=INFO source=gpu.go:246 msg="inference compute" id=GPU-19fc4f1e-fbcc-de33-f14a-ae21199420b6 library=cuda compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3060" total="12030.6 MiB" available="11922.6 MiB"
time=2024-05-07T21:51:19.894Z level=INFO source=gpu.go:246 msg="inference compute" id=GPU-f3a94ab8-b31d-61ff-9fbb-ce91ac1cdd95 library=cuda compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3060" total="12037.4 MiB" available="8897.4 MiB"

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/4238 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 5/7/2024 **Status:** ✅ Merged **Merged:** 5/9/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `gpu_info` --- ### 📝 Commits (1) - [`8727a9c`](https://github.com/ollama/ollama/commit/8727a9c140cabc2ffcf6599412f540ced594edb7) Record more GPU information ### 📊 Changes **10 files changed** (+150 additions, -96 deletions) <details> <summary>View changed files</summary> 📝 `gpu/amd_hip_windows.go` (+10 -5) 📝 `gpu/amd_linux.go` (+61 -21) 📝 `gpu/amd_windows.go` (+19 -47) 📝 `gpu/gpu.go` (+11 -5) 📝 `gpu/gpu_info.h` (+3 -0) 📝 `gpu/gpu_info_cpu.c` (+0 -4) 📝 `gpu/gpu_info_nvcuda.c` (+12 -8) 📝 `gpu/gpu_info_nvcuda.h` (+3 -0) 📝 `gpu/types.go` (+29 -5) 📝 `server/routes.go` (+2 -1) </details> ### 📄 Description This cleans up the logging for GPU discovery a bit, and can serve as a foundation to report GPU information in a future UX. Some example output (without OLLAMA_DEBUG set) Windows Cuda: ``` time=2024-05-07T14:55:04.202-07:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11.3 cuda_v12.3 rocm_v5.7]" time=2024-05-07T14:55:04.310-07:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-05-07T14:55:04.572-07:00 level=INFO source=amd_windows.go:90 msg="unsupported Radeon iGPU detected skipping" id=0 name="AMD Radeon(TM) Graphics" gfx=gfx1036 time=2024-05-07T14:55:04.635-07:00 level=INFO source=gpu.go:246 msg="inference compute" id=GPU-13b3d4ff-808b-ab50-e395-de65e58aa716 library=cuda compute=8.9 driver=12.3 name="NVIDIA GeForce RTX 4090" total="24563.5 MiB" available="23008.0 MiB" ``` Windows Radeon: ``` time=2024-05-07T15:00:22.839-07:00 level=INFO source=amd_windows.go:90 msg="unsupported Radeon iGPU detected skipping" id=0 name="AMD Radeon(TM) Graphics" gfx=gfx1036 time=2024-05-07T15:00:23.082-07:00 level=INFO source=gpu.go:246 msg="inference compute" id=1 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24560.0 MiB" available="24432.0 MiB" ``` Linux Radeon mismatch gfx without override ``` time=2024-05-07T21:56:20.080Z level=WARN source=amd_linux.go:290 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1034 library=/usr/share/ollama/lib/rocm supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]" time=2024-05-07T21:56:20.080Z level=WARN source=amd_linux.go:292 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage" time=2024-05-07T21:56:20.080Z level=INFO source=amd_linux.go:305 msg="no compatible amdgpu devices detected" time=2024-05-07T21:56:20.080Z level=INFO source=gpu.go:246 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="32051.6 MiB" available="271.3 MiB" ``` Same system with the override set ``` time=2024-05-07T21:57:08.855Z level=INFO source=amd_linux.go:298 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0 time=2024-05-07T21:57:08.855Z level=INFO source=gpu.go:246 msg="inference compute" id=0 library=rocm compute=gfx1034 driver=6.3 name=1002:743f total="4080.0 MiB" available="1102.9 MiB" ``` Dual CUDA setup: ``` time=2024-05-07T21:51:19.894Z level=INFO source=gpu.go:246 msg="inference compute" id=GPU-19fc4f1e-fbcc-de33-f14a-ae21199420b6 library=cuda compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3060" total="12030.6 MiB" available="11922.6 MiB" time=2024-05-07T21:51:19.894Z level=INFO source=gpu.go:246 msg="inference compute" id=GPU-f3a94ab8-b31d-61ff-9fbb-ce91ac1cdd95 library=cuda compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3060" total="12037.4 MiB" available="8897.4 MiB" ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:00:40 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#37296