[PR #15608] api: expose GPU device info in /api/status #46476

Open
opened 2026-04-25 01:53:43 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15608
Author: @Frank-Schruefer
Created: 4/15/2026
Status: 🔄 Open

Base: mainHead: feature/api-status-gpu-info


📝 Commits (2)

  • 132b47d sample: add segment-level repetition loop detection
  • 16d139b api: expose GPU device info in /api/status

📊 Changes

6 files changed (+240 additions, -24 deletions)

View changed files

📝 api/types.go (+31 -0)
📝 runner/ollamarunner/runner.go (+4 -0)
📝 sample/samplers.go (+86 -14)
📝 sample/samplers_benchmark_test.go (+4 -4)
📝 sample/samplers_test.go (+96 -6)
📝 server/routes.go (+19 -0)

📄 Description

Currently there is no way to query GPU hardware information through the Ollama API. This adds a Gpus field to StatusResponse that exposes available GPU devices via the existing discover.GPUDevices() function.

Clients can use this to query GPU hardware (name, total VRAM, compute capability, etc.) for capacity planning and model size selection — for example to determine the largest context window that still allows all model layers to be offloaded to the GPU, or to pick the largest model that fits in available VRAM.

The field is omitempty and fully backwards-compatible.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15608 **Author:** [@Frank-Schruefer](https://github.com/Frank-Schruefer) **Created:** 4/15/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `feature/api-status-gpu-info` --- ### 📝 Commits (2) - [`132b47d`](https://github.com/ollama/ollama/commit/132b47de13c7d839b536636c61a7c6f0380f4b24) sample: add segment-level repetition loop detection - [`16d139b`](https://github.com/ollama/ollama/commit/16d139b7d30167b7556f11d0f7c2f80b6d78dcda) api: expose GPU device info in /api/status ### 📊 Changes **6 files changed** (+240 additions, -24 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+31 -0) 📝 `runner/ollamarunner/runner.go` (+4 -0) 📝 `sample/samplers.go` (+86 -14) 📝 `sample/samplers_benchmark_test.go` (+4 -4) 📝 `sample/samplers_test.go` (+96 -6) 📝 `server/routes.go` (+19 -0) </details> ### 📄 Description Currently there is no way to query GPU hardware information through the Ollama API. This adds a `Gpus` field to `StatusResponse` that exposes available GPU devices via the existing `discover.GPUDevices()` function. Clients can use this to query GPU hardware (name, total VRAM, compute capability, etc.) for capacity planning and model size selection — for example to determine the largest context window that still allows all model layers to be offloaded to the GPU, or to pick the largest model that fits in available VRAM. The field is `omitempty` and fully backwards-compatible. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 01:53:43 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#46476