[GH-ISSUE #10940] Detailed info in /api/ps or other new endpoint #53716

Closed
opened 2026-04-29 04:35:28 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @ckuethe on GitHub (Jun 1, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10940

I'd like to propose the addition of detailed processing statistics, maybe as an addition to /api/ps or maybe a whole new endpoint, to answer the question "hey ollama, what are you doing right now?"

At this very moment, I happen to have cogito loaded, doing nothing because I haven't asked it to do anything. ollama ps says Cogito is 100% GPU, which is true but it's also not consuming any CPU, and more importantly not consuming any energy. If I tell it to go compute something, amdgpu_top or roc-smi do show that ollama runners are now using a bunch of GPU.

What I'd like to see is a way for ollama to report what's actually running, consuming compute and energy, possibly even the prompt. I'm hacking something together parse the output of rocm-smi (I assume nvidia-smi has similar outputs) to try estimate the energy consumption of running LLMs and possibly each of my queries. It seems like Ollama would be better positioned to report on resource consumption.

Thoughts?

Originally created by @ckuethe on GitHub (Jun 1, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10940 I'd like to propose the addition of detailed processing statistics, maybe as an addition to `/api/ps` or maybe a whole new endpoint, to answer the question "hey ollama, what are you doing right now?" At this very moment, I happen to have cogito loaded, doing nothing because I haven't asked it to do anything. `ollama ps` says Cogito is 100% GPU, which is true but it's also not consuming any CPU, and more importantly not consuming any energy. If I tell it to go compute something, `amdgpu_top` or `roc-smi` do show that ollama runners are now using a bunch of GPU. What I'd like to see is a way for ollama to report what's actually running, consuming compute and energy, possibly even the prompt. I'm hacking something together parse the output of `rocm-smi` (I assume `nvidia-smi` has similar outputs) to try estimate the energy consumption of running LLMs and possibly each of my queries. It seems like Ollama would be better positioned to report on resource consumption. Thoughts?
GiteaMirror added the feature request label 2026-04-29 04:35:28 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 1, 2025):

#3144

<!-- gh-comment-id:2927945818 --> @rick-github commented on GitHub (Jun 1, 2025): #3144
Author
Owner

@ckuethe commented on GitHub (Jun 1, 2025):

thanks, I'll follow that issue.

<!-- gh-comment-id:2927956523 --> @ckuethe commented on GitHub (Jun 1, 2025): thanks, I'll follow that issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53716