[GH-ISSUE #8814] Show in UI GPU status (models loaded, VRAM available) #53941

Closed
opened 2026-05-05 15:35:59 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @JusefPol on GitHub (Jan 23, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8814

Feature Request

Hi guys, I have check the Issues and I haven't found this feature request:

One of the things I have noticed is that when using WebUI, and switch between models, I ended up connecting to my Ollama server to check if the previous model I was using has been dropped already, or if I have enough VRAM available to run another one in parallel. I know I can configure the model to drop immediately, but having some time with the model lingering allows for time to write prompts without the need to reload the model.

But still, it would be nice to know if I have enough VRAM available to run another model for example directly from the UI. I also have a couple of friends accessing my UI sometimes, and they have no way of knowing which model is loaded, or if there is capacity on the GPUs to run the one they want. with the outputs of nvidia-smi and ollama ps you can pretty much get that information, but no idea if is possible to do it from the UI. (since the UI access only the API, I guess the API has to support it first. But I though I drop the idea here in case it gains traction).

Thanks.

Originally created by @JusefPol on GitHub (Jan 23, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8814 # Feature Request Hi guys, I have check the Issues and I haven't found this feature request: One of the things I have noticed is that when using WebUI, and switch between models, I ended up connecting to my Ollama server to check if the previous model I was using has been dropped already, or if I have enough VRAM available to run another one in parallel. I know I can configure the model to drop immediately, but having some time with the model lingering allows for time to write prompts without the need to reload the model. But still, it would be nice to know if I have enough VRAM available to run another model for example directly from the UI. I also have a couple of friends accessing my UI sometimes, and they have no way of knowing which model is loaded, or if there is capacity on the GPUs to run the one they want. with the outputs of nvidia-smi and ollama ps you can pretty much get that information, but no idea if is possible to do it from the UI. (since the UI access only the API, I guess the API has to support it first. But I though I drop the idea here in case it gains traction). Thanks.
Author
Owner

@panda44312 commented on GitHub (Jan 23, 2025):

#8176

<!-- gh-comment-id:2610586479 --> @panda44312 commented on GitHub (Jan 23, 2025): #8176
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#53941