[Question] Open-Webui, how is it releasing memory? #415

Closed
opened 2025-11-11 14:20:42 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @tilllt on GitHub (Mar 5, 2024).

Hey there,

i have several AI Services running (Whisper-API, Auto111/StableDiffusion, TTS-Web-GUI), which all compete for the little 12GB VRAM of our GPU. Open-Webui / Ollama are the most frequently used and it happens quite regularly, that - even if no one is actively using Ollama- there is no VRAM available for the other servies.

It seems that open-webui's detection of when a session is active (keep the model loaded) or not (free the vram) is not working reliably. How is this handled at the moment?

Cheers

Originally created by @tilllt on GitHub (Mar 5, 2024). Hey there, i have several AI Services running (Whisper-API, Auto111/StableDiffusion, TTS-Web-GUI), which all compete for the little 12GB VRAM of our GPU. Open-Webui / Ollama are the most frequently used and it happens quite regularly, that - even if no one is actively using Ollama- there is no VRAM available for the other servies. It seems that open-webui's detection of when a session is active (keep the model loaded) or not (free the vram) is not working reliably. How is this handled at the moment? Cheers
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#415