mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #11509] issue: Isn't anyone optimizing the overall running speed? #54922
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @shentong0722 on GitHub (Mar 10, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11509
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.5.20
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
Can we optimize the overall architecture? It feels really bloated.
Actual Behavior
The initial screen response speed is really very slow, and the overall chat experience is also very slow.
Steps to Reproduce
I feel that I have maximized everything, with a 4-core 8G server. Through testing, I found that the network response speed is also extremely fast. So, the problem should be with the project itself, right?
Logs & Screenshots
Additional Information
No response
@frenzybiscuit commented on GitHub (Mar 10, 2025):
If it's just the initial page load speed, it could be the LLM backend taking a while to reply. Open-WebUI will not load if the LLM backend (at least OpenAI) goes down. It will wait by default until it's back online/responding to load pages.
Also, what is your resource usage like? You're throwing numbers like 4 cores / 8GB RAM, but without knowing what the utilization is we can't really say if this is a Open-WebUI problem or a problem with your hardware.
@frenzybiscuit commented on GitHub (Mar 10, 2025):
And you're using postgresql correct?
@mark-kazakov commented on GitHub (Mar 10, 2025):
@shentong0722 It could be because you have a lot of models to load.
When the interface is being launched, it has to fetch all the models.
have a look at this issue: #11228
@frenzybiscuit commented on GitHub (Mar 10, 2025):
Yes, if the backend takes a while to finish replying it will take a while for the page to load.
You can bypass this with the environmental variable AIOHTTP_CLIENT_TIMEOUT_OPENAI_MODEL_LIST=5 when launching Open-WebUI.
@mark-kazakov commented on GitHub (Mar 10, 2025):
I have a tiny server for my instance, and on the latest maybe 3 versions v0.5.18, v0.5.19 and v0.5.20 I have noticed a drop in performance for initially loading the front-end.
So it would be interesting to know if there is actually a regression, or added feature that cause this.
@frenzybiscuit commented on GitHub (Mar 10, 2025):
Well, lets try to isolate the problem. Are you using postgresql or sqlite? That's where I'd start with performance issues. Postgresql is needed for basically every instance if you're over 1 active user at a time. IMO.
@abiari commented on GitHub (Mar 11, 2025):
@frenzybiscuit I'm using postgresql with a 1000 users on the app. I have a simple deployment o 10 replicas scaled through docker compose and server behind a reverse proxy. Having a postgres db for each container didn't make sense since even-though I have a sort of session persistence, if a container is down and the load balancer switches the user to another container, how will I assure the sync between the dbs and assure the user has his data no matter the container they landed on?
@Classic298 commented on GitHub (Mar 11, 2025):
You can't just split databases. If you do, you get data inconsistencies.