mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #14945] issue: Performance degradation as active users increase due to "Active Users" count & user-list emitters
#17415
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @taylorwilsdon on GitHub (Jun 12, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14945
Originally assigned to: @tjbck on GitHub.
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.6.14
Ollama Version (if applicable)
No response
Operating System
Ubuntu 22
Browser (if applicable)
Chrome
Confirmation
README.md.Expected Behavior
Load should scale in a somewhat linear manner with associated infrastructure when operating in a distributed / multi-node environment (ie double your users, double your hardware will accomplish similar performance)
Actual Behavior
When a certain volume of users is present and websockets + redis are being used, chat streaming performance begins to degrade substantially to the point that chat completions never finish no matter how much infra you throw at it. This appears in large part to be due to the fact that the amount of server sent websocket events to the client skyrockets under heavy user activity as each join and leave fires a user-list and usage event. Under low load it streams nicely, but as the frequency increases it gets to the point that even an m4 max with plenty of headroom starts to struggle and eventually becomes overwhelmed.
https://github.com/user-attachments/assets/e7ab5146-5f88-4579-9e8b-f201c4ca5a85
We commented out these bits in websocket.ts
These in socket/main.py
Making that change resulted in an immediate and dramatic reduction in load on the backend hosts, and better performance on the clients:
Steps to Reproduce
Have 1300 people use an open-webui instance
Logs & Screenshots
https://github.com/user-attachments/assets/e7ab5146-5f88-4579-9e8b-f201c4ca5a85
Additional Information
My proposed solution (will have an associated PR @tjbck) is a new environment variable for
ENABLE_USER_POOL_EVENTSthat if disabled, basically skips the event emits and lose the "Active Users" count in the settings menu.Also probably worth considering making the console.log statements ie
console.log('user-list', data);not exist in general (even if you do want the counts streamed / enabled), because heavy console.log usage will degrade client performance at high volumes even in ideal circumstances.@taylorwilsdon commented on GitHub (Jun 12, 2025):
relates to https://github.com/open-webui/open-webui/issues/13026
@Classic298 commented on GitHub (Jun 13, 2025):
So disabling the user count is leading to such drastic performance improvements already? Interesting
@rgaricano commented on GitHub (Jun 13, 2025):
I was thinkning the same, but it's impossible that such memory use was due only for those functions.
We are speaking of more than a 50%, and in GB ranges!
unless it is due to a "blocking" effect that prevents memory discharge. (that I don't see)
@rgaricano commented on GitHub (Jun 13, 2025):
a question,
why those await ?
63256136ef/backend/open_webui/socket/main.py (L174)63256136ef/backend/open_webui/socket/main.py (L195)it would not be enough with a
send_usage = Trueset? (and leave the update to the pool loop )and/or
Release memory after emits for both user-list & usage?
(call to
release_func()after those awaits )@Ithanil commented on GitHub (Jun 13, 2025):
Just a question, is this somehow a regression with 0.6.14? Because on 0.6.13 we didn't have issues serving 200+ active users concurrently, despite the user list logging to console.
@Classic298 commented on GitHub (Jun 13, 2025):
@Ithanil no, not new. 0.6.14 works fine for me. This seems to be an existing issue, related to the user count on very large instances
@Ithanil commented on GitHub (Jun 13, 2025):
But he says 1000 users on a day, we had like 3000 new users within 2 hours after an official announcement mail.
Not saying this isn't an issues, I don't see any reason whatsoever for the full user list being logged to console.
Also the console.log of every streaming chunk seems unnecessary, right?
@Ithanil commented on GitHub (Jun 13, 2025):
FYI https://github.com/jhubbardsf/vite-plugin-svelte-console-remover
also: https://github.com/jhubbardsf/vite-plugin-svelte-console-remover/issues/1#issue-1240211980
@Classic298 commented on GitHub (Jun 13, 2025):
Hm. Yeah console log for every chunk is only in debug mode i think
@Ithanil commented on GitHub (Jun 13, 2025):
https://github.com/open-webui/open-webui/pull/14958
99% sure won't be merged, but maybe it's interesting for you
@taylorwilsdon commented on GitHub (Jun 13, 2025):
This is 1200+ concurrent active users. It is not a visible problem under 1k or so. 200 is a breeze never seen any issues at that level, you can run one node no problem. 5k or so users, 16 nodes.
@Ithanil commented on GitHub (Jun 13, 2025):
OK, I see. Thanks for clarification!
We have a fully HA multi-node setup and so far I was happy to see all components chilling, except for the GPU-Servers.
@taylorwilsdon commented on GitHub (Jun 13, 2025):
Yeah same setup here, overall scales linearly well but this is definitely a bottleneck that will eventually hit because you have thousands of clients getting thousands of events every second (anytime the user count or usage list changes, which is constantly happening at scale) so commenting out the backend python there was indeed the sole change in the AWS resource usage graph. Reduced egress calls from the cluster by millions of requests.
The open websocket waiting for chat completion streaming chunks is competing with these nonstop updates and ends up falling behind in actual text response so it’s very visible on the client side even if you scale your backend infra to infinity. I got like 3000 console logs from user-info while waiting for a 300 token response. Will have a PR today to fix.
@Classic298 commented on GitHub (Jun 13, 2025):
Good catch. Will love to see the pr
@tjbck commented on GitHub (Jun 16, 2025):
423a35782bmay have already addressed the issue, will also follow up with additional PRs to disable the feature entirely!@Ithanil commented on GitHub (Jun 16, 2025):
Losing the feature entirely is sad news ☹️
@tjbck commented on GitHub (Jun 16, 2025):
@Ithanil it should still work as intended? let me know if that's not the case for your deployment.
@Ithanil commented on GitHub (Jun 16, 2025):
Oh, I understood "will also follow up with additional PRs to disable the feature entirely!" as if you were going to remove the user count display. Sorry, apparently a misunderstanding.
@tjbck commented on GitHub (Jun 16, 2025):
to provide an option to disable* 😅
@Classic298 commented on GitHub (Jun 16, 2025):
an option to disable is good. Save every % performance possible haha
@Yash-Patidar commented on GitHub (Jun 16, 2025):
Is it possible to support both SSE and WebSocket, with the option to switch between them using an environment variable? Just curious — I know this might require significant changes, but it could be a good approach for scalability and better resource optimization. For example, even ChatGPT’s website uses SSE for handling streaming responses efficiently.

@taylorwilsdon commented on GitHub (Jun 16, 2025):
Amazing, appreciate it! This is a much more sensible solution than my bool on/off haha