mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 07:43:10 -05:00
issue: streaming event is slow when the user has multiple open webui tabs open in the browser #5940
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @frdeng on GitHub (Aug 1, 2025).
Originally assigned to: @tjbck on GitHub.
Check Existing Issues
Installation Method
k8s/pg/redis
Open WebUI Version
0.6.18
Ollama Version (if applicable)
No response
Operating System
k8s with multiple pods
Browser (if applicable)
chrome
Confirmation
README.md.Expected Behavior
When the user has multiple open webui tab, the backend stores sessions for all these clients, when the backend starts streaming the data, it would try to emit event data to all these clients,
b8da4a8cd8/backend/open_webui/socket/main.py (L636-L645)I don't know what the point is to include other sessions of the same user here, it becomes extremely slow if the user has many open webui tabs open in the browser.
If it's no necessary I would go ahead creating a PR to fix this.
Actual Behavior
the backend broadcasts streaming data to all session instead of the current session.
Steps to Reproduce
1, open webui deployment in k8s - multiple pods with redis.
2, open multiple open webui tabs, send request, the response is much slower than only one tab.
Logs & Screenshots
n/a
Additional Information
No response
@rgaricano commented on GitHub (Aug 2, 2025):
If not, how I can use same user for work with different chats or client instances (e.g. browser, code helper, desktop helper,...)
Maybe it's enougth with some advertisment about opening many tabs can slowdown response time...
@frdeng commented on GitHub (Aug 2, 2025):
ok, it would make sense if the same user has multiple clients(browser, or other apps) interacting with the same chat.
but if the other clients are doing other stuff(other chats, etc), then it makes no sense to send data to them.
Is this correct? maybe I missed some use cases
@frdeng commented on GitHub (Aug 2, 2025):
Another issue is
USER_POOL.get(user_id, [])is called for each single stream data trunk, with redis, it would make a request, there is a little bit latency added, but it's not too bad compared to multiple concurrentsio.emit()calls.@rgaricano commented on GitHub (Aug 2, 2025):
Yes, I think so, sync should be requested by client or by server just if it's needed.
Maybe a kind of sync requirement pool? (some clients could need to by synced, others maybe not & only when it requests....also it'll need a emitter manager to redirect to each client when synced or enqueue events waiting for request...
If you think it's feasible, give it a try.
Although I would wait for other opinions from people with more knowledge.
@frdeng commented on GitHub (Aug 2, 2025):
An idea is we probably can make use of the socketIO namespace and room? similar to how we handle channels, but it's for chats.
basically the backend would make a room for a chat, if there are multiple are in the same chat(room), it would emit the event to the room members.
@sihyeonn commented on GitHub (Aug 16, 2025):
Hi @frdeng! I was wondering if this commit (
1a93891d97) actually helps address this issue. It seems that simply settingCHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZEto 2 or 3 might be more effective. Could you please clarify the intention behind this change, or share your thoughts? Thank you!@frdeng commented on GitHub (Aug 16, 2025):
This is new feature right? it would definitely help I think.
I tend to agree the stream events should be sent to all user sessions, maybe we extend this feature for the non current sessions, we only send one final event with stream data of all trunks combined?
Another related issue is the session pool cache. the user session count doesn't neccessarily match the actual browser tab count. I noticed over time, a user could end up having dozen of sessions due to ungraceful disconnect or other reason(pod restart, race condition on cache update between the pods etc). we should have a stale session detection and cleanup mechanism for the redis session pool and user pool cache,
@sihyeonn commented on GitHub (Aug 17, 2025):
@frdeng Your points regarding stream event handling and session cleanup are very insightful. This PR (#16693 ) directly addresses the session management concerns you raised with a TTL-based cleanup mechanism, ensuring SESSION_POOL and USER_POOL stay synchronized. The simplification of disconnect logic and the new abstraction layer are also positive steps.
@tjbck @rgaricano Regarding the stream events of sending a final combined event to non-current sessions, I think we could explore implementing this logic on top of the new TTL framework. For example, when a session expires, we could trigger a final aggregated event instead of streaming updates. Would it make sense to proceed with this? I’d be happy to take this on if you’re okay with it! 😊
@rgaricano commented on GitHub (Aug 18, 2025):
@Sihyeon
About TTL PR,
In my opinion, from a simple contributor who hasn't delved into the details on this topic.
This type of implementation requires great care and extensive testing and monitoring. These procedures add additional layers of processing that can generate further unwanted bottlenecks, starting with the wide variety of configurations open-webui works with, both in terms of systems and the usage and load it can support.
Implementing a timer for the session pool without other types of verifications and limit checks doesn't seem like a good idea to me. It can result in orphaned processes or transactions that could have continued, for example, in complex configurations that use a session for data exchange, streaming, information processing, monitoring, etc.
I'm not saying it's not a good idea; I just think that, in the current state, an isolated proposal isn't the most appropriate, and that any new addition in this regard must respond to a very well-planned functionality and structure, with clear objectives and implementation steps. In any case, I would implement it in a lab version.
@Ithanil commented on GitHub (Aug 23, 2025):
Yes, there is definitely a permanent upwards drift in active sessions / users, and yes for users with many active sessions models with high token/s ultimately kill the system and Redis. Which was what led me to the PR that led to
1a93891d97, but we definitely need a solution for the root problem. Didn't delve into https://github.com/open-webui/open-webui/pull/16693 yet, but I very much hope it is the answer.