[PR #7780] [CLOSED] Disable Polling Transport When WebSockets Are Enabled and Implement Cleanup Locking Mechanism #22106

Closed
opened 2026-04-20 03:55:24 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/7780
Author: @jk-f5
Created: 12/11/2024
Status: Closed

Base: devHead: disablepolling


📝 Commits (1)

  • 5a3b650 feat: Make ENABLE_WEBSOCKET_SUPPORT disable polling entirely to allow multiple replicas without sticky sessions.

📊 Changes

4 files changed (+83 additions, -33 deletions)

View changed files

📝 backend/open_webui/apps/socket/main.py (+52 -31)
📝 backend/open_webui/apps/socket/utils.py (+24 -0)
📝 backend/open_webui/main.py (+2 -0)
📝 src/routes/+layout.svelte (+5 -2)

📄 Description

Overview

This merge request introduces changes to optimize WebSocket usage and ensure proper operation in distributed environments.

  1. Disable the "polling" transport when WebSocket support is enabled to reduce unnecessary overhead and prevent connectivity issues in multi-replica deployments without sticky sessions.
  2. Implement a Redis-based locking mechanism to synchronize the periodic_usage_pool_cleanup task across multiple instances in a distributed deployment.

Changes

1. Disable Polling Transport When WebSockets Are Enabled

Rationale:

By default, Socket.IO uses both polling and websocket transports. In environments with multiple application instances (replicas) behind a load balancer without sticky sessions, the polling transport can lead to connectivity issues:

  • 400 Errors Without Sticky Sessions:

    • The polling transport relies on HTTP long-polling requests that must consistently reach the same server instance. Without sticky sessions, requests from the same client may be routed to different instances, causing the server to not recognize the session and respond with HTTP 400 errors.
    • This happens because the initial HTTP request that establishes the Socket.IO session and the subsequent polling requests are handled by different servers, leading to mismatches in session IDs and authentication tokens.
    • As a result, clients experience failed connections and errors in the application.
  • Solution:

    • By disabling the polling transport and using only websocket, which operates over a persistent TCP connection, we avoid the need for sticky sessions.
    • WebSocket connections are maintained over the same TCP connection, ensuring that all communication is directed to the same server instance.
    • Most modern load balancers support WebSocket connections and can route them correctly without sticky sessions.

Benefits:

  • Improved Stability: Eliminates 400 errors caused by misrouted polling requests in non-sticky environments.
  • Better Performance: Reduces overhead associated with the fallback polling mechanism, leading to more efficient resource utilization.
  • Simplified Deployment: Removes the requirement for sticky sessions or additional load balancer configuration, making it easier to deploy in scalable environments.

2. Implement Redis-Based Locking for periodic_usage_pool_cleanup

Rationale:

In a distributed deployment with multiple instances of the application, running the periodic_usage_pool_cleanup task concurrently can lead to race conditions and inconsistent state in the USAGE_POOL. By introducing a Redis-based lock:

  • Ensures Single Instance Execution:

    • Only one instance acquires the lock and performs the cleanup at any given time.
    • Prevents conflicts and potential data corruption due to simultaneous access.
  • Compatibility with Distributed Systems:

    • Redis serves as a centralized coordination point accessible by all instances.
    • The lock mechanism leverages Redis's atomic operations to manage concurrency.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/7780 **Author:** [@jk-f5](https://github.com/jk-f5) **Created:** 12/11/2024 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `disablepolling` --- ### 📝 Commits (1) - [`5a3b650`](https://github.com/open-webui/open-webui/commit/5a3b650ed178b4ad3edabaf1b2666f859b534ac6) feat: Make ENABLE_WEBSOCKET_SUPPORT disable polling entirely to allow multiple replicas without sticky sessions. ### 📊 Changes **4 files changed** (+83 additions, -33 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/apps/socket/main.py` (+52 -31) 📝 `backend/open_webui/apps/socket/utils.py` (+24 -0) 📝 `backend/open_webui/main.py` (+2 -0) 📝 `src/routes/+layout.svelte` (+5 -2) </details> ### 📄 Description ## Overview This merge request introduces changes to optimize WebSocket usage and ensure proper operation in distributed environments. 1. **Disable the "polling" transport when WebSocket support is enabled** to reduce unnecessary overhead and prevent connectivity issues in multi-replica deployments without sticky sessions. 2. **Implement a Redis-based locking mechanism** to synchronize the `periodic_usage_pool_cleanup` task across multiple instances in a distributed deployment. ## Changes ### 1. Disable Polling Transport When WebSockets Are Enabled **Rationale:** By default, Socket.IO uses both `polling` and `websocket` transports. In environments with multiple application instances (replicas) behind a load balancer without sticky sessions, the `polling` transport can lead to connectivity issues: - **400 Errors Without Sticky Sessions:** - The `polling` transport relies on HTTP long-polling requests that must consistently reach the same server instance. Without sticky sessions, requests from the same client may be routed to different instances, causing the server to not recognize the session and respond with HTTP 400 errors. - This happens because the initial HTTP request that establishes the Socket.IO session and the subsequent polling requests are handled by different servers, leading to mismatches in session IDs and authentication tokens. - As a result, clients experience failed connections and errors in the application. - **Solution:** - By disabling the `polling` transport and using only `websocket`, which operates over a persistent TCP connection, we avoid the need for sticky sessions. - WebSocket connections are maintained over the same TCP connection, ensuring that all communication is directed to the same server instance. - Most modern load balancers support WebSocket connections and can route them correctly without sticky sessions. **Benefits:** - **Improved Stability:** Eliminates 400 errors caused by misrouted polling requests in non-sticky environments. - **Better Performance:** Reduces overhead associated with the fallback polling mechanism, leading to more efficient resource utilization. - **Simplified Deployment:** Removes the requirement for sticky sessions or additional load balancer configuration, making it easier to deploy in scalable environments. ### 2. Implement Redis-Based Locking for `periodic_usage_pool_cleanup` **Rationale:** In a distributed deployment with multiple instances of the application, running the `periodic_usage_pool_cleanup` task concurrently can lead to race conditions and inconsistent state in the `USAGE_POOL`. By introducing a Redis-based lock: - **Ensures Single Instance Execution:** - Only one instance acquires the lock and performs the cleanup at any given time. - Prevents conflicts and potential data corruption due to simultaneous access. - **Compatibility with Distributed Systems:** - Redis serves as a centralized coordination point accessible by all instances. - The lock mechanism leverages Redis's atomic operations to manage concurrency. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-20 03:55:24 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#22106