mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 04:16:03 -05:00
[GH-ISSUE #23751] perf: MCP tool server reconnects on every message causing 15-20s silent delay #58727
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DSavaliya-gh on GitHub (Apr 15, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23751
Bug Description
When an MCP (Model Context Protocol) tool server is enabled in Open WebUI, every chat message—including follow-ups in the same conversation—triggers a full MCP connection lifecycle from scratch:
MCPClient()— new instance allocated.connect()— TCP handshake + TLS + HTTP session + MCPinitialize()protocol exchange (10 s timeout cap).list_tool_specs()—list_tools()round-trip to enumerate all tool definitionsThis entire sequence runs before the model API call is made and no
event_emitterstatus event is emitted during the wait. The user sees a blinking dot with zero feedback for 15–20 seconds, then the model starts thinking.Steps to Reproduce
Expected Behavior
Actual Behavior
Root Cause
backend/open_webui/utils/middleware.pyinsideprocess_chat_payload:No
event_emittercall is made before this block — the frontend has nothing to render.Proposed Fix
Two complementary changes to
middleware.py:1. Status events (UX — visible immediately)
Emit
type: 'status'events bracketing the connect call so the existingStatusHistorycomponent shows activity:2. Connection pool for bearer-auth servers (performance)
Cache the
MCPClientonapp.state.mcp_client_pool, keyed byserver_id, whenauth_type == 'bearer'(static credentials that do not change per-user). Per-user auth types (session,oauth_2.1) are explicitly excluded from pooling to preserve security.Stale pool entries are evicted automatically on connection errors.
Security Consideration
Only static bearer credentials are pooled. Per-user OAuth and session tokens are never cached — those paths always get a fresh connection.
Linked Fix
A ready-to-review PR is available at: https://github.com/DSavaliya-gh/open-webui/tree/fix/mcp-connection-pool-status-events
@0xbrainkid commented on GitHub (Apr 15, 2026):
Reconnecting on every message is a significant latency cost — 15-20s per message is non-interactive even with a fast LLM. The root cause is worth diagnosing: is it a missing session keep-alive (the server closes idle connections), a client that does not reuse connections (always creates fresh), or a load balancer that does not support persistent connections?
From an agent identity perspective, reconnect-on-every-message is also an identity cost. Each reconnect involves a fresh MCP
initializehandshake, which re-establishes the client-server session but does not carry any identity context from the previous session. An MCP server that performs identity verification at session initialization (checkingX-Agent-ID, verifying attestation, looking up behavioral trust score) repeats this work on every message — which is wasteful and also prevents building a within-session behavioral track record.Two improvements that would help:
1. Persistent session with identity binding at init. If the session is kept alive, the identity verification happens once at
initializeand the result is cached for the session lifetime. The session ID becomes an identity token — holding the session is equivalent to holding the identity.2. Session ID as identity-bound token. When a fresh session is created, the server binds the verified agent identity to the session ID. If the connection drops and the client reconnects with the same session ID (within a TTL), the server restores the identity context without re-verification. This makes reconnect fast while preserving identity continuity — the session ID carries the identity proof.
The 15-20s delay suggests the reconnect is doing a full initialization including any tool discovery or capability negotiation — persisting that state across reconnects would help too.
@Classic298 commented on GitHub (Apr 15, 2026):
STOP OPENING / SPAMMING THE ISSUES SECTION
Final warning
@Classic298 commented on GitHub (Apr 15, 2026):
@0xbrainkid bot?
@Classic298 commented on GitHub (Apr 15, 2026):
@DSavaliya-gh besides my warning from stop spamming the issues, also stop spamming PRs and READ THE PR TEMPLATE. We will agressively close all automated PRs or PRs that do not follow the PR template
@Classic298 commented on GitHub (Apr 15, 2026):
Also my slop alarm is going off here. 15-20 seconds for one MCP connection is widly absurd.
The initialize call has a fail_after(10) timeout (line 73 in client.py), so the theoretical maximum for a single connection is ~10 seconds before it'd hard-fail
So 15-20 seconds is totally absolutely theoretically and practically impossible
@Classic298 commented on GitHub (Apr 15, 2026):
HTTP-based MCP streamable connection to a bearer-auth server should take well under 1 second. It's an HTTP request, not a TCP+TLS+MCP handshake taking 15-20s