mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #24190] issue: Azure OpenAI - Open WebUI: Server Connection Error #58892
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @burkhat on GitHub (Apr 28, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24190
Check Existing Issues
Installation Method
Other
Open WebUI Version
v0.9.2
Ollama Version (if applicable)
Operating System
Openshift 4.20.x
Browser (if applicable)
Chrome Version 146.0.7680.154
Confirmation
README.md.Expected Behavior
We want to use gpt-5.4 Model from Azure OpenAI with Private Endpoints and Reasoning Effort xhigh.
And we want to get an reply from the Model for the prompt after round about 8 minutes
Actual Behavior
At the moment we got a Server Connection Error after round about 4 Minutes.
We got the RESET from the Azure Ressource, is it possible to implement keep alive messages for OpenAI?
Steps to Reproduce
Logs & Screenshots
Additional Information
We have created a dump and we can see we got the RESET from the Azure OpenAI Ressource.
It looks like the idle timout for VNETs within Azure is set to 4 minutes, see https://www.reddit.com/r/AZURE/comments/1mzn1hg/tcp_idle_timeout_in_azure_vnets/?tl=de&rdt=44769
In the dump we can see that Open WebUI don't send keep alive messages.
I've created a small python script with the newest OpenAI pip modul and here we can see a keep alive message will be send every 30 seconds.
At the moment it is only possible to configure keep alive message for Ollama and not for OpenAI within OpenWebUI
@PHclaw commented on GitHub (Apr 28, 2026):
Empty title issue — if
titleis null or empty, the frontend should fall back to the first ~50 chars of the content. Check the React component rendering the title field and add:Also add a DB-level NOT NULL constraint with a default of the first content chars to prevent nulls at the storage layer.
@PHclaw commented on GitHub (Apr 28, 2026):
Azure OpenAI Server Connection Error - this is likely a model name mismatch. Azure OpenAI uses deployment names, not model names. The format should be:
Also check that your Azure OpenAI resource has the correct API version and that CORS is enabled for the webui origin. The error 'Server Connection Error' typically means the /v1/models endpoint is returning a non-200 status.
@burkhat commented on GitHub (Apr 28, 2026):
@PHclaw each other chat with the same chat model is working. The problem only occurs if we change the Reasoning effort to "xhigh" and add some files to the prompt.
If we're using "xhigh" with just a prompt "Hello how are you?" we got a response.
We can see in the dump Azure will send a RST after 4 minutes.
With my python script it is working with same Base URL and Model like in OpenWebUI.
@PHclaw commented on GitHub (Apr 28, 2026):
Interesting! If it works fine with the same model normally but fails when switching 'Reasoning effort' to high, that points to the reasoning model (o1/o3/o4) being invoked differently.
Reasoning effort settings in Azure OpenAI map to different model deployments:
Check your Azure OpenAI studio for which model deployment is used for reasoning. The 'high' setting likely hits a different deployment that either:
Look at the Azure portal -> Your resource -> Model deployments. If 'o1-preview' or 'o3' is in a separate deployment, make sure that deployment name is in your Open Web UI config.
@PHclaw commented on GitHub (Apr 28, 2026):
Following up: if the error only happens when 'Reasoning effort' is set to high, the issue is that high reasoning effort routes to a different model (o1-preview/o3-mini) in Azure OpenAI. These models require different API parameters:
systemmessages (o1-series ignores them)max_completion_tokensinstead ofmax_tokenstemperatureparameter (always 1 for o1)Check your Open WebUI backend code for the model routing logic. When reasoning_effort=high, it likely sends o1-compatible parameters to gpt-4o, causing the 400 error. The fix is to conditionally apply API parameters based on model family:
@PHclaw commented on GitHub (Apr 28, 2026):
For the DuckDuckGo search AttributeError:
This is likely a version incompatibility. duckduckgo-search updated its API and renamed
ddg()function. The fix:Or pin the working version:
Check the current installed version and compare with what the code expects.
@burkhat commented on GitHub (Apr 28, 2026):
@PHclaw The problem has nothing to do with the model or with tokens, but with how the prompt is queried.
When I send the same prompt via a Python script to the same endpoint it works without any issues; the error only occurs with Open WebUI.
In the network dump you can see that Open WebUI does not send any keep‑alive messages, and after about four minutes Azure OpenAI sends an RST, which then triggers the error.
The Python script sends a request every 30 seconds, so no RST is emitted by OpenAI.
In my opinion, Open WebUI should be modified to send keep‑alive messages in order to work around the idle‑timeout.
oai_prompt.py
@Classic298 commented on GitHub (May 1, 2026):
@PHclaw stop your clanker and stop it commenting with totally random comments that have NOTHING to do with this here. SPAM
@Classic298 commented on GitHub (May 1, 2026):
@burkhat nginx settings? Reverse proxy? Timeouts there? How did you add the model? Responses api?
Cannot reproduce with this needs more details please