mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #21340] feat: full open responses support #58113
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tjbck on GitHub (Feb 12, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21340
@Classic298 commented on GitHub (Feb 13, 2026):
(Idea)
@ahxxm commented on GitHub (Feb 14, 2026):
The collapsed sections for thinking summary and web.run always have a 'Thought for less than a second', while it's actually still streaming tokens
It looks like this to me
The above request from v0.8.1 went through litellm
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6, config:@ahxxm commented on GitHub (Feb 14, 2026):
I could open a new issue, but I imagine it to be automatically closed by this one so I took the shortcut
@dhaern commented on GitHub (Feb 14, 2026):
@tjbck My custom multikernel jupyter tool stop working in every execution when I enable Responses Api for my server proxy CLIProxyAPI.
Tested with more community tools and this is happening with every non built in tool. Also tested diferent models, every model with native tool calling enabled. Working fine in completions mode, no problem with parrallel or single execution .
@yazon commented on GitHub (Feb 14, 2026):
Azure OpenAI does not work:
This works:
curl -X POST "https://my-endpoint.openai.azure.com/openai/responses?api-version=2025-04-01-preview"
-H "Content-Type: application/json"
-H "Authorization: Bearer my-key"
-d '{
"input": [
{
"role": "user",
"content": "Hi"
}
],
"max_output_tokens": 16384,
"model": "gpt-5.1-codex-mini"
}'
@Classic298 commented on GitHub (Feb 14, 2026):
Resource not found. Are you sure your link and config and deployment is correct?
@yazon commented on GitHub (Feb 14, 2026):
Yes, the url, api version, deployment name is the same as for the completions API (which works).
@dhaern commented on GitHub (Feb 16, 2026):
@tjbck Tool call bug in responses API still happening in 0.8.2
@yazon commented on GitHub (Feb 17, 2026):
I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL:
@gaby commented on GitHub (Feb 17, 2026):
@Classic298 Where was this added? I don't see it on the Swagger Docs (running 0.8.3)
@Classic298 commented on GitHub (Feb 17, 2026):
@gaby i think was added here:
abc9b63093@gaby commented on GitHub (Feb 17, 2026):
@Classic298 Ah I see, so that one only shows up if using the OpenAI proxy via
/openai/responses. Open-WebUI itself doesnt serve responses API.I don't know if its possible to add that? It would make api usage cleaner since I wont have to be changing base_url.
@Classic298 commented on GitHub (Feb 17, 2026):
aha okay so it's a workaround is what is implemented at the moment. a barebones version via the openai path.
@Classic298 commented on GitHub (Feb 17, 2026):
i edited my comment
@sahrmann commented on GitHub (Feb 23, 2026):
Ran into the same problem, it would be great if this issue could be fixed!
@Classic298 commented on GitHub (Feb 25, 2026):
https://github.com/open-webui/open-webui/discussions/15628
@DanCarrollAI commented on GitHub (Feb 27, 2026):
openai_v1_fix.patch
A note on the Azure Responses API v1 in case it helps with implementation.
Environment:
OpenWebUI 0.8.5
Azure OpenAI Responses API
GPT-5/GPT-5.1 models
Issue
Azure Responses API connections fail with "Resource not found" (404) when using the v1 endpoint format:
https://{resource}.openai.azure.com/openai/v1
OpenWebUI only supports the legacy deployment-based format, causing v1 connections to fail (and requiring an api version to be specified in the connector when it is not used in the payload or url if using v1.)
Root Cause
The v1 API uses a different format:
/openai/v1/responses
With model in the payload, not the URL.
Fix
Added v1 detection and routing logic to handle both formats:
Why Use v1?
Recommended: Use v1 endpoint format (/openai/v1) instead of deployment-based URLs:
Simpler configuration - One endpoint for all models, model name in request body
No api-version required - Cleaner URLs, less configuration
Future-proof - Microsoft is standardizing on v1 format
Testing
I only tested v1 endpoint as the other endpoint is legacy already and cannot see a use case where it would need to be used. Verified working with GPT-5 and GPT-5.1. "Thinking" indicator shows as expected.
Edit: Native function calling has unexpected results on the models ability to use inbuilt tools (spinning wheel as described in this chain earlier and https://github.com/open-webui/open-webui/issues/21426) so I will still be using an external pipe to handle the logic of the inbuilt tools and response outputs kept within the reasoning summary. But thought this may help with inbuilt support/implementation for Azure Openai.
https://github.com/DanCarrollAI/open-webui-developer-toolkit-azure/blob/azure-alpha-preview/functions/pipes/openai_responses_manifold/openai_responses_manifold.py
@i0ntempest commented on GitHub (Mar 9, 2026):
I've noticed that title generation does not work when using responses endpoint as said in https://github.com/open-webui/open-webui/issues/21566. If fact anything that uses the "task model" doesn't work (text completions, follow ups, etc.)
@Arslo-cloud commented on GitHub (Mar 11, 2026):
Any plans to integrate that change to the main branch?
Edit:
Just found https://github.com/open-webui/open-webui/discussions/15628 that it is already on to-do.
@eatmyvenom commented on GitHub (Mar 12, 2026):
Just for clarity, native mode tool calls still do not work. The rest of the implementation works just fine.
I have tested using both built-in and uploaded tools with the responses endpoint from OpenAI and OpenRouter.
If someone is willing and has time to have a checklist of the progress of this issue somewhere so we can monitor the status of the implementation at a glance, that would be appreciated!
@Arslo-cloud commented on GitHub (Mar 12, 2026):
Well it doesn't work with azure response API at all.
@attilaolah commented on GitHub (Mar 12, 2026):
In case this helps anyone, this is a workaround we've been using for about a week now. It still does not allow folks to use Azure Responses-API models in the Open WebUI web app, but it does let the CLI agent tools and other IDE integrations through, directly to Azure.
One downside is that requests that get proxied and thus bypass Open WebUI entirely will not be counted in the statistics for Open WebUI's model usage, but we can live with that for now.
/api/v1/auths/api_keyto verify the key.The below proxy config works for both Claude Code and Codex.
The
__OPENAI_API_KEY__gets replaced in the init container by an env var mounted from a secret, containing the Azure API key. An ingress rule will mount the below Caddy config under the "/api" subpath, matching only /api/anthropic* and /api/v1/responses*, so it won't clash e.g. with the Chat API — that one can still go through Open WebUI.Claude Code needs these three env vars then:
Codex can be configured by adding a model provider as usual, pointing to Open WebUI, that uses the responses wire format (default in recent Codex versions).
Note that the
/apipart we use for the subpath is somewhat arbitrary, however we have that because we have a second reverse proxy in front of Open WebUI that does an additional layer of authentication, and/api/*is exempted from this rule so that CLI tools could sign in using only the token.@gaby commented on GitHub (Mar 12, 2026):
@attilaolah Be aware that
/openaiendpoints do not work with Workspace models.I raised an issue, but it was closed: https://github.com/open-webui/open-webui/issues/22134
The only endpoint that will work is chat completions with workspace models.
@attilaolah commented on GitHub (Mar 12, 2026):
Indeed, but it does work with the foundation models deployed via the AI Foundry, including the Codex models, which is better than nothing.
@Arslo-cloud commented on GitHub (Mar 13, 2026):
I changed the azure API to use the open air settings. Now it works but I now got the tool call problem.
@JiwaniZakir commented on GitHub (Mar 14, 2026):
I can pick this up. The current streaming implementation in open-webui doesn't fully handle OpenAI-style streaming responses where partial JSON chunks need to be assembled — I'd work through the chat completion handler to ensure delta objects, including tool calls and function arguments, are properly accumulated and rendered incrementally rather than waiting for the final assembled response.
@JiwaniZakir commented on GitHub (Mar 14, 2026):
Hey, dropping my claim on this one. I spent a good chunk of time on it but couldn't land on a clean approach for the full open responses support, so I'm stepping aside to let someone else take a crack at it.
@Arslo-cloud commented on GitHub (Mar 15, 2026):
Hope that solves it https://github.com/open-webui/open-webui/pull/22497#issuecomment-4062796363
@flange-ipb commented on GitHub (Mar 17, 2026):
Hello, Ollama recently added support for the Responses API (https://github.com/ollama/ollama/pull/13351). As a low-hanging fruit would it be possible to add it to the Ollama Proxy?
Edit: This was added with #23483. Thanks!
@tjbck commented on GitHub (Apr 7, 2026):
I believe we've reached full parity with chat completions in the latest version, any issues everyone?
@Classic298 commented on GitHub (Apr 7, 2026):
4 issues identified, PRs incoming
@Classic298 commented on GitHub (Apr 14, 2026):
all PRs merged should be fully done now
@wolffberg commented on GitHub (Apr 28, 2026):
@yazon, @Classic298
Did this ever make it into a release? I'm getting the exact same issue with
gpt-5.1-codex-miniand2025-04-01-preview.Log:
@yazon commented on GitHub (Apr 29, 2026):
No, I just manually replace the file for each openwebui release 🙂
@wolffberg commented on GitHub (Apr 29, 2026):
Would you mind making it into a PR or issue with a description on where you add it? 🙏
I'm running a stateless instance, so replacing the file is not viable.