[GH-ISSUE #21340] feat: full open responses support #58113

New Issue

GiteaMirror · 2026-05-05T22:21:06-05:00

GiteaMirror commented

2026-05-05 22:21:06 -05:00

Originally created by @tjbck on GitHub (Feb 12, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21340

Originally created by @tjbck on GitHub (Feb 12, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/21340

GiteaMirror closed this issue

2026-05-05 22:21:06 -05:00

GiteaMirror commented

2026-05-05 22:21:07 -05:00

@Classic298 commented on GitHub (Feb 13, 2026):

(Idea)

Responses support on Open WebUI API - /responses endpoint

@Classic298 commented on GitHub (Feb 13, 2026): (Idea) - [x] Responses support on Open WebUI API - /responses endpoint

GiteaMirror commented

2026-05-05 22:21:07 -05:00

@ahxxm commented on GitHub (Feb 14, 2026):

The collapsed sections for thinking summary and web.run always have a 'Thought for less than a second', while it's actually still streaming tokens

It looks like this to me

The above request from v0.8.1 went through litellm ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6, config:


model_list:
  - model_name: gpt-5.2
    litellm_params:
      model: openai/gpt-5.2
      api_base: https://oooo/openai/v1
      api_key: "akey"
      service_tier: "priority"
      reasoning: {"effort": "xhigh", "summary": "auto"}
      tools:
        - type: web_search
      tool_choice: auto
    model_info:
      base_model: gpt-5.2

@ahxxm commented on GitHub (Feb 14, 2026): The collapsed sections for thinking summary and web.run always have a 'Thought for less than a second', while it's actually still streaming tokens It looks like this to me <img width="746" height="489" alt="Image" src="https://github.com/user-attachments/assets/6e3d7992-6710-4ee3-b7fe-151559116414" /> The above request from v0.8.1 went through litellm `ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6`, config: ```yaml model_list: - model_name: gpt-5.2 litellm_params: model: openai/gpt-5.2 api_base: https://oooo/openai/v1 api_key: "akey" service_tier: "priority" reasoning: {"effort": "xhigh", "summary": "auto"} tools: - type: web_search tool_choice: auto model_info: base_model: gpt-5.2 ```

GiteaMirror commented

2026-05-05 22:21:08 -05:00

@ahxxm commented on GitHub (Feb 14, 2026):

I could open a new issue, but I imagine it to be automatically closed by this one so I took the shortcut

@ahxxm commented on GitHub (Feb 14, 2026): I could open a new issue, but I imagine it to be automatically closed by this one so I took the shortcut

GiteaMirror commented

2026-05-05 22:21:09 -05:00

@dhaern commented on GitHub (Feb 14, 2026):

@tjbck My custom multikernel jupyter tool stop working in every execution when I enable Responses Api for my server proxy CLIProxyAPI.

Tested with more community tools and this is happening with every non built in tool. Also tested diferent models, every model with native tool calling enabled. Working fine in completions mode, no problem with parrallel or single execution .

@dhaern commented on GitHub (Feb 14, 2026): <img width="1234" height="1942" alt="Image" src="https://github.com/user-attachments/assets/18080578-e821-4d35-9338-226ecbe91f75" /> @tjbck My custom multikernel jupyter tool stop working in every execution when I enable Responses Api for my server proxy [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI). Tested with more community tools and this is happening with every non built in tool. Also tested diferent models, every model with native tool calling enabled. Working fine in completions mode, no problem with parrallel or single execution .

GiteaMirror commented

2026-05-05 22:21:09 -05:00

@yazon commented on GitHub (Feb 14, 2026):

Azure OpenAI does not work:

This works:

curl -X POST "https://my-endpoint.openai.azure.com/openai/responses?api-version=2025-04-01-preview"
-H "Content-Type: application/json"
-H "Authorization: Bearer my-key"
-d '{
"input": [
{
"role": "user",
"content": "Hi"
}
],
"max_output_tokens": 16384,
"model": "gpt-5.1-codex-mini"
}'

@yazon commented on GitHub (Feb 14, 2026): Azure OpenAI does not work: <img width="1057" height="188" alt="Image" src="https://github.com/user-attachments/assets/9ed0cff6-7212-418c-a02c-86e1f2baa166" /> This works: curl -X POST "https://my-endpoint.openai.azure.com/openai/responses?api-version=2025-04-01-preview" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer my-key" \ -d '{ "input": [ { "role": "user", "content": "Hi" } ], "max_output_tokens": 16384, "model": "gpt-5.1-codex-mini" }'

GiteaMirror commented

2026-05-05 22:21:10 -05:00

@Classic298 commented on GitHub (Feb 14, 2026):

Resource not found. Are you sure your link and config and deployment is correct?

@Classic298 commented on GitHub (Feb 14, 2026): Resource not found. Are you sure your link and config and deployment is correct?

GiteaMirror commented

2026-05-05 22:21:11 -05:00

@yazon commented on GitHub (Feb 14, 2026):

Yes, the url, api version, deployment name is the same as for the completions API (which works).

@yazon commented on GitHub (Feb 14, 2026): Yes, the url, api version, deployment name is the same as for the completions API (which works).

GiteaMirror commented

2026-05-05 22:21:12 -05:00

@dhaern commented on GitHub (Feb 16, 2026):

[@tjbck](https://github.com/tjbck) My custom multikernel jupyter tool stop working in every execution when I enable Responses Api for my server proxy [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI).
Tested with more community tools and this is happening with every non built in tool. Also tested diferent models, every model with native tool calling enabled. Working fine in completions mode, no problem with parrallel or single execution .

@tjbck Tool call bug in responses API still happening in 0.8.2

@dhaern commented on GitHub (Feb 16, 2026): > <img alt="Image" width="1234" height="1942" src="https://private-user-images.githubusercontent.com/7317522/549776732-18080578-e821-4d35-9338-226ecbe91f75.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzEyMzAyNjgsIm5iZiI6MTc3MTIyOTk2OCwicGF0aCI6Ii83MzE3NTIyLzU0OTc3NjczMi0xODA4MDU3OC1lODIxLTRkMzUtOTMzOC0yMjZlY2JlOTFmNzUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI2MDIxNiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNjAyMTZUMDgxOTI4WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9Nzc5ZDEzMTU5ZjQ2ODA3YjhiYzYyNWI5ZGQwOWZhM2VhMTA4ZWUyZjQyYWFiZDBlMWE3Y2FmODNhMmFlMWM0YSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.wd0UTQBY6BTQuLKw6fQ13HHIKhRlV8C5EtTb6SDBZq8"> > [@tjbck](https://github.com/tjbck) My custom multikernel jupyter tool stop working in every execution when I enable Responses Api for my server proxy [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI). > > Tested with more community tools and this is happening with every non built in tool. Also tested diferent models, every model with native tool calling enabled. Working fine in completions mode, no problem with parrallel or single execution . @tjbck Tool call bug in responses API still happening in 0.8.2 <img width="1132" height="645" alt="Image" src="https://github.com/user-attachments/assets/0125a074-b26c-44a1-b229-c27f0d4ffae9" />

GiteaMirror commented

2026-05-05 22:21:13 -05:00

@yazon commented on GitHub (Feb 17, 2026):

I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL:

    if api_config.get("azure", False):
        api_version = api_config.get("api_version", "2023-03-15-preview")
        model = payload.get("model", "")
        request_url, payload = convert_to_azure_payload(url, payload, api_version)

        # Only set api-key header if not using Azure Entra ID authentication
        auth_type = api_config.get("auth_type", "bearer")
        if auth_type not in ("azure_ad", "microsoft_entra_id"):
            headers["api-key"] = key

        headers["api-version"] = api_version

        if is_responses:
            payload = convert_to_responses_payload(payload)
            payload["model"] = model
            request_url = f"{url}/openai/responses?api-version={api_version}"

@yazon commented on GitHub (Feb 17, 2026): I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL: ``` if api_config.get("azure", False): api_version = api_config.get("api_version", "2023-03-15-preview") model = payload.get("model", "") request_url, payload = convert_to_azure_payload(url, payload, api_version) # Only set api-key header if not using Azure Entra ID authentication auth_type = api_config.get("auth_type", "bearer") if auth_type not in ("azure_ad", "microsoft_entra_id"): headers["api-key"] = key headers["api-version"] = api_version if is_responses: payload = convert_to_responses_payload(payload) payload["model"] = model request_url = f"{url}/openai/responses?api-version={api_version}" ```

GiteaMirror commented

2026-05-05 22:21:14 -05:00

@gaby commented on GitHub (Feb 17, 2026):

(Idea)

Responses support on Open WebUI API - /responses endpoint on API was added 🥳

@Classic298 Where was this added? I don't see it on the Swagger Docs (running 0.8.3)

@gaby commented on GitHub (Feb 17, 2026): > (Idea) > > * [x] Responses support on Open WebUI API - /responses endpoint on API was added 🥳 @Classic298 Where was this added? I don't see it on the Swagger Docs (running 0.8.3) <img width="1341" height="2612" alt="Image" src="https://github.com/user-attachments/assets/b182e671-20c9-46da-886e-cfee75c26474" />

GiteaMirror commented

2026-05-05 22:21:14 -05:00

@Classic298 commented on GitHub (Feb 17, 2026):

@gaby i think was added here: abc9b63093

@Classic298 commented on GitHub (Feb 17, 2026): @gaby i think was added here: https://github.com/open-webui/open-webui/commit/abc9b63093d65f4d74342db85b7d5df1809aa0f0

GiteaMirror commented

2026-05-05 22:21:15 -05:00

@gaby commented on GitHub (Feb 17, 2026):

@gaby i think was added here: abc9b63

@Classic298 Ah I see, so that one only shows up if using the OpenAI proxy via /openai/responses. Open-WebUI itself doesnt serve responses API.

I don't know if its possible to add that? It would make api usage cleaner since I wont have to be changing base_url.

@gaby commented on GitHub (Feb 17, 2026): > [@gaby](https://github.com/gaby) i think was added here: [abc9b63](https://github.com/open-webui/open-webui/commit/abc9b63093d65f4d74342db85b7d5df1809aa0f0) @Classic298 Ah I see, so that one only shows up if using the OpenAI proxy via `/openai/responses`. Open-WebUI itself doesnt serve responses API. I don't know if its possible to add that? It would make api usage cleaner since I wont have to be changing base_url.

GiteaMirror commented

2026-05-05 22:21:16 -05:00

@Classic298 commented on GitHub (Feb 17, 2026):

aha okay so it's a workaround is what is implemented at the moment. a barebones version via the openai path.

@Classic298 commented on GitHub (Feb 17, 2026): aha okay so it's a workaround is what is implemented at the moment. a barebones version via the openai path.

GiteaMirror commented

2026-05-05 22:21:17 -05:00

@Classic298 commented on GitHub (Feb 17, 2026):

i edited my comment

@Classic298 commented on GitHub (Feb 17, 2026): i edited my comment

GiteaMirror commented

2026-05-05 22:21:18 -05:00

@sahrmann commented on GitHub (Feb 23, 2026):

I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL:

    if api_config.get("azure", False):
        api_version = api_config.get("api_version", "2023-03-15-preview")
        model = payload.get("model", "")
        request_url, payload = convert_to_azure_payload(url, payload, api_version)

        # Only set api-key header if not using Azure Entra ID authentication
        auth_type = api_config.get("auth_type", "bearer")
        if auth_type not in ("azure_ad", "microsoft_entra_id"):
            headers["api-key"] = key

        headers["api-version"] = api_version

        if is_responses:
            payload = convert_to_responses_payload(payload)
            payload["model"] = model
            request_url = f"{url}/openai/responses?api-version={api_version}"

Ran into the same problem, it would be great if this issue could be fixed!

@sahrmann commented on GitHub (Feb 23, 2026): > I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL: > > ``` > if api_config.get("azure", False): > api_version = api_config.get("api_version", "2023-03-15-preview") > model = payload.get("model", "") > request_url, payload = convert_to_azure_payload(url, payload, api_version) > > # Only set api-key header if not using Azure Entra ID authentication > auth_type = api_config.get("auth_type", "bearer") > if auth_type not in ("azure_ad", "microsoft_entra_id"): > headers["api-key"] = key > > headers["api-version"] = api_version > > if is_responses: > payload = convert_to_responses_payload(payload) > payload["model"] = model > request_url = f"{url}/openai/responses?api-version={api_version}" > ``` Ran into the same problem, it would be great if this issue could be fixed!

GiteaMirror commented

2026-05-05 22:21:19 -05:00

@Classic298 commented on GitHub (Feb 25, 2026):

Azure Responses <-> OpenResponses support

https://github.com/open-webui/open-webui/discussions/15628

@Classic298 commented on GitHub (Feb 25, 2026): - [ ] Azure Responses <-> OpenResponses support https://github.com/open-webui/open-webui/discussions/15628

GiteaMirror commented

2026-05-05 22:21:20 -05:00

@DanCarrollAI commented on GitHub (Feb 27, 2026):

openai_v1_fix.patch

A note on the Azure Responses API v1 in case it helps with implementation.

Environment:
OpenWebUI 0.8.5
Azure OpenAI Responses API
GPT-5/GPT-5.1 models

Issue
Azure Responses API connections fail with "Resource not found" (404) when using the v1 endpoint format:

https://{resource}.openai.azure.com/openai/v1
OpenWebUI only supports the legacy deployment-based format, causing v1 connections to fail (and requiring an api version to be specified in the connector when it is not used in the payload or url if using v1.)

Root Cause

/app/backend/open_webui/routers/openai.py (lines ~1067-1090) always constructs deployment-based URLs:

/openai/deployments/{model}/responses?api-version={version}

The v1 API uses a different format:
/openai/v1/responses
With model in the payload, not the URL.

Fix
Added v1 detection and routing logic to handle both formats:

# Detect v1 API
is_v1_api = "/openai/v1" in url

if is_responses:
    payload = convert_to_responses_payload(payload)
    
    if is_v1_api:
        # V1: model in payload, no api-version
        payload["model"] = payload.get("model", "")
        request_url = f"{url.rstrip('/')}/responses"
    else:
        # Legacy: model in URL path
        request_url, payload = convert_to_azure_payload(url, payload, api_version)
        request_url = f"{request_url}/responses?api-version={api_version}"
        headers["api-version"] = api_version

Why Use v1?
Recommended: Use v1 endpoint format (/openai/v1) instead of deployment-based URLs:

Simpler configuration - One endpoint for all models, model name in request body
No api-version required - Cleaner URLs, less configuration
Future-proof - Microsoft is standardizing on v1 format

Testing
I only tested v1 endpoint as the other endpoint is legacy already and cannot see a use case where it would need to be used. Verified working with GPT-5 and GPT-5.1. "Thinking" indicator shows as expected.

Edit: Native function calling has unexpected results on the models ability to use inbuilt tools (spinning wheel as described in this chain earlier and https://github.com/open-webui/open-webui/issues/21426) so I will still be using an external pipe to handle the logic of the inbuilt tools and response outputs kept within the reasoning summary. But thought this may help with inbuilt support/implementation for Azure Openai.
https://github.com/DanCarrollAI/open-webui-developer-toolkit-azure/blob/azure-alpha-preview/functions/pipes/openai_responses_manifold/openai_responses_manifold.py

@DanCarrollAI commented on GitHub (Feb 27, 2026): [openai_v1_fix.patch](https://github.com/user-attachments/files/25606273/openai_v1_fix.patch) A note on the Azure Responses API v1 in case it helps with implementation. **Environment:** OpenWebUI 0.8.5 Azure OpenAI Responses API GPT-5/GPT-5.1 models **Issue** Azure Responses API connections fail with "Resource not found" (404) when using the v1 endpoint format: https://{resource}.openai.azure.com/openai/v1 OpenWebUI only supports the legacy deployment-based format, causing v1 connections to fail (and requiring an api version to be specified in the connector when it is not used in the payload or url if using v1.) **Root Cause** ``` /app/backend/open_webui/routers/openai.py (lines ~1067-1090) always constructs deployment-based URLs: /openai/deployments/{model}/responses?api-version={version} ``` The v1 API uses a different format: /openai/v1/responses With model in the payload, not the URL. Fix Added v1 detection and routing logic to handle both formats: ``` # Detect v1 API is_v1_api = "/openai/v1" in url if is_responses: payload = convert_to_responses_payload(payload) if is_v1_api: # V1: model in payload, no api-version payload["model"] = payload.get("model", "") request_url = f"{url.rstrip('/')}/responses" else: # Legacy: model in URL path request_url, payload = convert_to_azure_payload(url, payload, api_version) request_url = f"{request_url}/responses?api-version={api_version}" headers["api-version"] = api_version ``` **Why Use v1?** Recommended: Use v1 endpoint format (/openai/v1) instead of deployment-based URLs: Simpler configuration - One endpoint for all models, model name in request body No api-version required - Cleaner URLs, less configuration Future-proof - Microsoft is standardizing on v1 format **Testing** I only tested v1 endpoint as the other endpoint is legacy already and cannot see a use case where it would need to be used. Verified working with GPT-5 and GPT-5.1. "Thinking" indicator shows as expected. **Edit:** Native function calling has unexpected results on the models ability to use inbuilt tools (spinning wheel as described in this chain earlier and https://github.com/open-webui/open-webui/issues/21426) so I will still be using an external pipe to handle the logic of the inbuilt tools and response outputs kept within the reasoning summary. But thought this may help with inbuilt support/implementation for Azure Openai. https://github.com/DanCarrollAI/open-webui-developer-toolkit-azure/blob/azure-alpha-preview/functions/pipes/openai_responses_manifold/openai_responses_manifold.py

GiteaMirror commented

2026-05-05 22:21:21 -05:00

@i0ntempest commented on GitHub (Mar 9, 2026):

I've noticed that title generation does not work when using responses endpoint as said in https://github.com/open-webui/open-webui/issues/21566. If fact anything that uses the "task model" doesn't work (text completions, follow ups, etc.)

@i0ntempest commented on GitHub (Mar 9, 2026): I've noticed that title generation does not work when using responses endpoint as said in https://github.com/open-webui/open-webui/issues/21566. If fact anything that uses the "task model" doesn't work (text completions, follow ups, etc.)

GiteaMirror commented

2026-05-05 22:21:22 -05:00

@Arslo-cloud commented on GitHub (Mar 11, 2026):

Any plans to integrate that change to the main branch?

Edit:
Just found https://github.com/open-webui/open-webui/discussions/15628 that it is already on to-do.

@Arslo-cloud commented on GitHub (Mar 11, 2026): Any plans to integrate that change to the main branch? Edit: Just found https://github.com/open-webui/open-webui/discussions/15628 that it is already on to-do.

GiteaMirror commented

2026-05-05 22:21:23 -05:00

@eatmyvenom commented on GitHub (Mar 12, 2026):

Just for clarity, native mode tool calls still do not work. The rest of the implementation works just fine.

I have tested using both built-in and uploaded tools with the responses endpoint from OpenAI and OpenRouter.

If someone is willing and has time to have a checklist of the progress of this issue somewhere so we can monitor the status of the implementation at a glance, that would be appreciated!

@eatmyvenom commented on GitHub (Mar 12, 2026): Just for clarity, native mode tool calls still do not work. The rest of the implementation works just fine. I have tested using both built-in and uploaded tools with the responses endpoint from OpenAI and OpenRouter. If someone is willing and has time to have a checklist of the progress of this issue somewhere so we can monitor the status of the implementation at a glance, that would be appreciated!

GiteaMirror commented

2026-05-05 22:21:24 -05:00

@Arslo-cloud commented on GitHub (Mar 12, 2026):

Well it doesn't work with azure response API at all.

@Arslo-cloud commented on GitHub (Mar 12, 2026): Well it doesn't work with azure response API at all.

GiteaMirror commented

2026-05-05 22:21:25 -05:00

@attilaolah commented on GitHub (Mar 12, 2026):

In case this helps anyone, this is a workaround we've been using for about a week now. It still does not allow folks to use Azure Responses-API models in the Open WebUI web app, but it does let the CLI agent tools and other IDE integrations through, directly to Azure.

One downside is that requests that get proxied and thus bypass Open WebUI entirely will not be counted in the statistics for Open WebUI's model usage, but we can live with that for now.

Run a Caddy reverse-proxy in the same namespace as open-webui.
- I would run a sidecar container, but currently the Helm chart doesn't expose a way to run sidecar containers, only init containers.
Incoming requests must have an Open WebUI API key, and an auth request is sent to /api/v1/auths/api_key to verify the key.
- A dedicated "authenticate this request and return nothing" endpoint in Open WebUI would be convenient, but the above works too.
Proxy incoming request directly to Azure, replacing the auth headers with the Azure API key.

The below proxy config works for both Claude Code and Codex.

The __OPENAI_API_KEY__ gets replaced in the init container by an env var mounted from a secret, containing the Azure API key. An ingress rule will mount the below Caddy config under the "/api" subpath, matching only /api/anthropic* and /api/v1/responses*, so it won't clash e.g. with the Chat API — that one can still go through Open WebUI.

{
        auto_https off
      }

      https://:8443 {
        tls /etc/tls/tls.crt /etc/tls/tls.key

        @anthropic path /anthropic*
        handle @anthropic {
          # Auth check is always GET to this exact path.
          forward_auth https://open-webui {
            uri /api/v1/auths/api_key
            header_up Authorization "Bearer {header.X-Api-Key}"
            transport http {
              tls_server_name open-webui
              tls_client_auth /etc/tls/tls.crt /etc/tls/tls.key
              tls_trust_pool file /etc/tls/ca.crt
            }
          }

          reverse_proxy https://my-azure-resource.services.ai.azure.com {
            # Anthropic API uses the X-Api-Key header for authorisation.
            header_up x-api-key "__OPENAI_API_KEY__"
            header_up Host {upstream_hostport}
          }
        }

        @responses path /v1/responses*
        handle @responses {
          # Responses requests already arrive with a Bearer token.
          forward_auth https://open-webui {
            uri /api/v1/auths/api_key
            header_up Authorization "{header.Authorization}"
            transport http {
              tls_server_name open-webui
              tls_client_auth /etc/tls/tls.crt /etc/tls/tls.key
              tls_trust_pool file /etc/tls/ca.crt
            }
          }

          uri replace /v1/responses /openai/v1/responses

          reverse_proxy https://my-azure-resource.services.ai.azure.com {
            # Open Responses requires a standard bearer token for authorisation.
            header_up Authorization "Bearer __OPENAI_API_KEY__"
            header_up Host {upstream_hostport}
          }
        }

      }

Claude Code needs these three env vars then:

ANTHROPIC_FOUNDRY_API_KEY=sk-my-open-webui-api-key
ANTHROPIC_FOUNDRY_BASE_URL=https://openwebui.example.com/api/anthropic
CLAUDE_CODE_USE_FOUNDRY=1

Codex can be configured by adding a model provider as usual, pointing to Open WebUI, that uses the responses wire format (default in recent Codex versions).

model = "gpt-5.3-codex"
model_provider = "openwebui"
model_reasoning_effort = "medium"
personality = "pragmatic"

[model_providers.openwebui]
name = "Open WebUI"
base_url = "https://openwebui.example.com/api/v1"
env_key = "OPEN_WEBUI_API_KEY"
wire_api = "responses"  # the default these days
# Some models are very flake, hammer them more than the default 5 retries.
stream_max_retries = 30

Note that the /api part we use for the subpath is somewhat arbitrary, however we have that because we have a second reverse proxy in front of Open WebUI that does an additional layer of authentication, and /api/* is exempted from this rule so that CLI tools could sign in using only the token.

@attilaolah commented on GitHub (Mar 12, 2026): In case this helps anyone, this is a workaround we've been using for about a week now. It still does not allow folks to use Azure Responses-API models in the Open WebUI web app, but it does let the CLI agent tools and other IDE integrations through, directly to Azure. One downside is that requests that get proxied and thus bypass Open WebUI entirely will not be counted in the statistics for Open WebUI's model usage, but we can live with that for now. --- 1. Run a Caddy reverse-proxy in the same namespace as open-webui. - I would run a sidecar container, but currently the Helm chart doesn't expose a way to run sidecar containers, only init containers. 2. Incoming requests must have an Open WebUI API key, and an auth request is sent to `/api/v1/auths/api_key` to verify the key. - A dedicated "authenticate this request and return nothing" endpoint in Open WebUI would be convenient, but the above works too. 3. Proxy incoming request directly to Azure, replacing the auth headers with the Azure API key. The below proxy config works for both Claude Code and Codex. The `__OPENAI_API_KEY__` gets replaced in the init container by an env var mounted from a secret, containing the Azure API key. An ingress rule will mount the below Caddy config under the "/api" subpath, matching only /api/anthropic* and /api/v1/responses*, so it won't clash e.g. with the Chat API — that one can still go through Open WebUI. ```caddy { auto_https off } https://:8443 { tls /etc/tls/tls.crt /etc/tls/tls.key @anthropic path /anthropic* handle @anthropic { # Auth check is always GET to this exact path. forward_auth https://open-webui { uri /api/v1/auths/api_key header_up Authorization "Bearer {header.X-Api-Key}" transport http { tls_server_name open-webui tls_client_auth /etc/tls/tls.crt /etc/tls/tls.key tls_trust_pool file /etc/tls/ca.crt } } reverse_proxy https://my-azure-resource.services.ai.azure.com { # Anthropic API uses the X-Api-Key header for authorisation. header_up x-api-key "__OPENAI_API_KEY__" header_up Host {upstream_hostport} } } @responses path /v1/responses* handle @responses { # Responses requests already arrive with a Bearer token. forward_auth https://open-webui { uri /api/v1/auths/api_key header_up Authorization "{header.Authorization}" transport http { tls_server_name open-webui tls_client_auth /etc/tls/tls.crt /etc/tls/tls.key tls_trust_pool file /etc/tls/ca.crt } } uri replace /v1/responses /openai/v1/responses reverse_proxy https://my-azure-resource.services.ai.azure.com { # Open Responses requires a standard bearer token for authorisation. header_up Authorization "Bearer __OPENAI_API_KEY__" header_up Host {upstream_hostport} } } } ``` Claude Code needs these three env vars then: ```sh= ANTHROPIC_FOUNDRY_API_KEY=sk-my-open-webui-api-key ANTHROPIC_FOUNDRY_BASE_URL=https://openwebui.example.com/api/anthropic CLAUDE_CODE_USE_FOUNDRY=1 ``` Codex can be configured by adding a model provider as usual, pointing to Open WebUI, that uses the responses wire format (default in recent Codex versions). ```toml model = "gpt-5.3-codex" model_provider = "openwebui" model_reasoning_effort = "medium" personality = "pragmatic" [model_providers.openwebui] name = "Open WebUI" base_url = "https://openwebui.example.com/api/v1" env_key = "OPEN_WEBUI_API_KEY" wire_api = "responses" # the default these days # Some models are very flake, hammer them more than the default 5 retries. stream_max_retries = 30 ``` Note that the `/api` part we use for the subpath is somewhat arbitrary, however we have that because we have a second reverse proxy in front of Open WebUI that does an additional layer of authentication, and `/api/*` is exempted from this rule so that CLI tools could sign in using only the token.

GiteaMirror commented

2026-05-05 22:21:26 -05:00

@gaby commented on GitHub (Mar 12, 2026):

@attilaolah Be aware that /openai endpoints do not work with Workspace models.

I raised an issue, but it was closed: https://github.com/open-webui/open-webui/issues/22134

The only endpoint that will work is chat completions with workspace models.

@gaby commented on GitHub (Mar 12, 2026): @attilaolah Be aware that `/openai` endpoints do not work with Workspace models. I raised an issue, but it was closed: https://github.com/open-webui/open-webui/issues/22134 The only endpoint that will work is chat completions with workspace models.

GiteaMirror commented

2026-05-05 22:21:29 -05:00

@attilaolah commented on GitHub (Mar 12, 2026):

Indeed, but it does work with the foundation models deployed via the AI Foundry, including the Codex models, which is better than nothing.

@attilaolah commented on GitHub (Mar 12, 2026): Indeed, but it does work with the foundation models deployed via the AI Foundry, including the Codex models, which is better than nothing.

GiteaMirror commented

2026-05-05 22:21:32 -05:00

@Arslo-cloud commented on GitHub (Mar 13, 2026):

I changed the azure API to use the open air settings. Now it works but I now got the tool call problem.

@Arslo-cloud commented on GitHub (Mar 13, 2026): I changed the azure API to use the open air settings. Now it works but I now got the tool call problem.

GiteaMirror commented

2026-05-05 22:21:34 -05:00

@JiwaniZakir commented on GitHub (Mar 14, 2026):

I can pick this up. The current streaming implementation in open-webui doesn't fully handle OpenAI-style streaming responses where partial JSON chunks need to be assembled — I'd work through the chat completion handler to ensure delta objects, including tool calls and function arguments, are properly accumulated and rendered incrementally rather than waiting for the final assembled response.

@JiwaniZakir commented on GitHub (Mar 14, 2026): I can pick this up. The current streaming implementation in open-webui doesn't fully handle OpenAI-style streaming responses where partial JSON chunks need to be assembled — I'd work through the chat completion handler to ensure delta objects, including tool calls and function arguments, are properly accumulated and rendered incrementally rather than waiting for the final assembled response.

GiteaMirror commented

2026-05-05 22:21:35 -05:00

@JiwaniZakir commented on GitHub (Mar 14, 2026):

Hey, dropping my claim on this one. I spent a good chunk of time on it but couldn't land on a clean approach for the full open responses support, so I'm stepping aside to let someone else take a crack at it.

@JiwaniZakir commented on GitHub (Mar 14, 2026): Hey, dropping my claim on this one. I spent a good chunk of time on it but couldn't land on a clean approach for the full open responses support, so I'm stepping aside to let someone else take a crack at it.

GiteaMirror commented

2026-05-05 22:21:36 -05:00

@Arslo-cloud commented on GitHub (Mar 15, 2026):

Hope that solves it https://github.com/open-webui/open-webui/pull/22497#issuecomment-4062796363

@Arslo-cloud commented on GitHub (Mar 15, 2026): Hope that solves it https://github.com/open-webui/open-webui/pull/22497#issuecomment-4062796363

GiteaMirror commented

2026-05-05 22:21:37 -05:00

@flange-ipb commented on GitHub (Mar 17, 2026):

Hello, Ollama recently added support for the Responses API (https://github.com/ollama/ollama/pull/13351). As a low-hanging fruit would it be possible to add it to the Ollama Proxy?

Edit: This was added with #23483. Thanks!

@flange-ipb commented on GitHub (Mar 17, 2026): Hello, Ollama recently added support for the Responses API (https://github.com/ollama/ollama/pull/13351). As a low-hanging fruit would it be possible to add it to the [Ollama Proxy](https://docs.openwebui.com/reference/api-endpoints#-ollama-api-proxy-support)? Edit: This was added with #23483. Thanks!

GiteaMirror commented

2026-05-05 22:21:39 -05:00

@tjbck commented on GitHub (Apr 7, 2026):

I believe we've reached full parity with chat completions in the latest version, any issues everyone?

@tjbck commented on GitHub (Apr 7, 2026): I believe we've reached full parity with chat completions in the latest version, any issues everyone?

GiteaMirror commented

2026-05-05 22:21:40 -05:00

@Classic298 commented on GitHub (Apr 7, 2026):

4 issues identified, PRs incoming

@Classic298 commented on GitHub (Apr 7, 2026): 4 issues identified, PRs incoming

GiteaMirror commented

2026-05-05 22:21:41 -05:00

@Classic298 commented on GitHub (Apr 14, 2026):

all PRs merged should be fully done now

@Classic298 commented on GitHub (Apr 14, 2026): all PRs merged should be fully done now

GiteaMirror commented

2026-05-05 22:21:42 -05:00

@wolffberg commented on GitHub (Apr 28, 2026):

@yazon, @Classic298

Did this ever make it into a release? I'm getting the exact same issue with gpt-5.1-codex-mini and 2025-04-01-preview.

Log:

open_webui.main:process_chat:1884 - Error processing chat payload: Resource not found

I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL:

    if api_config.get("azure", False):
        api_version = api_config.get("api_version", "2023-03-15-preview")
        model = payload.get("model", "")
        request_url, payload = convert_to_azure_payload(url, payload, api_version)

        # Only set api-key header if not using Azure Entra ID authentication
        auth_type = api_config.get("auth_type", "bearer")
        if auth_type not in ("azure_ad", "microsoft_entra_id"):
            headers["api-key"] = key

        headers["api-version"] = api_version

        if is_responses:
            payload = convert_to_responses_payload(payload)
            payload["model"] = model
            request_url = f"{url}/openai/responses?api-version={api_version}"

@wolffberg commented on GitHub (Apr 28, 2026): @yazon, @Classic298 Did this ever make it into a release? I'm getting the exact same issue with `gpt-5.1-codex-mini` and `2025-04-01-preview`. Log: ``` open_webui.main:process_chat:1884 - Error processing chat payload: Resource not found ``` > I manually edited openai.py file and changed the URL, added a model to payload. It fixed the issue, my Azure OpenAI expects completely different URL: > > ``` > if api_config.get("azure", False): > api_version = api_config.get("api_version", "2023-03-15-preview") > model = payload.get("model", "") > request_url, payload = convert_to_azure_payload(url, payload, api_version) > > # Only set api-key header if not using Azure Entra ID authentication > auth_type = api_config.get("auth_type", "bearer") > if auth_type not in ("azure_ad", "microsoft_entra_id"): > headers["api-key"] = key > > headers["api-version"] = api_version > > if is_responses: > payload = convert_to_responses_payload(payload) > payload["model"] = model > request_url = f"{url}/openai/responses?api-version={api_version}" > ```

GiteaMirror commented

2026-05-05 22:21:43 -05:00

@yazon commented on GitHub (Apr 29, 2026):

No, I just manually replace the file for each openwebui release 🙂

@yazon commented on GitHub (Apr 29, 2026): No, I just manually replace the file for each openwebui release 🙂

GiteaMirror commented

2026-05-05 22:21:44 -05:00

@wolffberg commented on GitHub (Apr 29, 2026):

Would you mind making it into a PR or issue with a description on where you add it? 🙏

I'm running a stateless instance, so replacing the file is not viable.

@wolffberg commented on GitHub (Apr 29, 2026): Would you mind making it into a PR or issue with a description on where you add it? 🙏 I'm running a stateless instance, so replacing the file is not viable.

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#58113