mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #22905] feat: Per-model API type override (Chat Completions vs Responses) #35368
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @shishiraiyar on GitHub (Mar 20, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22905
Check Existing Issues
Verify Feature Scope
Problem Description
With the introduction of the Responses API toggle on connections, users can choose between Chat Completions and Responses API types — but only at the connection level. This becomes a problem when a single backend (e.g., a LiteLLM proxy) serves models that require different API types.
Real-world example
I use a LiteLLM proxy that routes to GitHub Copilot models. All models are served from the same endpoint, but:
gpt-5.4andgpt-5.4-minionly work via/v1/responsesAll other models (Claude, Gemini, older GPT) only work via
/v1/chat/completionsCurrent workarounds and why they fail
gpt-5.4andgpt-5.4-minifail with "model only supported in /v1/responses"/responses-only ones, and takes priority due to orderingThe only working solution today requires manually maintaining two separate model lists across two connections to the same backend and giving up auto-discovery.
Desired Solution
Add the ability to set the API type (Chat Completions vs Responses) per model, not just per connection. This could take several forms:
Option A: Per-model override in Admin → Models
In the model editor (Admin Settings → Models), add an "API Type" dropdown that overrides the connection-level default for that specific model.
Option B: Model-level config within a connection
In the connection's model ID list, allow specifying the API type per model, e.g.:
Option C: Connection priority / conflict resolution for duplicate models
When the same model ID appears in multiple connections, allow the user to choose which connection takes priority for that model (rather than always defaulting to the first connection).
Additional Context
As more models move to
/responses-only (which is increasingly likely given that the Responses API is OpenAI's new format), this problem will affect more users.The current workaround is manageable with a small, static model list, but breaks down for users with many models or frequently changing model availability.
The Responses API is a superset of Chat Completions, so the trend is clearly toward more models requiring it.
@pr-validator-bot commented on GitHub (Mar 20, 2026):
⚠️ Invalid Issue Title
Hey @shishiraiyar, please provide a descriptive title for your issue. Titles that are empty, very short (under 10 characters), or generic (like "issue:" or "feat:") make it difficult for volunteer contributors to understand and triage issues.
Please update the title to reflect the content of your issue.
⚠️ Missing Issue Title Prefix
@shishiraiyar, your issue title is missing a prefix (e.g.,
bug:,feat:,docs:).Please update your issue title to include one of the following prefixes:
Example:
bug: Login fails when using special characters in password