[GH-ISSUE #16395] issue: max_tokens deprecated, breaking Title Generation with GPT-5 #33416

Closed
opened 2026-04-25 07:19:37 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @17jmumford on GitHub (Aug 8, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16395

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Pip Install for locally, docker in production

Open WebUI Version

latest

Ollama Version (if applicable)

No response

Operating System

mac

Browser (if applicable)

chrome

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When a title is generated in the background for a chat, it sets the max_tokens parameter to ensure the title fits in the sidebar. Normally, this generation works great.

Actual Behavior

max_tokens was marked as deprecated on the OpenAI API for /chat/completions. Interestingly, despite stating it wouldn't work with o-series models, these models are successfully generating titles. Does Open WebUI have a hardcoded fix for this already, or is the documentation wrong?

However, the music finally stopped with GPT-5. Title generation will error out with these models. See screenshots and links below.

Now, obviously you can set a different model to generate the title in Open WebUI. Unfortunately, Open WebUI requires a public base model to be set as the task model. Our organization is set up where the only public models are models created inside of Open WebUI. We humbly request that Open WebUI either updates title generation to use max_completion_tokens for title generation length limiting, or allow Open WebUI models to be selected as task models.

Steps to Reproduce

Pretty easy to reproduce. Just use a GPT-5 model 🤷‍♂️

Logs & Screenshots

Here is the error message that our proxy gateway (similar to LiteLLM but homegrown) caught:

openai.badrequesterror: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}

Additional Information

Here is the link to the documentation on OpenAI. Again, despite saying that o-series models can't use this parameter, they actually generate titles successfully 🤷 . Not sure if Open WebUI has a hardcoded fix for this, or if the documentation is wrong.
https://platform.openai.com/docs/api-reference/chat/create

Originally created by @17jmumford on GitHub (Aug 8, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/16395 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install for locally, docker in production ### Open WebUI Version latest ### Ollama Version (if applicable) _No response_ ### Operating System mac ### Browser (if applicable) chrome ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When a title is generated in the background for a chat, it sets the max_tokens parameter to ensure the title fits in the sidebar. Normally, this generation works great. ### Actual Behavior max_tokens was marked as deprecated on the OpenAI API for /chat/completions. Interestingly, despite stating it wouldn't work with o-series models, these models are successfully generating titles. Does Open WebUI have a hardcoded fix for this already, or is the documentation wrong? However, the music finally stopped with GPT-5. Title generation will error out with these models. See screenshots and links below. Now, obviously you can set a different model to generate the title in Open WebUI. Unfortunately, Open WebUI requires a public base model to be set as the task model. Our organization is set up where the only public models are models created inside of Open WebUI. We humbly request that Open WebUI either updates title generation to use max_completion_tokens for title generation length limiting, or allow Open WebUI models to be selected as task models. ### Steps to Reproduce Pretty easy to reproduce. Just use a GPT-5 model 🤷‍♂️ ### Logs & Screenshots Here is the error message that our proxy gateway (similar to LiteLLM but homegrown) caught: openai.badrequesterror: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}} ### Additional Information Here is the link to the documentation on OpenAI. Again, despite saying that o-series models can't use this parameter, they actually generate titles successfully 🤷 . Not sure if Open WebUI has a hardcoded fix for this, or if the documentation is wrong. https://platform.openai.com/docs/api-reference/chat/create
GiteaMirror added the bug label 2026-04-25 07:19:37 -05:00
Author
Owner

@17jmumford commented on GitHub (Aug 8, 2025):

After grepping the codebase, there IS a hardcoded fix for this with o-series models.

https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/openai.py

So it seems like the short-term fix would be to add gpt-5 to the list 🤷

Longterm, this is not ideal. However, I understand that other proxy systems emulating OpenAI will be expecting the deprecated max_tokens parameter to be the default. Anthropic, Bedrock, Google, Grok, all accept max_tokens with their OpenAI emulators. I suspect at least one of them would break if we switched to max_completion_tokens.

I wonder if there is some way to identify reasoning models automatically....?

I'll make a PR to get this updated.

<!-- gh-comment-id:3169159935 --> @17jmumford commented on GitHub (Aug 8, 2025): After grepping the codebase, there IS a hardcoded fix for this with o-series models. https://github.com/open-webui/open-webui/blob/main/backend/open_webui/routers/openai.py So it seems like the short-term fix would be to add gpt-5 to the list 🤷 Longterm, this is not ideal. However, I understand that other proxy systems emulating OpenAI will be expecting the deprecated max_tokens parameter to be the default. Anthropic, Bedrock, Google, Grok, all accept max_tokens with their OpenAI emulators. I suspect at least one of them would break if we switched to max_completion_tokens. I wonder if there is some way to identify reasoning models automatically....? I'll make a PR to get this updated.
Author
Owner

@17jmumford commented on GitHub (Aug 8, 2025):

https://github.com/open-webui/open-webui/pull/16397
Merged!

<!-- gh-comment-id:3169195229 --> @17jmumford commented on GitHub (Aug 8, 2025): https://github.com/open-webui/open-webui/pull/16397 Merged!
Author
Owner

@diegoscl commented on GitHub (Feb 4, 2026):

This issue still pops up when you're using a gateway like OpenRouter or Cloudflare AI Gateway.

From what I can tell, it happens because GPT-5 models aren't being recognized as "reasoning models" in this context. The logic at line 767 doesn't trigger the switch from max_tokens to max_completion_tokens because it's specifically looking for model IDs that start with "o1", "o3", "o4", or "gpt-5". Since gateways usually format IDs as {provider}/{model}, an ID like openai/gpt-5-nano ends up breaking the check.

@17jmumford, What do you think of this?

<!-- gh-comment-id:3844993935 --> @diegoscl commented on GitHub (Feb 4, 2026): This issue still pops up when you're using a gateway like OpenRouter or Cloudflare AI Gateway. From what I can tell, it happens because GPT-5 models aren't being recognized as "reasoning models" in this context. The logic at [line 767](https://github.com/open-webui/open-webui/blob/2b26355002064228e9b671339f8f3fb9d1fafa73/backend/open_webui/routers/openai.py#L767) doesn't trigger the switch from `max_tokens` to `max_completion_tokens` because it's specifically looking for model IDs that start with "o1", "o3", "o4", or "gpt-5". Since gateways usually format IDs as `{provider}/{model}`, an ID like `openai/gpt-5-nano` ends up breaking the check. @17jmumford, What do you think of this?
Author
Owner

@danny70437 commented on GitHub (Feb 4, 2026):

also in gpt-5-mini - we use lm-proxy to connect to openai.com; opeonwebui connects to 127.0.0.1:8000 - so, the URL-Check does not work. openwebui sends to wrong parameter max_token instead of max_completion_tokens.

lm-proxy           | 20:59:22 INFO: Querying LLM... params: {'model': 'gpt-5-mini', 'stream': True, 'max_tokens': 0}
lm-proxy           | INFO:     172.19.0.7:41920 - "POST /v1/chat/completions HTTP/1.1" 200 OK
lm-proxy           | 20:59:22 INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
lm-proxy           | 20:59:22 ERROR: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported w
ith this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_
parameter'}}
<!-- gh-comment-id:3849722757 --> @danny70437 commented on GitHub (Feb 4, 2026): also in gpt-5-mini - we use lm-proxy to connect to openai.com; opeonwebui connects to 127.0.0.1:8000 - so, the [URL-Chec](https://github.com/open-webui/open-webui/blob/2b26355002064228e9b671339f8f3fb9d1fafa73/backend/open_webui/routers/openai.py#L896)k does not work. openwebui sends to wrong parameter max_token instead of max_completion_tokens. ``` lm-proxy | 20:59:22 INFO: Querying LLM... params: {'model': 'gpt-5-mini', 'stream': True, 'max_tokens': 0} lm-proxy | INFO: 172.19.0.7:41920 - "POST /v1/chat/completions HTTP/1.1" 200 OK lm-proxy | 20:59:22 INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request" lm-proxy | 20:59:22 ERROR: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported w ith this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_ parameter'}} ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#33416