[GH-ISSUE #5600] num_predict slider max value too small. #14046

Closed
opened 2026-04-19 20:32:58 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @JamesClarke7283 on GitHub (Sep 22, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/5600

Bug Report

Installation Method

Docker

Environment

  • Open WebUI Version: v0.3.23

  • Ollama (if applicable): 0.3.10

  • Operating System: ArchLinux

  • Browser (if applicable): Zen 1.0.1-a.2 (Fork of Firefox)

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • N/A: I have included the browser console logs.
  • N/A: I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

Able to set num_predict to at most 65536 tokens.

Actual Behavior:

I can only move the slider to 16000

Description

Bug Summary:
There are now models that support higher num_predict amounts, like 65536

Reproduction Details

Steps to Reproduce:
Go to the Chat Window or Model Editor, and in advanced parameters, try and change num_predict to a number higher than 16000, it wont do it.

Logs and Screenshots

N/A

Additional Information

This is the highest number i have seen(where the num_predict could fill the whole context window):
https://openrouter.ai/models/meta-llama/llama-3.1-405b-instruct/providers

This is the more likely case:
https://openrouter.ai/models/openai/o1-mini/providers

Soon, seeing a max output token limit of 65536 will be commonplace.

Even standardly used models, go slightly over the 16000 limit, like 16384:
https://platform.openai.com/docs/models/gpt-4o

Conclusion

I think a upper limit of 65536 or 128000 would be good(the former, being conservative and meeting most models of today, the later, covering the edge cases like llama3.1, and also future models).

Originally created by @JamesClarke7283 on GitHub (Sep 22, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/5600 # Bug Report ## Installation Method Docker ## Environment - **Open WebUI Version:** v0.3.23 - **Ollama (if applicable):** 0.3.10 - **Operating System:** ArchLinux - **Browser (if applicable):** Zen 1.0.1-a.2 (Fork of Firefox) **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - N/A: I have included the browser console logs. - N/A: I have included the Docker container logs. - [x] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: Able to set `num_predict` to at most `65536` tokens. ## Actual Behavior: I can only move the slider to `16000` ## Description **Bug Summary:** There are now models that support higher `num_predict` amounts, like `65536` ## Reproduction Details **Steps to Reproduce:** Go to the Chat Window or Model Editor, and in advanced parameters, try and change `num_predict` to a number higher than `16000`, it wont do it. ## Logs and Screenshots N/A ## Additional Information This is the highest number i have seen(where the num_predict could fill the whole context window): https://openrouter.ai/models/meta-llama/llama-3.1-405b-instruct/providers This is the more likely case: https://openrouter.ai/models/openai/o1-mini/providers Soon, seeing a max output token limit of `65536` will be commonplace. Even standardly used models, go slightly over the `16000` limit, like `16384`: https://platform.openai.com/docs/models/gpt-4o ## Conclusion I think a upper limit of `65536` or `128000` would be good(the former, being conservative and meeting most models of today, the later, covering the edge cases like llama3.1, and also future models).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#14046