mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #1327] Dynamic Behavior of Maximum Tokens (max_tokens) in Claude 3 models using LiteLLM #51114
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @EnzoAndree on GitHub (Mar 27, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1327
Bug Report
Description
Bug Summary:
The maximum output token count appears to decrease dynamically, even though the Claude 3 models supports a context window of 200,000 tokens and the max_tokens (Max output) parameter is set to 4096.
Steps to Reproduce:
max_tokensvalue, which should be 4096 in the log.max_tokensvalue decreases with each subsequent message.Expected Behavior:
The
max_tokensparameter should be kept consistent at 4096 throughout the conversation. It should only be adjusted if the maximum context length is reached (200,000 tokens).Actual Behavior:
The
max_tokensvalue decreases as the conversation progresses, even though the context window for the Claude model is 200,000 tokens.Environment
Reproduction Details
Confirmation:
Logs and Screenshots
Installation Method
Manual installation
@tjbck commented on GitHub (Mar 27, 2024):
This might be a LiteLLM issue, so you might want to try testing LiteLLM in isolation. Keep us updated!