mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-05 18:38:17 -05:00
[GH-ISSUE #23606] feat: Option to auto truncate chat context when size doesn't fit model (not Ollama) #35556
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @xNissX233 on GitHub (Apr 11, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23606
Check Existing Issues
Verify Feature Scope
Problem Description
When the chat grows too much in context, Open-WebUI displays an error if the context size is bigger than configured in llama.cpp.
This is handled by Ollama but when not using Ollama there's no context setting for Open-WebUI to handle it itself.
This problem is extremely relevant because everyone using LLM for long enough in the same chat will find this issue at some point, and the only solution is starting a new chat or manually summarizing or using knowledge or notes tools or external plugins, which is not ideal.
Desired Solution you'd like
Adding a "max_ctx" parameter that copies the feature of "num_ctx" for Ollama, but handled by Open-WebUI.
This would simply send to llama.cpp as much context as specified, discarding the older messages that wouldn't fit.
Alternatives Considered
No response
Additional Context
No response