mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 21:09:41 -05:00
issue: preserve context in multi-turn chat #5910
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dariko on GitHub (Jul 30, 2025).
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v143
Ollama Version (if applicable)
No response
Operating System
debian 12
Browser (if applicable)
firefox 135
Confirmation
README.md.Expected Behavior
When running a chat with multiple turns, all of the context from the first request should be sent in subsequent requests.
Actual Behavior
When running a chat with multiple turns, system prompt and files attached to the first prompt are removed from the prompt when submitting following requests.
Steps to Reproduce
/tmp/file.txtcontaining a string ("the cat is on the table")/tmp/file.txt, writefirst prompt, ctrl+entersecond prompt)Logs & Screenshots
chat screenshot:
prompts received by llama-server (dumped using
/slots):first prompt:
second prompt:
container log
Additional Information
Having the <system prompt/files> removed from subsequent requests makes the model/server:
I honestly don't know if this is intended behaviour, but it makes using open-webui a lot slower with local models (and limited resources/preprocessing speed).
edit: reword/reformat
@tjbck commented on GitHub (Jul 30, 2025):
This has to do with model context length, related: https://docs.openwebui.com/troubleshooting/rag
@rgaricano commented on GitHub (Jul 30, 2025):
For local (ollama), also you can set num_keep advanced param to always keep in context first x tokens of conversation.
@dariko commented on GitHub (Jul 30, 2025):
thank you for pointing me to "model context length"!
I raised
Chat Controls -> num_keep (Ollama)(which was24) to a number greater the the token count and now the behavior is what i expected.I'm a little confused about why this parameter has
Ollamain its name, if it is applied also toopenapi-compatibleendpoints.@rgaricano commented on GitHub (Jul 30, 2025):
Maybe, because some openAI api compatible endpoints doesn't support it & could give errors (but others could to admit its, e.g. ollama's added as openAI API comp)
If I'm not wrong, official admitted openAI API params are:
@dariko commented on GitHub (Jul 30, 2025):
If i understand correctly
num_keep (ollama)is not sent to the model but used locally when "generating" the prompt sent to the model.I think so because i do not see the the default value (
24) in thellama-server/slotsdumps i collected when opening this issue.slots_first_prompt.json
slots_second_prompt.json
edit: add dumps grep
@rgaricano commented on GitHub (Jul 30, 2025):
openAI advance params mapping:
b8da4a8cd8/backend/open_webui/utils/payload.py (L102-L114)ollama's advance params mapping:
b8da4a8cd8/backend/open_webui/utils/payload.py (L148-L176)But in openAI API comp also you can add its as custom params if need it.