mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 03:18:23 -05:00
[GH-ISSUE #12058] issue: Cached tokens out of nowhere #55120
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @davidvpe on GitHub (Mar 25, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/12058
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.5.20
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
Brave
Confirmation
README.md.Expected Behavior
When sending a message saying "hello" for the first time in a new chat tokens should be at the minimum
Actual Behavior
Tokens used are really high and and some are cached already but nothing appears on the logs.
Steps to Reproduce
I am using OpenWebUI with LiteLLM
Logs & Screenshots
These are the request/response/metadata gotten from LiteLLM. I see from the browser logs that the request made doesn't containe any history. So I am a bit puzzled.
Request:
Response:
Metadata
Additional Information
Just to mention that I am using LiteLLM from other clients as well (Telegram Bots) and I don't have the same issue so I think OpenWebUI is doing something behind the curtains to cache some tokens and reuse them in the same conversation Because sometimes it randomly mentions stuff I mentioned in different chats.