mirror of
https://github.com/open-webui/open-webui.git
synced 2026-06-03 23:38:13 -05:00
[GH-ISSUE #573] feat: token counting according to model's context size #12126
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @peperunas on GitHub (Jan 25, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/573
Is your feature request related to a problem? Please describe.
I believe a crucial feature would be to have an idea of the context size of the conversation with a model. In this way, the user is aware of how much more the chat can continue.
As of now, it seems there is no "official" way to pull a model's context size from Ollama via its API. The issue has been raised and tracked here 1.
I guess this issue can track the advancement of the API endpoint on Ollama's side as well.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
@jukofyork commented on GitHub (Feb 3, 2024):
You can sort of see this by clicking on the little "info" icon I think but agree it would be nice to have a clearer representation.
@justinh-rahb commented on GitHub (Feb 19, 2024):
I'm not sure you've got this in the right place. This seems like a suggestion to make to the Ollama project, not here.
@scscgit commented on GitHub (Aug 7, 2024):
Cross-linking with issue https://github.com/open-webui/open-webui/discussions/4246#discussioncomment-10264834 which has been sadly turned into a discussion
@evolutioned commented on GitHub (Jul 22, 2025):
i would suggest removing this as a "Good First Issue", as it requires a deep understanding of how to work with Ollama in the backend. It's a complex problem for a first timer.
@Classic298 commented on GitHub (Dec 18, 2025):
@silentoplayz can this be closed? there are filters on openwebui.com which visually show the context window and how much has been used
@scscgit commented on GitHub (Dec 18, 2025):
Can you post a screenshot of this feature please (just for a reference)? I couldn't find it in the release notes or docs (and Docker update requires a lot of effort).
@Classic298 commented on GitHub (Dec 18, 2025):
@scscgit there are Filters on openwebui.com built by the community that do exactly this, search the community there for filters that do that. They will visually show the context window
@silentoplayz commented on GitHub (Dec 18, 2025):
With that having been said, while community filters can visually display a token count, their accuracy is fundamentally limited by how they are coded by their authors.
Tokenization is not a "one-size-fits-all" process. Different models (e.g., Llama 3, GPT-4, Claude, Mistral) often use different tokenizers and vocabularies. A filter that simply counts words or uses a generic tokenizer (like
tiktokenwith a standard encoding) will likely drift significantly from the actual context usage seen by the model, especially for non-English languages or code.Without an official API endpoint from Ollama (or the respective provider) that exposes the exact tokenizer or returns the authoritative token count for a given prompt, any client-side counter is just an estimation that can be misleading.
@silentoplayz commented on GitHub (Dec 18, 2025):
@scscgit This screenshot displays the
Chat Metrics Advancedfilter function, a fork of the originalChat Metricsfilter. It reuses a lot of the original statistics you'd find in the model generation info icon when the mouse is hovered above it, but has Valve toggles for more options too.https://openwebui.com/posts/586c67cd-4115-46be-8c69-21abaabf8c01