[GH-ISSUE #573] feat: token counting according to model's context size #12126

Closed
opened 2026-04-19 18:55:48 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @peperunas on GitHub (Jan 25, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/573

Is your feature request related to a problem? Please describe.

I believe a crucial feature would be to have an idea of the context size of the conversation with a model. In this way, the user is aware of how much more the chat can continue.

As of now, it seems there is no "official" way to pull a model's context size from Ollama via its API. The issue has been raised and tracked here 1.

I guess this issue can track the advancement of the API endpoint on Ollama's side as well.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Originally created by @peperunas on GitHub (Jan 25, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/573 **Is your feature request related to a problem? Please describe.** I believe a crucial feature would be to have an idea of the context size of the conversation with a model. In this way, the user is aware of how much more the chat can continue. As of now, it seems there is no "official" way to pull a model's context size from Ollama via its API. The issue has been raised and tracked here [1]. I guess this issue can track the advancement of the API endpoint on Ollama's side as well. [1]: https://github.com/ollama/ollama/issues/1473 **Describe the solution you'd like** A clear and concise description of what you want to happen. **Describe alternatives you've considered** A clear and concise description of any alternative solutions or features you've considered. **Additional context** Add any other context or screenshots about the feature request here.
GiteaMirror added the enhancementhelp wantednon-core labels 2026-04-19 18:55:48 -05:00
Author
Owner

@jukofyork commented on GitHub (Feb 3, 2024):

You can sort of see this by clicking on the little "info" icon I think but agree it would be nice to have a clearer representation.

<!-- gh-comment-id:1925252616 --> @jukofyork commented on GitHub (Feb 3, 2024): You can sort of see this by clicking on the little "info" icon I think but agree it would be nice to have a clearer representation.
Author
Owner

@justinh-rahb commented on GitHub (Feb 19, 2024):

should be a way to set "num_ctx" like when using ollama in code

Example of this:
ollamaLLM1 = Ollama(model="mistral:7b", temperature=0.9, num_ctx=8192)

It is already an option to set the "temperature", should be doable to add the "num_ctx" for manual context length.
or even automatic if that information can be pulled from the ollama api, but to my knowledge there is no way to do that, so it just defaults to 2048.

I'm not sure you've got this in the right place. This seems like a suggestion to make to the Ollama project, not here.

<!-- gh-comment-id:1952469924 --> @justinh-rahb commented on GitHub (Feb 19, 2024): > should be a way to set "num_ctx" like when using ollama in code > > Example of this: > ollamaLLM1 = Ollama(model="mistral:7b", temperature=0.9, num_ctx=8192) > > It is already an option to set the "temperature", should be doable to add the "num_ctx" for manual context length. > or even automatic if that information can be pulled from the ollama api, but to my knowledge there is no way to do that, so it just defaults to 2048. I'm not sure you've got this in the right place. This seems like a suggestion to make to the Ollama project, not here.
Author
Owner

@scscgit commented on GitHub (Aug 7, 2024):

Cross-linking with issue https://github.com/open-webui/open-webui/discussions/4246#discussioncomment-10264834 which has been sadly turned into a discussion

<!-- gh-comment-id:2273512930 --> @scscgit commented on GitHub (Aug 7, 2024): Cross-linking with issue https://github.com/open-webui/open-webui/discussions/4246#discussioncomment-10264834 which has been sadly turned into a discussion
Author
Owner

@evolutioned commented on GitHub (Jul 22, 2025):

i would suggest removing this as a "Good First Issue", as it requires a deep understanding of how to work with Ollama in the backend. It's a complex problem for a first timer.

<!-- gh-comment-id:3105016701 --> @evolutioned commented on GitHub (Jul 22, 2025): i would suggest removing this as a "Good First Issue", as it requires a deep understanding of how to work with Ollama in the backend. It's a complex problem for a first timer.
Author
Owner

@Classic298 commented on GitHub (Dec 18, 2025):

@silentoplayz can this be closed? there are filters on openwebui.com which visually show the context window and how much has been used

<!-- gh-comment-id:3669251678 --> @Classic298 commented on GitHub (Dec 18, 2025): @silentoplayz can this be closed? there are filters on openwebui.com which visually show the context window and how much has been used
Author
Owner

@scscgit commented on GitHub (Dec 18, 2025):

@silentoplayz can this be closed? there are filters on openwebui.com which visually show the context window and how much has been used

Can you post a screenshot of this feature please (just for a reference)? I couldn't find it in the release notes or docs (and Docker update requires a lot of effort).

<!-- gh-comment-id:3669709531 --> @scscgit commented on GitHub (Dec 18, 2025): > @silentoplayz can this be closed? there are filters on openwebui.com which visually show the context window and how much has been used Can you post a screenshot of this feature please (just for a reference)? I couldn't find it in the release notes or docs (and Docker update requires a lot of effort).
Author
Owner

@Classic298 commented on GitHub (Dec 18, 2025):

@scscgit there are Filters on openwebui.com built by the community that do exactly this, search the community there for filters that do that. They will visually show the context window

<!-- gh-comment-id:3670083052 --> @Classic298 commented on GitHub (Dec 18, 2025): @scscgit there are Filters on openwebui.com built by the community that do exactly this, search the community there for filters that do that. They will visually show the context window
Author
Owner

@silentoplayz commented on GitHub (Dec 18, 2025):

With that having been said, while community filters can visually display a token count, their accuracy is fundamentally limited by how they are coded by their authors.

Tokenization is not a "one-size-fits-all" process. Different models (e.g., Llama 3, GPT-4, Claude, Mistral) often use different tokenizers and vocabularies. A filter that simply counts words or uses a generic tokenizer (like tiktoken with a standard encoding) will likely drift significantly from the actual context usage seen by the model, especially for non-English languages or code.

Without an official API endpoint from Ollama (or the respective provider) that exposes the exact tokenizer or returns the authoritative token count for a given prompt, any client-side counter is just an estimation that can be misleading.

<!-- gh-comment-id:3672076818 --> @silentoplayz commented on GitHub (Dec 18, 2025): With that having been said, while community filters can visually display a token count, their accuracy is fundamentally limited by **how they are coded** by their authors. Tokenization is not a "one-size-fits-all" process. Different models (e.g., Llama 3, GPT-4, Claude, Mistral) often use different tokenizers and vocabularies. A filter that simply counts words or uses a generic tokenizer (like `tiktoken` with a standard encoding) will likely drift significantly from the actual context usage seen by the model, especially for non-English languages or code. Without an official API endpoint from Ollama (or the respective provider) that exposes the *exact tokenizer* or returns the *authoritative* token count for a given prompt, any client-side counter is just an estimation that can be misleading.
Author
Owner

@silentoplayz commented on GitHub (Dec 18, 2025):

@scscgit This screenshot displays the Chat Metrics Advanced filter function, a fork of the original Chat Metrics filter. It reuses a lot of the original statistics you'd find in the model generation info icon when the mouse is hovered above it, but has Valve toggles for more options too.

https://openwebui.com/posts/586c67cd-4115-46be-8c69-21abaabf8c01

Image
<!-- gh-comment-id:3672087352 --> @silentoplayz commented on GitHub (Dec 18, 2025): @scscgit This screenshot displays the `Chat Metrics Advanced` filter function, a fork of the original `Chat Metrics` filter. It reuses a lot of the original statistics you'd find in the model generation info icon when the mouse is hovered above it, but has Valve toggles for more options too. https://openwebui.com/posts/586c67cd-4115-46be-8c69-21abaabf8c01 <img width="2302" height="1279" alt="Image" src="https://github.com/user-attachments/assets/58e724c5-cd31-47ab-b0c0-ad0dc76e2d57" />
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#12126