[GH-ISSUE #11958] issue: Context Length completely ignored #31945

New Issue

GiteaMirror · 2026-04-25T05:50:11-05:00

GiteaMirror commented

2026-04-25 05:50:11 -05:00

Originally created by @frenzybiscuit on GitHub (Mar 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11958

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

latest

Ollama Version (if applicable)

No response

Operating System

Debian 12

Browser (if applicable)

Firefox

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have listed steps to reproduce the bug in detail.

Expected Behavior

When using TabbyAPI as a OpenAI compatible backend, context length limit is ignored in the admin model settings.

I have confirmed with other people using open-webui that this happens.

*THIS IS NOT MARKED AS AN OLLAMA ONLY FEATURE

Actual Behavior

Set context length to 50. Send chat. Watch as tokens go over 50 and continue.

Tabby log:
2025-03-22 11:29:37.899 INFO: Finished chat completion streaming request 52c1a4f4badd475c8067daab9971af49
2025-03-22 11:29:37.900 INFO: Metrics (ID: 52c1a4f4badd475c8067daab9971af49): 89 tokens generated in 3.7 seconds (Queue: 0.0 s, Process: 95 cached tokens and 17 new tokens at 109.06 T/s, Generate: 25.09 T/s, Context: 112 tokens)

Steps to Reproduce

Read the above

Logs & Screenshots

No logs

Additional Information

No response

Originally created by @frenzybiscuit on GitHub (Mar 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/11958 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version latest ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12 ### Browser (if applicable) Firefox ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When using TabbyAPI as a OpenAI compatible backend, context length limit is ignored in the admin model settings. I have confirmed with other people using open-webui that this happens. *THIS IS NOT MARKED AS AN OLLAMA ONLY FEATURE ### Actual Behavior Set context length to 50. Send chat. Watch as tokens go over 50 and continue. ![Image](https://github.com/user-attachments/assets/d7be4d6d-1bc5-441f-9415-d83c6af5d558) Tabby log: 2025-03-22 11:29:37.899 INFO: Finished chat completion streaming request 52c1a4f4badd475c8067daab9971af49 2025-03-22 11:29:37.900 INFO: Metrics (ID: 52c1a4f4badd475c8067daab9971af49): 89 tokens generated in 3.7 seconds (Queue: 0.0 s, Process: 95 cached tokens and 17 new tokens at 109.06 T/s, Generate: 25.09 T/s, **Context: 112 tokens**) ### Steps to Reproduce Read the above ### Logs & Screenshots No logs ### Additional Information _No response_

GiteaMirror added the bug label 2026-04-25 05:50:11 -05:00

GiteaMirror closed this issue

2026-04-25 05:50:12 -05:00

GiteaMirror commented

2026-04-25 05:50:13 -05:00

@tjbck commented on GitHub (Mar 23, 2025):

e5b7188379

Marked as Ollama Only in dev.

@tjbck commented on GitHub (Mar 23, 2025): e5b7188379553b52436776af8ed85fa7b77fcc2f Marked as Ollama Only in dev.

Sign in to join this conversation.