[GH-ISSUE #11958] issue: Context Length completely ignored #31945

Closed
opened 2026-04-25 05:50:11 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @frenzybiscuit on GitHub (Mar 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11958

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

latest

Ollama Version (if applicable)

No response

Operating System

Debian 12

Browser (if applicable)

Firefox

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

When using TabbyAPI as a OpenAI compatible backend, context length limit is ignored in the admin model settings.

I have confirmed with other people using open-webui that this happens.

*THIS IS NOT MARKED AS AN OLLAMA ONLY FEATURE

Actual Behavior

Set context length to 50. Send chat. Watch as tokens go over 50 and continue.

Image

Tabby log:
2025-03-22 11:29:37.899 INFO: Finished chat completion streaming request 52c1a4f4badd475c8067daab9971af49
2025-03-22 11:29:37.900 INFO: Metrics (ID: 52c1a4f4badd475c8067daab9971af49): 89 tokens generated in 3.7 seconds (Queue: 0.0 s, Process: 95 cached tokens and 17 new tokens at 109.06 T/s, Generate: 25.09 T/s, Context: 112 tokens)

Steps to Reproduce

Read the above

Logs & Screenshots

No logs

Additional Information

No response

Originally created by @frenzybiscuit on GitHub (Mar 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/11958 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version latest ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12 ### Browser (if applicable) Firefox ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When using TabbyAPI as a OpenAI compatible backend, context length limit is ignored in the admin model settings. I have confirmed with other people using open-webui that this happens. *THIS IS NOT MARKED AS AN OLLAMA ONLY FEATURE ### Actual Behavior Set context length to 50. Send chat. Watch as tokens go over 50 and continue. ![Image](https://github.com/user-attachments/assets/d7be4d6d-1bc5-441f-9415-d83c6af5d558) Tabby log: 2025-03-22 11:29:37.899 INFO: Finished chat completion streaming request 52c1a4f4badd475c8067daab9971af49 2025-03-22 11:29:37.900 INFO: Metrics (ID: 52c1a4f4badd475c8067daab9971af49): 89 tokens generated in 3.7 seconds (Queue: 0.0 s, Process: 95 cached tokens and 17 new tokens at 109.06 T/s, Generate: 25.09 T/s, **Context: 112 tokens**) ### Steps to Reproduce Read the above ### Logs & Screenshots No logs ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 05:50:11 -05:00
Author
Owner

@tjbck commented on GitHub (Mar 23, 2025):

e5b7188379

Marked as Ollama Only in dev.

<!-- gh-comment-id:2746363998 --> @tjbck commented on GitHub (Mar 23, 2025): e5b7188379553b52436776af8ed85fa7b77fcc2f Marked as Ollama Only in dev.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#31945