mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #18397] issue: 0.6.33 problem with long contest 128k #18584
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @batot1 on GitHub (Oct 17, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18397
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.6.33
Ollama Version (if applicable)
0.12.5
Operating System
Debian 12 (all updates)
Browser (if applicable)
FF 143.0.4 (64-bit)
Confirmation
README.md.Expected Behavior
BUG long contest 128k.
Past ~second question he never stooped answer and repeats the same thing over and over in an infinite loop.
How reproduce:
/ollama pull qwen3-coder:30b
/set parameter num_ctx 131072
/save qwen3-coder-128k
Now when I'm asking him 2-3 query in window, he getting loop and never stoping answer.
When I'm doing this same direct in window ollama all working property.
Actual Behavior
BUG long contest 128k.
Past ~second question he never stooped answer and repeats the same thing over and over in an infinite loop.
How reproduce:
/ollama pull qwen3-coder:30b
/set parameter num_ctx 131072
/save qwen3-coder-128k
Now when I'm asking him 2-3 query in window, he getting loop and never stoping answer.
When I'm doing this same direct in window ollama all working property.
Steps to Reproduce
How reproduce:
/ollama pull qwen3-coder:30b
/set parameter num_ctx 131072
/save qwen3-coder-128k
Now when I'm asking him 2-3 query in window in OpenWeb-UI, he getting loop and never stoping answer.
Logs & Screenshots
In windows Open-Web-UI Contest:
Response payload is not completed: <TransferEncodingError: 400, message='Not enough data to satisfy transfer length header.'>
Additional Information
In LOG docker I don't see any warring or error only INFO. Nothing special.
@Classic298 commented on GitHub (Oct 17, 2025):
Do you have the hardware to support a 128k context* window?
Reproduction steps not clear; how to reproduce and what to do
@batot1 commented on GitHub (Oct 17, 2025):
I was giving you instruction how reprocude:
/ollama pull qwen3-coder:30b
/set parameter num_ctx 131072
/save qwen3-coder-128k
When you saveing this you get "new model with long contest. But it only working wiht ollama but with OpenWebUI not working property.
Yes, I have hardware to reproduce 128k contest.
$ ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
qwen3-coder-30b-128k:latest 571f59fefc54 44 GB 48%/52% CPU/GPU 131072 4 minutes from now
I'm guessing that OpenWebUI probably won't work with any longer models, as I see they're probably not handling long contexts correctly.
Why are you closing the ticket without resolving or even verifying that the problem exists?
4 simple steps to reproduce.
@Classic298 commented on GitHub (Oct 17, 2025):
You did not provide any sensible steps to reproduce? How do I reproduce these steps inside of Open WebUI?
And yes, Open WebUI CAN handle long context models just fine.
@batot1 commented on GitHub (Oct 17, 2025):
Open-WebUI side:
new chat ---> search model (select qwen3-coder-30b-128k:latest)
"10 rows with question"
--->>> PUSH ENTER
--->>> PUSH ENTER
"10 rows with question no matter what you wrote"
--->>> PUSH ENTER
--->>> PUSH ENTER
@silentoplayz commented on GitHub (Oct 17, 2025):
https://www.lenovo.com/us/en/glossary/pebkac/