mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-05 18:38:17 -05:00
[GH-ISSUE #16465] issue: RAG requires multiple regens to work #17916
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @frenzybiscuit on GitHub (Aug 11, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16465
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.6.21
Ollama Version (if applicable)
No response
Operating System
Debian 12
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
RAG works on first attempt
Actual Behavior
RAG requires multiple regenerations before it works.
Initial response:
Regeneration:
Also title generation and prompt follow-up are busted as you can see in the screenshot. (Question 1, Question 2, Question 3...)
Steps to Reproduce
Logs & Screenshots
.
Additional Information
.
@frenzybiscuit commented on GitHub (Aug 11, 2025):
Note that I was not having this issue two versions ago.
@onestardao commented on GitHub (Aug 11, 2025):
hey, looks like you’re hitting [Problem No.5] typical RAG collapse where the system keeps regening because the semantic lock failed on first try.
we’ve actually built a public diagnostic map to solve this one (MIT licensed, not commercial).
it’s fixed at the symbolic layer, not just tweaking rerankers.
if you’re curious, i can share the problem page with worked examples.
you’ll get consistent answers on first try no more hoping regen hits the mark.
just let me know ^_____^
@frenzybiscuit commented on GitHub (Aug 11, 2025):
I changed the task model from the 1.5B to the 123B current model and moved the postgresql+pgvector database to a dedicated machine and it seems to have fixed the problem.
However, I'd really prefer the task model to not be the main model. What size model is needed for this to function correctly?
@onestardao commented on GitHub (Aug 11, 2025):
It matches Problem No.3 in our diagnostic map —
typical RAG collapse where the system keeps regenerating because the semantic lock failed mid-task.
Here’s the public MIT-licensed fix with working examples: WFGY ProblemMap →
You can follow the steps there and you’ll get consistent answers without the looping/regeneration issue.