mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 07:43:10 -05:00
issue: v0.6.33 RAG Retrieval Pull All Files for Collection Ignoring top_k #6635
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jamesottera on GitHub (Oct 9, 2025).
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.6.33
Ollama Version (if applicable)
0.12.3
Operating System
Ubuntu
Browser (if applicable)
Safari Latest
Confirmation
README.md.Expected Behavior
During RAG Retrieval, top_k should be limiting the number of files used as source.
Actual Behavior
After update to 0.6.33, RAG Retrieval is no longer using top_k to rerank / filter possible files to use for context in RAG retrieval. Previously, if you set top_k to 5, it would grab up to 5 files in a collection as sources.
Now, it is taking ALL files in the collection. This is a major issue with collections that have a large number of files.
In my case, my collection has 695 markdown files. This is leading to HUGE context bloat (cost) and cases where the context exceed the maximum for the model. This also leads to incorrect answers as it is looking too widely and running out of context limits.
Steps to Reproduce
Logs & Screenshots
Logs can be provided. I tried pasting but it said it was over the character limits. The log I would show is just showing that:
open_webui.retrieval.utils:get_doc:142 - query_doc:result [[
Is a massive blob referencing 695 files.
Additional Information
No response
@silentoplayz commented on GitHub (Oct 9, 2025):
Related - https://github.com/open-webui/open-webui/issues/18133