mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-11 16:35:32 -05:00
[GH-ISSUE #10061] Improved chunking options for RAG #54414
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @subashc2023 on GitHub (Feb 15, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/10061
Feature Request
Right now it seems the only options are chunk size, and overlap. Can we introduce additional chunking options that will automatically chunk codeblocks, and split by paragraph in Markdown docs? The current chunking algorithm destroys code, and the retrieval system does not work very will with this arbitrarily chunked code.
@Schwenn2002 commented on GitHub (Feb 15, 2025):
A suggestion for additional optimizing the RAG function.
Currently, you can improve the quality of the query using Top k and ReRanking.
It would be optimal to filter out the most important documents using Top k=50 or 70 and ReRanking (0,5 - 0.8). Then another parameter would be optimal if, after the reranking, maximum the best (Top Best=10 or 20) hits were passed on to the LLM as context.
This way, the context length can be kept shorter and the response time improved. This also ensures that the context cannot become larger than configured in the LLM.
The current behavior of the LLM is that a context that is too large cannot be processed (i.e. it is interpreted as empty). It would therefore make sense to truncate the context from the RAG query after reranking to the context length of the LLM, so that only the best hits are passed on.
@Schwenn2002 commented on GitHub (Feb 15, 2025):
Furthermore, I have replaced the integrated chromadb with qdrant. Retrieving information seems to work much better with qdrant!
I have customized the model suggested above in the source code. This makes the retrieval much better. But I also chose 50% overlap in the RAG.
Nevertheless, the semantic formation of vectors would certainly be even better for the RAG.