mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[GH-ISSUE #12077] feat: Handling of a large number of knowledge base files #16459
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @setuin on GitHub (Mar 26, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/12077
Check Existing Issues
Problem Description
I always feel frustrated when I have to upload a large number of PDF files to the knowledge base. It would cause a lot of token waste and I can't handle too many text files either.
Desired Solution you'd like
I'm wondering how to incorporate the processing of the knowledge base into the process of establishing it, especially when dealing with a large number of PDF text files. What would be the best way to handle it?
Alternatives Considered
What I'm thinking about is to interface with the vector database. In the knowledge base area, we can use a better vector model or establish a more appropriate database to compress the data, reduce the usage of tokens or adopt a better retrieval method. I'm not quite clear about whether this is handled at the user end or the client end.
Additional Context
No response
@mahenning commented on GitHub (Mar 27, 2025):
I'm not sure what you want here. What do you mean with "token waste" when uploading PDFs?
Do you want a separate vector database and embedding model for each knowledge? How should that "compress" the data?
I'm even more lost here, please rephrase and explain.
Maybe you can enlighten me with examples, but right now I have no idea what exactly you want, sorry.
@ivanbaldo commented on GitHub (Apr 14, 2025):
Three weeks and no clarification, should be clarified or closed...
@Mariano215 commented on GitHub (Aug 10, 2025):
I'm having a similar issue. The issue is that if I upload many documents into a knowledgebase the query ends up blowing past the token limit because - it seems - that the returned results from many documents are not ranked and limited to only the top choices. It seems to send EVERYTHING it finds as a potential match back to the LLM as context and we get an error.
There should be some functionality that limits the RAG retrieval to only the top choices. Even better, allow for hybrid rag, not just cosign similarity?
@mahenning commented on GitHub (Aug 11, 2025):
There is already the option for hybrid search with a reranker and the top k (embedding) and top k reranker (for the reranker) option to limit the number of results. Note that currently this limit is per query, and typically 3 queries are created per user request in the chat to retrieve information. Maybe try with a lower top k reranker number?