mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #1293] feat: better RAG #51098
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @zachrattner on GitHub (Mar 25, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1293
Is your feature request related to a problem? Please describe.
The document-based RAG feature shows a glimpse of how useful it can be. Unfortunately for most practical applications, data is not sitting in a single self contained file:
Describe the solution you'd like
Describe alternatives you've considered
Additional context
@tjbck commented on GitHub (Mar 25, 2024):
We'd love to this see feature! Feel free to make a draft PR, I'm sure a lot of people in the community would be interested in helping as well.
@zachrattner commented on GitHub (Mar 25, 2024):
@tjbck Do you have any pointers in terms of docs/code I could look at to see how the RAG is implemented? Happy to take a look
@bright-ren commented on GitHub (Mar 29, 2024):
add support for mysql and other db is a good idea
@bozo32 commented on GitHub (Mar 29, 2024):
Please allow reporting by or across sources. By would permit iteration through a set of sources.
@tjbck commented on GitHub (Mar 31, 2024):
We'd love to incorporate all the amazing ideas to our webui, but the current RAG implementation requires major overhaul to accommodate all requested features. I believe I should have more time in April to actually start taking a look at this, let's make our webui to feature the best RAG implementation to ever exist and beyond! 🚀
@zachrattner Feel free to join our discord! All the contributors are extremely active on our server, and I would love to establish high bandwidth communication before we start working on this feature. As for the pointers for the current implementation, I'd suggest you take a look at these files:
86aa2ca6cb/backend/main.py (L76)@raoulg commented on GitHub (Apr 16, 2024):
I will start working on this, probably tomorrow
@buroa commented on GitHub (Apr 24, 2024):
Relevant: https://github.com/open-webui/open-webui/pull/1693
@FaintWhisper commented on GitHub (May 6, 2024):
As part of this proposal to improve RAG support, please consider integrating injection of specific documents into context as suggested in #1719.
The current approach based on semantic and hybrid search methods does not cover all RAG use cases. Especially, the current method does not work properly according to my tests when the goal is not to retrieve or ask about specific information semantically related to the query, but to ask general questions about the document or to generate new information given all the information it contains.
In addition, we must consider what will happen when the document is larger than the available context of the selected model. Chunking (preferably using smart splitting techniques such as [1]) and summarization could be an option, or maybe injecting only the N last/first tokens into the available context. Letting the user select relevant passages or sections or transparently selecting those that are relevant for the given query using an LLM and some heuristics could also be considered.
[1] https://github.com/nlmatics/llmsherpa
@thebetauser commented on GitHub (May 14, 2024):
This project has as great RAG for youtube transcripts however the RAG implementation doesn't do a good job summarizing the transcripts. I would like to propose an option (via toggle) to allow summarization through the use of in-context memory. This would import the transcript into the context memory directly instead of a vector store which would then be included with the prompt on how to summarize the text
@tjbck commented on GitHub (Jun 19, 2024):
We're at a stage where we need to start prioritising this. PR/discussion welcome everyone!
@JOduMonT commented on GitHub (Jun 20, 2024):
(ironically like a human with a conversation)
If this is and still revelant;
@JOduMonT commented on GitHub (Jun 20, 2024):
DB wize: Postgres have a this PG Vector
@jeremiahsb commented on GitHub (Jul 5, 2024):
An alternative could be something like this: https://microsoft.github.io/graphrag/
@menelic commented on GitHub (Jul 7, 2024):
GraphRag would be an important addition to Open WebUI due to the many benefits GraphRAG has over pure embeddings-based RAG such as better contextual understanding, more accurate retrieval, less hallucinations and better understanding and retrieval of complex, connected information and the relationships between entities relevant for a query - in addition to the complementary features and benefits form using both types of RAG together.
Aside from Microsoft Graph Rag there are two more open GraphRAG implementations to consider:
Neo4J s implementation has the benefit of the graph being browse able and editable with the visual graph editor bloom https://neo4j.com/developer-blog/graphrag-llm-knowledge-graph-builder/
LamaIndex also has its own recently improved GraphRAG implementation https://neo4j.com/developer-blog/property-graph-index-llamaindex/
Note the code that offers entity disambiguation and deduplication in the linked LLamaIndex examle. This is one key advantage over standard RAG. Just imagine this would be part of Open WebUI, it would be extremely useful especially when it comes to either larger text files or many shorter text files mentioning the same entities, be they concepts, organizations, people or processes.
Please make GraphRAG a part of Open WebUIs RAg pipeline.
Edit: Ok it seems that https://github.com/open-webui/open-webui/discussions/3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph
@Domi31tls commented on GitHub (Jul 25, 2024):
I use ragflow ( https://github.com/infiniflow/ragflow) for rag, which gives good results. It is accessible via API and Docker. It offers the advantage of creating 'datarooms' identifiable by their collection name, which allows it to cross-reference information from one collection to another...
@gigq commented on GitHub (Sep 7, 2024):
Is R2R integration actually happening, I couldn't find any mention in the discussion other than them telling him he could write a custom function.
@menelic commented on GitHub (Sep 8, 2024):
God question! This seems to have stalled a bit, but thanks to your question he now responded that the documentation was a bit unclear to him https://github.com/open-webui/open-webui/discussions/3687#discussioncomment-10579619 I hope they assist because this addition would really enhance Open WebUI usecases
@muhanstudio commented on GitHub (Oct 10, 2024):
In fact, for official apps or web pages like OpenAI’s PLUS membership, they seem to be using a more costly method of document parsing. Specifically, they upload the entire document content into the context, regardless of the length, and then apply an additional layer of RAG (Retrieval-Augmented Generation) to enhance the model's attention and improve the responses. This restores RAG to its original purpose: merely augmenting context retrieval rather than directly replacing the entire document.
When engaging in a conversation on OpenAI’s PLUS membership webpage and uploading a file, the results are far better. For example, if we upload a short novel with only the main text, removing the table of contents, and ask the AI to sequentially provide each chapter's title, OpenAI can accurately and in order provide the chapter titles. Meanwhile, third-party RAG implementations often directly replace the document with retrieval fragments, which results in missing chapters or a jumbled order. This indicates that OpenAI actually holds the complete document contents. Furthermore, when asking questions about specific parts of the document, OpenAI shows surprisingly good attention, being able to accurately locate the relevant portions of the document related to the query, which suggests that they are also incorporating RAG.
I believe this is the ultimate reason why third-party RAG systems always seem to lack some content or present the material in a disordered fashion compared to OpenAI’s document-upload-to-AI feature. Of course, there’s the issue of cost to consider, but it should be an option. I’m confident that, given the significant decline in large model prices and the increasing size of context windows, many people would still prefer to optimize document parsing performance with AI using the high-cost approach that the official systems employ. Importantly, unless cost is a very pressing issue, keeping RAG as an enhancement for retrieval rather than as a replacement for the entire document seems far more reasonable. For scenarios where only RAG is used, I believe it makes sense when dealing with a knowledge base comprising hundreds of documents to use RAG for retrieval of specific fragments. However, for a single or small number of documents, we shouldn’t need to cut corners.
Lastly, I would appreciate the inclusion of metadata (such as file name, extension, and file size) when uploading documents. You can refer to another issue for detailed application scenarios. https://github.com/lobehub/lobe-chat/issues/4102
@silentoplayz commented on GitHub (Oct 15, 2024):
Related - https://github.com/open-webui/open-webui/issues/6133
@dsanr commented on GitHub (Oct 20, 2024):
Support for external Vector DB would be better https://github.com/open-webui/open-webui/discussions/938
@bozo32 commented on GitHub (Oct 20, 2024):
Picking up an old notion
I'm not sure how much should be asked of webUI
RAG returns are limited...normally 3-4. This is great if you want one answer across all documents. Many RAG use cases, such as versions of entity recognition, require all valid responses. This requires parsing of resources into chunks that are meaningful units to ask for the presence of the entity (e.g. sentences). This is quite easily done with something like grobid to parse documents and python to sequentially feed them to a LLM. This may not be reasonable to ask of an accessible webUI
2.
context size and missing in the middle seem to have a sweet inverse correlation problem. The bigger the context, the greater the probability of missing in the middle. This, again, requires some sort of carefully thought strategy (perhaps segmentation like in 1, perhaps COT, perhaps passing through a formalisation strategy like the recent stuff on prolog). In any case, these are not the sorts of things I would ask of an idiot friendly webUI
-p
From: muhanstudio @.>
Sent: Thursday, October 10, 2024 02:38
To: open-webui/open-webui @.>
Cc: Tamas, Peter @.>; Comment @.>
Subject: Re: [open-webui/open-webui] feat: better RAG (Issue #1293)
In fact, for the official app or web page, such as openai's PLUS members, they should take a relatively expensive method of parsing documents. That is, passing all the content of the document into context anyway, and then adding a layer of RAG search to increase the model's attention and optimize the response, which returns RAG to its original purpose: to enhance context search rather than to replace the entire document. It's much better to have a conversation on openai's PLUS membership page and upload a file. For example, if we upload a short story with the contents deleted and only the text, Let AI tell us the titles of each chapter in order, openai can accurately and in order tell us all the chapter titles, while the current third-party RAG replaces the document directly with the retrieval fragment, resulting in the loss of some chapters. Or the order is completely messed up, which means that OpenAI actually owns the entire document content, and when we ask questions about specified content, OpenAI's attention is surprisingly good, and it can locate exactly where we ask the question in the document, which indicates that it also introduces RAG. I think this is the ultimate reason why there is always a lack of content or the order is messed up compared to the AI answer uploaded by the third-party RAG and the official website. Of course, the cost issue is something we're thinking about, but it should also be selective, and I believe that prices generally decline significantly in the big models, Although the context window is getting bigger and bigger, there are still many people who hope to optimize their performance in document parsing with AI through the same high-cost approach as the official. Most importantly, it is much more reasonable to use RAG only as an enhanced retrieval rather than a direct replacement for the entire document, unless the cost is very pressing. For a scenario where only a few hundred documents are imported into RAG, I think it makes more sense for a knowledge base of probably hundreds of documents to be imported into just a RAG retrieval segment, and we don't have to be so swamped with individual or small amounts of documents. In addition, I hope that when uploading a document, including the file's meta information (file name, extension name, file size), the specific application scenarios we can look at in another issue. lobehub/lobe-chat#4102https://github.com/lobehub/lobe-chat/issues/4102
—
Reply to this email directly, view it on GitHubhttps://github.com/open-webui/open-webui/issues/1293#issuecomment-2403673361, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYKOUNMZVEYQLSJJ23ZOZI3Z2XD6XAVCNFSM6AAAAABFGZLQXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBTGY3TGMZWGE.
You are receiving this because you commented.Message ID: @.***>
@sir3mat commented on GitHub (Nov 5, 2024):
Hi, i have seen a strange behaviour using rag and pipelines
The workflow is as follows:
Documents are loaded and fed into OpenWebUI.
OpenWebUI encodes the documents, splits them into chunks, and stores them.
The /inlet endpoint on the pipeline is called.
OpenWebUI then performs the RAG operation.
The pipeline’s /pipe endpoint is triggered, displaying the response on OpenWebUI.
Finally, the client calls the /outlet endpoint
Question on Implementation:
Why do the /inlet, /pipe, and /outlet endpoints each have different request bodies?
Why the process is openwebuiClient->inlet->openwebuiClient->pipe->openwebuiClient->outlet->openwebuiClient?
If I develop RAG on my custom pipeline in pipe logic tha RAG of openwebui is still executed.
@tjbck commented on GitHub (Feb 16, 2025):
Closing in favour of #10094