[GH-ISSUE #1293] feat: better RAG #51098

New Issue

GiteaMirror · 2026-05-05T11:57:37-05:00

GiteaMirror commented

2026-05-05 11:57:37 -05:00

Originally created by @zachrattner on GitHub (Mar 25, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1293

Is your feature request related to a problem? Please describe.

The document-based RAG feature shows a glimpse of how useful it can be. Unfortunately for most practical applications, data is not sitting in a single self contained file:

Source code repos span multiple files
Cloud services like Jira, GitHub, Wordpress, Slack, etc. have REST APIs/webhooks and exporting data to a flat file is clunky and not always possible
Structured data sources like MySQL or MongoDB databases

Describe the solution you'd like

Generalize the Documents tab to Data Sources
Keep the Document option as a Data Source
Allow a Folder to be a Data Source, and recursively include all its containing files
Allow for an API connector to connect common cloud tools like Slack, Jira, GitHub, Notion
Allow for database connectors like MySQL, MongoDB so folks can enter a connection string and access the data
Allow for generic REST API so folks can connect their own services

Describe alternatives you've considered

Exporting from the sources to a flat file, then uploading the flat file

Additional context

Happy to help if the team decides to do this
Better to do it one step at a time, since each new Data Source adds incremental value to the tool

Originally created by @zachrattner on GitHub (Mar 25, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/1293 **Is your feature request related to a problem? Please describe.** The document-based RAG feature shows a glimpse of how useful it can be. Unfortunately for most practical applications, data is not sitting in a single self contained file: - Source code repos span multiple files - Cloud services like Jira, GitHub, Wordpress, Slack, etc. have REST APIs/webhooks and exporting data to a flat file is clunky and not always possible - Structured data sources like MySQL or MongoDB databases **Describe the solution you'd like** - Generalize the Documents tab to Data Sources - Keep the Document option as a Data Source - Allow a Folder to be a Data Source, and recursively include all its containing files - Allow for an API connector to connect common cloud tools like Slack, Jira, GitHub, Notion - Allow for database connectors like MySQL, MongoDB so folks can enter a connection string and access the data - Allow for generic REST API so folks can connect their own services **Describe alternatives you've considered** - Exporting from the sources to a flat file, then uploading the flat file **Additional context** - Happy to help if the team decides to do this - Better to do it one step at a time, since each new Data Source adds incremental value to the tool

GiteaMirror added the enhancement good first issue help wanted core labels 2026-05-05 11:57:38 -05:00

GiteaMirror closed this issue

2026-05-05 11:57:41 -05:00

GiteaMirror commented

2026-05-05 11:57:43 -05:00

@tjbck commented on GitHub (Mar 25, 2024):

We'd love to this see feature! Feel free to make a draft PR, I'm sure a lot of people in the community would be interested in helping as well.

@tjbck commented on GitHub (Mar 25, 2024): We'd love to this see feature! Feel free to make a draft PR, I'm sure a lot of people in the community would be interested in helping as well.

GiteaMirror commented

2026-05-05 11:57:44 -05:00

@zachrattner commented on GitHub (Mar 25, 2024):

@tjbck Do you have any pointers in terms of docs/code I could look at to see how the RAG is implemented? Happy to take a look

@zachrattner commented on GitHub (Mar 25, 2024): @tjbck Do you have any pointers in terms of docs/code I could look at to see how the RAG is implemented? Happy to take a look

GiteaMirror commented

2026-05-05 11:57:45 -05:00

@bright-ren commented on GitHub (Mar 29, 2024):

add support for mysql and other db is a good idea

@bright-ren commented on GitHub (Mar 29, 2024): add support for mysql and other db is a good idea

GiteaMirror commented

2026-05-05 11:57:46 -05:00

@bozo32 commented on GitHub (Mar 29, 2024):

Please allow reporting by or across sources. By would permit iteration through a set of sources.

@bozo32 commented on GitHub (Mar 29, 2024): Please allow reporting by or across sources. By would permit iteration through a set of sources.

GiteaMirror commented

2026-05-05 11:57:47 -05:00

@tjbck commented on GitHub (Mar 31, 2024):

We'd love to incorporate all the amazing ideas to our webui, but the current RAG implementation requires major overhaul to accommodate all requested features. I believe I should have more time in April to actually start taking a look at this, let's make our webui to feature the best RAG implementation to ever exist and beyond! 🚀

@zachrattner Feel free to join our discord! All the contributors are extremely active on our server, and I would love to establish high bandwidth communication before we start working on this feature. As for the pointers for the current implementation, I'd suggest you take a look at these files:

@tjbck commented on GitHub (Mar 31, 2024): We'd love to incorporate all the amazing ideas to our webui, but the current RAG implementation requires major overhaul to accommodate all requested features. I believe I should have more time in April to actually start taking a look at this, let's make our webui to feature the best RAG implementation to ever exist and beyond! 🚀 @zachrattner [Feel free to join our discord](https://discord.gg/5rJgQTnV4s)! All the contributors are extremely active on our server, and I would love to establish high bandwidth communication before we start working on this feature. As for the pointers for the current implementation, I'd suggest you take a look at these files: - https://github.com/open-webui/open-webui/tree/main/backend/apps/rag - https://github.com/open-webui/open-webui/blob/86aa2ca6cb14df2399f8297726a3bd978ab3d541/backend/main.py#L76

GiteaMirror commented

2026-05-05 11:57:48 -05:00

@raoulg commented on GitHub (Apr 16, 2024):

I will start working on this, probably tomorrow

@raoulg commented on GitHub (Apr 16, 2024): I will start working on this, probably tomorrow

GiteaMirror commented

2026-05-05 11:57:48 -05:00

@buroa commented on GitHub (Apr 24, 2024):

Relevant: https://github.com/open-webui/open-webui/pull/1693

@buroa commented on GitHub (Apr 24, 2024): Relevant: https://github.com/open-webui/open-webui/pull/1693

GiteaMirror commented

2026-05-05 11:57:49 -05:00

@FaintWhisper commented on GitHub (May 6, 2024):

As part of this proposal to improve RAG support, please consider integrating injection of specific documents into context as suggested in #1719.

The current approach based on semantic and hybrid search methods does not cover all RAG use cases. Especially, the current method does not work properly according to my tests when the goal is not to retrieve or ask about specific information semantically related to the query, but to ask general questions about the document or to generate new information given all the information it contains.

In addition, we must consider what will happen when the document is larger than the available context of the selected model. Chunking (preferably using smart splitting techniques such as [1]) and summarization could be an option, or maybe injecting only the N last/first tokens into the available context. Letting the user select relevant passages or sections or transparently selecting those that are relevant for the given query using an LLM and some heuristics could also be considered.

[1] https://github.com/nlmatics/llmsherpa

@FaintWhisper commented on GitHub (May 6, 2024): As part of this proposal to improve RAG support, please consider integrating injection of specific documents into context as suggested in #1719. The current approach based on semantic and hybrid search methods does not cover all RAG use cases. Especially, the current method does not work properly according to my tests when the goal is not to retrieve or ask about specific information semantically related to the query, but to ask general questions about the document or to generate new information given all the information it contains. In addition, we must consider what will happen when the document is larger than the available context of the selected model. Chunking (preferably using smart splitting techniques such as **[1]**) and summarization could be an option, or maybe injecting only the _N_ last/first tokens into the available context. Letting the user select relevant passages or sections or transparently selecting those that are relevant for the given query using an LLM and some heuristics could also be considered. **[1]** https://github.com/nlmatics/llmsherpa

GiteaMirror commented

2026-05-05 11:57:49 -05:00

@thebetauser commented on GitHub (May 14, 2024):

This project has as great RAG for youtube transcripts however the RAG implementation doesn't do a good job summarizing the transcripts. I would like to propose an option (via toggle) to allow summarization through the use of in-context memory. This would import the transcript into the context memory directly instead of a vector store which would then be included with the prompt on how to summarize the text

@thebetauser commented on GitHub (May 14, 2024): This project has as great RAG for youtube transcripts however the RAG implementation doesn't do a good job summarizing the transcripts. I would like to propose an option (via toggle) to allow summarization through the use of in-context memory. This would import the transcript into the context memory directly instead of a vector store which would then be included with the prompt on how to summarize the text

GiteaMirror commented

2026-05-05 11:57:50 -05:00

@tjbck commented on GitHub (Jun 19, 2024):

We're at a stage where we need to start prioritising this. PR/discussion welcome everyone!

@tjbck commented on GitHub (Jun 19, 2024): We're at a stage where we need to start prioritising this. PR/discussion welcome everyone!

GiteaMirror commented

2026-05-05 11:57:51 -05:00

@JOduMonT commented on GitHub (Jun 20, 2024):

Don't remember where and when; but I eared that Ai mostly will remember (use) the beginning and the end of a documents

(ironically like a human with a conversation)

If this is and still revelant;

to make a better RAG would you considere to remove the beginning and end of ebooks like TOC, copyrights, ...
if yes how ethically would you consider that, since most of the time it would be consider as tampering the document.

@JOduMonT commented on GitHub (Jun 20, 2024): - Don't remember where and when; but I eared that Ai mostly will remember (use) the beginning and the end of a documents (_ironically like a human with a conversation_) ### If this is and still revelant; 1. to make a better RAG would you considere to remove the beginning and end of ebooks like TOC, copyrights, ... 2. if yes how ethically would you consider that, since most of the time it would be consider as tampering the document.

GiteaMirror commented

2026-05-05 11:57:53 -05:00

@JOduMonT commented on GitHub (Jun 20, 2024):

DB wize: Postgres have a this PG Vector

@JOduMonT commented on GitHub (Jun 20, 2024): DB wize: Postgres have a this [PG Vector](https://github.com/pgvector/pgvector)

GiteaMirror commented

2026-05-05 11:57:58 -05:00

@jeremiahsb commented on GitHub (Jul 5, 2024):

An alternative could be something like this: https://microsoft.github.io/graphrag/

@jeremiahsb commented on GitHub (Jul 5, 2024): An alternative could be something like this: https://microsoft.github.io/graphrag/

GiteaMirror commented

2026-05-05 11:58:03 -05:00

@menelic commented on GitHub (Jul 7, 2024):

GraphRag would be an important addition to Open WebUI due to the many benefits GraphRAG has over pure embeddings-based RAG such as better contextual understanding, more accurate retrieval, less hallucinations and better understanding and retrieval of complex, connected information and the relationships between entities relevant for a query - in addition to the complementary features and benefits form using both types of RAG together.

Aside from Microsoft Graph Rag there are two more open GraphRAG implementations to consider:

Neo4J s implementation has the benefit of the graph being browse able and editable with the visual graph editor bloom https://neo4j.com/developer-blog/graphrag-llm-knowledge-graph-builder/

LamaIndex also has its own recently improved GraphRAG implementation https://neo4j.com/developer-blog/property-graph-index-llamaindex/

Note the code that offers entity disambiguation and deduplication in the linked LLamaIndex examle. This is one key advantage over standard RAG. Just imagine this would be part of Open WebUI, it would be extremely useful especially when it comes to either larger text files or many shorter text files mentioning the same entities, be they concepts, organizations, people or processes.

Please make GraphRAG a part of Open WebUIs RAg pipeline.

Edit: Ok it seems that https://github.com/open-webui/open-webui/discussions/3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph

@menelic commented on GitHub (Jul 7, 2024): GraphRag would be an important addition to Open WebUI due to the many benefits GraphRAG has over pure embeddings-based RAG such as better contextual understanding, more accurate retrieval, less hallucinations and better understanding and retrieval of complex, connected information and the relationships between entities relevant for a query - in addition to the complementary features and benefits form using both types of RAG together. Aside from Microsoft Graph Rag there are two more open GraphRAG implementations to consider: Neo4J s implementation has the benefit of the graph being browse able and editable with the visual graph editor bloom https://neo4j.com/developer-blog/graphrag-llm-knowledge-graph-builder/ LamaIndex also has its own recently improved GraphRAG implementation https://neo4j.com/developer-blog/property-graph-index-llamaindex/ Note the code that offers entity disambiguation and deduplication in the linked LLamaIndex examle. This is one key advantage over standard RAG. Just imagine this would be part of Open WebUI, it would be extremely useful especially when it comes to either larger text files or many shorter text files mentioning the same entities, be they concepts, organizations, people or processes. Please make GraphRAG a part of Open WebUIs RAg pipeline. Edit: Ok it seems that https://github.com/open-webui/open-webui/discussions/3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph

GiteaMirror commented

2026-05-05 11:58:06 -05:00

@Domi31tls commented on GitHub (Jul 25, 2024):

I use ragflow ( https://github.com/infiniflow/ragflow) for rag, which gives good results. It is accessible via API and Docker. It offers the advantage of creating 'datarooms' identifiable by their collection name, which allows it to cross-reference information from one collection to another...

@Domi31tls commented on GitHub (Jul 25, 2024): I use ragflow ( https://github.com/infiniflow/ragflow) for rag, which gives good results. It is accessible via API and Docker. It offers the advantage of creating 'datarooms' identifiable by their collection name, which allows it to cross-reference information from one collection to another...

GiteaMirror commented

2026-05-05 11:58:07 -05:00

@gigq commented on GitHub (Sep 7, 2024):

Edit: Ok it seems that #3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph

Is R2R integration actually happening, I couldn't find any mention in the discussion other than them telling him he could write a custom function.

@gigq commented on GitHub (Sep 7, 2024): > Edit: Ok it seems that #3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph Is R2R integration actually happening, I couldn't find any mention in the discussion other than them telling him he could write a custom function.

GiteaMirror commented

2026-05-05 11:58:08 -05:00

@menelic commented on GitHub (Sep 8, 2024):

Edit: Ok it seems that #3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph

Is R2R integration actually happening, I couldn't find any mention in the discussion other than them telling him he could write a custom function.

God question! This seems to have stalled a bit, but thanks to your question he now responded that the documentation was a bit unclear to him https://github.com/open-webui/open-webui/discussions/3687#discussioncomment-10579619 I hope they assist because this addition would really enhance Open WebUI usecases

@menelic commented on GitHub (Sep 8, 2024): > > Edit: Ok it seems that #3687 is about to take care of this and much more with an upcoming R2R integration in Open WebUI. R2R includes a Neo4j GraphRAG implementation https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph > > Is R2R integration actually happening, I couldn't find any mention in the discussion other than them telling him he could write a custom function. God question! This seems to have stalled a bit, but thanks to your question he now responded that the documentation was a bit unclear to him https://github.com/open-webui/open-webui/discussions/3687#discussioncomment-10579619 I hope they assist because this addition would really enhance Open WebUI usecases

GiteaMirror commented

2026-05-05 11:58:09 -05:00

@muhanstudio commented on GitHub (Oct 10, 2024):

In fact, for official apps or web pages like OpenAI’s PLUS membership, they seem to be using a more costly method of document parsing. Specifically, they upload the entire document content into the context, regardless of the length, and then apply an additional layer of RAG (Retrieval-Augmented Generation) to enhance the model's attention and improve the responses. This restores RAG to its original purpose: merely augmenting context retrieval rather than directly replacing the entire document.

When engaging in a conversation on OpenAI’s PLUS membership webpage and uploading a file, the results are far better. For example, if we upload a short novel with only the main text, removing the table of contents, and ask the AI to sequentially provide each chapter's title, OpenAI can accurately and in order provide the chapter titles. Meanwhile, third-party RAG implementations often directly replace the document with retrieval fragments, which results in missing chapters or a jumbled order. This indicates that OpenAI actually holds the complete document contents. Furthermore, when asking questions about specific parts of the document, OpenAI shows surprisingly good attention, being able to accurately locate the relevant portions of the document related to the query, which suggests that they are also incorporating RAG.

I believe this is the ultimate reason why third-party RAG systems always seem to lack some content or present the material in a disordered fashion compared to OpenAI’s document-upload-to-AI feature. Of course, there’s the issue of cost to consider, but it should be an option. I’m confident that, given the significant decline in large model prices and the increasing size of context windows, many people would still prefer to optimize document parsing performance with AI using the high-cost approach that the official systems employ. Importantly, unless cost is a very pressing issue, keeping RAG as an enhancement for retrieval rather than as a replacement for the entire document seems far more reasonable. For scenarios where only RAG is used, I believe it makes sense when dealing with a knowledge base comprising hundreds of documents to use RAG for retrieval of specific fragments. However, for a single or small number of documents, we shouldn’t need to cut corners.

Lastly, I would appreciate the inclusion of metadata (such as file name, extension, and file size) when uploading documents. You can refer to another issue for detailed application scenarios. https://github.com/lobehub/lobe-chat/issues/4102

@muhanstudio commented on GitHub (Oct 10, 2024): In fact, for official apps or web pages like OpenAI’s PLUS membership, they seem to be using a more costly method of document parsing. Specifically, they upload the entire document content into the context, regardless of the length, and then apply an additional layer of RAG (Retrieval-Augmented Generation) to enhance the model's attention and improve the responses. This restores RAG to its original purpose: merely augmenting context retrieval rather than directly replacing the entire document. When engaging in a conversation on OpenAI’s PLUS membership webpage and uploading a file, the results are far better. For example, if we upload a short novel with only the main text, removing the table of contents, and ask the AI to sequentially provide each chapter's title, OpenAI can accurately and in order provide the chapter titles. Meanwhile, third-party RAG implementations often directly replace the document with retrieval fragments, which results in missing chapters or a jumbled order. This indicates that OpenAI actually holds the complete document contents. Furthermore, when asking questions about specific parts of the document, OpenAI shows surprisingly good attention, being able to accurately locate the relevant portions of the document related to the query, which suggests that they are also incorporating RAG. I believe this is the ultimate reason why third-party RAG systems always seem to lack some content or present the material in a disordered fashion compared to OpenAI’s document-upload-to-AI feature. Of course, there’s the issue of cost to consider, but it should be an option. I’m confident that, given the significant decline in large model prices and the increasing size of context windows, many people would still prefer to optimize document parsing performance with AI using the high-cost approach that the official systems employ. Importantly, unless cost is a very pressing issue, keeping RAG as an enhancement for retrieval rather than as a replacement for the entire document seems far more reasonable. For scenarios where only RAG is used, I believe it makes sense when dealing with a knowledge base comprising hundreds of documents to use RAG for retrieval of specific fragments. However, for a single or small number of documents, we shouldn’t need to cut corners. Lastly, I would appreciate the inclusion of metadata (such as file name, extension, and file size) when uploading documents. You can refer to another issue for detailed application scenarios. https://github.com/lobehub/lobe-chat/issues/4102

GiteaMirror commented

2026-05-05 11:58:09 -05:00

@silentoplayz commented on GitHub (Oct 15, 2024):

Related - https://github.com/open-webui/open-webui/issues/6133

@silentoplayz commented on GitHub (Oct 15, 2024): Related - https://github.com/open-webui/open-webui/issues/6133

GiteaMirror commented

2026-05-05 11:58:11 -05:00

@dsanr commented on GitHub (Oct 20, 2024):

Support for external Vector DB would be better https://github.com/open-webui/open-webui/discussions/938

@dsanr commented on GitHub (Oct 20, 2024): Support for external Vector DB would be better https://github.com/open-webui/open-webui/discussions/938

GiteaMirror commented

2026-05-05 11:58:15 -05:00

@bozo32 commented on GitHub (Oct 20, 2024):

Picking up an old notion
I'm not sure how much should be asked of webUI

RAG returns are limited...normally 3-4. This is great if you want one answer across all documents. Many RAG use cases, such as versions of entity recognition, require all valid responses. This requires parsing of resources into chunks that are meaningful units to ask for the presence of the entity (e.g. sentences). This is quite easily done with something like grobid to parse documents and python to sequentially feed them to a LLM. This may not be reasonable to ask of an accessible webUI
2.
context size and missing in the middle seem to have a sweet inverse correlation problem. The bigger the context, the greater the probability of missing in the middle. This, again, requires some sort of carefully thought strategy (perhaps segmentation like in 1, perhaps COT, perhaps passing through a formalisation strategy like the recent stuff on prolog). In any case, these are not the sorts of things I would ask of an idiot friendly webUI

-p

From: muhanstudio @.>
Sent: Thursday, October 10, 2024 02:38
To: open-webui/open-webui @.>
Cc: Tamas, Peter @.>; Comment @.>
Subject: Re: [open-webui/open-webui] feat: better RAG (Issue #1293)

In fact, for the official app or web page, such as openai's PLUS members, they should take a relatively expensive method of parsing documents. That is, passing all the content of the document into context anyway, and then adding a layer of RAG search to increase the model's attention and optimize the response, which returns RAG to its original purpose: to enhance context search rather than to replace the entire document. It's much better to have a conversation on openai's PLUS membership page and upload a file. For example, if we upload a short story with the contents deleted and only the text, Let AI tell us the titles of each chapter in order, openai can accurately and in order tell us all the chapter titles, while the current third-party RAG replaces the document directly with the retrieval fragment, resulting in the loss of some chapters. Or the order is completely messed up, which means that OpenAI actually owns the entire document content, and when we ask questions about specified content, OpenAI's attention is surprisingly good, and it can locate exactly where we ask the question in the document, which indicates that it also introduces RAG. I think this is the ultimate reason why there is always a lack of content or the order is messed up compared to the AI answer uploaded by the third-party RAG and the official website. Of course, the cost issue is something we're thinking about, but it should also be selective, and I believe that prices generally decline significantly in the big models, Although the context window is getting bigger and bigger, there are still many people who hope to optimize their performance in document parsing with AI through the same high-cost approach as the official. Most importantly, it is much more reasonable to use RAG only as an enhanced retrieval rather than a direct replacement for the entire document, unless the cost is very pressing. For a scenario where only a few hundred documents are imported into RAG, I think it makes more sense for a knowledge base of probably hundreds of documents to be imported into just a RAG retrieval segment, and we don't have to be so swamped with individual or small amounts of documents. In addition, I hope that when uploading a document, including the file's meta information (file name, extension name, file size), the specific application scenarios we can look at in another issue. lobehub/lobe-chat#4102 https://github.com/lobehub/lobe-chat/issues/4102

—
Reply to this email directly, view it on GitHubhttps://github.com/open-webui/open-webui/issues/1293#issuecomment-2403673361, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AYKOUNMZVEYQLSJJ23ZOZI3Z2XD6XAVCNFSM6AAAAABFGZLQXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBTGY3TGMZWGE.
You are receiving this because you commented.Message ID: @.***>

@bozo32 commented on GitHub (Oct 20, 2024): Picking up an old notion I'm not sure how much should be asked of webUI 1. RAG returns are limited...normally 3-4. This is great if you want one answer across all documents. Many RAG use cases, such as versions of entity recognition, require all valid responses. This requires parsing of resources into chunks that are meaningful units to ask for the presence of the entity (e.g. sentences). This is quite easily done with something like grobid to parse documents and python to sequentially feed them to a LLM. This may not be reasonable to ask of an accessible webUI 2. context size and missing in the middle seem to have a sweet inverse correlation problem. The bigger the context, the greater the probability of missing in the middle. This, again, requires some sort of carefully thought strategy (perhaps segmentation like in 1, perhaps COT, perhaps passing through a formalisation strategy like the recent stuff on prolog). In any case, these are not the sorts of things I would ask of an idiot friendly webUI -p ________________________________ From: muhanstudio ***@***.***> Sent: Thursday, October 10, 2024 02:38 To: open-webui/open-webui ***@***.***> Cc: Tamas, Peter ***@***.***>; Comment ***@***.***> Subject: Re: [open-webui/open-webui] feat: better RAG (Issue #1293) In fact, for the official app or web page, such as openai's PLUS members, they should take a relatively expensive method of parsing documents. That is, passing all the content of the document into context anyway, and then adding a layer of RAG search to increase the model's attention and optimize the response, which returns RAG to its original purpose: to enhance context search rather than to replace the entire document. It's much better to have a conversation on openai's PLUS membership page and upload a file. For example, if we upload a short story with the contents deleted and only the text, Let AI tell us the titles of each chapter in order, openai can accurately and in order tell us all the chapter titles, while the current third-party RAG replaces the document directly with the retrieval fragment, resulting in the loss of some chapters. Or the order is completely messed up, which means that OpenAI actually owns the entire document content, and when we ask questions about specified content, OpenAI's attention is surprisingly good, and it can locate exactly where we ask the question in the document, which indicates that it also introduces RAG. I think this is the ultimate reason why there is always a lack of content or the order is messed up compared to the AI answer uploaded by the third-party RAG and the official website. Of course, the cost issue is something we're thinking about, but it should also be selective, and I believe that prices generally decline significantly in the big models, Although the context window is getting bigger and bigger, there are still many people who hope to optimize their performance in document parsing with AI through the same high-cost approach as the official. Most importantly, it is much more reasonable to use RAG only as an enhanced retrieval rather than a direct replacement for the entire document, unless the cost is very pressing. For a scenario where only a few hundred documents are imported into RAG, I think it makes more sense for a knowledge base of probably hundreds of documents to be imported into just a RAG retrieval segment, and we don't have to be so swamped with individual or small amounts of documents. In addition, I hope that when uploading a document, including the file's meta information (file name, extension name, file size), the specific application scenarios we can look at in another issue. lobehub/lobe-chat#4102<https://github.com/lobehub/lobe-chat/issues/4102> — Reply to this email directly, view it on GitHub<https://github.com/open-webui/open-webui/issues/1293#issuecomment-2403673361>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AYKOUNMZVEYQLSJJ23ZOZI3Z2XD6XAVCNFSM6AAAAABFGZLQXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBTGY3TGMZWGE>. You are receiving this because you commented.Message ID: ***@***.***>

GiteaMirror commented

2026-05-05 11:58:19 -05:00

@sir3mat commented on GitHub (Nov 5, 2024):

Hi, i have seen a strange behaviour using rag and pipelines

The workflow is as follows:

Documents are loaded and fed into OpenWebUI.
OpenWebUI encodes the documents, splits them into chunks, and stores them.
The /inlet endpoint on the pipeline is called.
OpenWebUI then performs the RAG operation.
The pipeline’s /pipe endpoint is triggered, displaying the response on OpenWebUI.
Finally, the client calls the /outlet endpoint

Question on Implementation:

Why do the /inlet, /pipe, and /outlet endpoints each have different request bodies?
Why the process is openwebuiClient->inlet->openwebuiClient->pipe->openwebuiClient->outlet->openwebuiClient?

If I develop RAG on my custom pipeline in pipe logic tha RAG of openwebui is still executed.

@sir3mat commented on GitHub (Nov 5, 2024): Hi, i have seen a strange behaviour using rag and pipelines The workflow is as follows: Documents are loaded and fed into OpenWebUI. OpenWebUI encodes the documents, splits them into chunks, and stores them. The /inlet endpoint on the pipeline is called. OpenWebUI then performs the RAG operation. The pipeline’s /pipe endpoint is triggered, displaying the response on OpenWebUI. Finally, the client calls the /outlet endpoint Question on Implementation: Why do the /inlet, /pipe, and /outlet endpoints each have different request bodies? Why the process is openwebuiClient->inlet->openwebuiClient->pipe->openwebuiClient->outlet->openwebuiClient? If I develop RAG on my custom pipeline in pipe logic tha RAG of openwebui is still executed.

GiteaMirror commented

2026-05-05 11:58:20 -05:00

@tjbck commented on GitHub (Feb 16, 2025):

Closing in favour of #10094

@tjbck commented on GitHub (Feb 16, 2025): Closing in favour of #10094

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#51098