mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-12 01:54:38 -05:00
feat: web content pipeline for rag #177
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tjbck on GitHub (Jan 13, 2024).
Originally assigned to: @tjbck on GitHub.
https://www.reddit.com/r/LocalLLaMA/comments/192dz3r/q_is_it_possible_to_give_ollama_access_to_a_local/
@justinh-rahb commented on GitHub (Jan 14, 2024):
Integrating a feature in Ollama WebUI that allows users to provide a URL and have it automatically processed through RAG would be an exciting addition. However, determining how to trigger this feature raises some questions. For instance, we wouldn't want every URL mentioned in the chat to automatically trigger the RAG process since that could lead to unnecessary processing and potentially unwanted results.
One solution could be adding a button that explicitly triggers the RAG process for a given URL. However, it would be ideal if we could make this feature as automatic as possible while maintaining control over when it's triggered. One way to achieve this could be by implementing some sort of semantic routing model that recognizes specific commands or keywords in the chat input and automatically triggers the RAG process for a provided URL.
For example, if a user types "Read this article" followed by a URL, Ollama WebUI could automatically recognize the command and trigger the RAG process without requiring any additional steps. This approach would maintain the clean interface we currently have.
Here's some further information on Semantic Routing:
YouTube
Demo notebook
LangChain example notebook
Repo
OpenAI must be implementing a similar concept in ChatGPT Plus to figure out whether to generate images or code or what have you, as it wouldn't be feasible or cost-effective to require two API calls for every message, one with GPT-3.5 to determine what to use and another to perform the action with GPT-4/DALLE-3/CodeInterpreter.
@RLutsch commented on GitHub (Jan 14, 2024):
how about adding an endpoint you can publish data to? something like
curl -X post <myUrl.com>/rag --data 'my.csv'then option to add chromaDB?
Maybe also support for external db?
@ChingWeiChan commented on GitHub (Jan 16, 2024):
Additional information about web content pipeline, I think we can integrate search api for RAG. Like serpapi(free plan is 100 times search/month), bing search API (free plan is 1000 transactions /month),Wikipedia api or other.
@Marclass commented on GitHub (Jan 17, 2024):
The RAG endpoint to scrape web pages was added in #333 with /web.
@oliverbob commented on GitHub (Jan 23, 2024):
Any update to this yet?
Thanks.
@tjbck commented on GitHub (Jan 27, 2024):
You can now add website content to rag pipeline directly using '#' command followed by the website url, let me know if you guys encounter any issues!
As for the API integration support, let's continue our discussion here: #586, Thanks!
@oliverbob commented on GitHub (Jan 28, 2024):
Thanks mate. Been waiting for this. Will try the latest update.
@justinh-rahb commented on GitHub (Jan 28, 2024):
Working great for me 💯 Congrats @tjbck for landing this absolutely huge feature!