[GH-ISSUE #586] feat: rag api integration support (web search) #12130

Closed
opened 2026-04-19 18:56:15 -05:00 by GiteaMirror · 17 comments
Owner

Originally created by @tjbck on GitHub (Jan 27, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/586

Extension of #464

Additional information about web content pipeline, I think we can integrate search api for RAG. Like serpapi(free plan is 100 times search/month), bing search API (free plan is 1000 transactions /month),Wikipedia api or other.

Originally created by @tjbck on GitHub (Jan 27, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/586 Extension of #464 > Additional information about web content pipeline, I think we can integrate search api for RAG. Like [serpapi](https://serpapi.com/)(free plan is 100 times search/month), [bing search API](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) (free plan is 1000 transactions /month),[Wikipedia api](https://api.wikimedia.org/wiki/Searching_for_Wikipedia_articles_using_Python) or other.
Author
Owner

@zinwelzl commented on GitHub (Jan 28, 2024):

Hi.

Can you give example how to use it.

I try with
count words from #http://example.com/
but doesn't work for me

<!-- gh-comment-id:1913552254 --> @zinwelzl commented on GitHub (Jan 28, 2024): Hi. Can you give example how to use it. I try with count words from #http://example.com/ but doesn't work for me
Author
Owner

@tjbck commented on GitHub (Jan 29, 2024):

@zinwelzl

You should start the prompt using the '#' command followed by the website url like such:

image

Make sure to click on the button to include the website as a document.

<!-- gh-comment-id:1915623323 --> @tjbck commented on GitHub (Jan 29, 2024): @zinwelzl You should start the prompt using the '#' command followed by the website url like such: <img width="772" alt="image" src="https://github.com/ollama-webui/ollama-webui/assets/25473318/f464d276-0f0d-4285-9492-1580adc8900c"> Make sure to click on the button to include the website as a document.
Author
Owner

@ChingWeiChan commented on GitHub (Jan 30, 2024):

@tjbck
I think we can make the url block clickable (hypertext) or catch the website title then can check the url whether we expected or not.
截圖 2024-01-30 晚上7 34 52

<!-- gh-comment-id:1916653495 --> @ChingWeiChan commented on GitHub (Jan 30, 2024): @tjbck I think we can make the url block clickable (hypertext) or catch the website title then can check the url whether we expected or not. ![截圖 2024-01-30 晚上7 34 52](https://github.com/ollama-webui/ollama-webui/assets/14084937/652828ca-7dc5-42d7-bb45-1895114c1007)
Author
Owner

@tjbck commented on GitHub (Feb 1, 2024):

@ChingWeiChan Great idea, I'll make it clickable!

Just merged #616, should work as expected!

<!-- gh-comment-id:1922207763 --> @tjbck commented on GitHub (Feb 1, 2024): @ChingWeiChan Great idea, I'll make it clickable! Just merged #616, should work as expected!
Author
Owner

@adan89lion commented on GitHub (Feb 21, 2024):

Hi,

I tried some news articles with RAG, but it seems like models I tried are unable to process the "correct" content.

For instance, given the example article from CNN, the model is unable to extract answers from the article.
Screenshot_20240221_134956_Chrome

And if I want the model to introduce the document, it responded with content that are irrelevant to the article (website headers or SEO data I presume).
Screenshot_20240221_135057_Chrome

I think an integration with Mozilla's Readability library or similar projects can vastly improve the efficiency of website RAG support for open-webui.

<!-- gh-comment-id:1955947903 --> @adan89lion commented on GitHub (Feb 21, 2024): Hi, I tried some news articles with RAG, but it seems like models I tried are unable to process the "correct" content. For instance, given the example article from [CNN](https://edition.cnn.com/2024/02/19/business/walmart-earnings-walkup/index.html), the model is unable to extract answers from the article. ![Screenshot_20240221_134956_Chrome](https://github.com/open-webui/open-webui/assets/6585644/8686c23c-934a-4e33-a25c-12a9d9314b91) And if I want the model to introduce the document, it responded with content that are irrelevant to the article (website headers or SEO data I presume). ![Screenshot_20240221_135057_Chrome](https://github.com/open-webui/open-webui/assets/6585644/988fb30e-ab15-42ff-912d-baac0238acaa) I think an integration with [Mozilla's Readability library](https://github.com/mozilla/readability) or similar projects can vastly improve the efficiency of website RAG support for open-webui.
Author
Owner

@tbendien commented on GitHub (Mar 1, 2024):

DuckDuckGo search API is free: https://pypi.org/project/duckduckgo-search/

<!-- gh-comment-id:1972456088 --> @tbendien commented on GitHub (Mar 1, 2024): DuckDuckGo search API is free: https://pypi.org/project/duckduckgo-search/
Author
Owner

@dillfrescott commented on GitHub (Mar 22, 2024):

Can't wait for this! Super cool stuff!

<!-- gh-comment-id:2014335040 --> @dillfrescott commented on GitHub (Mar 22, 2024): Can't wait for this! Super cool stuff!
Author
Owner

@d416 commented on GitHub (Mar 22, 2024):

Huggingface chat is one of the best implementations of web search I’ve seen for UX so it’d be great if open-webui used this flow: https://huggingface.co/chat/

HuggingChat also open sourced and is OpenAPI compatible so it can be used with Ollama. The web search can be configured to use different search engines including searxng (private open source search engine)
https://github.com/huggingface/chat-ui?tab=readme-ov-file#web-search-config

Edit: chat-ui has Ollama support built in (it’s in the readme)

<!-- gh-comment-id:2014477021 --> @d416 commented on GitHub (Mar 22, 2024): Huggingface chat is one of the best implementations of web search I’ve seen for UX so it’d be great if open-webui used this flow: https://huggingface.co/chat/ HuggingChat also open sourced and is OpenAPI compatible so it can be used with Ollama. The web search can be configured to use different search engines including searxng (private open source search engine) https://github.com/huggingface/chat-ui?tab=readme-ov-file#web-search-config Edit: chat-ui has Ollama support built in (it’s in the readme)
Author
Owner

@strikeoncmputrz commented on GitHub (Apr 4, 2024):

LLM_Web_search is a similar capability implemented in Text Generation WebUI that uses Langchain for RAG. It's worked very well for me and with a good system or character prompt I don't even need to ask it to search the web. https://github.com/mamei16/LLM_Web_search. I made a minor bug fix but otherwise am not affiliated.

<!-- gh-comment-id:2036106527 --> @strikeoncmputrz commented on GitHub (Apr 4, 2024): [LLM_Web_search](https://github.com/mamei16/LLM_Web_search) is a similar capability implemented in Text Generation WebUI that uses Langchain for RAG. It's worked very well for me and with a good system or character prompt I don't even need to ask it to search the web. https://github.com/mamei16/LLM_Web_search. I made a minor bug fix but otherwise am not affiliated.
Author
Owner

@dillfrescott commented on GitHub (Apr 5, 2024):

I think if open webui were to use a search engine, I nominate searxng (running locally) 100%.

I'm using it with another similar project and it works flawlessly and its totally free too, unlike a lot of these API's.

<!-- gh-comment-id:2040506886 --> @dillfrescott commented on GitHub (Apr 5, 2024): I think if open webui were to use a search engine, I nominate searxng (running locally) 100%. I'm using it with another similar project and it works flawlessly and its totally free too, unlike a lot of these API's.
Author
Owner

@sammcj commented on GitHub (Apr 11, 2024):

Seconding the recommendation of searxng, it's really very good, self-hostable and works with many backend search engine providers.

<!-- gh-comment-id:2048947007 --> @sammcj commented on GitHub (Apr 11, 2024): Seconding the recommendation of searxng, it's really very good, self-hostable and works with many backend search engine providers.
Author
Owner

@spergware commented on GitHub (Apr 25, 2024):

+1
Can't wait for a web search feature, game changer!

<!-- gh-comment-id:2076837193 --> @spergware commented on GitHub (Apr 25, 2024): +1 Can't wait for a web search feature, game changer!
Author
Owner

@MohamedAliRashad commented on GitHub (Apr 25, 2024):

Any updates ?

<!-- gh-comment-id:2077649902 --> @MohamedAliRashad commented on GitHub (Apr 25, 2024): Any updates ?
Author
Owner

@knd775 commented on GitHub (Apr 25, 2024):

If you don't have anything to add, please don't comment like this. It just makes it harder for other people to follow.

@9cento
Reactions exist for a reason, use them instead

@MohamedAliRashad
Do you see any updates?

<!-- gh-comment-id:2077680718 --> @knd775 commented on GitHub (Apr 25, 2024): If you don't have anything to add, please don't comment like this. It just makes it harder for other people to follow. @9cento Reactions exist for a reason, use them instead @MohamedAliRashad Do you see any updates?
Author
Owner

@ProjectMoon commented on GitHub (May 20, 2024):

Idea for how to implement this: make a checkbox/toggle in the message prompt to enable "use web search for enhanced response accuracy". If this toggle is enabled, first send a hidden message to the LLM asking it to analyze the user's message for terms to search for. Then make a search request for each of these terms using some supported search engine, and send the top results to the normal web RAG pipeline.

This would require the admin to configure some search API endpoint. Starting with SearxNG would probably be good because it has an easily configurable JSON API.

<!-- gh-comment-id:2120463747 --> @ProjectMoon commented on GitHub (May 20, 2024): Idea for how to implement this: make a checkbox/toggle in the message prompt to enable "use web search for enhanced response accuracy". If this toggle is enabled, first send a hidden message to the LLM asking it to analyze the user's message for terms to search for. Then make a search request for each of these terms using some supported search engine, and send the top results to the normal web RAG pipeline. This would require the admin to configure some search API endpoint. Starting with SearxNG would probably be good because it has an easily configurable JSON API.
Author
Owner

@spergware commented on GitHub (May 20, 2024):

Reactions exist for a reason, use them instead

That was a bump, frustrated mass-replier. And this is another bump.

<!-- gh-comment-id:2120547507 --> @spergware commented on GitHub (May 20, 2024): > Reactions exist for a reason, use them instead That was a bump, frustrated mass-replier. And this is another bump.
Author
Owner

@tjbck commented on GitHub (May 27, 2024):

Implemented in dev.

<!-- gh-comment-id:2134109745 --> @tjbck commented on GitHub (May 27, 2024): Implemented in dev.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#12130