[PR #3078] [CLOSED] Implement domain whitelisting for web search results #7976

Closed
opened 2025-11-11 17:41:41 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/3078
Author: @que-nguyen
Created: 6/12/2024
Status: Closed

Base: devHead: whitelist-websearch


📝 Commits (10+)

  • c41b33c Merge pull request #3013 from open-webui/dev
  • 0917fa6 Merge pull request #3056 from open-webui/dev
  • 47ac0fb Define configuration variable RAG_WEB_SEARCH_WHITE_LIST_DOMAINS
  • 5bdf7c9 Implemented filter_by_whitelist function
  • ffb06e7 Implement domain whitelisting for BRAVE search results
  • 242be8b Implement domain whitelisting for DUCKDUCKGO search results
  • 314f2fb Implement domain whitelisting for GOOGLE_PSE search results
  • 8feced3 Implement domain whitelisting for SEARXNG search results
  • 38cf4c7 Implement domain whitelisting for SERPER search results
  • e031a7e Implement domain whitelisting for SERPLY search results

📊 Changes

10 files changed (+68 additions, -32 deletions)

View changed files

📝 backend/apps/rag/main.py (+9 -1)
📝 backend/apps/rag/search/brave.py (+5 -4)
📝 backend/apps/rag/search/duckduckgo.py (+6 -5)
📝 backend/apps/rag/search/google_pse.py (+5 -4)
📝 backend/apps/rag/search/main.py (+11 -1)
📝 backend/apps/rag/search/searxng.py (+8 -5)
📝 backend/apps/rag/search/serper.py (+5 -4)
📝 backend/apps/rag/search/serply.py (+5 -4)
📝 backend/apps/rag/search/serpstack.py (+5 -4)
📝 backend/config.py (+9 -0)

📄 Description

  • Note: Meta search engines like SearxNG do not support filtering results by domain natively, so this filtering is applied post-search.
  • Added a filter to restrict search results to specified domains.
  • Updated the processing of search results to apply the whitelist filter before returning the final results.
  • Ensured that only results from allowed domains (if specified in the whitelist) are included in the output.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/3078 **Author:** [@que-nguyen](https://github.com/que-nguyen) **Created:** 6/12/2024 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `whitelist-websearch` --- ### 📝 Commits (10+) - [`c41b33c`](https://github.com/open-webui/open-webui/commit/c41b33c9c0722c84055a833802611f8ee39609d3) Merge pull request #3013 from open-webui/dev - [`0917fa6`](https://github.com/open-webui/open-webui/commit/0917fa6f4ac015810211d85eb41da5d335762952) Merge pull request #3056 from open-webui/dev - [`47ac0fb`](https://github.com/open-webui/open-webui/commit/47ac0fba098c9c7686d7d820627bcdbfebfdff2a) Define configuration variable RAG_WEB_SEARCH_WHITE_LIST_DOMAINS - [`5bdf7c9`](https://github.com/open-webui/open-webui/commit/5bdf7c9dcc91662a418140cfefd71d5ae3835aee) Implemented filter_by_whitelist function - [`ffb06e7`](https://github.com/open-webui/open-webui/commit/ffb06e7193366bd8a129e336125910cfacdf8576) Implement domain whitelisting for BRAVE search results - [`242be8b`](https://github.com/open-webui/open-webui/commit/242be8b2e9b358fb6a3245da7dc5ada649209704) Implement domain whitelisting for DUCKDUCKGO search results - [`314f2fb`](https://github.com/open-webui/open-webui/commit/314f2fb157fbc40c139aada8c4c159fc34757805) Implement domain whitelisting for GOOGLE_PSE search results - [`8feced3`](https://github.com/open-webui/open-webui/commit/8feced357c778343d45f8804e06a0de124cdcea0) Implement domain whitelisting for SEARXNG search results - [`38cf4c7`](https://github.com/open-webui/open-webui/commit/38cf4c7898e9d08b75de9086ff6eeb41e6fe83df) Implement domain whitelisting for SERPER search results - [`e031a7e`](https://github.com/open-webui/open-webui/commit/e031a7eb6ceab37b7e2ad238a6e853b3ecfaed7a) Implement domain whitelisting for SERPLY search results ### 📊 Changes **10 files changed** (+68 additions, -32 deletions) <details> <summary>View changed files</summary> 📝 `backend/apps/rag/main.py` (+9 -1) 📝 `backend/apps/rag/search/brave.py` (+5 -4) 📝 `backend/apps/rag/search/duckduckgo.py` (+6 -5) 📝 `backend/apps/rag/search/google_pse.py` (+5 -4) 📝 `backend/apps/rag/search/main.py` (+11 -1) 📝 `backend/apps/rag/search/searxng.py` (+8 -5) 📝 `backend/apps/rag/search/serper.py` (+5 -4) 📝 `backend/apps/rag/search/serply.py` (+5 -4) 📝 `backend/apps/rag/search/serpstack.py` (+5 -4) 📝 `backend/config.py` (+9 -0) </details> ### 📄 Description - Note: Meta search engines like SearxNG do not support filtering results by domain natively, so this filtering is applied post-search. - Added a filter to restrict search results to specified domains. - Updated the processing of search results to apply the whitelist filter before returning the final results. - Ensured that only results from allowed domains (if specified in the whitelist) are included in the output. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-11 17:41:41 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#7976