[PR #12769] [MERGED] feat: Add Frontend Configuration for RAG_WEB_LOADER_ENGINE #23014

Closed
opened 2026-04-20 04:34:21 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/12769
Author: @tth37
Created: 4/12/2025
Status: Merged
Merged: 4/12/2025
Merged by: @tjbck

Base: devHead: feat_frontend_web_loader


📝 Commits (1)

  • 5eac596 feat: Add frontend configuration for web loader

📊 Changes

3 files changed (+357 additions, -191 deletions)

View changed files

📝 backend/open_webui/config.py (+18 -18)
📝 backend/open_webui/routers/retrieval.py (+95 -55)
📝 src/lib/components/admin/Settings/WebSearch.svelte (+244 -118)

📄 Description

Pull Request Checklist

  • Target branch: Please verify that the pull request targets the dev branch.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests to validate the changes?
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

Description

Related Discussion: #12744

Currently, users can configure the RAG_WEB_SEARCH_ENGINE (e.g., DuckDuckGo, Google) directly from the Open WebUI frontend settings. However, the related configuration RAG_WEB_LOADER_ENGINE, which controls the method used to fetch and parse web page content for RAG, appears to only be configurable via environment variables or backend configuration (on latest main branch). This makes it less convenient to switch or experiment with different loading mechanisms compared to the search engine setting.

This PR introduces new configuration options in the frontend under Settings -> Web Search allowing users to select the RAG_WEB_LOADER_ENGINE via a dropdown menu.

To accomodate this and improve structure, the Web Search settings sections has been reorganized both in the backend API schema and the frontend UI.

Key Changes

  1. Backend (open_webui/routers/retrieval.py, open_webui/config.py)

    • Restructured the WebConfig Pydantic model to have distinct search and loader subsections.
    • Moved web loader-related configurations (RAG_WEB_LOADER_ENGINE, ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION, BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL, RAG_WEB_SEARCH_TRUST_ENV, Playwright settings, Firecrawl settings, Tavily loader settings, YouTube loader settings) under the new loader section.
    • Adjusted configuration key paths in config.py for consistency (e.g., rag.web.loader.playwright_ws_uri, I hope this won't introduce compatibility issues)
    • Updated the /config GET and /config/update POST endpoints to use the new nested schema.
  2. Frontend (components/admin/Settings/WebSearch.svelte)

    • Added a dropdown selector for RAG_WEB_LOADER_ENGINE (options: safe_web, playwright, firecrawl, tavily)
    • Added conditional input fields that appear based on the selected loader engine:
      • Playwright: WS URI & Timeout
      • Firecrawl: API Base URL & API Key
      • Tavily: Extract Depth & API Key (reuses the search API key input if Tavily search isn't selected)
    • Restructured the Web Search settings UI into logical sections: "General" (general enable toggle, engine and related settings), and "Loader" (engine and related settings, including YouTube, SSL, Trust Env, Bypass Embedding).
    • Consolidated the tavily_api_key input field to serve both the search and loader configurations when Tavily is selected.

Bug Fixes

  • Typo Correction: Fixed a typo serachapi_api_key -> searchapi_api_key in the response data structure of the update_rag_config endpoint
  • SSL Verification Logic: Corrected the logic for ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION. When the frontend "Bypass SSL verification" switch is toggled ON, the backend enable_ssl_verification setting is now correctly updated to false upon saving. This behavior was previously commented but not correctly implemented.

Additional Information

  • Strange Configuration Structure: The configuration entry YOUTUBE_LOADER_TRANSLATION lies directly in app.state while others were managed within app.state.config, I wonder whether it's a bug or special design?

Screenshots or Videos

Before:
b9be274936e30c68fd00dc77c3f874e2

After:
1b0f330ed157cdffd8ac0aeeb2dfc42d


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/12769 **Author:** [@tth37](https://github.com/tth37) **Created:** 4/12/2025 **Status:** ✅ Merged **Merged:** 4/12/2025 **Merged by:** [@tjbck](https://github.com/tjbck) **Base:** `dev` ← **Head:** `feat_frontend_web_loader` --- ### 📝 Commits (1) - [`5eac596`](https://github.com/open-webui/open-webui/commit/5eac5960efe332e113359df9516c5fa7f1eb0da0) feat: Add frontend configuration for web loader ### 📊 Changes **3 files changed** (+357 additions, -191 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+18 -18) 📝 `backend/open_webui/routers/retrieval.py` (+95 -55) 📝 `src/lib/components/admin/Settings/WebSearch.svelte` (+244 -118) </details> ### 📄 Description # Pull Request Checklist - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [x] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests to validate the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # Changelog Entry ### Description Related Discussion: [#12744](https://github.com/open-webui/open-webui/discussions/12744) Currently, users can configure the `RAG_WEB_SEARCH_ENGINE` (e.g., DuckDuckGo, Google) directly from the Open WebUI frontend settings. However, the related configuration RAG_WEB_LOADER_ENGINE, which controls the method used to fetch and parse web page content for RAG, appears to only be configurable via environment variables or backend configuration (on latest main branch). This makes it less convenient to switch or experiment with different loading mechanisms compared to the search engine setting. This PR introduces new configuration options in the frontend under **Settings -> Web Search** allowing users to select the `RAG_WEB_LOADER_ENGINE` via a dropdown menu. To accomodate this and improve structure, the Web Search settings sections has been reorganized both in the backend API schema and the frontend UI. ### Key Changes 1. Backend (`open_webui/routers/retrieval.py`, `open_webui/config.py`) - Restructured the `WebConfig` Pydantic model to have distinct `search` and `loader` subsections. - Moved web loader-related configurations (`RAG_WEB_LOADER_ENGINE`, `ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION`, `BYPASS_WEB_SEARCH_EMBEDDING_AND_RETRIEVAL`, `RAG_WEB_SEARCH_TRUST_ENV`, Playwright settings, Firecrawl settings, Tavily loader settings, YouTube loader settings) under the new `loader` section. - Adjusted configuration key paths in `config.py` for consistency (e.g., `rag.web.loader.playwright_ws_uri`, I hope this won't introduce compatibility issues) - Updated the `/config` GET and `/config/update` POST endpoints to use the new nested schema. 2. Frontend (`components/admin/Settings/WebSearch.svelte`) - Added a dropdown selector for `RAG_WEB_LOADER_ENGINE` (options: safe_web, playwright, firecrawl, tavily) - Added conditional input fields that appear based on the selected loader engine: - Playwright: WS URI & Timeout - Firecrawl: API Base URL & API Key - Tavily: Extract Depth & API Key (**reuses the search API key input if Tavily search isn't selected**) - Restructured the Web Search settings UI into logical sections: "General" (general enable toggle, engine and related settings), and "Loader" (engine and related settings, including YouTube, SSL, Trust Env, Bypass Embedding). - Consolidated the `tavily_api_key` input field to serve both the search and loader configurations when Tavily is selected. ### Bug Fixes - **Typo Correction**: Fixed a typo `serachapi_api_key` -> `searchapi_api_key` in the response data structure of the `update_rag_config` endpoint - **SSL Verification Logic**: Corrected the logic for `ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION`. When the frontend "Bypass SSL verification" switch is toggled ON, the backend `enable_ssl_verification` setting is now correctly updated to `false` upon saving. This behavior was previously commented but not correctly implemented. --- ### Additional Information - **Strange Configuration Structure**: The configuration entry `YOUTUBE_LOADER_TRANSLATION` lies directly in `app.state` while others were managed within `app.state.config`, I wonder whether it's a bug or special design? ### Screenshots or Videos Before: ![b9be274936e30c68fd00dc77c3f874e2](https://github.com/user-attachments/assets/85304d5b-be28-4f78-abcf-655b0cc56699) After: ![1b0f330ed157cdffd8ac0aeeb2dfc42d](https://github.com/user-attachments/assets/66b01d00-9fd6-4463-b939-0690a8f00297) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-20 04:34:21 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#23014