[PR #1419] [MERGED] feat: improve embedding model update & resolve network dependency #43714

Closed
opened 2026-04-29 17:45:37 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/1419
Author: @ghost
Created: 4/4/2024
Status: Merged
Merged: 4/10/2024
Merged by: @tjbck

Base: devHead: embedding-model-fix-and-manual-update


📝 Commits (10+)

📊 Changes

6 files changed (+424 additions, -196 deletions)

View changed files

📝 backend/apps/rag/main.py (+35 -13)
📝 backend/apps/rag/utils.py (+42 -0)
📝 backend/config.py (+6 -0)
📝 src/lib/apis/rag/index.ts (+61 -0)
📝 src/lib/components/documents/Settings/General.svelte (+273 -183)
📝 src/lib/i18n/locales/en-US/translation.json (+7 -0)

📄 Description

Pull Request Checklist

  • Description: Briefly describe the changes in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation?

Description

Improve embedding model update & resolve network dependency. This permits uvicorn to start without network access and eliminates the network dependency. Huggingface will not be contacted unless manual update is initiated from the GUI/API or RAG_EMBEDDING_MODEL_AUTO_UPDATE is set to True.

The huggingface_hub function snapshot_download is called directly to allow for control of the local_files_only kwarg. Also, clean up RAG main.py to remove unused commented code and associated sentence_transformers direct import. The embedding_model_get_path() is used to obtain the full filesystem path to the snapshot which, when passed to Chroma, doesn't attempt to download or update automatically.

Relates to #1302 and may fix issues like #1122.

image


Changelog Entry

Added

  • Add config variable RAG_EMBEDDING_MODEL_AUTO_UPDATE to control update behavior
  • Add RAG utils embedding_model_get_path() function to output the filesystem path in addition to update of the model using huggingface_hub
  • Add GUI setting to execute manual update process

Fixed

  • Fix Huggingface network dependency on startup

Changed

  • Update and utilize existing RAG functions in main: get_embedding_model() & update_embedding_model()

Removed

  • Removed unused commented code & associated import

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/1419 **Author:** [@ghost](https://github.com/ghost) **Created:** 4/4/2024 **Status:** ✅ Merged **Merged:** 4/10/2024 **Merged by:** [@tjbck](https://github.com/tjbck) **Base:** `dev` ← **Head:** `embedding-model-fix-and-manual-update` --- ### 📝 Commits (10+) - [`3b66aa5`](https://github.com/open-webui/open-webui/commit/3b66aa55c0d7b2a63841cc33ac31fdd142b6c3f4) Improve embedding model update & resolve network dependency - [`bcf79c8`](https://github.com/open-webui/open-webui/commit/bcf79c836652496c21015d5280a1048afdb5f0b8) Format fixes - [`075fbed`](https://github.com/open-webui/open-webui/commit/075fbedb02c256843afd496db707c30be557d3e8) More format fixes - [`9f82f5a`](https://github.com/open-webui/open-webui/commit/9f82f5abbad3fef40352784d020c049bd84176c2) Formatting... - [`11741ea`](https://github.com/open-webui/open-webui/commit/11741ea7f03fc2559c8107bcefa587e8d9d68da3) Tooltip info & warn. Detect file path during update. Add translation. - [`ec530ac`](https://github.com/open-webui/open-webui/commit/ec530ac9f8a043232dc5ca2a72c96a9ea586b955) format fix - [`506a061`](https://github.com/open-webui/open-webui/commit/506a061387f55a234d758aef3edd12115aa40442) Merge branch 'dev' into embedding-model-fix-and-manual-update - [`48aad65`](https://github.com/open-webui/open-webui/commit/48aad6551455e8595b2820b68bc0c921ff96ea98) refac - [`f4b87ec`](https://github.com/open-webui/open-webui/commit/f4b87ecb23783b1496445a44dd2d6ea50b67e326) refac - [`abfccee`](https://github.com/open-webui/open-webui/commit/abfcceecefa6251ab780558eb46931ffed9b9e78) refac ### 📊 Changes **6 files changed** (+424 additions, -196 deletions) <details> <summary>View changed files</summary> 📝 `backend/apps/rag/main.py` (+35 -13) 📝 `backend/apps/rag/utils.py` (+42 -0) 📝 `backend/config.py` (+6 -0) 📝 `src/lib/apis/rag/index.ts` (+61 -0) 📝 `src/lib/components/documents/Settings/General.svelte` (+273 -183) 📝 `src/lib/i18n/locales/en-US/translation.json` (+7 -0) </details> ### 📄 Description ## Pull Request Checklist - [X] **Description:** Briefly describe the changes in this pull request. - [X] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Have you updated relevant documentation? --- ## Description Improve embedding model update & resolve network dependency. This permits uvicorn to start without network access and eliminates the network dependency. Huggingface will not be contacted unless manual update is initiated from the GUI/API or `RAG_EMBEDDING_MODEL_AUTO_UPDATE` is set to `True`. The `huggingface_hub` function `snapshot_download` is called directly to allow for control of the `local_files_only` *kwarg*. Also, clean up RAG main.py to remove unused commented code and associated sentence_transformers direct import. The `embedding_model_get_path()` is used to obtain the full filesystem path to the snapshot which, when passed to Chroma, doesn't attempt to download or update automatically. Relates to #1302 and may fix issues like #1122. ![image](https://github.com/open-webui/open-webui/assets/126992880/74febf19-d63d-4fc1-8214-f4c5db1ae688) --- ### Changelog Entry ### Added - Add config variable `RAG_EMBEDDING_MODEL_AUTO_UPDATE` to control update behavior - Add RAG utils `embedding_model_get_path()` function to output the filesystem path in addition to update of the model using `huggingface_hub` - Add GUI setting to execute manual update process ### Fixed - Fix Huggingface network dependency on startup ### Changed - Update and utilize existing RAG functions in main: `get_embedding_model()` & `update_embedding_model()` ### Removed - Removed unused commented code & associated import --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 17:45:37 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#43714