mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #20053] issue: Embedding Batch Size setting is ignored for SentenceTransformers (Local Embedding), causing high memory usage #34600
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @taka817123 on GitHub (Dec 20, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20053
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
v0.6.41
Ollama Version (if applicable)
No response
Operating System
Windows 11
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
When using the default local embedding engine (SentenceTransformers), the Embedding Batch Size setting in Admin Settings > Documents should be respected.
Specifically, the [batch_size] parameter should be passed to the [embedding_function.encode] method in the backend. This allows users to lower the batch size (e.g., to 1 or 2) to reduce VRAM/RAM usage, especially when using large context models or running on hardware with limited memory.
Additionally, the Embedding Batch Size setting UI should be visible when the engine is set to Default (SentenceTransformers).
Actual Behavior
Backend Issue: The Embedding Batch Size setting is ignored. The backend code in [utils.py] calls [embedding_function.encode] without passing the [batch_size] argument. Consequently, sentence-transformers uses its default batch size (usually 32).
This causes massive RAM/VRAM spikes when embedding documents, especially with models that support long contexts (e.g., ModernBERT with 8k context) or when chunk sizes are large.
Steps to Reproduce
Embedding Model Engine is set to "Default (SentenceTransformers)".
Upload a large document (or many documents) to the Knowledge Base.
Monitor RAM/VRAM usage. The system attempts to process embeddings with the default batch size (32), causing high memory consumption regardless of any user configuration attempts.
Logs & Screenshots
Code Analysis:
In [utils.py] looks like this:
It should be:
Additional Information
No response
@owui-terminator[bot] commented on GitHub (Dec 20, 2025):
🔍 Similar Issues Found
I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:
#19749 issue: Embedding model not working (“NoneType has no attribute encode”) when using local SentenceTransformers (engine="")
by tar-s • Dec 04, 2025 •
bug#19867 issue:Memory Leak in Attach Web Page Function Due to Null Bytes in Postgres Embeddings
by fgonzalez-glmc • Dec 10, 2025 •
bug#19723 issue: "Async Embedding Processing" does not seem to have an effect
by Elettrotecnica • Dec 03, 2025 •
bug#19474 issue: Embeddings using API not working
by curious-broccoli • Nov 25, 2025 •
bug#19421 issue: save embedding to vector DB freezes the whole application
by FBH93 • Nov 24, 2025 •
bugShow 5 more related issues
#19281 issue: RAG Template applied with "Bypass Embedding and Retrieval" enabled
by lucyknada • Nov 19, 2025 •
bug#16389 issue: embeddings based on OpenAI-compatible APIs are broken
by MattBash17 • Aug 08, 2025 •
bug#17845 issue: web search too slow / generating embeddings for 10.000+ chunks
by tfriedel • Sep 28, 2025 •
bug#17699 issue: Generating embeddings two time for one file
by koddev • Sep 24, 2025 •
bug#16158 issue: Processing does not continue after open_webui.retrieval.utils:generate_openai_batch_embeddings call
by BAngelis • Jul 30, 2025 •
bug💡 Tips:
This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.