mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[PR #20637] Perf: Optimize retrieval logic for pure vector search with reranker [ hybrid search enabled with BM25 weight set to 0] , significant performance improvements #25713
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/20637
Author: @shrutichy91
Created: 1/13/2026
Status: 🔄 Open
Base:
dev← Head:main📝 Commits (10+)
fe6783cMerge pull request #19030 from open-webui/devfc05e0aMerge pull request #19405 from open-webui/deve3faec6Merge pull request #19416 from open-webui/dev9899293Merge pull request #19448 from open-webui/dev140605eMerge pull request #19462 from open-webui/dev6f1486fMerge pull request #19466 from open-webui/devd95f533Merge pull request #19729 from open-webui/deva7271530.6.43 (#20093)6adde20Merge pull request #20394 from open-webui/devf9b0534Merge pull request #20522 from open-webui/dev📊 Changes
1 file changed (+70 additions, -40 deletions)
View changed files
📝
backend/open_webui/retrieval/utils.py(+70 -40)📄 Description
Checks for hybrid_bm25_weight before calculating bm25 texts and bm25_retriever.
Also does not run VECTOR_DB_CLIENT.get as this is not required for pure vector search with reranker. Significantly improves performance (8x) for larger docs >3K
Before submitting, make sure you've checked the following:
This PR optimizes the document retrieval pipeline by skipping BM25-related data loading and computation when hybrid_bm25_weight <= 0.
These operations are unnecessary for vector-only search and add avoidable latency and memory overhead.
Tested for docs and observed significant performance improvements
2k large docs now take 13s as compared to previous 2m3s
10k small docs now take 11s as compared to previous 2m12s
This will not break hybrid search with BM_25 weight >0 or Non hybrid search as if conditions are provided.
Code review completed
Changelog Entry
Description
This PR optimizes the document retrieval pipeline by skipping BM25-related data loading and computation when hybrid_bm25_weight <= 0.
Currently, even when BM25 is effectively disabled (weight ≤ 0), the system may still:
Fetches documents using VECTOR_DB_CLIENT.get
Constructs BM25 retrievers in memory
These operations are unnecessary for vector-only search and add avoidable latency and memory overhead causing app server to hang for knowledge with docs >2k
Changed
Introduced an explicit short-circuit for the vector-only path when hybrid_bm25_weight <= 0
We dont need to call VECTOR_DB_CLIENT.get when BM25 is disabled
Bypassed BM25 retriever creation and ensemble logic in vector-only scenarios > if condition added and BM25 retriever logic is moved to BM25 weights > 0
reranking still applies correctly on vector search results [ taking advantage of reranker option in hybrid search]
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.