mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[PR #20342] perf: Fix hybrid search performance regression with parallel collection fetching and BM25 bypass #64431
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Original Pull Request: https://github.com/open-webui/open-webui/pull/20342
State: closed
Merged: No
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.
This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.
Before submitting, make sure you've checked the following:
devbranch. Not targeting thedevbranch will lead to immediate closure of the PR.Changelog Entry
Description
This PR addresses a critical performance regression introduced in version 0.6.26 where hybrid search with BM25 disabled experiences dramatic latency increases. The regression makes retrieval workflows impractical for collections with 10k+ files, with response times increasing from 10-30 seconds (v0.6.25) to 3+ minutes, and becoming completely unusable for larger collections (160k+ files) with timeouts at 15-20 minutes.
Root causes identified:
0, the system still initialized BM25 retrievers and processed enriched textsSolutions implemented:
asyncio.gatherto fetch multiple collections concurrently, eliminating N-1 sequential waitsAdded
[fetch_collection_data()]async helper function to fetch multiple collections in parallel usingasyncio.gather[query_doc_with_hybrid_search()]to skip BM25 processing when not neededChanged
[query_collection_with_hybrid_search()]now uses parallel collection fetching instead of sequential loopFixed
Breaking Changes
Additional Information
Technical Details
Parallel Collection Fetching:
asyncio.gatherto fetch collections concurrentlyN × Tto~Tfor N collectionsEarly BM25 Bypass:
hybrid_bm25_weight <= 0(BM25 not wanted)not enable_enriched_texts(No metadata enrichment wanted)Testing Note
Code changes are conservative:
asyncio.gatherpatternsPerformance testing requested:
@galvanoid (original issue reporter) - Could you test this PR with your 10k and 160k file collections to verify it resolves the latency issues you reported?
Expected improvements:
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.