[PR #11814] [CLOSED] perf: Parallelize search_query in query_collection_with_hybrid_search and vector db data retrieval #45844

New Issue

GiteaMirror · 2026-04-29T20:25:54-05:00

GiteaMirror commented

2026-04-29 20:25:54 -05:00

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/11814
Author: @Phlogi
Created: 3/18/2025
Status: ❌ Closed

Base: dev ← Head: patch-dev-01

📝 Commits (10+)

125a4fa Improve performance of hybrid search
dd74d41 re-add comments
ea5dda7 Manual rebase to dev
d109ee8 Use set() instead of dict and remove unnecessary code.
a969d6e Refactor non-reverse sorting for chroma into one place only
1639283 Remove wrong argument k_reranker
9311971 Remove over-optimized retriever parallism
2b21da7 Remove unsuitable comment
90df6c9 Remove overoptimized initialization
daddd61 Merge branch 'dev' into patch-dev-01

📊 Changes

1 file changed (+87 additions, -58 deletions)

View changed files

📝 backend/open_webui/retrieval/utils.py (+87 -58)

📄 Description

Before submitting, make sure you've checked the following:

Link to discussion opened before PR: https://github.com/open-webui/open-webui/discussions/11729
Target branch: Please verify that the pull request targets the dev branch.
Description: Provide a concise description of the changes made in this pull request.
Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
Testing: Have you written and run sufficient tests for validating the changes?
Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
Prefix: To cleary categorize this pull request, prefix the pull request title, using one of the following

Changelog Entry

Description

I noticed that RAG with hybrid search is unusably slow. Currently, the hybrid search runs a nested 2 level for loop in serial to get all potential relevant parts of the document(s) and then reranks them sequentially too.
Both of these bottlenecks are fixed with the changes. Before the change only few cores were utilized on my system, with these changes all cores are full saturated. The estimated speed improvement is about 50 % on a CPU-only system.

Changed

Function query_doc_with_hybrid_search updates:

Replace nested for loop with a ThreadPoolExecutor to get all collection data in parallel and only once for all queries.
The collection_data is then passed to the query_doc_with_hybrid_search function.
Execution of process_query over all potential documents in parallel

Refactoring of the Chroma sorting exception

DRY-principle: move the reverse sorting flag based on the VECTOR_DB type, specifically for "chroma" into merge_and_sort_query_results

Function merge_and_sort_query_results Enhancements:

Optimized memory allocation by estimating capacity and reserving space for combined results.
Implemented batch processing for document hash computations to improve performance.
Added an early return for cases with empty combined results
Refined the sorting mechanism and truncated results to the top k entries.

Removed

Unneeded imports at top removed: uuid and asyncio

Additional Information

I measured the speed up based on log entries time stamps. On my CPU-only system with 32 cores, the improvement is about 50% less time to answer questions of a single complex document, length 360'000 chars.

Testing

The following manual tests were performed:

Querying...

a collection of 5 book sized documents (about multiple megabytes)
a collection with a single document
a single document uploaded in chat
multiple documents uploaded in chat

Queries used, verification:

Specific know-how queries that i knew the appropriate answer. Verified the returned relevant part of the document manually.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/11814 **Author:** [@Phlogi](https://github.com/Phlogi) **Created:** 3/18/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `patch-dev-01` --- ### 📝 Commits (10+) - [`125a4fa`](https://github.com/open-webui/open-webui/commit/125a4fa0226fc379e2a5b1695b75d6944a166baa) Improve performance of hybrid search - [`dd74d41`](https://github.com/open-webui/open-webui/commit/dd74d415ab20bcbd50eee2b6767e8d1a330186d5) re-add comments - [`ea5dda7`](https://github.com/open-webui/open-webui/commit/ea5dda7b1b61139913bec83a8d5631b4d50ed15c) Manual rebase to dev - [`d109ee8`](https://github.com/open-webui/open-webui/commit/d109ee878addc464745809c3e61bacc93cec6f24) Use set() instead of dict and remove unnecessary code. - [`a969d6e`](https://github.com/open-webui/open-webui/commit/a969d6ee03eda54819eb4842ef05a5a1733b1d8b) Refactor non-reverse sorting for chroma into one place only - [`1639283`](https://github.com/open-webui/open-webui/commit/16392838d583f4119452662e2c5031c6750039a2) Remove wrong argument k_reranker - [`9311971`](https://github.com/open-webui/open-webui/commit/93119716befb837dcd18a2d92ee505297de3a5ef) Remove over-optimized retriever parallism - [`2b21da7`](https://github.com/open-webui/open-webui/commit/2b21da754c80a7e071e88eb935c087ded64ac1d6) Remove unsuitable comment - [`90df6c9`](https://github.com/open-webui/open-webui/commit/90df6c925fa9ce53bbd58b54d1d2f4f4504e5c77) Remove overoptimized initialization - [`daddd61`](https://github.com/open-webui/open-webui/commit/daddd61f92804c4b599f0b313a095e8dd669b334) Merge branch 'dev' into patch-dev-01 ### 📊 Changes **1 file changed** (+87 additions, -58 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/retrieval/utils.py` (+87 -58) </details> ### 📄 Description **Before submitting, make sure you've checked the following:** - Link to discussion opened before PR: https://github.com/open-webui/open-webui/discussions/11729 - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [x] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests for validating the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Prefix:** To cleary categorize this pull request, prefix the pull request title, using one of the following # Changelog Entry ### Description - I noticed that RAG with hybrid search is unusably slow. Currently, the hybrid search runs a nested 2 level for loop in serial to get all potential relevant parts of the document(s) and then reranks them sequentially too. - Both of these bottlenecks are fixed with the changes. Before the change only few cores were utilized on my system, with these changes all cores are full saturated. The estimated speed improvement is about 50 % on a CPU-only system. ### Changed #### Function query_doc_with_hybrid_search updates: - Replace nested for loop with a ThreadPoolExecutor to get all collection data in parallel and only once for all queries. - The collection_data is then passed to the query_doc_with_hybrid_search function. - Execution of process_query over all potential documents in parallel #### Refactoring of the Chroma sorting exception - DRY-principle: move the reverse sorting flag based on the VECTOR_DB type, specifically for "chroma" into merge_and_sort_query_results #### Function merge_and_sort_query_results Enhancements: - Optimized memory allocation by estimating capacity and reserving space for combined results. - Implemented batch processing for document hash computations to improve performance. - Added an early return for cases with empty combined results - Refined the sorting mechanism and truncated results to the top k entries. ### Removed - Unneeded imports at top removed: uuid and asyncio --- ### Additional Information - I measured the speed up based on log entries time stamps. On my CPU-only system with 32 cores, the improvement is about 50% less time to answer questions of a single complex document, length 360'000 chars. ### Testing The following manual tests were performed: Querying... 1. a collection of 5 book sized documents (about multiple megabytes) 2. a collection with a single document 3. a single document uploaded in chat 4. multiple documents uploaded in chat Queries used, verification: - Specific know-how queries that i knew the appropriate answer. Verified the returned relevant part of the document manually. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-04-29 20:25:54 -05:00

GiteaMirror closed this issue

2026-04-29 20:25:56 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#45844