[PR #23275] feat: add BM25 hybrid search support for Qdrant (multitenancy mode) #42748

Open
opened 2026-04-25 14:33:47 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/23275
Author: @MiXaiLL76
Created: 3/31/2026
Status: 🔄 Open

Base: devHead: qdrant-bm25


📝 Commits (10+)

📊 Changes

22 files changed (+227 additions, -40 deletions)

View changed files

📝 backend/open_webui/config.py (+6 -0)
📝 backend/open_webui/retrieval/utils.py (+7 -4)
📝 backend/open_webui/retrieval/vector/dbs/chroma.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/elasticsearch.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/mariadb_vector.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/milvus.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/milvus_multitenancy.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/opengauss.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/opensearch.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/oracle23ai.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/pgvector.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/pinecone.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/qdrant.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/qdrant_multitenancy.py (+195 -36)
📝 backend/open_webui/retrieval/vector/dbs/s3vector.py (+1 -0)
📝 backend/open_webui/retrieval/vector/dbs/weaviate.py (+1 -0)
📝 backend/open_webui/retrieval/vector/main.py (+1 -0)
📝 backend/open_webui/routers/memories.py (+1 -0)
📝 backend/open_webui/routers/retrieval.py (+1 -0)
📝 backend/open_webui/tools/builtin.py (+1 -0)

...and 2 more files

📄 Description

Changelog Entry

Description

Adds hybrid search (dense + sparse BM25) support to the Qdrant multitenancy backend using fastembed for sparse text embeddings and Reciprocal Rank Fusion (RRF) for result merging.

This improves retrieval quality by combining semantic (dense vector) similarity with keyword-based (BM25 sparse) relevance, particularly for queries where exact keyword matching matters alongside semantic similarity.

Hybrid search uses built-in qdrant functions and mechanisms.

To enable hybrid search, the raw query text is now propagated through the entire retrieval pipeline — from routers and tools down to the VectorDBBase.search() interface — via a new optional query: str parameter. All other vector DB backends accept this parameter without behavioral change (they simply ignore it), preserving full backward compatibility.

Added

  • Qdrant hybrid search (BM25 + dense RRF) in multitenancy mode via fastembed SparseTextEmbedding:
    • Sparse vectors are stored alongside dense vectors in collections created with hybrid mode enabled.
    • At query time, dense and sparse results are merged using Qdrant's native FusionQuery(RRF).
    • Automatic detection of whether an existing collection is hybrid or dense-only (backward-compatible upserts and queries).
  • New environment variables for fine-grained control:
    • QDRANT_HYBRID_SEARCH_ENABLED (default: false) — enable/disable hybrid search.
    • QDRANT_SPARSE_EMBEDDING_MODEL (default: Qdrant/bm25) — fastembed sparse model name.
    • QDRANT_DENSE_VECTOR_NAME (default: dense) — named vector key for dense embeddings.
    • QDRANT_SPARSE_VECTOR_NAME (default: sparse) — named vector key for sparse embeddings.
    • QDRANT_HYBRID_SEARCH_FUSION_TYPE (default: rrf) — Fusion type
    • QDRANT_SPARSE_ON_DISK (default: false) — store sparse index on disk.
  • New dependency: fastembed==0.8.0 (optional — hybrid search is gracefully disabled if not installed).

Changed

  • VectorDBBase.search() signature extended with query: Optional[str] = None across all backends (Chroma, Elasticsearch, MariaDB, Milvus, Milvus multitenancy, OpenGauss, OpenSearch, Oracle 23ai, pgvector, Pinecone, Qdrant, Qdrant multitenancy, S3Vector, Weaviate).
  • query_doc() and process_query_collection() in retrieval/utils.py now receive and forward query_text so the raw query string reaches the vector DB layer.
  • query_collection() in retrieval/utils.py now zips queries with query_embeddings when dispatching to thread pool workers.
  • routers/retrieval.pyquery_doc_handler passes form_data.query as query_text.
  • routers/memories.py — memory search passes form_data.content as query.
  • tools/builtin.py — knowledge-base search passes the raw query string.
  • Qdrant multitenancy collection creation uses named vectors config ({"dense": VectorParams, ...}) when hybrid mode is active.

Deprecated

  • N/A

Removed

  • N/A

Fixed

  • N/A

Security

  • N/A

Breaking Changes

  • BREAKING CHANGE (Qdrant multitenancy only): Collections created with QDRANT_HYBRID_SEARCH_ENABLED=true use a named-vector schema ({"dense": ..., "sparse": ...}) instead of the previous unnamed single-vector schema. Existing collections created before this change will continue to work in dense-only mode (detected automatically via _is_hybrid_collection()). To migrate an existing collection to hybrid, it must be re-created (drop and re-index).

Additional Information

  • Hybrid search is disabled gracefully if fastembed is not installed — a warning is logged and dense-only search is used as fallback.
  • The _is_hybrid_collection() helper inspects the live Qdrant collection config, so mixed environments (some hybrid, some legacy collections) are handled correctly without manual intervention.
  • The query parameter is ignored by all non-Qdrant backends; no behavioral change for Chroma, Milvus, pgvector, etc.
  • fastembed will download the selected sparse model on first use (default: Qdrant/bm25, ~20 MB). Ensure outbound network access or pre-cache the model in air-gapped deployments.

Screenshots or Videos

image {B71B52E1-05F8-467F-9D65-8F2C4B2EE6CA}

Contributor License Agreement


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/23275 **Author:** [@MiXaiLL76](https://github.com/MiXaiLL76) **Created:** 3/31/2026 **Status:** 🔄 Open **Base:** `dev` ← **Head:** `qdrant-bm25` --- ### 📝 Commits (10+) - [`fe6783c`](https://github.com/open-webui/open-webui/commit/fe6783c16699911c7be17392596d579333fb110c) Merge pull request #19030 from open-webui/dev - [`fc05e0a`](https://github.com/open-webui/open-webui/commit/fc05e0a6c5d39da60b603b4d520f800d6e36f748) Merge pull request #19405 from open-webui/dev - [`e3faec6`](https://github.com/open-webui/open-webui/commit/e3faec62c58e3a83d89aa3df539feacefa125e0c) Merge pull request #19416 from open-webui/dev - [`9899293`](https://github.com/open-webui/open-webui/commit/9899293f050ad50ae12024cbebee7e018acd851e) Merge pull request #19448 from open-webui/dev - [`140605e`](https://github.com/open-webui/open-webui/commit/140605e660b8186a7d5c79fb3be6ffb147a2f498) Merge pull request #19462 from open-webui/dev - [`6f1486f`](https://github.com/open-webui/open-webui/commit/6f1486ffd0cb288d0e21f41845361924e0d742b3) Merge pull request #19466 from open-webui/dev - [`d95f533`](https://github.com/open-webui/open-webui/commit/d95f533214e3fe5beb5e41ec1f349940bc4c7043) Merge pull request #19729 from open-webui/dev - [`a727153`](https://github.com/open-webui/open-webui/commit/a7271532f8a38da46785afcaa7e65f9a45e7d753) 0.6.43 (#20093) - [`6adde20`](https://github.com/open-webui/open-webui/commit/6adde203cd292a9e3af9c64a2ae36b603fed096a) Merge pull request #20394 from open-webui/dev - [`f9b0534`](https://github.com/open-webui/open-webui/commit/f9b0534e0c442631d1cb7205169588b9b6204179) Merge pull request #20522 from open-webui/dev ### 📊 Changes **22 files changed** (+227 additions, -40 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+6 -0) 📝 `backend/open_webui/retrieval/utils.py` (+7 -4) 📝 `backend/open_webui/retrieval/vector/dbs/chroma.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/elasticsearch.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/mariadb_vector.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/milvus.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/milvus_multitenancy.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/opengauss.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/opensearch.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/oracle23ai.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/pgvector.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/pinecone.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/qdrant.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/qdrant_multitenancy.py` (+195 -36) 📝 `backend/open_webui/retrieval/vector/dbs/s3vector.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/dbs/weaviate.py` (+1 -0) 📝 `backend/open_webui/retrieval/vector/main.py` (+1 -0) 📝 `backend/open_webui/routers/memories.py` (+1 -0) 📝 `backend/open_webui/routers/retrieval.py` (+1 -0) 📝 `backend/open_webui/tools/builtin.py` (+1 -0) _...and 2 more files_ </details> ### 📄 Description # Changelog Entry ### Description Adds [**hybrid search (dense + sparse BM25)**](https://qdrant.tech/articles/hybrid-search/) support to the Qdrant multitenancy backend using [fastembed](https://github.com/qdrant/fastembed) for sparse text embeddings and Reciprocal Rank Fusion (RRF) for result merging. This improves retrieval quality by combining semantic (dense vector) similarity with keyword-based (BM25 sparse) relevance, particularly for queries where exact keyword matching matters alongside semantic similarity. **Hybrid search uses built-in qdrant functions and mechanisms.** To enable hybrid search, the raw query text is now propagated through the entire retrieval pipeline — from routers and tools down to the `VectorDBBase.search()` interface — via a new optional `query: str` parameter. All other vector DB backends accept this parameter without behavioral change (they simply ignore it), preserving full backward compatibility. ### Added - **Qdrant hybrid search (BM25 + dense RRF)** in multitenancy mode via `fastembed` `SparseTextEmbedding`: - Sparse vectors are stored alongside dense vectors in collections created with hybrid mode enabled. - At query time, dense and sparse results are merged using Qdrant's native `FusionQuery(RRF)`. - Automatic detection of whether an existing collection is hybrid or dense-only (backward-compatible upserts and queries). - New environment variables for fine-grained control: - `QDRANT_HYBRID_SEARCH_ENABLED` (default: `false`) — enable/disable hybrid search. - `QDRANT_SPARSE_EMBEDDING_MODEL` (default: `Qdrant/bm25`) — fastembed sparse model name. - `QDRANT_DENSE_VECTOR_NAME` (default: `dense`) — named vector key for dense embeddings. - `QDRANT_SPARSE_VECTOR_NAME` (default: `sparse`) — named vector key for sparse embeddings. - `QDRANT_HYBRID_SEARCH_FUSION_TYPE` (default: `rrf`) — [Fusion type](https://qdrant.tech/course/essentials/day-3/hybrid-search-demo/) - `QDRANT_SPARSE_ON_DISK` (default: `false`) — store sparse index on disk. - New dependency: `fastembed==0.8.0` (optional — hybrid search is gracefully disabled if not installed). ### Changed - `VectorDBBase.search()` signature extended with `query: Optional[str] = None` across all backends (Chroma, Elasticsearch, MariaDB, Milvus, Milvus multitenancy, OpenGauss, OpenSearch, Oracle 23ai, pgvector, Pinecone, Qdrant, Qdrant multitenancy, S3Vector, Weaviate). - `query_doc()` and `process_query_collection()` in `retrieval/utils.py` now receive and forward `query_text` so the raw query string reaches the vector DB layer. - `query_collection()` in `retrieval/utils.py` now zips `queries` with `query_embeddings` when dispatching to thread pool workers. - `routers/retrieval.py` — `query_doc_handler` passes `form_data.query` as `query_text`. - `routers/memories.py` — memory search passes `form_data.content` as `query`. - `tools/builtin.py` — knowledge-base search passes the raw `query` string. - Qdrant multitenancy collection creation uses named vectors config (`{"dense": VectorParams, ...}`) when hybrid mode is active. ### Deprecated - N/A ### Removed - N/A ### Fixed - N/A ### Security - N/A ### Breaking Changes - **BREAKING CHANGE (Qdrant multitenancy only):** Collections created with `QDRANT_HYBRID_SEARCH_ENABLED=true` use a named-vector schema (`{"dense": ..., "sparse": ...}`) instead of the previous unnamed single-vector schema. Existing collections created before this change will continue to work in dense-only mode (detected automatically via `_is_hybrid_collection()`). To migrate an existing collection to hybrid, it must be re-created (drop and re-index). ### Additional Information - Hybrid search is **disabled gracefully** if `fastembed` is not installed — a warning is logged and dense-only search is used as fallback. - The `_is_hybrid_collection()` helper inspects the live Qdrant collection config, so mixed environments (some hybrid, some legacy collections) are handled correctly without manual intervention. - The `query` parameter is **ignored** by all non-Qdrant backends; no behavioral change for Chroma, Milvus, pgvector, etc. - `fastembed` will download the selected sparse model on first use (default: `Qdrant/bm25`, ~20 MB). Ensure outbound network access or pre-cache the model in air-gapped deployments. ### Screenshots or Videos <img width="1165" height="1074" alt="image" src="https://github.com/user-attachments/assets/da4a8303-0ac2-4c0d-a59c-2fa953762cc5" /> <img width="1183" height="772" alt="{B71B52E1-05F8-467F-9D65-8F2C4B2EE6CA}" src="https://github.com/user-attachments/assets/3e1c8e54-f3d4-45db-8015-e45d33d2e65c" /> ### Contributor License Agreement - [x] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 14:33:47 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#42748