[GH-ISSUE #16816] issue: Document embedding doesn't work - active: hybrid search/reranker in v.0.6.23 #33583

New Issue

GiteaMirror · 2026-04-25T07:29:03-05:00

GiteaMirror commented

2026-04-25 07:29:03 -05:00

Originally created by @eXt73 on GitHub (Aug 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16816

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.23

Ollama Version (if applicable)

0.11.5

Operating System

Linux - Ubuntu 24.04

Browser (if applicable)

Brave v1.81.136

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The document should be properly embedded, then searched by Qwen3, reranked by bge-reranker-v2-m3 and the fragments prepared in this way are sent to LLM 'on the front end'

Actual Behavior

In version v.0.6.23, document embedding and hybrid search/reranker stopped working in v.0.6.23. It looks like the document is embedded, but in reality, the LLM model doesn't receive it. When we disable reranker, the embedding itself works correctly - I use hf.co/Qwen/Qwen3-Embedding-4B-GGUF:Q8_0 via ollama, and as a reranker: BAAI/bge-reranker-v2-m3. Everything worked perfectly until version v.0.6.22.

Steps to Reproduce

I'm skipping the obvious parts:

With the Qwen3 4B embedding model and the hybrid search engine + bge-reranker-v2-m3 reranker active.
Post the document in the chat window.
Ask the main LLM model for information from the document.
The model will respond [if its system prompt is configured] that there is no information or file.
The response shows that the embedded file is not attached.

Logs & Screenshots

Additional Information

No response

Originally created by @eXt73 on GitHub (Aug 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/16816 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.23 ### Ollama Version (if applicable) 0.11.5 ### Operating System Linux - Ubuntu 24.04 ### Browser (if applicable) Brave v1.81.136 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior The document should be properly embedded, then searched by Qwen3, reranked by bge-reranker-v2-m3 and the fragments prepared in this way are sent to LLM 'on the front end' ### Actual Behavior In version v.0.6.23, document embedding and hybrid search/reranker stopped working in v.0.6.23. It looks like the document is embedded, but in reality, the LLM model doesn't receive it. When we disable reranker, the embedding itself works correctly - I use hf.co/Qwen/Qwen3-Embedding-4B-GGUF:Q8_0 via ollama, and as a reranker: BAAI/bge-reranker-v2-m3. Everything worked perfectly until version v.0.6.22. ### Steps to Reproduce I'm skipping the obvious parts: 1. With the Qwen3 4B embedding model and the hybrid search engine + bge-reranker-v2-m3 reranker active. 2. Post the document in the chat window. 3. Ask the main LLM model for information from the document. 4. The model will respond [if its system prompt is configured] that there is no information or file. 5. The response shows that the embedded file is not attached. ### Logs & Screenshots ![Image](https://github.com/user-attachments/assets/c3295e8f-7775-47dc-8413-cbcf70ab283c) ### Additional Information _No response_

GiteaMirror added the bug label 2026-04-25 07:29:03 -05:00

GiteaMirror closed this issue

2026-04-25 07:29:04 -05:00

GiteaMirror commented

2026-04-25 07:29:05 -05:00

@uuuhuuu commented on GitHub (Aug 22, 2025):

Same problem, after upgrading to version 0.6.23, none of my RAGs are working anymore. Same problem with 0.6.24.

@uuuhuuu commented on GitHub (Aug 22, 2025): Same problem, after upgrading to version 0.6.23, none of my RAGs are working anymore. Same problem with 0.6.24.

GiteaMirror commented

2026-04-25 07:29:06 -05:00

@alpilotx commented on GitHub (Aug 22, 2025):

Yes, it seems - according to logs - it always bails out with some error like this

**2025-08-22 11:52:29.328 | ERROR    | open_webui.retrieval.utils:process_query:377 - Error when querying the collection with hybrid_search: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()**

Traceback (most recent call last):

  File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap

    self._bootstrap_inner()

    │    └ <function Thread._bootstrap_inner at 0x77f51935c9a0>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner

    self.run()

    │    └ <function Thread.run at 0x77f51935c680>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/threading.py", line 982, in run

    self._target(*self._args, **self._kwargs)

    │    │        │    │        │    └ {}

    │    │        │    │        └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

    │    │        │    └ (<weakref at 0x77f2efc8f5b0; to 'ThreadPoolExecutor' at 0x77f36b9a8910>, <_queue.SimpleQueue object at 0x77f370820ef0>, None,...

    │    │        └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

    │    └ <function _worker at 0x77f5184349a0>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker

    work_item.run()

    │         └ <function _WorkItem.run at 0x77f518434ae0>

    └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run

    result = self.fn(*self.args, **self.kwargs)

             │    │   │    │       │    └ {}

             │    │   │    │       └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

             │    │   │    └ ('40c6946f-e64a-4ddb-aabf-6c92a68ba612', '..................')

             │    │   └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

             │    └ <function query_collection_with_hybrid_search.<locals>.process_query at 0x77f370d80180>

             └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

> File "/app/backend/open_webui/retrieval/utils.py", line 364, in process_query

    result = query_doc_with_hybrid_search(

             └ <function query_doc_with_hybrid_search at 0x77f44b54a840>

  File "/app/backend/open_webui/retrieval/utils.py", line 194, in query_doc_with_hybrid_search

    raise e

  File "/app/backend/open_webui/retrieval/utils.py", line 167, in query_doc_with_hybrid_search

    result = compression_retriever.invoke(query)

             │                     │      └ '......................'

             │                     └ <function BaseRetriever.invoke at 0x77f44b754ae0>

             └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 261, in invoke

    result = self._get_relevant_documents(

             │    └ <function ContextualCompressionRetriever._get_relevant_documents at 0x77f44b754860>

             └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/contextual_compression.py", line 44, in _get_relevant_documents

    compressed_docs = self.base_compressor.compress_documents(

                      │    │               └ <function RerankCompressor.compress_documents at 0x77f44b54b240>

                      │    └ RerankCompressor(embedding_function=<function chat_completion_files_handler.<locals>.<lambda>.<locals>.<lambda> at 0x77f37036...

                      └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/app/backend/open_webui/retrieval/utils.py", line 969, in compress_documents

    if scores:

       └ array([3.95550847e-01, 3.36318165e-01, 7.56461471e-02, 5.71876824e-01,

                2.34597996e-01, 1.81326702e-01, 5.10640085e-01,...

@alpilotx commented on GitHub (Aug 22, 2025): Yes, it seems - according to logs - it always bails out with some error like this ``` **2025-08-22 11:52:29.328 | ERROR | open_webui.retrieval.utils:process_query:377 - Error when querying the collection with hybrid_search: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()** Traceback (most recent call last): File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap self._bootstrap_inner() │ └ <function Thread._bootstrap_inner at 0x77f51935c9a0> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() │ └ <function Thread.run at 0x77f51935c680> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/threading.py", line 982, in run self._target(*self._args, **self._kwargs) │ │ │ │ │ └ {} │ │ │ │ └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> │ │ │ └ (<weakref at 0x77f2efc8f5b0; to 'ThreadPoolExecutor' at 0x77f36b9a8910>, <_queue.SimpleQueue object at 0x77f370820ef0>, None,... │ │ └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> │ └ <function _worker at 0x77f5184349a0> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker work_item.run() │ └ <function _WorkItem.run at 0x77f518434ae0> └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) │ │ │ │ │ └ {} │ │ │ │ └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> │ │ │ └ ('40c6946f-e64a-4ddb-aabf-6c92a68ba612', '..................') │ │ └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> │ └ <function query_collection_with_hybrid_search.<locals>.process_query at 0x77f370d80180> └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> > File "/app/backend/open_webui/retrieval/utils.py", line 364, in process_query result = query_doc_with_hybrid_search( └ <function query_doc_with_hybrid_search at 0x77f44b54a840> File "/app/backend/open_webui/retrieval/utils.py", line 194, in query_doc_with_hybrid_search raise e File "/app/backend/open_webui/retrieval/utils.py", line 167, in query_doc_with_hybrid_search result = compression_retriever.invoke(query) │ │ └ '......................' │ └ <function BaseRetriever.invoke at 0x77f44b754ae0> └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 261, in invoke result = self._get_relevant_documents( │ └ <function ContextualCompressionRetriever._get_relevant_documents at 0x77f44b754860> └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/contextual_compression.py", line 44, in _get_relevant_documents compressed_docs = self.base_compressor.compress_documents( │ │ └ <function RerankCompressor.compress_documents at 0x77f44b54b240> │ └ RerankCompressor(embedding_function=<function chat_completion_files_handler.<locals>.<lambda>.<locals>.<lambda> at 0x77f37036... └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/app/backend/open_webui/retrieval/utils.py", line 969, in compress_documents if scores: └ array([3.95550847e-01, 3.36318165e-01, 7.56461471e-02, 5.71876824e-01, 2.34597996e-01, 1.81326702e-01, 5.10640085e-01,... ```

GiteaMirror commented

2026-04-25 07:29:06 -05:00

@tjbck commented on GitHub (Aug 22, 2025):

Should be addressed in dev with fbff4e19de, testing wanted here!

@tjbck commented on GitHub (Aug 22, 2025): Should be addressed in dev with fbff4e19de591a440fcc5716e6796a6ed2d512b7, testing wanted here!

GiteaMirror commented

2026-04-25 07:29:07 -05:00

@tjbck commented on GitHub (Aug 22, 2025):

@XYKiwi03 milvus error, unrelated to this post.

@tjbck commented on GitHub (Aug 22, 2025): @XYKiwi03 milvus error, unrelated to this post.

GiteaMirror commented

2026-04-25 07:29:08 -05:00

@alpilotx commented on GitHub (Aug 22, 2025):

Should be addressed in dev with fbff4e1, testing wanted here!

I tried this fix (just quickly, manually "updated" my 0.6.24) , and it seems to work!

@alpilotx commented on GitHub (Aug 22, 2025): > Should be addressed in dev with [fbff4e1](https://github.com/open-webui/open-webui/commit/fbff4e19de591a440fcc5716e6796a6ed2d512b7), testing wanted here! I tried this fix (just quickly, manually "updated" my 0.6.24) , and it seems to work!

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#33583