[GH-ISSUE #16816] issue: Document embedding doesn't work - active: hybrid search/reranker in v.0.6.23 #33583

Closed
opened 2026-04-25 07:29:03 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @eXt73 on GitHub (Aug 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16816

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.23

Ollama Version (if applicable)

0.11.5

Operating System

Linux - Ubuntu 24.04

Browser (if applicable)

Brave v1.81.136

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The document should be properly embedded, then searched by Qwen3, reranked by bge-reranker-v2-m3 and the fragments prepared in this way are sent to LLM 'on the front end'

Actual Behavior

In version v.0.6.23, document embedding and hybrid search/reranker stopped working in v.0.6.23. It looks like the document is embedded, but in reality, the LLM model doesn't receive it. When we disable reranker, the embedding itself works correctly - I use hf.co/Qwen/Qwen3-Embedding-4B-GGUF:Q8_0 via ollama, and as a reranker: BAAI/bge-reranker-v2-m3. Everything worked perfectly until version v.0.6.22.

Steps to Reproduce

I'm skipping the obvious parts:

  1. With the Qwen3 4B embedding model and the hybrid search engine + bge-reranker-v2-m3 reranker active.
  2. Post the document in the chat window.
  3. Ask the main LLM model for information from the document.
  4. The model will respond [if its system prompt is configured] that there is no information or file.
  5. The response shows that the embedded file is not attached.

Logs & Screenshots

Image

Additional Information

No response

Originally created by @eXt73 on GitHub (Aug 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/16816 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.23 ### Ollama Version (if applicable) 0.11.5 ### Operating System Linux - Ubuntu 24.04 ### Browser (if applicable) Brave v1.81.136 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior The document should be properly embedded, then searched by Qwen3, reranked by bge-reranker-v2-m3 and the fragments prepared in this way are sent to LLM 'on the front end' ### Actual Behavior In version v.0.6.23, document embedding and hybrid search/reranker stopped working in v.0.6.23. It looks like the document is embedded, but in reality, the LLM model doesn't receive it. When we disable reranker, the embedding itself works correctly - I use hf.co/Qwen/Qwen3-Embedding-4B-GGUF:Q8_0 via ollama, and as a reranker: BAAI/bge-reranker-v2-m3. Everything worked perfectly until version v.0.6.22. ### Steps to Reproduce I'm skipping the obvious parts: 1. With the Qwen3 4B embedding model and the hybrid search engine + bge-reranker-v2-m3 reranker active. 2. Post the document in the chat window. 3. Ask the main LLM model for information from the document. 4. The model will respond [if its system prompt is configured] that there is no information or file. 5. The response shows that the embedded file is not attached. ### Logs & Screenshots ![Image](https://github.com/user-attachments/assets/c3295e8f-7775-47dc-8413-cbcf70ab283c) ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 07:29:03 -05:00
Author
Owner

@uuuhuuu commented on GitHub (Aug 22, 2025):

Same problem, after upgrading to version 0.6.23, none of my RAGs are working anymore. Same problem with 0.6.24.

<!-- gh-comment-id:3214078562 --> @uuuhuuu commented on GitHub (Aug 22, 2025): Same problem, after upgrading to version 0.6.23, none of my RAGs are working anymore. Same problem with 0.6.24.
Author
Owner

@alpilotx commented on GitHub (Aug 22, 2025):

Yes, it seems - according to logs - it always bails out with some error like this

**2025-08-22 11:52:29.328 | ERROR    | open_webui.retrieval.utils:process_query:377 - Error when querying the collection with hybrid_search: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()**

Traceback (most recent call last):

  File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap

    self._bootstrap_inner()

    │    └ <function Thread._bootstrap_inner at 0x77f51935c9a0>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner

    self.run()

    │    └ <function Thread.run at 0x77f51935c680>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/threading.py", line 982, in run

    self._target(*self._args, **self._kwargs)

    │    │        │    │        │    └ {}

    │    │        │    │        └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

    │    │        │    └ (<weakref at 0x77f2efc8f5b0; to 'ThreadPoolExecutor' at 0x77f36b9a8910>, <_queue.SimpleQueue object at 0x77f370820ef0>, None,...

    │    │        └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

    │    └ <function _worker at 0x77f5184349a0>

    └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)>

  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker

    work_item.run()

    │         └ <function _WorkItem.run at 0x77f518434ae0>

    └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run

    result = self.fn(*self.args, **self.kwargs)

             │    │   │    │       │    └ {}

             │    │   │    │       └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

             │    │   │    └ ('40c6946f-e64a-4ddb-aabf-6c92a68ba612', '..................')

             │    │   └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

             │    └ <function query_collection_with_hybrid_search.<locals>.process_query at 0x77f370d80180>

             └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90>

> File "/app/backend/open_webui/retrieval/utils.py", line 364, in process_query

    result = query_doc_with_hybrid_search(

             └ <function query_doc_with_hybrid_search at 0x77f44b54a840>

  File "/app/backend/open_webui/retrieval/utils.py", line 194, in query_doc_with_hybrid_search

    raise e

  File "/app/backend/open_webui/retrieval/utils.py", line 167, in query_doc_with_hybrid_search

    result = compression_retriever.invoke(query)

             │                     │      └ '......................'

             │                     └ <function BaseRetriever.invoke at 0x77f44b754ae0>

             └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 261, in invoke

    result = self._get_relevant_documents(

             │    └ <function ContextualCompressionRetriever._get_relevant_documents at 0x77f44b754860>

             └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/contextual_compression.py", line 44, in _get_relevant_documents

    compressed_docs = self.base_compressor.compress_documents(

                      │    │               └ <function RerankCompressor.compress_documents at 0x77f44b54b240>

                      │    └ RerankCompressor(embedding_function=<function chat_completion_files_handler.<locals>.<lambda>.<locals>.<lambda> at 0x77f37036...

                      └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l...

  File "/app/backend/open_webui/retrieval/utils.py", line 969, in compress_documents

    if scores:

       └ array([3.95550847e-01, 3.36318165e-01, 7.56461471e-02, 5.71876824e-01,

                2.34597996e-01, 1.81326702e-01, 5.10640085e-01,...
<!-- gh-comment-id:3214263378 --> @alpilotx commented on GitHub (Aug 22, 2025): Yes, it seems - according to logs - it always bails out with some error like this ``` **2025-08-22 11:52:29.328 | ERROR | open_webui.retrieval.utils:process_query:377 - Error when querying the collection with hybrid_search: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()** Traceback (most recent call last): File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap self._bootstrap_inner() │ └ <function Thread._bootstrap_inner at 0x77f51935c9a0> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() │ └ <function Thread.run at 0x77f51935c680> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/threading.py", line 982, in run self._target(*self._args, **self._kwargs) │ │ │ │ │ └ {} │ │ │ │ └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> │ │ │ └ (<weakref at 0x77f2efc8f5b0; to 'ThreadPoolExecutor' at 0x77f36b9a8910>, <_queue.SimpleQueue object at 0x77f370820ef0>, None,... │ │ └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> │ └ <function _worker at 0x77f5184349a0> └ <Thread(ThreadPoolExecutor-11_2, started 131887359063744)> File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 83, in _worker work_item.run() │ └ <function _WorkItem.run at 0x77f518434ae0> └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) │ │ │ │ │ └ {} │ │ │ │ └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> │ │ │ └ ('40c6946f-e64a-4ddb-aabf-6c92a68ba612', '..................') │ │ └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> │ └ <function query_collection_with_hybrid_search.<locals>.process_query at 0x77f370d80180> └ <concurrent.futures.thread._WorkItem object at 0x77f2efc48e90> > File "/app/backend/open_webui/retrieval/utils.py", line 364, in process_query result = query_doc_with_hybrid_search( └ <function query_doc_with_hybrid_search at 0x77f44b54a840> File "/app/backend/open_webui/retrieval/utils.py", line 194, in query_doc_with_hybrid_search raise e File "/app/backend/open_webui/retrieval/utils.py", line 167, in query_doc_with_hybrid_search result = compression_retriever.invoke(query) │ │ └ '......................' │ └ <function BaseRetriever.invoke at 0x77f44b754ae0> └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 261, in invoke result = self._get_relevant_documents( │ └ <function ContextualCompressionRetriever._get_relevant_documents at 0x77f44b754860> └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/usr/local/lib/python3.11/site-packages/langchain/retrievers/contextual_compression.py", line 44, in _get_relevant_documents compressed_docs = self.base_compressor.compress_documents( │ │ └ <function RerankCompressor.compress_documents at 0x77f44b54b240> │ └ RerankCompressor(embedding_function=<function chat_completion_files_handler.<locals>.<lambda>.<locals>.<lambda> at 0x77f37036... └ ContextualCompressionRetriever(base_compressor=RerankCompressor(embedding_function=<function chat_completion_files_handler.<l... File "/app/backend/open_webui/retrieval/utils.py", line 969, in compress_documents if scores: └ array([3.95550847e-01, 3.36318165e-01, 7.56461471e-02, 5.71876824e-01, 2.34597996e-01, 1.81326702e-01, 5.10640085e-01,... ```
Author
Owner

@tjbck commented on GitHub (Aug 22, 2025):

Should be addressed in dev with fbff4e19de, testing wanted here!

<!-- gh-comment-id:3214266803 --> @tjbck commented on GitHub (Aug 22, 2025): Should be addressed in dev with fbff4e19de591a440fcc5716e6796a6ed2d512b7, testing wanted here!
Author
Owner

@tjbck commented on GitHub (Aug 22, 2025):

@XYKiwi03 milvus error, unrelated to this post.

<!-- gh-comment-id:3214373936 --> @tjbck commented on GitHub (Aug 22, 2025): @XYKiwi03 milvus error, unrelated to this post.
Author
Owner

@alpilotx commented on GitHub (Aug 22, 2025):

Should be addressed in dev with fbff4e1, testing wanted here!

I tried this fix (just quickly, manually "updated" my 0.6.24) , and it seems to work!

<!-- gh-comment-id:3214396021 --> @alpilotx commented on GitHub (Aug 22, 2025): > Should be addressed in dev with [fbff4e1](https://github.com/open-webui/open-webui/commit/fbff4e19de591a440fcc5716e6796a6ed2d512b7), testing wanted here! I tried this fix (just quickly, manually "updated" my 0.6.24) , and it seems to work!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#33583