[GH-ISSUE #11970] issue: websearch collections causing memory exhaustion in vectordb #31949

Closed
opened 2026-04-25 05:50:25 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @patbernard on GitHub (Mar 23, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11970

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.5.20

Ollama Version (if applicable)

No response

Operating System

Using Docker image

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

The current vectordb pattern for websearch uses separate collections for each query and indexing operation. This ends up creating many small collections, which leads to significant resource consumption overhead. The bug is that this approach leads to excessive memory usage and potential exhaustion, particularly with Qdrant, violating best practices.

Reference: https://qdrant.tech/documentation/faq/qdrant-fundamentals/#how-many-collections-can-i-create

It is highly recommended not to create many small collections, as it will lead to significant resource consumption overhead.

We consider creating a collection for each user/dialog/document as an antipattern.

Please read more about collections, isolation, and multiple users in our Multitenancy tutorial.

The application should efficiently manage vector database resources, perhaps by using a single collection with additional tags and an index for queries and indexing. Memory usage should remain within reasonable bounds even with a high volume of searches and indexing.

Actual Behavior

The application creates separate collections for each query and indexing operation, leading to high memory consumption and potential memory exhaustion of the vector database, specifically Qdrant.

Steps to Reproduce

  • Configure the application to use Qdrant
    • VECTOR_DB -> qdrant
    • QDRANT_URI -> http://my.qdrant.instance.local:6333
  • Perform a large number of web searches and indexing operations, creating many small collections.
  • Monitor the memory usage of the vector database (e.g., Qdrant).
  • Continue performing searches and indexing
  • Observe the memory continue to grow with each new web-search and each new collection created
  • Continue until memory exhaustion
  • Web searches start to fail with "No results found"

Logs & Screenshots

This is the error I start receiving in Open WebUI server logs after the vector_db starts going out of memory.

 File "/app/backend/open_webui/utils/middleware.py", line 736, in process_chat_payload 
    form_data = await chat_web_search_handler( 
                      └ <function chat_web_search_handler at 0xffff7b1cad40> 
 
> File "/app/backend/open_webui/utils/middleware.py", line 341, in chat_web_search_handler 
    results = await process_web_search( 
                    └ <function process_web_search at 0xffff7c9e8fe0> 
 
  File "/app/backend/open_webui/routers/retrieval.py", line 1481, in process_web_search 
    raise HTTPException( 
          └ <class 'fastapi.exceptions.HTTPException'> 
 
fastapi.exceptions.HTTPException: 400: [ERROR: timed out] 

Here's a small snippet from the GET collections results in qdrant

Image

Additional Information

The short-term work-around I am executing for now is having a script that runs periodically to wipe my websearch collections from qdrant.

I haven't tested if this similar issue happens in other vector_dbs.

Originally created by @patbernard on GitHub (Mar 23, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/11970 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.5.20 ### Ollama Version (if applicable) _No response_ ### Operating System Using Docker image ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior The current vectordb pattern for websearch uses separate collections for each query and indexing operation. This ends up creating many small collections, which leads to significant resource consumption overhead. The bug is that this approach leads to excessive memory usage and potential exhaustion, particularly with Qdrant, violating best practices. Reference: https://qdrant.tech/documentation/faq/qdrant-fundamentals/#how-many-collections-can-i-create > It is highly recommended not to create many small collections, as it will lead to significant resource consumption overhead. > > We consider creating a collection for each user/dialog/document as an antipattern. > > Please read more about collections, isolation, and multiple users in our [Multitenancy](https://qdrant.tech/documentation/tutorials/multiple-partitions/) tutorial. The application should efficiently manage vector database resources, perhaps by using a single collection with additional tags and an index for queries and indexing. Memory usage should remain within reasonable bounds even with a high volume of searches and indexing. ### Actual Behavior The application creates separate collections for each query and indexing operation, leading to high memory consumption and potential memory exhaustion of the vector database, specifically Qdrant. ### Steps to Reproduce - Configure the application to use Qdrant - `VECTOR_DB` -> `qdrant` - `QDRANT_URI` -> `http://my.qdrant.instance.local:6333` - Perform a large number of web searches and indexing operations, creating many small collections. - Monitor the memory usage of the vector database (e.g., Qdrant). - Continue performing searches and indexing - Observe the memory continue to grow with each new web-search and each new collection created - Continue until memory exhaustion - Web searches start to fail with "No results found" ### Logs & Screenshots This is the error I start receiving in Open WebUI server logs after the vector_db starts going out of memory. ``` File "/app/backend/open_webui/utils/middleware.py", line 736, in process_chat_payload form_data = await chat_web_search_handler( └ <function chat_web_search_handler at 0xffff7b1cad40> > File "/app/backend/open_webui/utils/middleware.py", line 341, in chat_web_search_handler results = await process_web_search( └ <function process_web_search at 0xffff7c9e8fe0> File "/app/backend/open_webui/routers/retrieval.py", line 1481, in process_web_search raise HTTPException( └ <class 'fastapi.exceptions.HTTPException'> fastapi.exceptions.HTTPException: 400: [ERROR: timed out] ``` Here's a small snippet from the `GET collections` results in qdrant <img width="703" alt="Image" src="https://github.com/user-attachments/assets/ee1f5255-c50b-4aa7-95c5-00689a73887c" /> ### Additional Information The short-term work-around I am executing for now is having a script that runs periodically to wipe my websearch collections from qdrant. I haven't tested if this similar issue happens in other vector_dbs.
GiteaMirror added the bug label 2026-04-25 05:50:25 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#31949