[GH-ISSUE #10827] Bug and fix for Milvus database settings not being reflected on Milvus #16043

Closed
opened 2026-04-19 22:05:12 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @pablocerdeira on GitHub (Feb 26, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/10827

Installation Method

Docker (using docker-compose)

Environment

  • Open WebUI Version: v0.5.16
  • Operating System: Ubuntu
  • Browser (if applicable): Not applicable (server-side issue)

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs. (Not applicable)
  • I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

When configuring MILVUS_DB (e.g., vanilla) in the docker-compose.yml, collections should be created and managed in the specified database (vanilla) instead of the default database (default) when using Milvus as the vector database.

Actual Behavior:

Collections are always created in the default database, even when MILVUS_DB is set to a custom value (e.g., vanilla). The MilvusClient initialization with database=MILVUS_DB does not enforce the specified database, and tools like Attu and Milvus Web show all collections under default.

Description

Bug Summary:
The MilvusClient in open_webui/retrieval/vector/dbs/milvus.py does not correctly use the database specified in MILVUS_DB from the configuration. Collections are created in the default database instead, due to an issue with how pymilvus handles the database parameter in the MilvusClient constructor. This requires explicit use of connections.connect and using_database() to enforce the configured database.

Reproduction Details

Steps to Reproduce:

  1. Set up Open WebUI with Docker using a docker-compose.yml that includes Milvus integration:
    services:
      open-webui:
        image: ghcr.io/open-webui/open-webui:cuda
        environment:
          - MILVUS_URI=tcp://milvus:19530
          - MILVUS_DB=vanilla
    
  2. Deploy a Milvus standalone instance (e.g., milvusdb/milvus:v2.5.4-gpu) using its own docker-compose.yml.
  3. Start both services: docker-compose up -d.
  4. Access Open WebUI, create a new knowledge base, and upload a file.
  5. Check the collections in Milvus using Attu or Milvus Web.
  6. Observe that the collections (e.g., open_webui_<id>) appear in the default database, not vanilla.

Logs and Screenshots

Docker Container Logs:
From vanilla-open-webui:

DEBUG [pymilvus.milvus_client.milvus_client] Successfully created collection: open_webui_file_c96bc88f_2397_48d3_827d_2dafbb82d745
DEBUG [pymilvus.milvus_client.milvus_client] Successfully created collection: open_webui_daa1c34d_05a8_47c5_b359_a87baf4c5b5d
  • Despite MILVUS_DB=vanilla, these collections appear in default in Attu and Milvus Web.

Additional Information

  • Root Cause: The pymilvus (v2.5.4) MilvusClient constructor with database=MILVUS_DB does not reliably set the active database. Explicitly using connections.connect followed by using_database(MILVUS_DB) resolves this by ensuring the database context is correctly applied before operations.
  • Tested Fix: Modified milvus.py to use connections.connect and using_database() ensures collections are created in the configured database (e.g., vanilla).
  • Versions: Tested with pymilvus==2.5.4 and milvusdb/milvus:v2.5.4-gpu.

Diff Patch:
Below is the minimal diff between the original milvus.py and the fixed version:

diff --git a/open_webui/retrieval/vector/dbs/milvus.py b/open_webui/retrieval/vector/dbs/milvus.py
--- a/open_webui/retrieval/vector/dbs/milvus.py
+++ b/open_webui/retrieval/vector/dbs/milvus.py
@@ -1,4 +1,4 @@
-from pymilvus import MilvusClient as Client
+from pymilvus import MilvusClient as Client, connections, db
 from pymilvus import FieldSchema, DataType
 import json
 
@@ -13,9 +13,13 @@
 class MilvusClient:
     def __init__(self):
         self.collection_prefix = "open_webui"
+        connections.connect(uri=MILVUS_URI)
         if MILVUS_TOKEN is None:
-            self.client = Client(uri=MILVUS_URI, database=MILVUS_DB)
+            self.client = Client(uri=MILVUS_URI)
         else:
-            self.client = Client(uri=MILVUS_URI, database=MILVUS_DB, token=MILVUS_TOKEN)
+            self.client = Client(uri=MILVUS_URI, token=MILVUS_TOKEN)
+        if MILVUS_DB not in db.list_database():
+            db.create_database(MILVUS_DB)
+        self.client.using_database(MILVUS_DB)
 
     def _result_to_get_result(self, result) -> GetResult:
         ids = []
@@ -64,6 +68,7 @@
 
     def _create_collection(self, collection_name: str, dimension: int):
         schema = self.client.create_schema(
+            self.client.using_database(MILVUS_DB),
             auto_id=False,
             enable_dynamic_field=True,
         )
@@ -111,6 +116,7 @@
         # Search for the nearest neighbor items based on the vectors and return 'limit' number of results.
         collection_name = collection_name.replace("-", "_")
         result = self.client.search(
+            self.client.using_database(MILVUS_DB),
             collection_name=f"{self.collection_prefix}_{collection_name}",
             data=vectors,
             limit=limit,
@@ -149,6 +155,7 @@
                 current_fetch = min(
                     max_limit, remaining
                 )  # Determine how many items to fetch in this iteration
+                self.client.using_database(MILVUS_DB)
 
                 results = self.client.query(
                     collection_name=f"{self.collection_prefix}_{collection_name}",
@@ -186,6 +193,7 @@
         # Get all the items in the collection.
         collection_name = collection_name.replace("-", "_")
         result = self.client.query(
+            self.client.using_database(MILVUS_DB),
             collection_name=f"{self.collection_prefix}_{collection_name}",
             filter='id != ""',
         )
@@ -198,6 +206,7 @@
         # Insert the items into the collection, if the collection does not exist, it will be created.
         collection_name = collection_name.replace("-", "_")
         if not self.client.has_collection(
+            self.client.using_database(MILVUS_DB),
             collection_name=f"{self.collection_prefix}_{collection_name}"
         ):
             self._create_collection(
@@ -221,6 +230,7 @@
         # Update the items in the collection, if the items are not present, insert them. If the collection does not exist, it will be created.
         collection_name = collection_name.replace("-", "_")
         if not self.client.has_collection(
+            self.client.using_database(MILVUS_DB),
             collection_name=f"{self.collection_prefix}_{collection_name}"
         ):
             self._create_collection(
@@ -244,6 +254,7 @@
         # Delete the items from the collection based on the ids.
         collection_name = collection_name.replace("-", "_")
         if ids:
+            self.client.using_database(MILVUS_DB)
             return self.client.delete(
                 collection_name=f"{self.collection_prefix}_{collection_name}",
                 ids=ids,
@@ -257,6 +268,7 @@
                 for key, value in filter.items()
             ]
         )
+            self.client.using_database(MILVUS_DB)
 
         return self.client.delete(
             collection_name=f"{self.collection_prefix}_{collection_name}",
@@ -267,6 +279,7 @@
     def reset(self):
         # Resets the database. This will delete all collections and item entries.
         collection_names = self.client.list_collections()
+        self.client.using_database(MILVUS_DB)
         for collection_name in collection_names:
             if collection_name.startswith(self.collection_prefix):
                 self.client.drop_collection(collection_name=collection_name)

Note: The diff above uses self.client.using_database(MILVUS_DB) as a separate line for clarity in the diff. In practice, it should be called directly before the operation (e.g., self.client.using_database(MILVUS_DB) followed by self.client.create_collection(...) on the next line).

Note

This fix ensures compatibility with user-configured MILVUS_URI and MILVUS_DB, addressing the issue without requiring changes to the Milvus server configuration. I recommend integrating this into the main codebase or documenting it as a workaround for users relying on custom databases with Milvus.

Originally created by @pablocerdeira on GitHub (Feb 26, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/10827 ## Installation Method Docker (using `docker-compose`) ## Environment - **Open WebUI Version:** v0.5.16 - **Operating System:** Ubuntu - **Browser (if applicable):** Not applicable (server-side issue) **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. (Not applicable) - [x] I have included the Docker container logs. - [x] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: When configuring `MILVUS_DB` (e.g., `vanilla`) in the `docker-compose.yml`, collections should be created and managed in the specified database (`vanilla`) instead of the default database (`default`) when using Milvus as the vector database. ## Actual Behavior: Collections are always created in the `default` database, even when `MILVUS_DB` is set to a custom value (e.g., `vanilla`). The `MilvusClient` initialization with `database=MILVUS_DB` does not enforce the specified database, and tools like Attu and Milvus Web show all collections under `default`. ## Description **Bug Summary:** The `MilvusClient` in `open_webui/retrieval/vector/dbs/milvus.py` does not correctly use the database specified in `MILVUS_DB` from the configuration. Collections are created in the `default` database instead, due to an issue with how `pymilvus` handles the `database` parameter in the `MilvusClient` constructor. This requires explicit use of `connections.connect` and `using_database()` to enforce the configured database. ## Reproduction Details **Steps to Reproduce:** 1. Set up Open WebUI with Docker using a `docker-compose.yml` that includes Milvus integration: ```yaml services: open-webui: image: ghcr.io/open-webui/open-webui:cuda environment: - MILVUS_URI=tcp://milvus:19530 - MILVUS_DB=vanilla ``` 2. Deploy a Milvus standalone instance (e.g., `milvusdb/milvus:v2.5.4-gpu`) using its own `docker-compose.yml`. 3. Start both services: `docker-compose up -d`. 4. Access Open WebUI, create a new knowledge base, and upload a file. 5. Check the collections in Milvus using Attu or Milvus Web. 6. Observe that the collections (e.g., `open_webui_<id>`) appear in the `default` database, not `vanilla`. ## Logs and Screenshots **Docker Container Logs:** From `vanilla-open-webui`: ``` DEBUG [pymilvus.milvus_client.milvus_client] Successfully created collection: open_webui_file_c96bc88f_2397_48d3_827d_2dafbb82d745 DEBUG [pymilvus.milvus_client.milvus_client] Successfully created collection: open_webui_daa1c34d_05a8_47c5_b359_a87baf4c5b5d ``` - Despite `MILVUS_DB=vanilla`, these collections appear in `default` in Attu and Milvus Web. ## Additional Information - **Root Cause**: The `pymilvus` (v2.5.4) `MilvusClient` constructor with `database=MILVUS_DB` does not reliably set the active database. Explicitly using `connections.connect` followed by `using_database(MILVUS_DB)` resolves this by ensuring the database context is correctly applied before operations. - **Tested Fix**: Modified `milvus.py` to use `connections.connect` and `using_database()` ensures collections are created in the configured database (e.g., `vanilla`). - **Versions**: Tested with `pymilvus==2.5.4` and `milvusdb/milvus:v2.5.4-gpu`. **Diff Patch:** Below is the minimal diff between the original `milvus.py` and the fixed version: ```diff diff --git a/open_webui/retrieval/vector/dbs/milvus.py b/open_webui/retrieval/vector/dbs/milvus.py --- a/open_webui/retrieval/vector/dbs/milvus.py +++ b/open_webui/retrieval/vector/dbs/milvus.py @@ -1,4 +1,4 @@ -from pymilvus import MilvusClient as Client +from pymilvus import MilvusClient as Client, connections, db from pymilvus import FieldSchema, DataType import json @@ -13,9 +13,13 @@ class MilvusClient: def __init__(self): self.collection_prefix = "open_webui" + connections.connect(uri=MILVUS_URI) if MILVUS_TOKEN is None: - self.client = Client(uri=MILVUS_URI, database=MILVUS_DB) + self.client = Client(uri=MILVUS_URI) else: - self.client = Client(uri=MILVUS_URI, database=MILVUS_DB, token=MILVUS_TOKEN) + self.client = Client(uri=MILVUS_URI, token=MILVUS_TOKEN) + if MILVUS_DB not in db.list_database(): + db.create_database(MILVUS_DB) + self.client.using_database(MILVUS_DB) def _result_to_get_result(self, result) -> GetResult: ids = [] @@ -64,6 +68,7 @@ def _create_collection(self, collection_name: str, dimension: int): schema = self.client.create_schema( + self.client.using_database(MILVUS_DB), auto_id=False, enable_dynamic_field=True, ) @@ -111,6 +116,7 @@ # Search for the nearest neighbor items based on the vectors and return 'limit' number of results. collection_name = collection_name.replace("-", "_") result = self.client.search( + self.client.using_database(MILVUS_DB), collection_name=f"{self.collection_prefix}_{collection_name}", data=vectors, limit=limit, @@ -149,6 +155,7 @@ current_fetch = min( max_limit, remaining ) # Determine how many items to fetch in this iteration + self.client.using_database(MILVUS_DB) results = self.client.query( collection_name=f"{self.collection_prefix}_{collection_name}", @@ -186,6 +193,7 @@ # Get all the items in the collection. collection_name = collection_name.replace("-", "_") result = self.client.query( + self.client.using_database(MILVUS_DB), collection_name=f"{self.collection_prefix}_{collection_name}", filter='id != ""', ) @@ -198,6 +206,7 @@ # Insert the items into the collection, if the collection does not exist, it will be created. collection_name = collection_name.replace("-", "_") if not self.client.has_collection( + self.client.using_database(MILVUS_DB), collection_name=f"{self.collection_prefix}_{collection_name}" ): self._create_collection( @@ -221,6 +230,7 @@ # Update the items in the collection, if the items are not present, insert them. If the collection does not exist, it will be created. collection_name = collection_name.replace("-", "_") if not self.client.has_collection( + self.client.using_database(MILVUS_DB), collection_name=f"{self.collection_prefix}_{collection_name}" ): self._create_collection( @@ -244,6 +254,7 @@ # Delete the items from the collection based on the ids. collection_name = collection_name.replace("-", "_") if ids: + self.client.using_database(MILVUS_DB) return self.client.delete( collection_name=f"{self.collection_prefix}_{collection_name}", ids=ids, @@ -257,6 +268,7 @@ for key, value in filter.items() ] ) + self.client.using_database(MILVUS_DB) return self.client.delete( collection_name=f"{self.collection_prefix}_{collection_name}", @@ -267,6 +279,7 @@ def reset(self): # Resets the database. This will delete all collections and item entries. collection_names = self.client.list_collections() + self.client.using_database(MILVUS_DB) for collection_name in collection_names: if collection_name.startswith(self.collection_prefix): self.client.drop_collection(collection_name=collection_name) ``` **Note**: The diff above uses `self.client.using_database(MILVUS_DB)` as a separate line for clarity in the diff. In practice, it should be called directly before the operation (e.g., `self.client.using_database(MILVUS_DB)` followed by `self.client.create_collection(...)` on the next line). ## Note This fix ensures compatibility with user-configured `MILVUS_URI` and `MILVUS_DB`, addressing the issue without requiring changes to the Milvus server configuration. I recommend integrating this into the main codebase or documenting it as a workaround for users relying on custom databases with Milvus.
Author
Owner

@tjbck commented on GitHub (Feb 26, 2025):

PR welcome!

<!-- gh-comment-id:2686272833 --> @tjbck commented on GitHub (Feb 26, 2025): PR welcome!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#16043