[GH-ISSUE #17088] issue: milvus error because the limit set to None by default #18165

Closed
opened 2026-04-20 00:22:55 -05:00 by GiteaMirror · 16 comments
Owner

Originally created by @aaronchenlei on GitHub (Sep 1, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17088

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.6.26

Ollama Version (if applicable)

No response

Operating System

ubuntu 22.04

Browser (if applicable)

EDGE

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252:

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items
query_result = query_collection_with_hybrid_search(
└ <function query_collection_with_hybrid_search at 0x7fcf926245e0>

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search
collection_results[collection_name] = VECTOR_DB_CLIENT.get(
│ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480>
│ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
│ └ '98d31974-93f5-4992-a3f6-4a41f6f77847'
└ {}

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get
return self.query(collection_name=collection_name, filter={}, limit=None)
│ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847'
│ └ <function MilvusClient.query at 0x7fcfd7e623e0>
└ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 229, in query
result = iterator.next()
│ └ <function QueryIterator.next at 0x7fcfd7fffc40>
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>

File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next
ret = self.__check_reached_limit(ret)
│ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| A-DCH |\n| Associated Dedicated Channel |\n| |\n...
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit
left_count = self._limit - self._returned_count
│ │ │ └ 0
│ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
│ └ None
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>

TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

but I didn't find anywhere to set the limit. if somebody meet the same issue?

Actual Behavior

I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252

Steps to Reproduce

I use milvus as the vector database, but I met exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252

Logs & Screenshots

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items
query_result = query_collection_with_hybrid_search(
└ <function query_collection_with_hybrid_search at 0x7fcf926245e0>

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search
collection_results[collection_name] = VECTOR_DB_CLIENT.get(
│ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480>
│ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
│ └ '98d31974-93f5-4992-a3f6-4a41f6f77847'
└ {}

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get
return self.query(collection_name=collection_name, filter={}, limit=None)
│ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847'
│ └ <function MilvusClient.query at 0x7fcfd7e623e0>
└ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>

File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 229, in query
result = iterator.next()
│ └ <function QueryIterator.next at 0x7fcfd7fffc40>
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>

File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next
ret = self.__check_reached_limit(ret)
│ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| A-DCH |\n| Associated Dedicated Channel |\n| |\n...
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit
left_count = self._limit - self._returned_count
│ │ │ └ 0
│ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
│ └ None
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>

TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

but I didn't find anywhere to set the limit. if somebody meet the same issue?

Additional Information

No response

Originally created by @aaronchenlei on GitHub (Sep 1, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/17088 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.6.26 ### Ollama Version (if applicable) _No response_ ### Operating System ubuntu 22.04 ### Browser (if applicable) EDGE ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252: File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items query_result = query_collection_with_hybrid_search( └ <function query_collection_with_hybrid_search at 0x7fcf926245e0> File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search collection_results[collection_name] = VECTOR_DB_CLIENT.get( │ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480> │ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150> │ └ '98d31974-93f5-4992-a3f6-4a41f6f77847' └ {} File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get return self.query(collection_name=collection_name, filter={}, limit=None) │ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847' │ └ <function MilvusClient.query at 0x7fcfd7e623e0> └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150> > File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 229, in query result = iterator.next() │ └ <function QueryIterator.next at 0x7fcfd7fffc40> └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next ret = self.__check_reached_limit(ret) │ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| **A-DCH** |\n| Associated Dedicated Channel |\n| |\n... └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit left_count = self._limit - self._returned_count │ │ │ └ 0 │ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> │ └ None └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> TypeError: unsupported operand type(s) for -: 'NoneType' and 'int' but I didn't find anywhere to set the limit. if somebody meet the same issue? ### Actual Behavior I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252 ### Steps to Reproduce I use milvus as the vector database, but I met exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252 ### Logs & Screenshots File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items query_result = query_collection_with_hybrid_search( └ <function query_collection_with_hybrid_search at 0x7fcf926245e0> File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search collection_results[collection_name] = VECTOR_DB_CLIENT.get( │ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480> │ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150> │ └ '98d31974-93f5-4992-a3f6-4a41f6f77847' └ {} File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get return self.query(collection_name=collection_name, filter={}, limit=None) │ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847' │ └ <function MilvusClient.query at 0x7fcfd7e623e0> └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150> > File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 229, in query result = iterator.next() │ └ <function QueryIterator.next at 0x7fcfd7fffc40> └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next ret = self.__check_reached_limit(ret) │ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| **A-DCH** |\n| Associated Dedicated Channel |\n| |\n... └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit left_count = self._limit - self._returned_count │ │ │ └ 0 │ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> │ └ None └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0> TypeError: unsupported operand type(s) for -: 'NoneType' and 'int' but I didn't find anywhere to set the limit. if somebody meet the same issue? ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-20 00:22:55 -05:00
Author
Owner

@aaronchenlei commented on GitHub (Sep 1, 2025):

after I change the limit to 16383 or any other int value, the exception disappear. so I think it is a bug. it is better to have a env parameter to control the limit value.

<!-- gh-comment-id:3240925185 --> @aaronchenlei commented on GitHub (Sep 1, 2025): after I change the limit to 16383 or any other int value, the exception disappear. so I think it is a bug. it is better to have a env parameter to control the limit value.
Author
Owner

@qiaozhi199 commented on GitHub (Sep 11, 2025):

I also encountered the same issue, and the Open WebUI version is v0.6.28. Milvus version is v2.5.4.

<!-- gh-comment-id:3280113857 --> @qiaozhi199 commented on GitHub (Sep 11, 2025): I also encountered the same issue, and the Open WebUI version is v0.6.28. Milvus version is v2.5.4.
Author
Owner

@qiaozhi199 commented on GitHub (Sep 11, 2025):

after I change the limit to 16383 or any other int value, the exception disappear. so I think it is a bug. it is better to have a env parameter to control the limit value.

How did you modify the limit? Did you change the Open WebUI source code?

<!-- gh-comment-id:3280121317 --> @qiaozhi199 commented on GitHub (Sep 11, 2025): > after I change the limit to 16383 or any other int value, the exception disappear. so I think it is a bug. it is better to have a env parameter to control the limit value. How did you modify the limit? Did you change the Open WebUI source code?
Author
Owner

@qiaozhi199 commented on GitHub (Sep 12, 2025):

https://github.com/open-webui/open-webui/issues/16954
This issue mentions that upgrading the pymilvus version can resolve the problem.

But,I upgraded the pymilvus version in Open WebUI to the latest (2.6.1), but the same error still occurred.

My Milvus server version is v2.5.4. I also upgraded the pymilvus version in Open WebUI to match my server version (2.5.4), but it still reported the same error.

<!-- gh-comment-id:3283482840 --> @qiaozhi199 commented on GitHub (Sep 12, 2025): https://github.com/open-webui/open-webui/issues/16954 This issue mentions that upgrading the pymilvus version can resolve the problem. But,I upgraded the pymilvus version in Open WebUI to the latest (2.6.1), but the same error still occurred. My Milvus server version is v2.5.4. I also upgraded the pymilvus version in Open WebUI to match my server version (2.5.4), but it still reported the same error.
Author
Owner

@wzlsdu commented on GitHub (Sep 24, 2025):

Got the same error. my milvus version is v2.6.2-gpu, openwebui is v0.6.30

<!-- gh-comment-id:3326061355 --> @wzlsdu commented on GitHub (Sep 24, 2025): Got the same error. my milvus version is v2.6.2-gpu, openwebui is v0.6.30
Author
Owner

@Classic298 commented on GitHub (Sep 24, 2025):

@wzlsdu what if you manually update the milvus dependency in open webui?

<!-- gh-comment-id:3326630758 --> @Classic298 commented on GitHub (Sep 24, 2025): @wzlsdu what if you manually update the milvus dependency in open webui?
Author
Owner

@colabbear commented on GitHub (Sep 24, 2025):

I've noticed a potential inconsistency in the limit parameter handling.

In the query method of [pymilvus/orm/collection.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/collection.py), the default value for limit is UNLIMITED, which is defined as -1 in [pymilvus/orm/constants.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/constants.py).

However, in [open-webui/retrieval/vector/dbs/milvus.py] (https://github.com/open-webui/open-webui/blob/v0.6.30/backend/open_webui/retrieval/vector/dbs/milvus.py), the query method's signature defines limit as Optional[int] = None, and a comment on line 225 states that None means no limit. This seems to create a contradiction with how pymilvus handles unlimited queries.

To ensure consistency and correct behavior, I suggest the following changes in milvus.py:

Change the query method's signature to def query(self, collection_name: str, filter: dict, limit: Optional[int] = -1):

Change the method call return self.query(collection_name=collection_name, filter={}, limit=None) to return self.query(collection_name=collection_name, filter={}, limit=-1)

This would align the open-webui implementation with the pymilvus library's intended use of -1 for unlimited queries.

<!-- gh-comment-id:3327128187 --> @colabbear commented on GitHub (Sep 24, 2025): I've noticed a potential inconsistency in the limit parameter handling. In the query method of [pymilvus/orm/collection.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/collection.py), the default value for limit is UNLIMITED, which is defined as -1 in [pymilvus/orm/constants.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/constants.py). However, in [open-webui/retrieval/vector/dbs/milvus.py] (https://github.com/open-webui/open-webui/blob/v0.6.30/backend/open_webui/retrieval/vector/dbs/milvus.py), the query method's signature defines limit as Optional[int] = None, and a comment on line 225 states that None means no limit. This seems to create a contradiction with how pymilvus handles unlimited queries. To ensure consistency and correct behavior, I suggest the following changes in milvus.py: Change the query method's signature to def query(self, collection_name: str, filter: dict, limit: Optional[int] = -1): Change the method call return self.query(collection_name=collection_name, filter={}, limit=None) to return self.query(collection_name=collection_name, filter={}, limit=-1) This would align the open-webui implementation with the pymilvus library's intended use of -1 for unlimited queries.
Author
Owner

@Classic298 commented on GitHub (Sep 24, 2025):

Good catch

@wzlsdu @colabbear

If you have time please also check this discussion, I think i also found something weird

https://github.com/open-webui/open-webui/discussions/17687

<!-- gh-comment-id:3327952083 --> @Classic298 commented on GitHub (Sep 24, 2025): Good catch @wzlsdu @colabbear If you have time please also check this discussion, I think i also found something weird https://github.com/open-webui/open-webui/discussions/17687
Author
Owner

@wzlsdu commented on GitHub (Sep 26, 2025):

I've noticed a potential inconsistency in the limit parameter handling.

In the query method of [pymilvus/orm/collection.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/collection.py), the default value for limit is UNLIMITED, which is defined as -1 in [pymilvus/orm/constants.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/constants.py).

However, in [open-webui/retrieval/vector/dbs/milvus.py] (https://github.com/open-webui/open-webui/blob/v0.6.30/backend/open_webui/retrieval/vector/dbs/milvus.py), the query method's signature defines limit as Optional[int] = None, and a comment on line 225 states that None means no limit. This seems to create a contradiction with how pymilvus handles unlimited queries.

To ensure consistency and correct behavior, I suggest the following changes in milvus.py:

Change the query method's signature to def query(self, collection_name: str, filter: dict, limit: Optional[int] = -1):

Change the method call return self.query(collection_name=collection_name, filter={}, limit=None) to return self.query(collection_name=collection_name, filter={}, limit=-1)

This would align the open-webui implementation with the pymilvus library's intended use of -1 for unlimited queries.

@colabbear I used this solution, it works and currently everything is fine, thank you!

<!-- gh-comment-id:3336865276 --> @wzlsdu commented on GitHub (Sep 26, 2025): > I've noticed a potential inconsistency in the limit parameter handling. > > In the query method of [pymilvus/orm/collection.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/collection.py), the default value for limit is UNLIMITED, which is defined as -1 in [pymilvus/orm/constants.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/constants.py). > > However, in [open-webui/retrieval/vector/dbs/milvus.py] (https://github.com/open-webui/open-webui/blob/v0.6.30/backend/open_webui/retrieval/vector/dbs/milvus.py), the query method's signature defines limit as Optional[int] = None, and a comment on line 225 states that None means no limit. This seems to create a contradiction with how pymilvus handles unlimited queries. > > To ensure consistency and correct behavior, I suggest the following changes in milvus.py: > > Change the query method's signature to def query(self, collection_name: str, filter: dict, limit: Optional[int] = -1): > > Change the method call return self.query(collection_name=collection_name, filter={}, limit=None) to return self.query(collection_name=collection_name, filter={}, limit=-1) > > This would align the open-webui implementation with the pymilvus library's intended use of -1 for unlimited queries. @colabbear I used this solution, it works and currently everything is fine, thank you!
Author
Owner

@Classic298 commented on GitHub (Sep 26, 2025):

https://github.com/open-webui/open-webui/pull/17769

<!-- gh-comment-id:3337006251 --> @Classic298 commented on GitHub (Sep 26, 2025): https://github.com/open-webui/open-webui/pull/17769
Author
Owner

@wzlsdu commented on GitHub (Sep 26, 2025):

After change limit to -1, I get another error when uploading files, and the files are failed to upload, and when I change the limit back to None, the files can be uploaded again, the error: Duplicate content detected
I make sure that the files are different.

2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:_create_collection:157 - Successfully created collection 'open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51' with index type 'IVF_FLAT' and metric 'COSINE'.
2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:insert:274 - Inserting 43 items into collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51.
2025-09-26 07:55:05.703 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1391 - added 43 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51
2025-09-26 07:55:05.704 | INFO | open_webui.routers.retrieval:process_file:1577 - added 9 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51
2025-09-26 07:55:06.190 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51 with filter: 'metadata["file_id"] == "86b1004d-8537-49b2-93a6-6093ce10cf51"', limit: -1
2025-09-26 07:55:06.427 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 43
2025-09-26 07:55:06.429 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document LS 70-02_Business Travel Regulation.pdf e13b9ff6-3687-4e49-b11c-239f21d54b50
2025-09-26 07:55:06.434 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_e13b9ff6_3687_4e49_b11c_239f21d54b50 with filter: 'metadata["hash"] == "885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7"', limit: -1
2025-09-26 07:55:06.627 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 26
2025-09-26 07:55:06.627 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1235 - Document with hash 885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7 already exists
2025-09-26 07:55:06.627 | ERROR | open_webui.routers.retrieval:process_file:1604 - Duplicate content detected. Please provide unique content to proceed.
Traceback (most recent call last):

File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
self._bootstrap_inner()
│ └ <function Thread._bootstrap_inner at 0x7ff5a54009a0>
└ <WorkerThread(AnyIO worker thread, started 140689010562752)>
File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
│ └ <function WorkerThread.run at 0x7ff4c0d14180>
└ <WorkerThread(AnyIO worker thread, started 140689010562752)>
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
│ │ │ └ ()
│ │ └ functools.partial(<function add_file_to_knowledge_by_id at 0x7ff4d12efce0>, user=UserModel(id='8f5fd69e-772c-46a7-98dd-f48cd4...
│ └ <method 'run' of '_contextvars.Context' objects>
└ <_contextvars.Context object at 0x7ff4d0aaa540>

File "/app/backend/open_webui/routers/knowledge.py", line 398, in add_file_to_knowledge_by_id
process_file(
└ <function process_file at 0x7ff4d1426fc0>

File "/app/backend/open_webui/routers/retrieval.py", line 1601, in process_file
raise e
└ ValueError(<ERROR_MESSAGES.DUPLICATE_CONTENT: 'Duplicate content detected. Please provide unique content to proceed.'>)

File "/app/backend/open_webui/routers/retrieval.py", line 1565, in process_file
result = save_docs_to_vector_db(
└ <function save_docs_to_vector_db at 0x7ff4d15c3f60>

File "/app/backend/open_webui/routers/retrieval.py", line 1236, in save_docs_to_vector_db
raise ValueError(ERROR_MESSAGES.DUPLICATE_CONTENT)
│ └ <ERROR_MESSAGES.DUPLICATE_CONTENT: 'Duplicate content detected. Please provide unique content to proceed.'>
└ <enum 'ERROR_MESSAGES'>

ValueError: Duplicate content detected. Please provide unique content to proceed.

<!-- gh-comment-id:3337250417 --> @wzlsdu commented on GitHub (Sep 26, 2025): After change limit to -1, I get another error when uploading files, and the files are failed to upload, and when I change the limit back to None, the files can be uploaded again, the error: Duplicate content detected I make sure that the files are different. 2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:_create_collection:157 - Successfully created collection 'open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51' with index type 'IVF_FLAT' and metric 'COSINE'. 2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:insert:274 - Inserting 43 items into collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51. 2025-09-26 07:55:05.703 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1391 - added 43 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51 2025-09-26 07:55:05.704 | INFO | open_webui.routers.retrieval:process_file:1577 - added 9 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51 2025-09-26 07:55:06.190 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51 with filter: 'metadata["file_id"] == "86b1004d-8537-49b2-93a6-6093ce10cf51"', limit: -1 2025-09-26 07:55:06.427 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 43 2025-09-26 07:55:06.429 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document LS 70-02_Business Travel Regulation.pdf e13b9ff6-3687-4e49-b11c-239f21d54b50 2025-09-26 07:55:06.434 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_e13b9ff6_3687_4e49_b11c_239f21d54b50 with filter: 'metadata["hash"] == "885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7"', limit: -1 2025-09-26 07:55:06.627 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 26 2025-09-26 07:55:06.627 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1235 - Document with hash 885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7 already exists 2025-09-26 07:55:06.627 | ERROR | open_webui.routers.retrieval:process_file:1604 - Duplicate content detected. Please provide unique content to proceed. Traceback (most recent call last): File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap self._bootstrap_inner() │ └ <function Thread._bootstrap_inner at 0x7ff5a54009a0> └ <WorkerThread(AnyIO worker thread, started 140689010562752)> File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() │ └ <function WorkerThread.run at 0x7ff4c0d14180> └ <WorkerThread(AnyIO worker thread, started 140689010562752)> File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run result = context.run(func, *args) │ │ │ └ () │ │ └ functools.partial(<function add_file_to_knowledge_by_id at 0x7ff4d12efce0>, user=UserModel(id='8f5fd69e-772c-46a7-98dd-f48cd4... │ └ <method 'run' of '_contextvars.Context' objects> └ <_contextvars.Context object at 0x7ff4d0aaa540> File "/app/backend/open_webui/routers/knowledge.py", line 398, in add_file_to_knowledge_by_id process_file( └ <function process_file at 0x7ff4d1426fc0> > File "/app/backend/open_webui/routers/retrieval.py", line 1601, in process_file raise e └ ValueError(<ERROR_MESSAGES.DUPLICATE_CONTENT: 'Duplicate content detected. Please provide unique content to proceed.'>) File "/app/backend/open_webui/routers/retrieval.py", line 1565, in process_file result = save_docs_to_vector_db( └ <function save_docs_to_vector_db at 0x7ff4d15c3f60> File "/app/backend/open_webui/routers/retrieval.py", line 1236, in save_docs_to_vector_db raise ValueError(ERROR_MESSAGES.DUPLICATE_CONTENT) │ └ <ERROR_MESSAGES.DUPLICATE_CONTENT: 'Duplicate content detected. Please provide unique content to proceed.'> └ <enum 'ERROR_MESSAGES'> ValueError: Duplicate content detected. Please provide unique content to proceed.
Author
Owner

@Classic298 commented on GitHub (Sep 26, 2025):

Where are you trying to upload files?

This error is not unique to milvus.

If you create a knowledgebase and upload files with duplicate content, you will always get this error. Even if you remove the file from the knowledge base and then upload it again the error is still shown because the file still exists in the backend, thus duplicated content

<!-- gh-comment-id:3337277793 --> @Classic298 commented on GitHub (Sep 26, 2025): Where are you trying to upload files? This error is not unique to milvus. If you create a knowledgebase and upload files with duplicate content, you will always get this error. Even if you remove the file from the knowledge base and then upload it again the error is still shown because the file still exists in the backend, thus duplicated content
Author
Owner

@wzlsdu commented on GitHub (Sep 26, 2025):

I upload files in the knowledgebase, even though I upload a new file with new content that is never uploaded, there is also this duplicate error, only one file can be uploaded to this knowledgebase.

<!-- gh-comment-id:3337386510 --> @wzlsdu commented on GitHub (Sep 26, 2025): I upload files in the knowledgebase, even though I upload a new file with new content that is never uploaded, there is also this duplicate error, only one file can be uploaded to this knowledgebase.
Author
Owner

@alvarolopez commented on GitHub (Nov 26, 2025):

@Classic298 the issue is solved with your PR, but the problem still exists, why is the issue closed?

<!-- gh-comment-id:3580763116 --> @alvarolopez commented on GitHub (Nov 26, 2025): @Classic298 the issue is solved with your PR, but the problem still exists, why is the issue closed?
Author
Owner

@Classic298 commented on GitHub (Nov 26, 2025):

@alvarolopez no the issue it not solved with my PR because my PR was closed.

I need to break the PR down into smaller packages so it can get merged. Will do it when I have time.

Remind me again if it isn't done in a week

And I reopened this issue for now.

<!-- gh-comment-id:3581038028 --> @Classic298 commented on GitHub (Nov 26, 2025): @alvarolopez no the issue it not solved with my PR because my PR was closed. I need to break the PR down into smaller packages so it can get merged. Will do it when I have time. Remind me again if it isn't done in a week And I reopened this issue for now.
Author
Owner

@Classic298 commented on GitHub (Dec 14, 2025):

been fixed a few versions ago

<!-- gh-comment-id:3652221754 --> @Classic298 commented on GitHub (Dec 14, 2025): been fixed a few versions ago
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18165