mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 03:18:23 -05:00
[GH-ISSUE #17088] issue: milvus error because the limit set to None by default #56831
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @aaronchenlei on GitHub (Sep 1, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17088
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
v0.6.26
Ollama Version (if applicable)
No response
Operating System
ubuntu 22.04
Browser (if applicable)
EDGE
Confirmation
README.md.Expected Behavior
I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252:
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items
query_result = query_collection_with_hybrid_search(
└ <function query_collection_with_hybrid_search at 0x7fcf926245e0>
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search
collection_results[collection_name] = VECTOR_DB_CLIENT.get(
│ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480>
│ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
│ └ '98d31974-93f5-4992-a3f6-4a41f6f77847'
└ {}
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get
return self.query(collection_name=collection_name, filter={}, limit=None)
│ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847'
│ └ <function MilvusClient.query at 0x7fcfd7e623e0>
└ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next
ret = self.__check_reached_limit(ret)
│ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| A-DCH |\n| Associated Dedicated Channel |\n| |\n...
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit
left_count = self._limit - self._returned_count
│ │ │ └ 0
│ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
│ └ None
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
but I didn't find anywhere to set the limit. if somebody meet the same issue?
Actual Behavior
I use milvus as the vector database, but I met the following exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252
Steps to Reproduce
I use milvus as the vector database, but I met exception, it seems because the limit is set to None in backend/open_webui/retrieval/vector/dbs/milvus.py", line 252
Logs & Screenshots
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 647, in get_sources_from_items
query_result = query_collection_with_hybrid_search(
└ <function query_collection_with_hybrid_search at 0x7fcf926245e0>
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/utils.py", line 353, in query_collection_with_hybrid_search
collection_results[collection_name] = VECTOR_DB_CLIENT.get(
│ │ │ └ <function MilvusClient.get at 0x7fcfd7e62480>
│ │ └ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
│ └ '98d31974-93f5-4992-a3f6-4a41f6f77847'
└ {}
File "/home/jovyan/dev/open-webui-0.6.26/backend/open_webui/retrieval/vector/dbs/milvus.py", line 252, in get
return self.query(collection_name=collection_name, filter={}, limit=None)
│ │ └ '98d31974_93f5_4992_a3f6_4a41f6f77847'
│ └ <function MilvusClient.query at 0x7fcfd7e623e0>
└ <open_webui.retrieval.vector.dbs.milvus.MilvusClient object at 0x7fcfe4d4d150>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 342, in next
ret = self.__check_reached_limit(ret)
│ └ [{'id': '0b2af9c1-5528-473a-9ef0-0e7a4c906591', 'data': {'text': " |\n| A-DCH |\n| Associated Dedicated Channel |\n| |\n...
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
File "/home/jovyan/envs/py311/lib/python3.11/site-packages/pymilvus/orm/iterator.py", line 351, in __check_reached_limit
left_count = self._limit - self._returned_count
│ │ │ └ 0
│ │ └ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
│ └ None
└ <pymilvus.orm.iterator.QueryIterator object at 0x7fcf7232a8d0>
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
but I didn't find anywhere to set the limit. if somebody meet the same issue?
Additional Information
No response
@aaronchenlei commented on GitHub (Sep 1, 2025):
after I change the limit to 16383 or any other int value, the exception disappear. so I think it is a bug. it is better to have a env parameter to control the limit value.
@qiaozhi199 commented on GitHub (Sep 11, 2025):
I also encountered the same issue, and the Open WebUI version is v0.6.28. Milvus version is v2.5.4.
@qiaozhi199 commented on GitHub (Sep 11, 2025):
How did you modify the limit? Did you change the Open WebUI source code?
@qiaozhi199 commented on GitHub (Sep 12, 2025):
https://github.com/open-webui/open-webui/issues/16954
This issue mentions that upgrading the pymilvus version can resolve the problem.
But,I upgraded the pymilvus version in Open WebUI to the latest (2.6.1), but the same error still occurred.
My Milvus server version is v2.5.4. I also upgraded the pymilvus version in Open WebUI to match my server version (2.5.4), but it still reported the same error.
@wzlsdu commented on GitHub (Sep 24, 2025):
Got the same error. my milvus version is v2.6.2-gpu, openwebui is v0.6.30
@Classic298 commented on GitHub (Sep 24, 2025):
@wzlsdu what if you manually update the milvus dependency in open webui?
@colabbear commented on GitHub (Sep 24, 2025):
I've noticed a potential inconsistency in the limit parameter handling.
In the query method of [pymilvus/orm/collection.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/collection.py), the default value for limit is UNLIMITED, which is defined as -1 in [pymilvus/orm/constants.py] (https://github.com/milvus-io/pymilvus/blob/v2.5.0/pymilvus/orm/constants.py).
However, in [open-webui/retrieval/vector/dbs/milvus.py] (https://github.com/open-webui/open-webui/blob/v0.6.30/backend/open_webui/retrieval/vector/dbs/milvus.py), the query method's signature defines limit as Optional[int] = None, and a comment on line 225 states that None means no limit. This seems to create a contradiction with how pymilvus handles unlimited queries.
To ensure consistency and correct behavior, I suggest the following changes in milvus.py:
Change the query method's signature to def query(self, collection_name: str, filter: dict, limit: Optional[int] = -1):
Change the method call return self.query(collection_name=collection_name, filter={}, limit=None) to return self.query(collection_name=collection_name, filter={}, limit=-1)
This would align the open-webui implementation with the pymilvus library's intended use of -1 for unlimited queries.
@Classic298 commented on GitHub (Sep 24, 2025):
Good catch
@wzlsdu @colabbear
If you have time please also check this discussion, I think i also found something weird
https://github.com/open-webui/open-webui/discussions/17687
@wzlsdu commented on GitHub (Sep 26, 2025):
@colabbear I used this solution, it works and currently everything is fine, thank you!
@Classic298 commented on GitHub (Sep 26, 2025):
https://github.com/open-webui/open-webui/pull/17769
@wzlsdu commented on GitHub (Sep 26, 2025):
After change limit to -1, I get another error when uploading files, and the files are failed to upload, and when I change the limit back to None, the files can be uploaded again, the error: Duplicate content detected
I make sure that the files are different.
2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:_create_collection:157 - Successfully created collection 'open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51' with index type 'IVF_FLAT' and metric 'COSINE'.
2025-09-26 07:55:05.416 | INFO | open_webui.retrieval.vector.dbs.milvus:insert:274 - Inserting 43 items into collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51.
2025-09-26 07:55:05.703 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1391 - added 43 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51
2025-09-26 07:55:05.704 | INFO | open_webui.routers.retrieval:process_file:1577 - added 9 items to collection file-86b1004d-8537-49b2-93a6-6093ce10cf51
2025-09-26 07:55:06.190 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_file_86b1004d_8537_49b2_93a6_6093ce10cf51 with filter: 'metadata["file_id"] == "86b1004d-8537-49b2-93a6-6093ce10cf51"', limit: -1
2025-09-26 07:55:06.427 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 43
2025-09-26 07:55:06.429 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1221 - save_docs_to_vector_db: document LS 70-02_Business Travel Regulation.pdf e13b9ff6-3687-4e49-b11c-239f21d54b50
2025-09-26 07:55:06.434 | INFO | open_webui.retrieval.vector.dbs.milvus:query:214 - Querying collection open_webui_e13b9ff6_3687_4e49_b11c_239f21d54b50 with filter: 'metadata["hash"] == "885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7"', limit: -1
2025-09-26 07:55:06.627 | INFO | open_webui.retrieval.vector.dbs.milvus:query:235 - Total results from query: 26
2025-09-26 07:55:06.627 | INFO | open_webui.routers.retrieval:save_docs_to_vector_db:1235 - Document with hash 885d89ec8c1c6701b15b68dd7dd332b3ec274e23a5e852a6e96c214630983bb7 already exists
2025-09-26 07:55:06.627 | ERROR | open_webui.routers.retrieval:process_file:1604 - Duplicate content detected. Please provide unique content to proceed.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
self._bootstrap_inner()
│ └ <function Thread._bootstrap_inner at 0x7ff5a54009a0>
└ <WorkerThread(AnyIO worker thread, started 140689010562752)>
File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
│ └ <function WorkerThread.run at 0x7ff4c0d14180>
└ <WorkerThread(AnyIO worker thread, started 140689010562752)>
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
│ │ │ └ ()
│ │ └ functools.partial(<function add_file_to_knowledge_by_id at 0x7ff4d12efce0>, user=UserModel(id='8f5fd69e-772c-46a7-98dd-f48cd4...
│ └ <method 'run' of '_contextvars.Context' objects>
└ <_contextvars.Context object at 0x7ff4d0aaa540>
File "/app/backend/open_webui/routers/knowledge.py", line 398, in add_file_to_knowledge_by_id
process_file(
└ <function process_file at 0x7ff4d1426fc0>
File "/app/backend/open_webui/routers/retrieval.py", line 1565, in process_file
result = save_docs_to_vector_db(
└ <function save_docs_to_vector_db at 0x7ff4d15c3f60>
File "/app/backend/open_webui/routers/retrieval.py", line 1236, in save_docs_to_vector_db
raise ValueError(ERROR_MESSAGES.DUPLICATE_CONTENT)
│ └ <ERROR_MESSAGES.DUPLICATE_CONTENT: 'Duplicate content detected. Please provide unique content to proceed.'>
└ <enum 'ERROR_MESSAGES'>
ValueError: Duplicate content detected. Please provide unique content to proceed.
@Classic298 commented on GitHub (Sep 26, 2025):
Where are you trying to upload files?
This error is not unique to milvus.
If you create a knowledgebase and upload files with duplicate content, you will always get this error. Even if you remove the file from the knowledge base and then upload it again the error is still shown because the file still exists in the backend, thus duplicated content
@wzlsdu commented on GitHub (Sep 26, 2025):
I upload files in the knowledgebase, even though I upload a new file with new content that is never uploaded, there is also this duplicate error, only one file can be uploaded to this knowledgebase.
@alvarolopez commented on GitHub (Nov 26, 2025):
@Classic298 the issue is solved with your PR, but the problem still exists, why is the issue closed?
@Classic298 commented on GitHub (Nov 26, 2025):
@alvarolopez no the issue it not solved with my PR because my PR was closed.
I need to break the PR down into smaller packages so it can get merged. Will do it when I have time.
Remind me again if it isn't done in a week
And I reopened this issue for now.
@Classic298 commented on GitHub (Dec 14, 2025):
been fixed a few versions ago