mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 03:18:23 -05:00
[GH-ISSUE #9608] Vector Dimension Mismatch Error with mxbai-embed-large:335m in Qdrant Integration #15579
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @MatzeJoerling on GitHub (Feb 7, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/9608
Bug Report
Dear Developers,
First and foremost, I want to commend you on the exceptional software you've developed; it has been instrumental in ensuring my application's GDPR compliance, especially when interfacing with an AI-driven web chat.
I have configured the latest version of Open-WebUI to utilize the Qdrant vector database, employing nomic-embed-text as the Retrieval-Augmented Generation (RAG) embedding model. Given my requirements for German language support and multilingual capabilities, and having the necessary GPU resources, I opted to switch to mxbai-embed-large:335m.
After clearing my Qdrant database and uploading a new PDF to the RAG system, I observed that a new entry was created in the Qdrant documents. However, this entry possesses an embedding length of 1024, in contrast to the 768 dimensions associated with nomic-embed-text.
Installation Method
Portainer Stack with docker running on Nvidia Ubuntu Linux 22.04.
docker compose file with ollama, pipelines, postgres, redis, tika, qdrant and open-webui.
Environment
Confirmation:
Expected Behavior:
RAG / VectorDB delivers to correct data
Data is in VectorDB:
open-webui_file-7ac30036-60e8-4cac-90aa-a612536405c8 Status: green Points: 892 Segments: 8 Shards:_ 1 Vectors Configuration (Name, Size, Distance) default, 1024, CosineActual Behavior:
Data is saved in VectorDB
WARNI [python_multipart.multipart] Skipping data after last boundary INFO [open_webui.routers.files] file.content_type: application/pdf INFO [open_webui.routers.retrieval] save_docs_to_vector_db: document EU_AI_Act.pdf file-7ac30036-60e8-4cac-90aa-a612536405c8 INFO [open_webui.routers.retrieval] adding to collection file-7ac30036-60e8-4cac-90aa-a612536405c8 collection open-webui_file-7ac30036-60e8-4cac-90aa-a612536405c8 successfully created!Retrieval should be OK, but delivers a "Unexpected Response: 400 (Bad Request)"
Description
When querying the RAG i got:
Bug Summary:
Size in VectorDB not determined correctly, saved with 1024 and queried with 768.
Reproduction Details
Steps to Reproduce:
No idea:
If it is a bug install the stack with qdrant and mxbai-embed-large:335m on ollama and add a Document to RAG.
else i am to ... to configure the size for dimension length when retrieving RAG data, tell me where i can config that size.
I appreciate your assistance in resolving this matter.
Warm regards,
Martin