mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-11 08:22:09 -05:00
issue: pgvector results are sorted incorrectly lead to no match in obvious cases #4507
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @naliwai on GitHub (Mar 20, 2025).
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.5.20
Ollama Version (if applicable)
No response
Operating System
Linux
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
Queries to pgvector return results already sorted according to cosine distance. The smaller the results the better the match. The best results are on the first places in the result list. So when choosing Top K results the first K results should be taken without reversing them as it is done in utils.py.
Actual Behavior
Queries to pgvector are being sorted in reverse order according to utils.py. So the first results in list are the results with the worst match.
Moreover, there are two places that define Top K in open WebUI.
It seems when the query to the pgvector is sent, it uses the first Top K = 40 to limit the amount of results returned from the database. And later after reversing the order of documents it takes Top K from the Admin Panel hence choosing the worst 3 from the top 40.
Steps to Reproduce
Logs & Screenshots
There are no error log or screenshots necessary.
Additional Information
Way to kind of "fix" it right now: make Top K in chat settings equal to Top K in Admin Panel. In this case pgvector.py applies this Top K limit to results, utils.py reverses the whole list and still returns all the results, but in a wrong order. But since the model sees all the Top K results now it finds the correct information to answer the question.
@rgaricano commented on GitHub (Mar 20, 2025):
this problem isn't of pgvector, is of chromadb: https://github.com/open-webui/open-webui/pull/11876
it need some work: https://github.com/open-webui/open-webui/issues/8478#issuecomment-2739582603
@naliwai commented on GitHub (Mar 20, 2025):
Not chromadb. This seems to be a general problem for all databases that return the actual distance where 0.0 means the best match.
@almajo commented on GitHub (Mar 25, 2025):
I think the TOP-K Parameter from the Chat settings are model-related and not with retrieval.
However, I think the main point why it doesn't find the best results is query generation. For each query, the top-k results are searched in the database which are then limited incorrectly because as you've said, pgvector uses distance - not similarity.
So yes, I think the linked issues are very related and we should continue the fix from the latest dev branch.
Additionally: when tackling this, we should also make sure to get the "relevancy"-score in the frontend. Otherwise users will be very surprised with a relevancy score of ,e.g., 0.2 - even though it's a very good cosine distance. Also, the most relevant citations will be in the end.
@mahenning commented on GitHub (Mar 27, 2025):
Should be fixed in https://github.com/open-webui/open-webui/pull/12050. It's merged into the dev branch, so hopefully it's in the next version. For pgvector especially, it picks the k worst results if you don't use hybrid search, and had the relevance score reversed. For chromadb, it at least used the best k results but botched the relevance score again.
Also note that (retrieval) top k=3 is a bit low and can lead to inacurate results even with the fix above.
If you mean the Top K from the screenshot above, this chat Top K is NOT for retrieval. It's for generating the token of the actual answer, limiting the LLM to pick the next token from the distribution of the best k token candidates. It just (unfortunately) shares the same name.
@Phlogi commented on GitHub (Apr 1, 2025):
@naliwai This should be fixed in 0.6.0, did you test?
@naliwai commented on GitHub (Jun 2, 2025):
Sorry for late response. Yes, it seems to work in the later releases.