issue: pgvector results are sorted incorrectly lead to no match in obvious cases #4507

New Issue

GiteaMirror · 2025-11-11T15:55:39-06:00

GiteaMirror commented

2025-11-11 15:55:39 -06:00

Originally created by @naliwai on GitHub (Mar 20, 2025).

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.5.20

Ollama Version (if applicable)

No response

Operating System

Linux

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have listed steps to reproduce the bug in detail.

Expected Behavior

Queries to pgvector return results already sorted according to cosine distance. The smaller the results the better the match. The best results are on the first places in the result list. So when choosing Top K results the first K results should be taken without reversing them as it is done in utils.py.

Actual Behavior

Queries to pgvector are being sorted in reverse order according to utils.py. So the first results in list are the results with the worst match.
Moreover, there are two places that define Top K in open WebUI.

One place in Chat Settings, where Top K is by default 40. (What's the point?)
Another place in Admin panel -> Documents -> Top K where it is typically something like 3.
It seems when the query to the pgvector is sent, it uses the first Top K = 40 to limit the amount of results returned from the database. And later after reversing the order of documents it takes Top K from the Admin Panel hence choosing the worst 3 from the top 40.

Steps to Reproduce

Setup open WebUI with pgvector backend
Upload multi-page/multi-chunk document (like 10 pages)
Search for some clear and obvious text (e.g. the title on the first page)
Observe no match found
Increase Top K in Admin Panel -> Documents to the amount of chunks
Observe in citations that the best matching page has the lowest assigned relevance score

Logs & Screenshots

There are no error log or screenshots necessary.

Additional Information

Way to kind of "fix" it right now: make Top K in chat settings equal to Top K in Admin Panel. In this case pgvector.py applies this Top K limit to results, utils.py reverses the whole list and still returns all the results, but in a wrong order. But since the model sees all the Top K results now it finds the correct information to answer the question.

Originally created by @naliwai on GitHub (Mar 20, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.5.20 ### Ollama Version (if applicable) _No response_ ### Operating System Linux ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior Queries to pgvector return results already sorted according to cosine distance. The smaller the results the better the match. The best results are on the first places in the result list. So when choosing Top K results the first K results should be taken without reversing them as it is done in utils.py. ### Actual Behavior Queries to pgvector are being sorted in reverse order according to utils.py. So the first results in list are the results with the worst match. Moreover, there are two places that define Top K in open WebUI. - One place in Chat Settings, where Top K is by default 40. (What's the point?) - Another place in Admin panel -> Documents -> Top K where it is typically something like 3. It seems when the query to the pgvector is sent, it uses the first Top K = 40 to limit the amount of results returned from the database. And later after reversing the order of documents it takes Top K from the Admin Panel hence choosing the worst 3 from the top 40. ### Steps to Reproduce 1. Setup open WebUI with pgvector backend 2. Upload multi-page/multi-chunk document (like 10 pages) 3. Search for some clear and obvious text (e.g. the title on the first page) 4. Observe no match found 5. Increase Top K in Admin Panel -> Documents to the amount of chunks 6. Observe in citations that the best matching page has the lowest assigned relevance score ### Logs & Screenshots There are no error log or screenshots necessary. ### Additional Information Way to kind of "fix" it right now: make Top K in chat settings equal to Top K in Admin Panel. In this case pgvector.py applies this Top K limit to results, utils.py reverses the whole list and still returns all the results, but in a wrong order. But since the model sees all the Top K results now it finds the correct information to answer the question.

GiteaMirror added the bug label 2025-11-11 15:55:39 -06:00

GiteaMirror closed this issue

2025-11-11 15:55:39 -06:00

GiteaMirror commented

2025-11-11 15:55:40 -06:00

@rgaricano commented on GitHub (Mar 20, 2025):

this problem isn't of pgvector, is of chromadb: https://github.com/open-webui/open-webui/pull/11876

it need some work: https://github.com/open-webui/open-webui/issues/8478#issuecomment-2739582603

@rgaricano commented on GitHub (Mar 20, 2025): this problem isn't of pgvector, is of chromadb: https://github.com/open-webui/open-webui/pull/11876 it need some work: https://github.com/open-webui/open-webui/issues/8478#issuecomment-2739582603

GiteaMirror commented

2025-11-11 15:55:40 -06:00

@naliwai commented on GitHub (Mar 20, 2025):

Not chromadb. This seems to be a general problem for all databases that return the actual distance where 0.0 means the best match.

@naliwai commented on GitHub (Mar 20, 2025): Not chromadb. This seems to be a general problem for all databases that return the actual distance where 0.0 means the best match.

GiteaMirror commented

2025-11-11 15:55:40 -06:00

@almajo commented on GitHub (Mar 25, 2025):

I think the TOP-K Parameter from the Chat settings are model-related and not with retrieval.

However, I think the main point why it doesn't find the best results is query generation. For each query, the top-k results are searched in the database which are then limited incorrectly because as you've said, pgvector uses distance - not similarity.

So yes, I think the linked issues are very related and we should continue the fix from the latest dev branch.

Additionally: when tackling this, we should also make sure to get the "relevancy"-score in the frontend. Otherwise users will be very surprised with a relevancy score of ,e.g., 0.2 - even though it's a very good cosine distance. Also, the most relevant citations will be in the end.

@almajo commented on GitHub (Mar 25, 2025): I think the TOP-K Parameter from the Chat settings are model-related and not with retrieval. However, I think the main point why it doesn't find the best results is query generation. For each query, the top-k results are searched in the database which are then limited incorrectly because as you've said, pgvector uses distance - not similarity. So yes, I think the linked issues are very related and we should continue the fix from the latest dev branch. Additionally: when tackling this, we should also make sure to get the "relevancy"-score in the frontend. Otherwise users will be very surprised with a relevancy score of ,e.g., 0.2 - even though it's a very good cosine distance. Also, the most relevant citations will be in the end.

GiteaMirror commented

2025-11-11 15:55:40 -06:00

@mahenning commented on GitHub (Mar 27, 2025):

Should be fixed in https://github.com/open-webui/open-webui/pull/12050. It's merged into the dev branch, so hopefully it's in the next version. For pgvector especially, it picks the k worst results if you don't use hybrid search, and had the relevance score reversed. For chromadb, it at least used the best k results but botched the relevance score again.
Also note that (retrieval) top k=3 is a bit low and can lead to inacurate results even with the fix above.

If you mean the Top K from the screenshot above, this chat Top K is NOT for retrieval. It's for generating the token of the actual answer, limiting the LLM to pick the next token from the distribution of the best k token candidates. It just (unfortunately) shares the same name.

@mahenning commented on GitHub (Mar 27, 2025): Should be fixed in https://github.com/open-webui/open-webui/pull/12050. It's merged into the dev branch, so hopefully it's in the next version. For pgvector especially, it picks the k worst results if you don't use hybrid search, and had the relevance score reversed. For chromadb, it at least used the best k results but botched the relevance score again. Also note that (retrieval) top k=3 is a bit low and can lead to inacurate results even with the fix above. ![Image](https://github.com/user-attachments/assets/b5973285-5c6d-49ef-af53-638ddf0bd5e4) If you mean the Top K from the screenshot above, this chat Top K is NOT for retrieval. It's for generating the token of the actual answer, limiting the LLM to pick the next token from the distribution of the best k token candidates. It just (unfortunately) shares the same name.

GiteaMirror commented

2025-11-11 15:55:40 -06:00

@Phlogi commented on GitHub (Apr 1, 2025):

@naliwai This should be fixed in 0.6.0, did you test?

@Phlogi commented on GitHub (Apr 1, 2025): @naliwai This should be fixed in 0.6.0, did you test?

GiteaMirror commented

2025-11-11 15:55:41 -06:00

@naliwai commented on GitHub (Jun 2, 2025):

Sorry for late response. Yes, it seems to work in the later releases.

@naliwai commented on GitHub (Jun 2, 2025): Sorry for late response. Yes, it seems to work in the later releases.

GiteaMirror referenced this issue

2026-04-19 20:18:22 -05:00