mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 03:18:23 -05:00
[GH-ISSUE #8478] feat: Allow reranker to be accessed via API instead of local model #53806
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @GrayXu on GitHub (Jan 11, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8478
Is your feature request related to a problem? Please describe.
Currently, the reranker model used in RAG can only be run locally after being pulled. However, there are now many MaaS providers offering rerankers. I would like to use reranker models via API within open-webui, which would make the server much lighter.
Describe the solution you'd like
Although the OpenAI API does not have a reranker API, there is a widely used API pattern for rerankers in MaaS, such as
/v1/rerank, used by projects like siliconflow, xinference, and api-for-open-llm.Like this:
Thanks
@tjbck commented on GitHub (Jan 13, 2025):
PR Welcome here!
@GrayXu commented on GitHub (Jan 22, 2025):
Thanks for your reply, I'll see what I can do.
@rjmalagon commented on GitHub (Mar 11, 2025):
Llama.cpp server has an initial support for reranking jobs.
https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md#post-reranking-rerank-documents-according-to-a-given-query
Mixedbread AI released their Qwen2 based rerankers, mxbai-rerank-base-v2 and mxbai-rerank-large-v2.
https://huggingface.co/mixedbread-ai/mxbai-rerank-base-v2
https://huggingface.co/mixedbread-ai/mxbai-rerank-large-v2
I think these high quality rerankers are a good example of models that can be supported by llama.cpp. It would be nice to offload reranking to llama.cpp servers too.
@ArtrixTech commented on GitHub (Mar 13, 2025):
need this!
@niyouzhu commented on GitHub (Mar 15, 2025):
support!
@Jotakak-yu commented on GitHub (Mar 20, 2025):
need this too
@rgaricano commented on GitHub (Mar 20, 2025):
before it, rerank need to be improved with PRs : https://github.com/open-webui/open-webui/pull/11814, https://github.com/open-webui/open-webui/pull/11497 & https://github.com/open-webui/open-webui/pull/11876
Some of us are doing tests by manually integrating these three (not possible making direct pull requests with all togheter) it seem that it work better. ;)
@bet0x commented on GitHub (Mar 20, 2025):
Implementing this should be trivial at best.
1.- install any embedding or reranker using https://michaelfeil.eu/infinity/main/deploy/
Changes on the RerankCompressor at backend/open_webui/retrieval/utils.py:
You will also need to change the query_doc_with_hybrid_search function:
@Ithanil commented on GitHub (Mar 21, 2025):
@bet0x Wonderful! Please make this a PR, although I didn't check how much conflict there will be with the other PRs mentioned above. Maybe it's good to have them merged first.
That said, I think the function
rerank_remoteis missing.@bet0x commented on GitHub (Mar 24, 2025):
@Ithanil Hello. I had a busy week , if someone is willing to take on the PR and quote me to check & implement anything missing would it be of help!
@Phlogi commented on GitHub (Mar 24, 2025):
I'm willing to prepare a PR for this after the other open PRs get merged. I think we could run the remote reranking in parallel (as with my https://github.com/open-webui/open-webui/pull/11814) too; but should make it configurable due to possible rate limiting.
@athoik commented on GitHub (Apr 24, 2025):
Hello,
Is somebody still working on this?
I really support that feature! It's a must have!
Thank you!
@RAPHCVR commented on GitHub (Apr 30, 2025):
Very interesting feature ! Thanks for referencing this.
@tjbck commented on GitHub (May 5, 2025):
Related #13261
@jescalada commented on GitHub (May 10, 2025):
Hi folks, I made a proof-of-concept PR based on the suggestions by @bet0x. Let me know if it works for your specific API/workflow! #13745
@athoik commented on GitHub (May 10, 2025):
... And we officially have external reranker via
d5fd3b3600Kudos to everyone involved!
@tjbck commented on GitHub (May 10, 2025):
As mentioned by @athoik, this should be addressed in dev with
d5fd3b3600. Testing wanted here!@athoik commented on GitHub (May 10, 2025):
Testing soon!
@athoik commented on GitHub (May 10, 2025):
PS a minor issue building interface...
The last comma is causing the build to fail.
@athoik commented on GitHub (May 10, 2025):
We have an minor error when using external reranker...
The following fixes the issue:
I also think we need async or multithread on external rerarker to speed up exection.
@tjbck commented on GitHub (May 10, 2025):
Addresed with https://github.com/open-webui/open-webui/pull/13751, other enhancement PRs welcome!