[PR #20228] [MERGED] fix: normalize local CrossEncoder reranking scores for relevance threshold #41143

New Issue

2026-04-25T13:27:00-05:00

GiteaMirror commented

2026-04-25 13:27:00 -05:00

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/20228
Author: @Classic298
Created: 12/28/2025
Status: ✅ Merged
Merged: 12/31/2025
Merged by: @tjbck

Base: dev ← Head: patch-1

📝 Commits (6)

d3e4cea Update utils.py
2d9d0a2 Update retrieval.py
2555177 Update utils.py
039e73f Update retrieval.py
fc9620e add env var
d80be8b rename to SENTENCE_TRANSFORMERS_CROSS_ENCODER_SIGMOID_ACTIVATION_FUNCTION

📊 Changes

2 files changed (+14 additions, -0 deletions)

View changed files

📝 backend/open_webui/env.py (+7 -0)
📝 backend/open_webui/routers/retrieval.py (+7 -0)

📄 Description

Description:

Fix: Normalize local CrossEncoder reranking scores to 0-1 range

The relevance threshold setting (0-1 range in the UI) doesn't work correctly with local CrossEncoder reranking models because MS MARCO models (the most commonly used rerankers) return raw logits (roughly -10 to +10) instead of normalized scores. When a user sets a threshold of 0.5, practically everything passes because even poor matches score above 0.5 in logit space.

Solution: Pass activation_fn=torch.nn.Sigmoid() to the CrossEncoder constructor. This is the approach recommended by the sentence-transformers documentation for MS MARCO models and has the following benefits:

Normalization runs on-device during inference rather than post-hoc in Python
Threshold behavior now works as expected (0.5 = 50% relevance confidence)
External rerankers are unaffected (they already return normalized scores via their APIs)

"Note: You can initialize these models with activation_fn=torch.nn.Sigmoid() to force the model to return scores between 0 and 1. Otherwise, the raw value can reasonably range between -10 and 10.

Example code: model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2", activation_fn=torch.nn.Sigmoid())"

Source: https://www.sbert.net/docs/cross_encoder/pretrained_models.html

Edge case: Models that already output 0-1 scores (like STS-B) will have sigmoid applied twice, compressing their output range. However, STS-B models are designed for semantic similarity, not reranking and should not be used for reranking - in practice, reranking pipelines use MS MARCO or BGE models. Even if someone did use an STS-B model, ranking order is preserved (sigmoid is monotonic) and thresholds still function, just with compressed score magnitudes.

RE: https://github.com/open-webui/open-webui/discussions/19999

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/20228 **Author:** [@Classic298](https://github.com/Classic298) **Created:** 12/28/2025 **Status:** ✅ Merged **Merged:** 12/31/2025 **Merged by:** [@tjbck](https://github.com/tjbck) **Base:** `dev` ← **Head:** `patch-1` --- ### 📝 Commits (6) - [`d3e4cea`](https://github.com/open-webui/open-webui/commit/d3e4cea80c3655a01fbb0386b8944be4682dc0b9) Update utils.py - [`2d9d0a2`](https://github.com/open-webui/open-webui/commit/2d9d0a20ccd149b946131f8b27976dee23e1f0a9) Update retrieval.py - [`2555177`](https://github.com/open-webui/open-webui/commit/2555177491c5aa50a6ef718976f48a342402421e) Update utils.py - [`039e73f`](https://github.com/open-webui/open-webui/commit/039e73f65ba41c6f94fea43142c3a0c708a7f08f) Update retrieval.py - [`fc9620e`](https://github.com/open-webui/open-webui/commit/fc9620e5f2f490a807c40c7c143a79c835e7ef58) add env var - [`d80be8b`](https://github.com/open-webui/open-webui/commit/d80be8b2ba2b8b65733bc9c0f62c4a594d46cfaa) rename to SENTENCE_TRANSFORMERS_CROSS_ENCODER_SIGMOID_ACTIVATION_FUNCTION ### 📊 Changes **2 files changed** (+14 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/env.py` (+7 -0) 📝 `backend/open_webui/routers/retrieval.py` (+7 -0) </details> ### 📄 Description **Description:** **Fix: Normalize local CrossEncoder reranking scores to 0-1 range** The relevance threshold setting (0-1 range in the UI) doesn't work correctly with local CrossEncoder reranking models because MS MARCO models (the most commonly used rerankers) return raw logits (roughly -10 to +10) instead of normalized scores. When a user sets a threshold of 0.5, practically everything passes because even poor matches score above 0.5 in logit space. **Solution:** Pass `activation_fn=torch.nn.Sigmoid()` to the CrossEncoder constructor. This is the approach recommended by the [sentence-transformers documentation](https://www.sbert.net/docs/cross_encoder/pretrained_models.html) for MS MARCO models and has the following benefits: - Normalization runs on-device during inference rather than post-hoc in Python - Threshold behavior now works as expected (0.5 = 50% relevance confidence) - External rerankers are unaffected (they already return normalized scores via their APIs) > "Note: You can initialize these models with activation_fn=torch.nn.Sigmoid() to force the model to return scores between 0 and 1. Otherwise, the raw value can reasonably range between -10 and 10. > Example code: `model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2", activation_fn=torch.nn.Sigmoid())`" Source: https://www.sbert.net/docs/cross_encoder/pretrained_models.html Edge case: Models that already output 0-1 scores (like STS-B) will have sigmoid applied twice, compressing their output range. **However, STS-B models are designed for semantic similarity, not reranking and <ins>should not be used for reranking</ins>** - in practice, **reranking pipelines use MS MARCO or BGE models.** Even if someone did use an STS-B model, **ranking order is preserved (sigmoid is monotonic) and thresholds still function**, just with compressed score magnitudes. RE: https://github.com/open-webui/open-webui/discussions/19999 ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-04-25 13:27:00 -05:00

GiteaMirror closed this issue

2026-04-25 13:27:00 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#41143