issue: Qwen3-Reranker not working with Open WebUI #5870

New Issue

GiteaMirror · 2025-11-11T16:36:25-06:00

GiteaMirror commented

2025-11-11 16:36:25 -06:00

Originally created by @YetheSamartaka on GitHub (Jul 25, 2025).

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.18

Ollama Version (if applicable)

No response

Operating System

Debian

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Qwen3-Reranker working as other reranking models

Actual Behavior

Qwen3-Reranker not working and resulting in errors whereas other rerankers (Such as mixedbread-ai/mxbai-rerank-xsmall-v1 do work) and here are probably the most relevant part:
ValueError: Cannot handle batch sizes > 1 if no padding token is defined.

Steps to Reproduce

Have a working RAG setup without the hybrid search and ask a question you know that can be answered form the provided content (I use Postgres as my main db, qdrant as a vector db and dengcao/Qwen3-Embedding-0.6B:Q8_0 as embedding model that is run via latest Ollama, Text Splitter is set to Token via Tiktoken)
Use the Qwen3-Reranker (In my case I'm trying out the Qwen/Qwen3-Reranker-0.6B) with the Reranking Engine Default (SentenceTransformers)
Observe that Qwen3-Reranker is not working
Switch reranker to for example mixedbread-ai/mxbai-rerank-xsmall-v1 and repeat it --> setup is working as supposed.

Logs & Screenshots

Whole log:
Qwen3-Reranker-errors.txt

Additional Information

No response

Originally created by @YetheSamartaka on GitHub (Jul 25, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.18 ### Ollama Version (if applicable) _No response_ ### Operating System Debian ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Qwen3-Reranker working as other reranking models ### Actual Behavior Qwen3-Reranker not working and resulting in errors whereas other rerankers (Such as mixedbread-ai/mxbai-rerank-xsmall-v1 do work) and here are probably the most relevant part: ValueError: Cannot handle batch sizes > 1 if no padding token is defined. ### Steps to Reproduce 1. Have a working RAG setup without the hybrid search and ask a question you know that can be answered form the provided content (I use Postgres as my main db, qdrant as a vector db and dengcao/Qwen3-Embedding-0.6B:Q8_0 as embedding model that is run via latest Ollama, Text Splitter is set to Token via Tiktoken) 2. Use the Qwen3-Reranker (In my case I'm trying out the Qwen/Qwen3-Reranker-0.6B) with the Reranking Engine Default (SentenceTransformers) 3. Observe that Qwen3-Reranker is not working 4. Switch reranker to for example mixedbread-ai/mxbai-rerank-xsmall-v1 and repeat it --> setup is working as supposed. ### Logs & Screenshots Whole log: [Qwen3-Reranker-errors.txt](https://github.com/user-attachments/files/21428262/Qwen3-Reranker-errors.txt) ### Additional Information _No response_

GiteaMirror added the bug label 2025-11-11 16:36:25 -06:00

GiteaMirror closed this issue

2025-11-11 16:36:25 -06:00

GiteaMirror commented

2025-11-11 16:36:26 -06:00

@rgaricano commented on GitHub (Jul 25, 2025):

the error seem clear: Cannot handle batch sizes > 1 if no padding token is defined
As I read in https://github.com/babylm/evaluation-pipeline-2023/issues/5 a workaround could be to edit model's config.json and add a "pad_token_id": 151645, line (that is the same value as eos_token_id have).

@rgaricano commented on GitHub (Jul 25, 2025): the error seem clear: `Cannot handle batch sizes > 1 if no padding token is defined` As I read in https://github.com/babylm/evaluation-pipeline-2023/issues/5 a workaround could be to edit model's config.json and add a `"pad_token_id": 151645,` line (that is the same value as `eos_token_id` have).

GiteaMirror commented

2025-11-11 16:36:26 -06:00

@YetheSamartaka commented on GitHub (Jul 25, 2025):

If that model would then appear in Models, then I guess yeah, but how do I edit the config right now in the Open WebUI?

@YetheSamartaka commented on GitHub (Jul 25, 2025): If that model would then appear in Models, then I guess yeah, but how do I edit the config right now in the Open WebUI?

GiteaMirror commented

2025-11-11 16:36:26 -06:00

@rgaricano commented on GitHub (Jul 25, 2025):

Copid from same in Discord:
you can clone the HF repo, edit and use yours

Well, I meant apart from that of course 😄

probably yes, there are a model cached in backend/data/embedding/models, you can locate the config and edit it, it's inside blobs dir but without names, just a long numberID (smallers one), I never tried & as I saw there are more than one with that config, maybe you can try this way,..., but I think easier clone HF repo (cloned in your HF account, it's not necessary download it)

xD
Yeah, I will go with the clone option for now and edit it to see how it performs 😄
I wonder why such parameter is not included there tho 🤔

yea, probably there a reason!

@rgaricano commented on GitHub (Jul 25, 2025): Copid from same in Discord: you can clone the HF repo, edit and use yours > Well, I meant apart from that of course 😄 probably yes, there are a model cached in backend/data/embedding/models, you can locate the config and edit it, it's inside blobs dir but without names, just a long numberID (smallers one), I never tried & as I saw there are more than one with that config, maybe you can try this way,..., but I think easier clone HF repo (cloned in your HF account, it's not necessary download it) > xD Yeah, I will go with the clone option for now and edit it to see how it performs 😄 I wonder why such parameter is not included there tho 🤔 yea, probably there a reason!

GiteaMirror referenced this issue

2025-11-11 17:59:56 -06:00

[PR #5870] [MERGED] i18n: Update Chinese translation #8566

GiteaMirror referenced this issue

2026-04-20 03:42:49 -05:00

[PR #5870] [MERGED] i18n: Update Chinese translation #21770

GiteaMirror referenced this issue

2026-04-25 10:54:17 -05:00

[PR #5870] [MERGED] i18n: Update Chinese translation #37400

GiteaMirror referenced this issue

2026-04-29 19:01:46 -05:00

[PR #5870] [MERGED] i18n: Update Chinese translation #44818