[GH-ISSUE #22518] feat: Disable Qwen 3.5 thinking mode from API calls #58396

Closed
opened 2026-05-05 23:05:42 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @kukalikuk on GitHub (Mar 10, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22518

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

I'm on OWUI 0.8.10 using API calls to connect to LM Studio.
Everytime I use Qwen 3.5 (35B/27B/9B) it always thinking, even for follow up, image prompt, chat title, etc.
I've tried adding /no_think to the prompt, modify system prompt, making function/filter in the OWUI while consulting to Gemini, none of it worked. Qwen 3.5 is a stubborn thinker. It even thinking the /no_think prompt I added.
The only thing worked is adding {%- set enable_thinking = false -%} in the LM Studio template which means disable thinking for all OWUI access.

Desired Solution you'd like

Make a thinking toggle from inside OWUI which optional. So we can use the thinking mode just when needed.

Alternatives Considered

Gemini suggests so many methods which none of them worked inside OWUI.

Additional Context

No response

Originally created by @kukalikuk on GitHub (Mar 10, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/22518 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description I'm on OWUI 0.8.10 using API calls to connect to LM Studio. Everytime I use Qwen 3.5 (35B/27B/9B) it always thinking, even for follow up, image prompt, chat title, etc. I've tried adding /no_think to the prompt, modify system prompt, making function/filter in the OWUI while consulting to Gemini, none of it worked. Qwen 3.5 is a stubborn thinker. It even thinking the /no_think prompt I added. The only thing worked is adding {%- set enable_thinking = false -%} in the LM Studio template which means disable thinking for all OWUI access. ### Desired Solution you'd like Make a thinking toggle from inside OWUI which optional. So we can use the thinking mode just when needed. ### Alternatives Considered Gemini suggests so many methods which none of them worked inside OWUI. ### Additional Context _No response_
Author
Owner

@Classic298 commented on GitHub (Mar 10, 2026):

you can use advanced parameters for that in open webui to turn off the thinking as discussed here https://github.com/open-webui/open-webui/issues/21893

duplicate

<!-- gh-comment-id:4029468460 --> @Classic298 commented on GitHub (Mar 10, 2026): you can use advanced parameters for that in open webui to turn off the thinking as discussed here https://github.com/open-webui/open-webui/issues/21893 duplicate
Author
Owner

@kukalikuk commented on GitHub (Mar 10, 2026):

you can use advanced parameters for that in open webui to turn off the thinking as discussed here #21893

duplicate

Tried most of their method in that discussion and failed, some of them still mixing between api calls method and llama backend. Even the last method mentioned by them, add chat_template_kwargs
"enable_thinking": false to additional parameter also not working for api call to LM Studio.

<!-- gh-comment-id:4031505042 --> @kukalikuk commented on GitHub (Mar 10, 2026): > you can use advanced parameters for that in open webui to turn off the thinking as discussed here [#21893](https://github.com/open-webui/open-webui/issues/21893) > > duplicate Tried most of their method in that discussion and failed, some of them still mixing between api calls method and llama backend. Even the last method mentioned by them, add chat_template_kwargs "enable_thinking": false to additional parameter also not working for api call to LM Studio.
Author
Owner

@cgoudie commented on GitHub (May 6, 2026):

+1, hitting this exact issue with qwen3.5:9b on the Ollama backend through OpenWebUI. Sharing what we found tracing through the source in case it helps the next person searching.

TL;DR: For an Ollama backend, top-level think: false in an OpenAI-compat request body is silently dropped, but options: {"think": false} works.

Reproduction (qwen3.5:9b, Ollama backend, no Modelfile changes):

think OFF think ON
Total duration 6 s 98.7 s
Generated tokens 130 3,438
Thinking text 0 chars 12,663 chars
Output quality clean JSON same JSON, ~16× slower

Same prompt (4-item description batch with a small system message), only the think flag changes. Output is identical in quality — thinking adds no value for a "summarize each item" task.

Why top-level think is dropped on the Ollama path: convert_payload_openai_to_ollama in backend/open_webui/utils/payload.py (around L277–L325) rebuilds the payload from a hardcoded allowlist. It promotes options.think to top-level think (so Ollama sees it), but only reads think from inside options, never from the root.

Workarounds that work for the Ollama backend:

  1. Per-request: send {"options": {"think": false}, ...} — the converter lifts it through.
  2. Per-model: Admin → Models → qwen3.5:9b → Advanced Params → "Think (Ollama)" = Off. Sets params.think in the model row.
  3. Cleaner endpoint: /ollama/v1/chat/completions skips most middleware vs /api/chat/completions while keeping OpenAI shape. Still applies params.think from #2.

For the LM Studio backend the OP described, the converter path is different and chat_template_kwargs.enable_thinking=false should be the equivalent — but per the OP's testing it doesn't currently propagate. Probably worth a separate issue specifically for the LM Studio code path.

<!-- gh-comment-id:4384278672 --> @cgoudie commented on GitHub (May 6, 2026): +1, hitting this exact issue with qwen3.5:9b on the Ollama backend through OpenWebUI. Sharing what we found tracing through the source in case it helps the next person searching. **TL;DR**: For an Ollama backend, top-level `think: false` in an OpenAI-compat request body is silently dropped, but `options: {"think": false}` works. **Reproduction (qwen3.5:9b, Ollama backend, no Modelfile changes):** | | think OFF | think ON | |---|---|---| | Total duration | 6 s | 98.7 s | | Generated tokens | 130 | 3,438 | | Thinking text | 0 chars | 12,663 chars | | Output quality | clean JSON | same JSON, ~16× slower | Same prompt (4-item description batch with a small system message), only the `think` flag changes. Output is *identical in quality* — thinking adds no value for a "summarize each item" task. **Why top-level `think` is dropped on the Ollama path**: `convert_payload_openai_to_ollama` in `backend/open_webui/utils/payload.py` (around L277–L325) rebuilds the payload from a hardcoded allowlist. It promotes `options.think` to top-level `think` (so Ollama sees it), but only reads `think` from inside `options`, never from the root. **Workarounds that work for the Ollama backend:** 1. **Per-request**: send `{"options": {"think": false}, ...}` — the converter lifts it through. 2. **Per-model**: Admin → Models → qwen3.5:9b → Advanced Params → "Think (Ollama)" = Off. Sets `params.think` in the model row. 3. **Cleaner endpoint**: `/ollama/v1/chat/completions` skips most middleware vs `/api/chat/completions` while keeping OpenAI shape. Still applies `params.think` from #2. For the LM Studio backend the OP described, the converter path is different and `chat_template_kwargs.enable_thinking=false` should be the equivalent — but per the OP's testing it doesn't currently propagate. Probably worth a separate issue specifically for the LM Studio code path.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58396