mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[GH-ISSUE #22518] feat: Disable Qwen 3.5 thinking mode from API calls #58396
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kukalikuk on GitHub (Mar 10, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22518
Check Existing Issues
Verify Feature Scope
Problem Description
I'm on OWUI 0.8.10 using API calls to connect to LM Studio.
Everytime I use Qwen 3.5 (35B/27B/9B) it always thinking, even for follow up, image prompt, chat title, etc.
I've tried adding /no_think to the prompt, modify system prompt, making function/filter in the OWUI while consulting to Gemini, none of it worked. Qwen 3.5 is a stubborn thinker. It even thinking the /no_think prompt I added.
The only thing worked is adding {%- set enable_thinking = false -%} in the LM Studio template which means disable thinking for all OWUI access.
Desired Solution you'd like
Make a thinking toggle from inside OWUI which optional. So we can use the thinking mode just when needed.
Alternatives Considered
Gemini suggests so many methods which none of them worked inside OWUI.
Additional Context
No response
@Classic298 commented on GitHub (Mar 10, 2026):
you can use advanced parameters for that in open webui to turn off the thinking as discussed here https://github.com/open-webui/open-webui/issues/21893
duplicate
@kukalikuk commented on GitHub (Mar 10, 2026):
Tried most of their method in that discussion and failed, some of them still mixing between api calls method and llama backend. Even the last method mentioned by them, add chat_template_kwargs
"enable_thinking": false to additional parameter also not working for api call to LM Studio.
@cgoudie commented on GitHub (May 6, 2026):
+1, hitting this exact issue with qwen3.5:9b on the Ollama backend through OpenWebUI. Sharing what we found tracing through the source in case it helps the next person searching.
TL;DR: For an Ollama backend, top-level
think: falsein an OpenAI-compat request body is silently dropped, butoptions: {"think": false}works.Reproduction (qwen3.5:9b, Ollama backend, no Modelfile changes):
Same prompt (4-item description batch with a small system message), only the
thinkflag changes. Output is identical in quality — thinking adds no value for a "summarize each item" task.Why top-level
thinkis dropped on the Ollama path:convert_payload_openai_to_ollamainbackend/open_webui/utils/payload.py(around L277–L325) rebuilds the payload from a hardcoded allowlist. It promotesoptions.thinkto top-levelthink(so Ollama sees it), but only readsthinkfrom insideoptions, never from the root.Workarounds that work for the Ollama backend:
{"options": {"think": false}, ...}— the converter lifts it through.params.thinkin the model row./ollama/v1/chat/completionsskips most middleware vs/api/chat/completionswhile keeping OpenAI shape. Still appliesparams.thinkfrom #2.For the LM Studio backend the OP described, the converter path is different and
chat_template_kwargs.enable_thinking=falseshould be the equivalent — but per the OP's testing it doesn't currently propagate. Probably worth a separate issue specifically for the LM Studio code path.