[GH-ISSUE #12610] Can't disable thinking for qwen3:30b on API & CLI #34131

Closed
opened 2026-04-22 17:25:58 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @notdanilo on GitHub (Oct 14, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12610

What is the issue?

Bug description
Disabling thinking both on API and CLI is not working for qwen3:30b.

Environment
Windows 11
ollama version is 0.12.3
qwen3:30b updated @ Oct 10, 2025 8:18 PM UTC

Steps to reproduce

Danilo@DESKTOP-KF9E5B1 MINGW64 /d/dev/sensorial/systems/agents (main)
$ ollama run qwen3:30b
>>> /set nothink
Set 'nothink' mode.
>>> Hello
Okay, the user said "Hello". I need to respond appropriately. Let me think.

First, I should greet them back. Maybe say "Hello! How can I assist you today?" That's friendly and open-ended.

Wait, the user might be testing if I'm working. But since it's a simple greeting, a standard response should be fine. Let me check if there's any context I'm missing. The user's message is just "Hello", so no additional info.

I should keep it simple and polite. Avoid being too long. "Hello! How can I assist you today?" seems good. Maybe add an emoji to make it friendly, like a smiley face. But the user didn't specify, so maybe stick to the standard.

Wait, the previous response in the example had "Hello! How can I assist you today? 😊" So maybe include the emoji. Let me confirm: the user's message is "Hello", so the assistant should respond with a greeting and offer help.

Yes, that's standard. So the response should be "Hello! How can I assist you today? 😊" or similar. Let me make sure there's no typo. "Assist" is spelled correctly. The emoji is a smiley, which is common in such responses.

I think that's it. Just a friendly greeting and an offer to help. No need for more unless the user asks something else.
</think>

Hello! How can I assist you today? 😊

Expected behavior
Stop thinking

Originally created by @notdanilo on GitHub (Oct 14, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12610 ### What is the issue? **Bug description** Disabling thinking both on API and CLI is not working for `qwen3:30b`. **Environment** Windows 11 ollama version is 0.12.3 qwen3:30b updated @ Oct 10, 2025 8:18 PM UTC **Steps to reproduce** ``` Danilo@DESKTOP-KF9E5B1 MINGW64 /d/dev/sensorial/systems/agents (main) $ ollama run qwen3:30b >>> /set nothink Set 'nothink' mode. >>> Hello Okay, the user said "Hello". I need to respond appropriately. Let me think. First, I should greet them back. Maybe say "Hello! How can I assist you today?" That's friendly and open-ended. Wait, the user might be testing if I'm working. But since it's a simple greeting, a standard response should be fine. Let me check if there's any context I'm missing. The user's message is just "Hello", so no additional info. I should keep it simple and polite. Avoid being too long. "Hello! How can I assist you today?" seems good. Maybe add an emoji to make it friendly, like a smiley face. But the user didn't specify, so maybe stick to the standard. Wait, the previous response in the example had "Hello! How can I assist you today? 😊" So maybe include the emoji. Let me confirm: the user's message is "Hello", so the assistant should respond with a greeting and offer help. Yes, that's standard. So the response should be "Hello! How can I assist you today? 😊" or similar. Let me make sure there's no typo. "Assist" is spelled correctly. The emoji is a smiley, which is common in such responses. I think that's it. Just a friendly greeting and an offer to help. No need for more unless the user asks something else. </think> Hello! How can I assist you today? 😊 ``` **Expected behavior** Stop thinking
GiteaMirror added the bug label 2026-04-22 17:25:58 -05:00
Author
Owner

@notdanilo commented on GitHub (Oct 14, 2025):

Possibly related to #12575

<!-- gh-comment-id:3402044731 --> @notdanilo commented on GitHub (Oct 14, 2025): Possibly related to #12575
Author
Owner

@rick-github commented on GitHub (Oct 14, 2025):

qwen3:30b is an alias for qwen3:30b-a3b-thinking-2507-q4_K_M, a thinking-only model. If you want to use a model that does not think, use qwen3:30b-a3b-instruct-2507-q4_K_M. If you want to be able to turn thinking on and off, use the hybrid model, qwen3:30b-a3b-q4_K_M.

<!-- gh-comment-id:3402631947 --> @rick-github commented on GitHub (Oct 14, 2025): qwen3:30b is an alias for qwen3:30b-a3b-thinking-2507-q4_K_M, a thinking-only model. If you want to use a model that does not think, use qwen3:30b-a3b-instruct-2507-q4_K_M. If you want to be able to turn thinking on and off, use the hybrid model, qwen3:30b-a3b-q4_K_M.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34131