[GH-ISSUE #12837] qwen3-vl: Can't turn off thinking #55019

Closed
opened 2026-04-29 08:11:53 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @pks on GitHub (Oct 29, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12837

What is the issue?

It seems that the --think=false setting has no effect:

> echo "test" | ollama run qwen3-vl:8b --think=false
Thinking...
Okay, the user sent "test". Let me figure out what they need.

Hmm, maybe they're checking if the system is working. Or they might be testing the response time. It's a pretty generic message, so I should respond in a friendly way.

I should acknowledge their message and ask how I can help. Keep it open-ended so they can specify their request. Let me make sure the response is polite and helpful. Maybe add an emoji to
keep it friendly.

Wait, but they might be new users, so I should be welcoming. Let me check if there's any specific context I'm missing. Since it's just "test", probably just a test message. Alright, I'll
respond with a greeting and an offer to assist.
...done thinking.



Hi there! 👋 It looks like you're testing the system—no worries, I'm here to help! 😊 How can I assist you today? Let me know what you need!

Also tested with the python client, setting think=False when calling .generate(). Tested with both 8b and 32b variants.

Relevant log output


OS

Linux

GPU

RTX A4000

CPU

Intel Xeon

Ollama version

ollama version is 0.12.7-rc0

Originally created by @pks on GitHub (Oct 29, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12837 ### What is the issue? It seems that the `--think=false` setting has no effect: ``` > echo "test" | ollama run qwen3-vl:8b --think=false Thinking... Okay, the user sent "test". Let me figure out what they need. Hmm, maybe they're checking if the system is working. Or they might be testing the response time. It's a pretty generic message, so I should respond in a friendly way. I should acknowledge their message and ask how I can help. Keep it open-ended so they can specify their request. Let me make sure the response is polite and helpful. Maybe add an emoji to keep it friendly. Wait, but they might be new users, so I should be welcoming. Let me check if there's any specific context I'm missing. Since it's just "test", probably just a test message. Alright, I'll respond with a greeting and an offer to assist. ...done thinking. Hi there! 👋 It looks like you're testing the system—no worries, I'm here to help! 😊 How can I assist you today? Let me know what you need! ``` Also tested with the python client, setting `think=False` when calling `.generate()`. Tested with both 8b and 32b variants. ### Relevant log output ```shell ``` ### OS Linux ### GPU RTX A4000 ### CPU Intel Xeon ### Ollama version ollama version is 0.12.7-rc0
GiteaMirror added the bug label 2026-04-29 08:11:53 -05:00
Author
Owner

@pdevine commented on GitHub (Oct 29, 2025):

@pks You need one of the instruct models and not one of the thinking models. Unfortunately the Qwen team split those models apart. You can find a list of the qwen3-vl models here.

<!-- gh-comment-id:3463875455 --> @pdevine commented on GitHub (Oct 29, 2025): @pks You need one of the `instruct` models and not one of the `thinking` models. Unfortunately the Qwen team split those models apart. You can find a list of the qwen3-vl models [here](https://ollama.com/library/qwen3-vl/tags).
Author
Owner

@pdevine commented on GitHub (Oct 29, 2025):

Also, sorry this is so confusing!

<!-- gh-comment-id:3463879509 --> @pdevine commented on GitHub (Oct 29, 2025): Also, sorry this is so confusing!
Author
Owner

@pks commented on GitHub (Oct 29, 2025):

Oh, didn't notice these. Thanks for the quick reply @pdevine!

<!-- gh-comment-id:3464193308 --> @pks commented on GitHub (Oct 29, 2025): Oh, didn't notice these. Thanks for the quick reply @pdevine!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55019