[GH-ISSUE #10492] Disable Thinking Mode #6901

Closed
opened 2026-04-12 18:46:49 -05:00 by GiteaMirror · 26 comments
Owner

Originally created by @ChenDianWzh on GitHub (Apr 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10492

With the advent of qwen3, I feel that ollama can add a new parameter when using python calls to control whether the model thinks or not, I hope it can be realized, thank you

Originally created by @ChenDianWzh on GitHub (Apr 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10492 With the advent of qwen3, I feel that ollama can add a new parameter when using python calls to control whether the model thinks or not, I hope it can be realized, thank you
GiteaMirror added the feature request label 2026-04-12 18:46:49 -05:00
Author
Owner

@crazyi commented on GitHub (Apr 30, 2025):

I also want to know how to disable thinking in ollama or freely switch from one to another.

<!-- gh-comment-id:2840653728 --> @crazyi commented on GitHub (Apr 30, 2025): I also want to know how to disable thinking in ollama or freely switch from one to another.
Author
Owner

@yjwu-leadstec commented on GitHub (Apr 30, 2025):

Input /nothink after your prompt

<!-- gh-comment-id:2840743809 --> @yjwu-leadstec commented on GitHub (Apr 30, 2025): Input /nothink after your prompt
Author
Owner

@yebanliuying commented on GitHub (Apr 30, 2025):

That's right, similar to: ollama run qwen3:32b -- enable_thinking=false

<!-- gh-comment-id:2840838910 --> @yebanliuying commented on GitHub (Apr 30, 2025): That's right, similar to: ollama run qwen3:32b -- enable_thinking=false
Author
Owner

@myf5 commented on GitHub (Apr 30, 2025):

In vllm, it provides this switch by putting chat_template_kwargs in the API call. How to do similar in ollama?

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "Qwen/Qwen3-8B",
  "messages": [
    {"role": "user", "content": "Give me a short introduction to large language models."}
  ],
  "temperature": 0.7,
  "top_p": 0.8,
  "top_k": 20,
  "max_tokens": 8192,
  "presence_penalty": 1.5,
  "chat_template_kwargs": {"enable_thinking": false}
}'
<!-- gh-comment-id:2840992537 --> @myf5 commented on GitHub (Apr 30, 2025): In vllm, it provides this switch by putting chat_template_kwargs in the API call. How to do similar in ollama? ``` curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "Qwen/Qwen3-8B", "messages": [ {"role": "user", "content": "Give me a short introduction to large language models."} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }' ```
Author
Owner

@smileyboy2019 commented on GitHub (Apr 30, 2025):

@yebanliuying 在ollama api 里面如何设置

<!-- gh-comment-id:2841411415 --> @smileyboy2019 commented on GitHub (Apr 30, 2025): @yebanliuying 在ollama api 里面如何设置
Author
Owner

@ChenDianWzh commented on GitHub (Apr 30, 2025):

我是想通过python调用 看看能不能禁用思考

---- Replied Message ----
| From | @.> |
| Date | 04/30/2025 17:43 |
| To | ollama/ollama @.
> |
| Cc | ChenDianWzh @.>,
Author @.
> |
| Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) |
smileyboy2019 left a comment (ollama/ollama#10492)

@yebanliuying 在ollama api 里面如何设置


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: @.***>

<!-- gh-comment-id:2841414589 --> @ChenDianWzh commented on GitHub (Apr 30, 2025): 我是想通过python调用 看看能不能禁用思考 ---- Replied Message ---- | From | ***@***.***> | | Date | 04/30/2025 17:43 | | To | ollama/ollama ***@***.***> | | Cc | ChenDianWzh ***@***.***>, Author ***@***.***> | | Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) | smileyboy2019 left a comment (ollama/ollama#10492) @yebanliuying 在ollama api 里面如何设置 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>
Author
Owner

@yjwu-leadstec commented on GitHub (Apr 30, 2025):

Image

Check their blog.

<!-- gh-comment-id:2841423557 --> @yjwu-leadstec commented on GitHub (Apr 30, 2025): ![Image](https://github.com/user-attachments/assets/4420c220-8b4f-4b00-bfe6-9985785b215f) Check their blog.
Author
Owner

@ChenDianWzh commented on GitHub (Apr 30, 2025):

在python中使用ollama 跟这种还不一样呢,希望出一个参数可以控制吧

---- Replied Message ----
| From | Eugene @.> |
| Date | 04/30/2025 17:48 |
| To | ollama/ollama @.
> |
| Cc | ChenDianWzh @.>,
Author @.
> |
| Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) |
yjwu-leadstec left a comment (ollama/ollama#10492)

image.png (view on web)

Check their blog.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: @.***>

<!-- gh-comment-id:2841428250 --> @ChenDianWzh commented on GitHub (Apr 30, 2025): 在python中使用ollama 跟这种还不一样呢,希望出一个参数可以控制吧 ---- Replied Message ---- | From | Eugene ***@***.***> | | Date | 04/30/2025 17:48 | | To | ollama/ollama ***@***.***> | | Cc | ChenDianWzh ***@***.***>, Author ***@***.***> | | Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) | yjwu-leadstec left a comment (ollama/ollama#10492) image.png (view on web) Check their blog. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>
Author
Owner
<!-- gh-comment-id:2841428333 --> @yjwu-leadstec commented on GitHub (Apr 30, 2025): https://bailian.console.aliyun.com/?tab=api#/api/?type=model&url=https%3A%2F%2Fhelp.aliyun.com%2Fdocument_detail%2F2712576.html 或者看看阿里的API文档
Author
Owner

@ChenDianWzh commented on GitHub (Apr 30, 2025):

thank you bro

---- Replied Message ----
| From | Eugene @.> |
| Date | 04/30/2025 17:50 |
| To | ollama/ollama @.
> |
| Cc | ChenDianWzh @.>,
Author @.
> |
| Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) |
yjwu-leadstec left a comment (ollama/ollama#10492)

https://bailian.console.aliyun.com/?tab=api#/api/?type=model&url=https%3A%2F%2Fhelp.aliyun.com%2Fdocument_detail%2F2712576.html

或者看看阿里的API文档


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: @.***>

<!-- gh-comment-id:2841432224 --> @ChenDianWzh commented on GitHub (Apr 30, 2025): thank you bro ---- Replied Message ---- | From | Eugene ***@***.***> | | Date | 04/30/2025 17:50 | | To | ollama/ollama ***@***.***> | | Cc | ChenDianWzh ***@***.***>, Author ***@***.***> | | Subject | Re: [ollama/ollama] Disable Thinking Mode (Issue #10492) | yjwu-leadstec left a comment (ollama/ollama#10492) https://bailian.console.aliyun.com/?tab=api#/api/?type=model&url=https%3A%2F%2Fhelp.aliyun.com%2Fdocument_detail%2F2712576.html 或者看看阿里的API文档 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>
Author
Owner

@kalustian commented on GitHub (May 10, 2025):

.

That's right, similar to: ollama run qwen3:32b -- enable_thinking=false

I agree, I also would like to see in Ollama a built-in parameter to disable the thinking mode. Some times it gets anoying to type /no_think during inference.

<!-- gh-comment-id:2868851094 --> @kalustian commented on GitHub (May 10, 2025): . > That's right, similar to: ollama run qwen3:32b -- enable_thinking=false I agree, I also would like to see in Ollama a built-in parameter to disable the thinking mode. Some times it gets anoying to type /no_think during inference.
Author
Owner

@hintdesk commented on GitHub (May 12, 2025):

It will be great if we can disable it over parameters. I tried it in template but it doesn't work for DeepSeek at all.

Image
<!-- gh-comment-id:2871252434 --> @hintdesk commented on GitHub (May 12, 2025): It will be great if we can disable it over parameters. I tried it in template but it doesn't work for DeepSeek at all. <img width="618" alt="Image" src="https://github.com/user-attachments/assets/93c45aac-6e89-41f2-9e2b-04adc421a4e1" />
Author
Owner

@yjwu-leadstec commented on GitHub (May 12, 2025):

It will be great if we can disable it over parameters. I tried it in template but it doesn't work for DeepSeek at all.

Image

Disable DeepSeek thinking? Only the MOE model of Qwen3 can disable thinking... DeepSeek is not an Qwen3 MOE model...

<!-- gh-comment-id:2871340045 --> @yjwu-leadstec commented on GitHub (May 12, 2025): > It will be great if we can disable it over parameters. I tried it in template but it doesn't work for DeepSeek at all. > > <img alt="Image" width="618" src="https://private-user-images.githubusercontent.com/23084655/442637486-93c45aac-6e89-41f2-9e2b-04adc421a4e1.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDcwMzc0NTYsIm5iZiI6MTc0NzAzNzE1NiwicGF0aCI6Ii8yMzA4NDY1NS80NDI2Mzc0ODYtOTNjNDVhYWMtNmU4OS00MWYyLTllMmItMDRhZGM0MjFhNGUxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA1MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwNTEyVDA4MDU1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZjYjQ4ZDY3ZTRmMjY3MzlkOWJlYjQxNGUwZGE1OWQ4NmVmNjYwMjkwYjFlNjRjYTMzMjNiNjEyNzQ1YWMxODImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.wVNyP1qj1Nt1nJN1Xct1QZuAMWi5y_BvtWIGwgy8gdg"> Disable DeepSeek thinking? Only the MOE model of Qwen3 can disable thinking... DeepSeek is not an Qwen3 MOE model...
Author
Owner

@fslongjin commented on GitHub (May 29, 2025):

I've developed a proxy tool that disables the thinking mode of qwen3 on ollama. Once the proxy is up and running, you simply need to set the ollama endpoint for qwen3's requests to the proxy.

https://github.com/fslongjin/qwen3-ollama-no-thinking-proxy

<!-- gh-comment-id:2920043544 --> @fslongjin commented on GitHub (May 29, 2025): I've developed a proxy tool that disables the thinking mode of qwen3 on ollama. Once the proxy is up and running, you simply need to set the ollama endpoint for qwen3's requests to the proxy. https://github.com/fslongjin/qwen3-ollama-no-thinking-proxy
Author
Owner

@anuramat commented on GitHub (May 29, 2025):

I think it's implemented in 0.9.0: https://github.com/ollama/ollama/releases/tag/v0.9.0-rc0

<!-- gh-comment-id:2920813534 --> @anuramat commented on GitHub (May 29, 2025): I think it's implemented in 0.9.0: https://github.com/ollama/ollama/releases/tag/v0.9.0-rc0
Author
Owner

@1257mp commented on GitHub (Jul 17, 2025):

Input /nothink after your prompt

Ollama is (at least through my testing) now using the /set command to enable/disable modes/settings/etc. so this should now be:

/set nothink --> to DISABLE thinking while running the model (I have only really confirmed this behaviour with the deepseek LLMs)

/set think --> to ENABLE thinking while running the model (again, only tested on deepseek LLMs)

All of this should be run after the ollama run <LLM model name> command from the CLI, unless you are using another method to run this.

<!-- gh-comment-id:3085826438 --> @1257mp commented on GitHub (Jul 17, 2025): > Input /nothink after your prompt Ollama is (at least through my testing) now using the `/set` command to enable/disable modes/settings/etc. so this should now be: `/set nothink` --> to DISABLE thinking while running the model (I have only really confirmed this behaviour with the deepseek LLMs) `/set think` --> to ENABLE thinking while running the model (again, only tested on deepseek LLMs) All of this should be run after the `ollama run <LLM model name>` command from the CLI, unless you are using another method to run this.
Author
Owner

@mkozjak commented on GitHub (Aug 7, 2025):

Doesn't work with Ollama 0.9.6.

> ollama run qwen3:4b
>>> /set nothink
Set 'nothink' mode.
>>> hello
<think>
Okay, the user sent "hello". I need to respond appropriately. First, I should greet them in a friendly and professional
way. Since they just said "hello", I should acknowledge their greeting and maybe ask how I can assist them today.

Let me think about the best response. In Chinese, "hello" is "你好". So I can start with "你好!有什么我可以帮助你的吗?
" which translates to "Hello! How can I help you today?".

Wait, but sometimes people might want a more casual response. Let me check the standard phrases. In Chinese, common
greetings are "你好" followed by a question to engage them.

I should make sure the response is natural and not too formal. Maybe add a smiley emoji to keep it friendly. Let me
see. The user might be testing if I can handle simple greetings, so the response should be straightforward.

Also, need to avoid any errors in the Chinese. Let me confirm the translation. "你好!有什么我可以帮助你的吗?" is
correct. The phrase "有什么我可以帮助你的吗" is a common way to ask "How can I help you?"

Yes, that's right. So the response would be: 你好!有什么我可以帮助你的吗?

Wait, but the user might be expecting an English response. Wait, the user wrote "hello" in English, but the
instructions say the assistant should respond in Chinese. Wait, the problem says "You are an AI assistant. You must
respond in Chinese. Please write the response."

Wait, the user's message is "hello", and the instruction says to respond in Chinese. So I need to respond in Chinese.

Yes, so the response should be in Chinese. So the correct response is 你好!有什么我可以帮助你的吗?

Let me check if there's a more natural way. Sometimes people use "你好!有什么我可以帮你的吗?" but "帮助你" is more
common than "帮助你的". Wait, "帮助你" is the object, so "有什么我可以帮助你的吗" is correct.

Alternatively, maybe "你好!需要我帮忙吗?" but that's a bit more direct. But the user said "hello", so the standard
response is to ask how they can help.

Hmm, the most natural response in Chinese for "hello" is "你好!有什么我可以帮助你的吗?" So I'll go with that.

Wait, but the user might be using English, but the assistant must respond in Chinese. So the response is in Chinese.
Let me make sure the grammar is correct.

Yes, "有什么我可以帮助你的吗" is correct. The structure is "有什么(what)我可以(I can)帮助你的(help you)吗(
question marker)".

So the response is 你好!有什么我可以帮助你的吗?

Adding a smiley emoji might be good, like 你好!有什么我可以帮助你的吗?😊

But the instructions don't specify emojis, but in Chinese chat, emojis are common. Maybe better to include one to be
friendly.

Wait, the user's message is "hello", so the response should be concise and friendly.

Yes, I think that's the right response.
</think>

你好!有什么我可以帮助你的吗?😊

>>> Send a message (/? for help)
<!-- gh-comment-id:3163400337 --> @mkozjak commented on GitHub (Aug 7, 2025): Doesn't work with Ollama 0.9.6. ``` > ollama run qwen3:4b >>> /set nothink Set 'nothink' mode. >>> hello <think> Okay, the user sent "hello". I need to respond appropriately. First, I should greet them in a friendly and professional way. Since they just said "hello", I should acknowledge their greeting and maybe ask how I can assist them today. Let me think about the best response. In Chinese, "hello" is "你好". So I can start with "你好!有什么我可以帮助你的吗? " which translates to "Hello! How can I help you today?". Wait, but sometimes people might want a more casual response. Let me check the standard phrases. In Chinese, common greetings are "你好" followed by a question to engage them. I should make sure the response is natural and not too formal. Maybe add a smiley emoji to keep it friendly. Let me see. The user might be testing if I can handle simple greetings, so the response should be straightforward. Also, need to avoid any errors in the Chinese. Let me confirm the translation. "你好!有什么我可以帮助你的吗?" is correct. The phrase "有什么我可以帮助你的吗" is a common way to ask "How can I help you?" Yes, that's right. So the response would be: 你好!有什么我可以帮助你的吗? Wait, but the user might be expecting an English response. Wait, the user wrote "hello" in English, but the instructions say the assistant should respond in Chinese. Wait, the problem says "You are an AI assistant. You must respond in Chinese. Please write the response." Wait, the user's message is "hello", and the instruction says to respond in Chinese. So I need to respond in Chinese. Yes, so the response should be in Chinese. So the correct response is 你好!有什么我可以帮助你的吗? Let me check if there's a more natural way. Sometimes people use "你好!有什么我可以帮你的吗?" but "帮助你" is more common than "帮助你的". Wait, "帮助你" is the object, so "有什么我可以帮助你的吗" is correct. Alternatively, maybe "你好!需要我帮忙吗?" but that's a bit more direct. But the user said "hello", so the standard response is to ask how they can help. Hmm, the most natural response in Chinese for "hello" is "你好!有什么我可以帮助你的吗?" So I'll go with that. Wait, but the user might be using English, but the assistant must respond in Chinese. So the response is in Chinese. Let me make sure the grammar is correct. Yes, "有什么我可以帮助你的吗" is correct. The structure is "有什么(what)我可以(I can)帮助你的(help you)吗( question marker)". So the response is 你好!有什么我可以帮助你的吗? Adding a smiley emoji might be good, like 你好!有什么我可以帮助你的吗?😊 But the instructions don't specify emojis, but in Chinese chat, emojis are common. Maybe better to include one to be friendly. Wait, the user's message is "hello", so the response should be concise and friendly. Yes, I think that's the right response. </think> 你好!有什么我可以帮助你的吗?😊 >>> Send a message (/? for help) ```
Author
Owner

@rick-github commented on GitHub (Aug 7, 2025):

$ ollama -v
ollama version is 0.9.6
$ ollama run qwen3:4b
>>> /set nothink
Set 'nothink' mode.
>>> hello
Hello! How can I assist you today? 😊

>>> 

Try re-pulling the model, think support in ollama requires an updated template.

<!-- gh-comment-id:3163427937 --> @rick-github commented on GitHub (Aug 7, 2025): ```console $ ollama -v ollama version is 0.9.6 $ ollama run qwen3:4b >>> /set nothink Set 'nothink' mode. >>> hello Hello! How can I assist you today? 😊 >>> ``` Try re-pulling the model, think support in ollama requires an updated template.
Author
Owner

@mkozjak commented on GitHub (Aug 7, 2025):

$ ollama -v
ollama version is 0.9.6
$ ollama run qwen3:4b

/set nothink
Set 'nothink' mode.
hello
Hello! How can I assist you today? 😊

Try re-pulling the model, think support in ollama requires an updated template.

mkozjak@mbp:~ > ollama rm qwen3:4b
deleted 'qwen3:4b'
mkozjak@mbp:~ > ollama pull qwen3:4b
pulling manifest
pulling 3e4cb1417446: 100% ▕███████████████████████████████████████████████████████████████▏ 2.5 GB
pulling 53e4ea15e8f5: 100% ▕███████████████████████████████████████████████████████████████▏ 1.5 KB
pulling d18a5cc71b84: 100% ▕███████████████████████████████████████████████████████████████▏  11 KB
pulling cff3f395ef37: 100% ▕███████████████████████████████████████████████████████████████▏  120 B
pulling e18a783aae55: 100% ▕███████████████████████████████████████████████████████████████▏  487 B
verifying sha256 digest
writing manifest
success
mkozjak@mbp:~ > ollama -v
ollama version is 0.9.6
Warning: client version is 0.11.3
mkozjak@mbp:~ > ollama run qwen3:4b
>>> /set nothink
Set 'nothink' mode.
>>> hello
<think>
Okay, the user said "hello". I need to respond appropriately. Let me think.

First, "hello" is a greeting, so I should acknowledge it. Maybe start with a friendly response. Since the user is just
saying hello, I don't have much context. I should keep it simple and open-ended^C

>>> Send a message (/? for help)

I'm on a mac.

<!-- gh-comment-id:3163484825 --> @mkozjak commented on GitHub (Aug 7, 2025): > $ ollama -v > ollama version is 0.9.6 > $ ollama run qwen3:4b > >>> /set nothink > Set 'nothink' mode. > >>> hello > Hello! How can I assist you today? 😊 > > >>> > Try re-pulling the model, think support in ollama requires an updated template. ``` mkozjak@mbp:~ > ollama rm qwen3:4b deleted 'qwen3:4b' mkozjak@mbp:~ > ollama pull qwen3:4b pulling manifest pulling 3e4cb1417446: 100% ▕███████████████████████████████████████████████████████████████▏ 2.5 GB pulling 53e4ea15e8f5: 100% ▕███████████████████████████████████████████████████████████████▏ 1.5 KB pulling d18a5cc71b84: 100% ▕███████████████████████████████████████████████████████████████▏ 11 KB pulling cff3f395ef37: 100% ▕███████████████████████████████████████████████████████████████▏ 120 B pulling e18a783aae55: 100% ▕███████████████████████████████████████████████████████████████▏ 487 B verifying sha256 digest writing manifest success mkozjak@mbp:~ > ollama -v ollama version is 0.9.6 Warning: client version is 0.11.3 mkozjak@mbp:~ > ollama run qwen3:4b >>> /set nothink Set 'nothink' mode. >>> hello <think> Okay, the user said "hello". I need to respond appropriately. Let me think. First, "hello" is a greeting, so I should acknowledge it. Maybe start with a friendly response. Since the user is just saying hello, I don't have much context. I should keep it simple and open-ended^C >>> Send a message (/? for help) ``` I'm on a mac.
Author
Owner

@rick-github commented on GitHub (Aug 7, 2025):

It looks like the model was updated 11 hours ago to push new weights and remove the thinking control:

$ diff -u <(ollama show --template qwen3:4b-orig) <(ollama show --template qwen3:4b)
--- /dev/fd/63	2025-08-07 12:28:31.619240649 +0200
+++ /dev/fd/62	2025-08-07 12:28:31.620240717 +0200
@@ -30,14 +30,7 @@
 {{- range $i, $_ := .Messages }}
 {{- $last := eq (len (slice $.Messages $i)) 1 -}}
 {{- if eq .Role "user" }}<|im_start|>user
-{{ .Content }}
-{{- if and $.IsThinkSet (eq $i $lastUserIdx) }}
-   {{- if $.Think -}}
-      {{- " "}}/think
-   {{- else -}}
-      {{- " "}}/no_think
-   {{- end -}}
-{{- end }}<|im_end|>
+{{ .Content }}<|im_end|>
 {{ else if eq .Role "assistant" }}<|im_start|>assistant
 {{ if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}}
 <think>{{ .Thinking }}</think>
@@ -54,11 +47,5 @@
 </tool_response><|im_end|>
 {{ end }}
 {{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
-{{ if and $.IsThinkSet (not $.Think) -}}
-<think>
-
-</think>
-
-{{ end -}}
 {{ end }}
 {{- end }}
\ No newline at end of file

Perhaps the new model will have a different mechanism for controlling thinking.

@drifkin

<!-- gh-comment-id:3163535836 --> @rick-github commented on GitHub (Aug 7, 2025): It looks like the model was updated 11 hours ago to push new weights and remove the thinking control: ```diff $ diff -u <(ollama show --template qwen3:4b-orig) <(ollama show --template qwen3:4b) --- /dev/fd/63 2025-08-07 12:28:31.619240649 +0200 +++ /dev/fd/62 2025-08-07 12:28:31.620240717 +0200 @@ -30,14 +30,7 @@ {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "user" }}<|im_start|>user -{{ .Content }} -{{- if and $.IsThinkSet (eq $i $lastUserIdx) }} - {{- if $.Think -}} - {{- " "}}/think - {{- else -}} - {{- " "}}/no_think - {{- end -}} -{{- end }}<|im_end|> +{{ .Content }}<|im_end|> {{ else if eq .Role "assistant" }}<|im_start|>assistant {{ if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}} <think>{{ .Thinking }}</think> @@ -54,11 +47,5 @@ </tool_response><|im_end|> {{ end }} {{- if and (ne .Role "assistant") $last }}<|im_start|>assistant -{{ if and $.IsThinkSet (not $.Think) -}} -<think> - -</think> - -{{ end -}} {{ end }} {{- end }} \ No newline at end of file ``` Perhaps the new model will have a different mechanism for controlling thinking. @drifkin
Author
Owner

@drifkin commented on GitHub (Aug 7, 2025):

so these new qwen models don't have thinking control, instead they expect you to use a thinking model vs. a non-thinking model. We'll think about whether we should offer automatic switching in the cli via these existing commands, but we need to think through the implications of that a bit more, it could get really complicated!

<!-- gh-comment-id:3165946653 --> @drifkin commented on GitHub (Aug 7, 2025): so these new qwen models don't have thinking control, instead they expect you to use a thinking model vs. a non-thinking model. We'll think about whether we should offer automatic switching in the cli via these existing commands, but we need to think through the implications of that a bit more, it could get really complicated!
Author
Owner

@TranQuyenSinh commented on GitHub (Aug 11, 2025):

I found that command set /nothink or --think=false also work with qwen3:8b and doesn't with qwen3:4b

<!-- gh-comment-id:3173377429 --> @TranQuyenSinh commented on GitHub (Aug 11, 2025): I found that command set /nothink or --think=false also work with qwen3:8b and doesn't with qwen3:4b
Author
Owner

@MrMuhannadObeidat commented on GitHub (Aug 15, 2025):

--think=false doesn't with qwen3:4b. Making the model completely unusable for me at least. what a shame!

<!-- gh-comment-id:3190898840 --> @MrMuhannadObeidat commented on GitHub (Aug 15, 2025): --think=false doesn't with qwen3:4b. Making the model completely unusable for me at least. what a shame!
Author
Owner

@rick-github commented on GitHub (Aug 15, 2025):

Use the non-thinking version of the model, qwen3:4b-instruct-2507-q4_K_M.

$ ollama run qwen3:4b-instruct-2507-q4_K_M hello
Hello! How can I assist you today? 😊
<!-- gh-comment-id:3191320229 --> @rick-github commented on GitHub (Aug 15, 2025): Use the non-thinking version of the model, [qwen3:4b-instruct-2507-q4_K_M](https://ollama.com/library/qwen3:4b-instruct-2507-q4_K_M). ```console $ ollama run qwen3:4b-instruct-2507-q4_K_M hello Hello! How can I assist you today? 😊 ```
Author
Owner

@user123-source commented on GitHub (Aug 16, 2025):

Please list these model separately to the original Qwen3 to avoid this confusion.

<!-- gh-comment-id:3193519610 --> @user123-source commented on GitHub (Aug 16, 2025): Please list these model separately to the original Qwen3 to avoid this confusion.
Author
Owner

@MrMuhannadObeidat commented on GitHub (Aug 16, 2025):

@rick-github thanks for pointing to that. It works without the thinking piece.

<!-- gh-comment-id:3193548716 --> @MrMuhannadObeidat commented on GitHub (Aug 16, 2025): @rick-github thanks for pointing to that. It works without the thinking piece.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6901