[GH-ISSUE #10809] Ollama supports the enable_thinking parameter #32858

Closed
opened 2026-04-22 14:44:26 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @yoin528 on GitHub (May 22, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10809

When deploying Qwen3, Olama can support the enable_thinking parameter. The official documentation has stated that enable_thinking=False can support disabling thinking, but after passing the parameter in Olama deployment, it still enters thinking mode. Only by adding /no_think during each inference can we ensure that no thinking is done. Can additional general parameters be added in Olama and passed transparently to the original model to facilitate the specific characteristics of the model

Originally created by @yoin528 on GitHub (May 22, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10809 When deploying Qwen3, Olama can support the `enable_thinking` parameter. The official documentation has stated that `enable_thinking=False` can support disabling thinking, but after passing the parameter in Olama deployment, it still enters thinking mode. Only by adding `/no_think` during each inference can we ensure that no thinking is done. Can additional general parameters be added in Olama and passed transparently to the original model to facilitate the specific characteristics of the model
Author
Owner

@rick-github commented on GitHub (May 22, 2025):

#10584

<!-- gh-comment-id:2900836757 --> @rick-github commented on GitHub (May 22, 2025): #10584
Author
Owner

@yoin528 commented on GitHub (May 23, 2025):

#10584

The question I am asking is whether it is possible to implement a configurable switch parameter in the calling interface or model configuration when calling the Qwen3 model, which can be considered or not. This way, there is no need to add/no_thinking every time a request is made, as adding no_thinking repeatedly in multiple rounds of conversations is very troublesome, especially when dealing with subsequent tool calls

<!-- gh-comment-id:2903576189 --> @yoin528 commented on GitHub (May 23, 2025): > [#10584](https://github.com/ollama/ollama/pull/10584) The question I am asking is whether it is possible to implement a configurable switch parameter in the calling interface or model configuration when calling the Qwen3 model, which can be considered or not. This way, there is no need to add/no_thinking every time a request is made, as adding no_thinking repeatedly in multiple rounds of conversations is very troublesome, especially when dealing with subsequent tool calls
Author
Owner

@rick-github commented on GitHub (May 23, 2025):

The question I am asking is whether it is possible to implement a configurable switch parameter in the calling interface

Yes, as per #10584.

<!-- gh-comment-id:2903593181 --> @rick-github commented on GitHub (May 23, 2025): > The question I am asking is whether it is possible to implement a configurable switch parameter in the calling interface Yes, as per #10584.
Author
Owner

@yoin528 commented on GitHub (May 26, 2025):

After using the analysis to compile ollama and starting ollama serve, currently only the following request can be used after normal startup:

`
POST http://localhost:11434/api/chat
Content-Type: application/json

{
"model": "qwen3:0.6b",
"messages": [
{
"role": "system",
"content": "你是个智能助手,请根据提问回答问题。/no_think"
},
{
"role": "user",
"content": "1+1?"
}
],
"stream": false
}
`

Response:

{ "model": "qwen3:0.6b", "created_at": "2025-05-26T03:04:17.792722554Z", "message": { "role": "assistant", "content": "<think>\n\n</think>\n\n1加1等于2。" }, "done_reason": "stop", "done": true, "total_duration": 1164766885, "load_duration": 16615755, "prompt_eval_count": 31, "prompt_eval_duration": 583130577, "eval_count": 11, "eval_duration": 562205320 }

But the following question:
<think>\n\n</think>\n\n
`
Has this problem been resolved? Remove the issue of empty blocks in. Additionally, using/hidethinking does not work and cannot remove empty blocks. Furthermore, using the compiled version for startup also does not work with the think parameter

root@b5d8fddb60e3 :/app# ./ollama run qwen3:0.6b --think=false Error: unknown flag: --think

or

root@b5d8fddb60e3 :/app# ./ollama run qwen3:0.6b --think=false --hidethinking Error: unknown flag: --think

How do I operate the compiled ollama? Thank you very much for your answer.

<!-- gh-comment-id:2908386456 --> @yoin528 commented on GitHub (May 26, 2025): After using the analysis to compile ollama and starting ollama serve, currently only the following request can be used after normal startup: ` POST http://localhost:11434/api/chat Content-Type: application/json { "model": "qwen3:0.6b", "messages": [ { "role": "system", "content": "你是个智能助手,请根据提问回答问题。/no_think" }, { "role": "user", "content": "1+1?" } ], "stream": false } ` Response: ` { "model": "qwen3:0.6b", "created_at": "2025-05-26T03:04:17.792722554Z", "message": { "role": "assistant", "content": "<think>\n\n</think>\n\n1加1等于2。" }, "done_reason": "stop", "done": true, "total_duration": 1164766885, "load_duration": 16615755, "prompt_eval_count": 31, "prompt_eval_duration": 583130577, "eval_count": 11, "eval_duration": 562205320 } ` But the following question: ` <think>\n\n</think>\n\n` ` Has this problem been resolved? Remove the issue of empty blocks in<think></think>. Additionally, using/hidethinking does not work and cannot remove empty blocks. Furthermore, using the compiled version for startup also does not work with the think parameter ` root@b5d8fddb60e3 :/app# ./ollama run qwen3:0.6b --think=false Error: unknown flag: --think ` or ` root@b5d8fddb60e3 :/app# ./ollama run qwen3:0.6b --think=false --hidethinking Error: unknown flag: --think ` How do I operate the compiled ollama? Thank you very much for your answer.
Author
Owner

@rick-github commented on GitHub (May 30, 2025):

https://github.com/ollama/ollama/releases/tag/v0.9.0

<!-- gh-comment-id:2921927388 --> @rick-github commented on GitHub (May 30, 2025): https://github.com/ollama/ollama/releases/tag/v0.9.0
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32858