[GH-ISSUE #12493] Add /set silentthinking option #34054

Open
opened 2026-04-22 17:17:11 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @iplayfast on GitHub (Oct 3, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12493

Sometimes you want the model to think, but you don't want to suffer through the verbal diarrhea. An enhancement would be, don't output any text that is thinking, and only output the final resulting text.

Originally created by @iplayfast on GitHub (Oct 3, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12493 Sometimes you want the model to think, but you don't want to suffer through the verbal diarrhea. An enhancement would be, don't output any text that is thinking, and only output the final resulting text.
GiteaMirror added the feature request label 2026-04-22 17:17:11 -05:00
Author
Owner

@FieldMouse-AI commented on GitHub (Oct 5, 2025):

Sometimes you want the model to think, but you don't want to suffer through the verbal diarrhea. An enhancement would be, don't output any text that is thinking, and only output the final resulting text.

Hello, @iplayfast , am I to understand you correctly, based on your question it would suggest that thinking being on changes how a model produces its response?

That surprises me. I thought that it was just like a kind of verbose option that lets us see how the model produced a response without changing the actual response.

Can you help clarify this for me, please?

<!-- gh-comment-id:3368828273 --> @FieldMouse-AI commented on GitHub (Oct 5, 2025): > Sometimes you want the model to think, but you don't want to suffer through the verbal diarrhea. An enhancement would be, don't output any text that is thinking, and only output the final resulting text. Hello, @iplayfast , am I to understand you correctly, based on your question it would suggest that `thinking` being on changes how a model produces its response? That surprises me. I thought that it was just like a kind of **verbose** option that lets us see how the model produced a response without changing the actual response. Can you help clarify this for me, please?
Author
Owner

@rick-github commented on GitHub (Oct 5, 2025):

The tokens produced during inference influence the following tokens. Thinking is used by models to "reason" about an answer before generating it. Disabling thinking (on models that support it) removes those early tokens and the final result may be different.

<!-- gh-comment-id:3369210608 --> @rick-github commented on GitHub (Oct 5, 2025): The tokens produced during inference influence the following tokens. Thinking is used by models to "reason" about an answer before generating it. Disabling thinking (on models that support it) removes those early tokens and the final result may be different.
Author
Owner

@rick-github commented on GitHub (Oct 5, 2025):

In case OP is unaware, the ollama client has the --hidethinking command line argument, but it's not settable via /set, which is what I think the OP wants.

<!-- gh-comment-id:3369215333 --> @rick-github commented on GitHub (Oct 5, 2025): In case OP is unaware, the ollama client has the `--hidethinking` command line argument, but it's not settable via `/set`, which is what I think the OP wants.
Author
Owner

@iplayfast commented on GitHub (Oct 5, 2025):

I was not actually aware of the --hidethinking command. So I guess /set hidethinking is what I'm looking for.

<!-- gh-comment-id:3369340911 --> @iplayfast commented on GitHub (Oct 5, 2025): I was not actually aware of the --hidethinking command. So I guess /set hidethinking is what I'm looking for.
Author
Owner

@YuenSzeHong commented on GitHub (Jan 31, 2026):

i am on podman (docker) 0.15.2, with 5070ti, for some reason, in gpt-oss and qwen3 /set nothink didnt turn off thinking

<!-- gh-comment-id:3828385980 --> @YuenSzeHong commented on GitHub (Jan 31, 2026): i am on podman (docker) 0.15.2, with 5070ti, for some reason, in gpt-oss and qwen3 /set nothink didnt turn off thinking
Author
Owner

@rick-github commented on GitHub (Jan 31, 2026):

Thinking cannot be disabled for gpt-oss, just set to a different level: high, medium, low. Some qwen3 models come in two variants: instruct and thinking. If you want to disable thinking, use the non-thinking model, eg qwen3:4b-instruct.

<!-- gh-comment-id:3828539188 --> @rick-github commented on GitHub (Jan 31, 2026): Thinking cannot be disabled for gpt-oss, just set to a different level: high, medium, low. Some qwen3 models come in two variants: instruct and thinking. If you want to disable thinking, use the non-thinking model, eg qwen3:4b-instruct.
Author
Owner

@iplayfast commented on GitHub (Feb 3, 2026):

I thought it was basically filtering out the thinking and only showing the results, I don't really care if there is thinking going on under the surface, I just don't want to see it.

<!-- gh-comment-id:3843047175 --> @iplayfast commented on GitHub (Feb 3, 2026): I thought it was basically filtering out the thinking and only showing the results, I don't really care if there is thinking going on under the surface, I just don't want to see it.
Author
Owner

@AstroTheRabbit commented on GitHub (Mar 13, 2026):

Recently installed ollama, and I'd definitely like a hidethinking parameter. Being able to hide the thinking output by passing a command to the CLI itself, but not being able to set it permanently (in a modelfile PARAMETER) or being able to enable it mid-session (via /set ...) seems kinda weird.

<!-- gh-comment-id:4052743991 --> @AstroTheRabbit commented on GitHub (Mar 13, 2026): Recently installed ollama, and I'd definitely like a `hidethinking` parameter. Being able to hide the thinking output by passing a command to the CLI itself, but not being able to set it permanently (in a modelfile `PARAMETER`) or being able to enable it mid-session (via `/set ...`) seems kinda weird.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34054