[GH-ISSUE #10928] ollama 0.9.0 can support server level think mode #7191

Open
opened 2026-04-12 19:11:22 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @hlstudio on GitHub (May 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10928

The release of ollama 0.9.0 has brought improvements to the thinking mode, which can be enabled or disabled via CLI or API.
Is there any way to keep a specific model in non-thinking mode by default when using it through the serve interface? I'd rather not manually adjust those application settings.
such as
1、Add option (--nothink) for 'ollama serve' that cannot be overridden
2、Allow environment-variable: OLLAMA_THINK

or anyway like hack the model template file ?

Originally created by @hlstudio on GitHub (May 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10928 The release of ollama 0.9.0 has brought improvements to the thinking mode, which can be enabled or disabled via CLI or API. Is there any way to keep a specific model in non-thinking mode by default when using it through the serve interface? I'd rather not manually adjust those application settings. such as 1、Add option (--nothink) for 'ollama serve' that cannot be overridden 2、Allow environment-variable: OLLAMA_THINK or anyway like hack the model template file ?
GiteaMirror added the feature request label 2026-04-12 19:11:22 -05:00
Author
Owner

@shakti-bhakta commented on GitHub (May 31, 2025):

I also think this would be helpful. I am using Chatbox as the chat interface.
I cannot find any way to disable thinking since the /set nothink command does not work and I also cannot modify the json request sent.

It seem like this suggestion would solve my issue as well.

<!-- gh-comment-id:2925600597 --> @shakti-bhakta commented on GitHub (May 31, 2025): I also think this would be helpful. I am using Chatbox as the chat interface. I cannot find any way to disable thinking since the `/set nothink` command does not work and I also cannot modify the json request sent. It seem like this suggestion would solve my issue as well.
Author
Owner

@webdev23 commented on GitHub (Jun 3, 2025):

Side note, do not break the API.

Devs added that "thinking" capabilities, ok, great.

But:

In Ollama's API, a model's thinking is now returned as a separate thinking field for easy parsing

Thanks but that is a deep change in the parsing response, so peoples making their own UI and tools have to rewrite response handling.

Please add capabilities but do not break existing previous API calls and responses, so we could implement the newest capabilities changes without the mess.

<!-- gh-comment-id:2936569987 --> @webdev23 commented on GitHub (Jun 3, 2025): Side note, do not break the API. Devs added that "thinking" capabilities, ok, great. But: > In Ollama's API, a model's thinking is now returned as a separate thinking field for easy parsing Thanks but that is a deep change in the parsing response, so peoples making their own UI and tools have to rewrite response handling. Please add capabilities but **do not break existing previous API calls and responses**, so we could implement the newest capabilities changes without the mess.
Author
Owner

@bakman2 commented on GitHub (Jun 4, 2025):

Thanks but that is a deep change in the parsing response, so peoples making their own UI and tools have to rewrite response handling.

If you do not specify think:true/false in the request, the response will be as previously, ie. should be non-breaking.

<!-- gh-comment-id:2938512486 --> @bakman2 commented on GitHub (Jun 4, 2025): > Thanks but that is a deep change in the parsing response, so peoples making their own UI and tools have to rewrite response handling. If you do not specify `think:true/false` in the request, the response will be as previously, ie. should be non-breaking.
Author
Owner

@hlstudio commented on GitHub (Jun 12, 2025):

try this modelfile,add \n<think>\n\n</think>\n\n after <|Assistant|>

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
FROM deepseek-r1:8b

TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<|User|>{{ .Content }}
{{- else if eq .Role "assistant" }}<|Assistant|>\n<think>\n\n</think>\n\n
  {{- if and $.IsThinkSet (and $last .Thinking) -}}
<think>
{{ .Thinking }}
</think>
{{- end }}{{ .Content }}{{- if not $last }}<|end▁of▁sentence|>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<|Assistant|>\n<think>\n\n</think>\n\n
{{- if and $.IsThinkSet (not $.Think) -}}
<think>

</think>

{{ end }}
{{- end -}}
{{- end }}"""
PARAMETER top_p 0.95
PARAMETER stop <|begin▁of▁sentence|>
PARAMETER stop <|end▁of▁sentence|>
PARAMETER stop <|User|>
PARAMETER stop <|Assistant|>
PARAMETER temperature 0.6 
<!-- gh-comment-id:2964762690 --> @hlstudio commented on GitHub (Jun 12, 2025): try this modelfile,add `\n<think>\n\n</think>\n\n ` after `<|Assistant|>` ``` # Modelfile generated by "ollama show" # To build a new Modelfile based on this, replace FROM with: FROM deepseek-r1:8b TEMPLATE """{{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1}} {{- if eq .Role "user" }}<|User|>{{ .Content }} {{- else if eq .Role "assistant" }}<|Assistant|>\n<think>\n\n</think>\n\n {{- if and $.IsThinkSet (and $last .Thinking) -}} <think> {{ .Thinking }} </think> {{- end }}{{ .Content }}{{- if not $last }}<|end▁of▁sentence|>{{- end }} {{- end }} {{- if and $last (ne .Role "assistant") }}<|Assistant|>\n<think>\n\n</think>\n\n {{- if and $.IsThinkSet (not $.Think) -}} <think> </think> {{ end }} {{- end -}} {{- end }}""" PARAMETER top_p 0.95 PARAMETER stop <|begin▁of▁sentence|> PARAMETER stop <|end▁of▁sentence|> PARAMETER stop <|User|> PARAMETER stop <|Assistant|> PARAMETER temperature 0.6 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7191