[GH-ISSUE #11979] How to enable and disable the thinking function of the model when openai calls ollama #70012

Open
opened 2026-05-04 20:04:12 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @zkj12321 on GitHub (Aug 20, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11979

When using the OpenAI library to call a model deployed locally in Olama, how to set the model to enable and disable the thinking function? I tried adding relevant configurations to the 'extra-body' field, but it did not take effect. Have you made any adaptations in this area yet

Originally created by @zkj12321 on GitHub (Aug 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11979 When using the OpenAI library to call a model deployed locally in Olama, how to set the model to enable and disable the thinking function? I tried adding relevant configurations to the 'extra-body' field, but it did not take effect. Have you made any adaptations in this area yet
GiteaMirror added the feature request label 2026-05-04 20:04:12 -05:00
Author
Owner

@mbenchwevioo commented on GitHub (Aug 20, 2025):

Also how to hide thinking from output? display only the response
An env variable will be more suitable speacially for docker-compose, something like OLLAMA_HIDE_THINKING=1

<!-- gh-comment-id:3206725768 --> @mbenchwevioo commented on GitHub (Aug 20, 2025): Also how to hide thinking from output? display only the response An env variable will be more suitable speacially for docker-compose, something like OLLAMA_HIDE_THINKING=1
Author
Owner

@pdevine commented on GitHub (Aug 20, 2025):

@mbenchwevioo do you mean hide it from the OpenAI API, or hide it from the CLI?

<!-- gh-comment-id:3207590209 --> @pdevine commented on GitHub (Aug 20, 2025): @mbenchwevioo do you mean hide it from the OpenAI API, or hide it from the CLI?
Author
Owner

@mbenchwevioo commented on GitHub (Aug 20, 2025):

When I use API chat/completion with a reasoning model I get the thinking output with the response
It would be better if we can choose when we start ollama with an env variable THINK with values: enable / disable / hide
Enable: enable thinking
Disable: disable thinking
Hide: enable thinking but hide it and show only the response

<!-- gh-comment-id:3207620558 --> @mbenchwevioo commented on GitHub (Aug 20, 2025): When I use API chat/completion with a reasoning model I get the thinking output with the response It would be better if we can choose when we start ollama with an env variable THINK with values: enable / disable / hide Enable: enable thinking Disable: disable thinking Hide: enable thinking but hide it and show only the response
Author
Owner

@zkj12321 commented on GitHub (Aug 21, 2025):

@pdevine Olama previously added '-- think=false' to implement the think of closing the model, but what should be done to hide it from OpenAI API

<!-- gh-comment-id:3208675577 --> @zkj12321 commented on GitHub (Aug 21, 2025): @pdevine Olama previously added '-- think=false' to implement the think of closing the model, but what should be done to hide it from OpenAI API
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70012