[GH-ISSUE #9670] Context Modification for Stop Extended Thinking Process #68366

Open
opened 2026-05-04 13:35:32 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @123gggwnnggg on GitHub (Mar 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9670

I’m writing to propose a feature enhancement that could improve how the model handles extended thinking.
Some reasoning model can spend way to long on thinking, so the idea is to implement a context modification mechanism where a marker is automatically inserted after the last period ('.') when the model thinking for 1/2 of the max response.
Example: (limit at 10 words.)

<think> I am thinking. Keep thinking. Still thinking thinking thinking thinking 

modify to

<think> I am thinking. Keep thinking.</think>
Originally created by @123gggwnnggg on GitHub (Mar 12, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9670 I’m writing to propose a feature enhancement that could improve how the model handles extended thinking. Some reasoning model can spend way to long on thinking, so the idea is to implement a context modification mechanism where a </think> marker is automatically inserted after the last period ('.') when the model thinking for 1/2 of the max response. Example: (limit at 10 words.) ``` <think> I am thinking. Keep thinking. Still thinking thinking thinking thinking ``` modify to ``` <think> I am thinking. Keep thinking.</think> ```
GiteaMirror added the feature request label 2026-05-04 13:35:32 -05:00
Author
Owner

@JasonHonKL commented on GitHub (Mar 22, 2025):

I think this may not be suitable because it would be the application layer user (i.e developer) to handle these response properly. For instance you could take a look with how Langchain handle this case with Deepseek. It's not Ollama responsibility which is an applicably focusing on hosting LLM locally.

<!-- gh-comment-id:2745257471 --> @JasonHonKL commented on GitHub (Mar 22, 2025): I think this may not be suitable because it would be the application layer user (i.e developer) to handle these response properly. For instance you could take a look with how Langchain handle this case with Deepseek. It's not Ollama responsibility which is an applicably focusing on hosting LLM locally.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68366