[GH-ISSUE #10482] Add think field to message in response #32654

Closed
opened 2026-04-22 14:19:10 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @lasseedfast on GitHub (Apr 29, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10482

More and more models are "reasoning", it's not always one wants to show that process. I've tried to update various scripts to remove the thinking part (often between and ), but it would be a lot easier if the thinking comes in a separate field in the message, along with content and the other fields.
If this becomes implemented, I would love to have the option not to stream the thinking, if using the streaming endpoint for content.

Originally created by @lasseedfast on GitHub (Apr 29, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10482 More and more models are "reasoning", it's not always one wants to show that process. I've tried to update various scripts to remove the thinking part (often between <think> and </think>), but it would be a lot easier if the thinking comes in a separate field in the message, along with `content` and the other fields. If this becomes implemented, I would love to have the option not to stream the thinking, if using the streaming endpoint for content.
GiteaMirror added the feature request label 2026-04-22 14:19:10 -05:00
Author
Owner

@Abubakkar13 commented on GitHub (Apr 30, 2025):

Hey,
If you are running from python code, you can just trim the text till </think> , Like:

str(response).split('</think>')[1].strip()

I tried for my simple case, it just print the content without thinking part(But generation is both thinking and final answer), Hope This might work. This is just a string manipulation, but let me know if you found any option to ignore thinking part for streaming.

<!-- gh-comment-id:2841608922 --> @Abubakkar13 commented on GitHub (Apr 30, 2025): Hey, If you are running from python code, you can just trim the text till `</think>` , Like: `str(response).split('</think>')[1].strip()` I tried for my simple case, it just print the content without thinking part(But generation is both thinking and final answer), Hope This might work. This is just a string manipulation, but let me know if you found any option to ignore thinking part for streaming.
Author
Owner

@lasseedfast commented on GitHub (Apr 30, 2025):

Thanks,
The thing is that it would be very convenient not to have to do that in every case. As for now, I’d have to go back to change the code in all old projects if I want to use a new, reasoning model. That way, the new format of answers from eg. Qwen3 is almost breaking old scripts.

<!-- gh-comment-id:2841894328 --> @lasseedfast commented on GitHub (Apr 30, 2025): Thanks, The thing is that it would be very convenient not to have to do that in every case. As for now, I’d have to go back to change the code in all old projects if I want to use a new, reasoning model. That way, the new format of answers from eg. Qwen3 is almost breaking old scripts.
Author
Owner

@lemassykoi commented on GitHub (May 1, 2025):

you can easily use a simple func, here is mine:

def extract_thinking(entire_message: str):
    # if entire_message contains `<think> ... </think>`, extract the text within tags and clean the remaining text from extracted part. Returns a tuple: thinking_process:str and final_answer:str
    try:
        start_tag = '<think>'
        end_tag = '</think>'

        start_index = entire_message.index(start_tag)
        end_index = entire_message.index(end_tag)

        if start_index < end_index:
            thinking_process = entire_message[start_index + len(start_tag):end_index].strip()
            final_answer = entire_message[end_index + len(end_tag):].strip()
            logger.info('Thinking Process: ' + thinking_process)
            return thinking_process, final_answer
        else:
            return None, entire_message

    except ValueError:
        logger.warning("No `<think>` found.")
        return None, entire_message
<!-- gh-comment-id:2843903414 --> @lemassykoi commented on GitHub (May 1, 2025): you can easily use a simple func, here is mine: ```python def extract_thinking(entire_message: str): # if entire_message contains `<think> ... </think>`, extract the text within tags and clean the remaining text from extracted part. Returns a tuple: thinking_process:str and final_answer:str try: start_tag = '<think>' end_tag = '</think>' start_index = entire_message.index(start_tag) end_index = entire_message.index(end_tag) if start_index < end_index: thinking_process = entire_message[start_index + len(start_tag):end_index].strip() final_answer = entire_message[end_index + len(end_tag):].strip() logger.info('Thinking Process: ' + thinking_process) return thinking_process, final_answer else: return None, entire_message except ValueError: logger.warning("No `<think>` found.") return None, entire_message ```
Author
Owner

@lasseedfast commented on GitHub (May 1, 2025):

Thanks lemassykoi,
Again, the problem is not to clean the thinking part from the response. The problem is that I would have to go back to all my old scripts, in use in different projects, to add this. Therefore, it would just be nice to have the option to get the thinking part in a separate field.
I believe the best argument for using Ollama is that it is more convenient than other, and faster, frameworks, and this would be such a convenient thing.

<!-- gh-comment-id:2843961905 --> @lasseedfast commented on GitHub (May 1, 2025): Thanks lemassykoi, Again, the problem is not to clean the thinking part from the response. The problem is that I would have to go back to all my old scripts, in use in different projects, to add this. Therefore, it would just be nice to have the option to get the thinking part in a separate field. I believe the best argument for using Ollama is that it is more convenient than other, and faster, frameworks, and this would be such a convenient thing.
Author
Owner

@rick-github commented on GitHub (May 30, 2025):

https://github.com/ollama/ollama/releases/tag/v0.9.0

<!-- gh-comment-id:2921920391 --> @rick-github commented on GitHub (May 30, 2025): https://github.com/ollama/ollama/releases/tag/v0.9.0
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32654