[PR #10472] Add ability to not send think tokens #13251

Closed
opened 2026-04-13 00:22:05 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/10472

State: closed
Merged: No


Implement OLLAMA_DISABLE_TOKEN_TAG for filtering streamed tokens

This PR introduces a new environment variable OLLAMA_DISABLE_TOKEN_TAG that allows users to specify a tag whose content should not be streamed in the model's response. This is particularly useful for filtering out internal thought processes or other tagged sections from the final output. (Potentially saving context size for clients that don't support this themselves)

This PR adds a new environment variable OLLAMA_DISABLE_TOKEN_TAG. When this variable is set to a tag name (e.g., think), the server will detect content within <tag> </tag> blocks and prevent those tokens from being sent to the client.

How to Use:

Set the OLLAMA_DISABLE_TOKEN_TAG environment variable to the name of the tag you want to disable. For example, to disable content within <think> </think> tags, run Ollama with:

OLLAMA_DISABLE_TOKEN_TAG=think go run . serve

If the OLLAMA_DISABLE_TOKEN_TAG environment variable is not set or is empty, all tokens will be streamed as before.

**Original Pull Request:** https://github.com/ollama/ollama/pull/10472 **State:** closed **Merged:** No --- ## Implement OLLAMA_DISABLE_TOKEN_TAG for filtering streamed tokens This PR introduces a new environment variable `OLLAMA_DISABLE_TOKEN_TAG` that allows users to specify a tag whose content should not be streamed in the model's response. This is particularly useful for filtering out internal thought processes or other tagged sections from the final output. (Potentially saving context size for clients that don't support this themselves) This PR adds a new environment variable `OLLAMA_DISABLE_TOKEN_TAG`. When this variable is set to a tag name (e.g., `think`), the server will detect content within `<tag> </tag>` blocks and prevent those tokens from being sent to the client. **How to Use:** Set the `OLLAMA_DISABLE_TOKEN_TAG` environment variable to the name of the tag you want to disable. For example, to disable content within `<think> </think>` tags, run Ollama with: ```bash OLLAMA_DISABLE_TOKEN_TAG=think go run . serve ``` If the `OLLAMA_DISABLE_TOKEN_TAG` environment variable is not set or is empty, all tokens will be streamed as before.
GiteaMirror added the pull-request label 2026-04-13 00:22:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13251