[GH-ISSUE #14622] [Bug] lfm2.5-thinking:latest — thinking mode cannot be disabled; #71535

Closed
opened 2026-05-05 02:04:51 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @Asjad22 on GitHub (Mar 4, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14622

Bug : Thinking mode cannot be disabled

There is no way to suppress the <think> block output in lfm2.5-thinking:latest.

Expected: When Ollama parameter think is set to false, the model should not output <think> blocks.

Actual: The model always emits a verbose <think> block on every response. think: 'false' isn't working.

Request: Please add support to disable thinking mode via a model parameter (e.g. think: false in Ollama OPTIONS), or provide a non-thinking variant of the model on Ollama.

Relevant log output

curl http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "lfm2.5-thinking:latest",
    "stream": false,
    "options": { "think": false },
    "messages": [
      { "role": "user", "content": "What is 2 + 2?" }
    ]
  }'
{"model":"lfm2.5-thinking:latest","created_at":"2026-03-04T20:18:52.319383911Z","message":{"role":"assistant","content":"The sum of 2 and 2 is simply $2 + 2 = 4$. \n\n\\boxed{4}","thinking":"Okay, let me see. The user asked \"What is 2 + 2?\" Hmm, that seems straightforward, but maybe they want more than just a number. Wait, maybe they're testing if I know basic math? Or perhaps there's a trick here. Let me think.\n\nFirst, 2 plus 2 is a simple arithmetic problem. The answer should be 4. But maybe they expect some deeper answer? Like, in some contexts, 2+2 could symbolize something else? Or maybe it's a play on words? Like \"two plus two\" could be part of a larger question? Wait, the question is just \"What is 2 + 2?\" so probably just 4. \n\nBut let me make sure. If I consider possible contexts: in math class, adding two 2s gives 4. In everyday language, same thing. Unless there's a pun or something. Wait, maybe the user is expecting a different answer? Like, if it's a riddle? For example, \"2 + 2\" could be part of a sentence where it leads to something else. But the question is straightforward. \n\nAlternatively, maybe they want the explanation. Since the user just asked \"What is 2 + 2?\" perhaps they want the answer boxed as 4. Since the instruction says to put the final answer within \\boxed{}, so I should just give 4 in a box. \n\nWait, but maybe I should confirm. Let me check again. If someone asks \"What is 2 + 2?\" the answer is 4. Unless there's a trick, but I can't think of any. So I'll go with 4. Alright, I'll put \\boxed{4}."},"done":true,"done_reason":"stop","total_duration":19064370371,"load_duration":442441364,"prompt_eval_count":17,"prompt_eval_duration":216339639,"eval_count":401,"eval_duration":18312636992}

OS

NAME="Fedora Linux"
VERSION="43 (Workstation Edition)"

GPU

No response

CPU

Model name: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz

Ollama version

ollama version is 0.14.3

Originally created by @Asjad22 on GitHub (Mar 4, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14622 ## Bug : Thinking mode cannot be disabled There is no way to suppress the `<think>` block output in `lfm2.5-thinking:latest`. **Expected:** When Ollama parameter `think` is set to `false`, the model should not output `<think>` blocks. **Actual:** The model always emits a verbose `<think>` block on every response. think: 'false' isn't working. **Request:** Please add support to disable thinking mode via a model parameter (e.g. `think: false` in Ollama OPTIONS), or provide a non-thinking variant of the model on Ollama. ### Relevant log output ```shell curl http://localhost:11434/api/chat \ -H "Content-Type: application/json" \ -d '{ "model": "lfm2.5-thinking:latest", "stream": false, "options": { "think": false }, "messages": [ { "role": "user", "content": "What is 2 + 2?" } ] }' {"model":"lfm2.5-thinking:latest","created_at":"2026-03-04T20:18:52.319383911Z","message":{"role":"assistant","content":"The sum of 2 and 2 is simply $2 + 2 = 4$. \n\n\\boxed{4}","thinking":"Okay, let me see. The user asked \"What is 2 + 2?\" Hmm, that seems straightforward, but maybe they want more than just a number. Wait, maybe they're testing if I know basic math? Or perhaps there's a trick here. Let me think.\n\nFirst, 2 plus 2 is a simple arithmetic problem. The answer should be 4. But maybe they expect some deeper answer? Like, in some contexts, 2+2 could symbolize something else? Or maybe it's a play on words? Like \"two plus two\" could be part of a larger question? Wait, the question is just \"What is 2 + 2?\" so probably just 4. \n\nBut let me make sure. If I consider possible contexts: in math class, adding two 2s gives 4. In everyday language, same thing. Unless there's a pun or something. Wait, maybe the user is expecting a different answer? Like, if it's a riddle? For example, \"2 + 2\" could be part of a sentence where it leads to something else. But the question is straightforward. \n\nAlternatively, maybe they want the explanation. Since the user just asked \"What is 2 + 2?\" perhaps they want the answer boxed as 4. Since the instruction says to put the final answer within \\boxed{}, so I should just give 4 in a box. \n\nWait, but maybe I should confirm. Let me check again. If someone asks \"What is 2 + 2?\" the answer is 4. Unless there's a trick, but I can't think of any. So I'll go with 4. Alright, I'll put \\boxed{4}."},"done":true,"done_reason":"stop","total_duration":19064370371,"load_duration":442441364,"prompt_eval_count":17,"prompt_eval_duration":216339639,"eval_count":401,"eval_duration":18312636992} ``` ### OS NAME="Fedora Linux" VERSION="43 (Workstation Edition)" ### GPU _No response_ ### CPU Model name: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz ### Ollama version ollama version is 0.14.3
GiteaMirror added the bug label 2026-05-05 02:04:51 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 4, 2026):

The model doesn't have a non-thinking variant, and the template supplied by the model authors shows no mechanism for controlling thinking. It could probably be faked by adding a template that pre-fills the output with an empty <think></think> block, but that's not always reliable.

<!-- gh-comment-id:4000259296 --> @rick-github commented on GitHub (Mar 4, 2026): The model doesn't have a non-thinking variant, and the [template](https://huggingface.co/LiquidAI/LFM2-24B-A2B/blob/main/chat_template.jinja) supplied by the model authors shows no mechanism for controlling thinking. It could probably be faked by adding a template that pre-fills the output with an empty `<think></think>` block, but that's not always reliable.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71535