[GH-ISSUE #14809] Feature request: PARAMETER think false in Modelfile for Qwen 3.5 / thinking models #9563

Closed
opened 2026-04-12 22:28:43 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @Adam-Researchh on GitHub (Mar 12, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14809

Problem

There is no way to disable thinking mode for Qwen 3.5 (and other thinking models) at the Modelfile level.

Currently the only way to disable thinking is to pass think: false in the API request body:

{ "model": "qwen3.5:122b-a10b", "think": false, "messages": [...] }

However, clients that don't support passing custom Ollama parameters (e.g. OpenClaw, many OpenAI-compat clients) have no way to disable thinking — they just get endless thinking token streams, effectively hanging the request.

Expected Behavior

A Modelfile should support:

FROM qwen3.5:122b-a10b
PARAMETER think false

This would allow creating a model variant that disables thinking by default, regardless of what the client sends.

Current Workaround Attempts

  • SYSTEM /no_think in Modelfile — does not work when the client sends its own system prompt (which overrides the Modelfile SYSTEM value)
  • PARAMETER think false — returns Error: unknown parameter 'think'
  • API-level think: false — works but requires client support

Environment

  • Ollama: 0.17.5 and 0.17.7 (both affected)
  • Model: qwen3.5:122b-a10b (Q4_K_M, 125B)
  • Hardware: Apple M1 Ultra, macOS

Impact

Without this, any client that sends a custom system prompt cannot use Qwen 3.5 without the thinking phase hanging the connection — even with a dedicated no-think Modelfile variant.

Originally created by @Adam-Researchh on GitHub (Mar 12, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14809 ## Problem There is no way to disable thinking mode for Qwen 3.5 (and other thinking models) at the **Modelfile level**. Currently the only way to disable thinking is to pass `think: false` in the API request body: ```json { "model": "qwen3.5:122b-a10b", "think": false, "messages": [...] } ``` However, clients that don't support passing custom Ollama parameters (e.g. OpenClaw, many OpenAI-compat clients) have no way to disable thinking — they just get endless thinking token streams, effectively hanging the request. ## Expected Behavior A Modelfile should support: ``` FROM qwen3.5:122b-a10b PARAMETER think false ``` This would allow creating a model variant that disables thinking by default, regardless of what the client sends. ## Current Workaround Attempts - `SYSTEM /no_think` in Modelfile — **does not work** when the client sends its own system prompt (which overrides the Modelfile SYSTEM value) - `PARAMETER think false` — returns `Error: unknown parameter 'think'` - API-level `think: false` — works but requires client support ## Environment - Ollama: 0.17.5 and 0.17.7 (both affected) - Model: qwen3.5:122b-a10b (Q4_K_M, 125B) - Hardware: Apple M1 Ultra, macOS ## Impact Without this, any client that sends a custom system prompt cannot use Qwen 3.5 without the thinking phase hanging the connection — even with a dedicated no-think Modelfile variant.
Author
Owner

@rick-github commented on GitHub (Mar 13, 2026):

Thinking can be disabled for qwen3.5 by creating a modified Modelfile: https://ollama.com/frob/qwen3.5-instruct

<!-- gh-comment-id:4051290022 --> @rick-github commented on GitHub (Mar 13, 2026): Thinking can be disabled for qwen3.5 by creating a modified Modelfile: https://ollama.com/frob/qwen3.5-instruct
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9563