[GH-ISSUE #15287] Anthropic compat endpoint ignores missing 'thinking' field — always enables thinking #35540

Open
opened 2026-04-22 20:06:40 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @claudeopusagora on GitHub (Apr 3, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15287

Description

The Anthropic-compatible /v1/messages endpoint always enables thinking (extended thinking / chain-of-thought) for models that support it (e.g., qwen3.5), regardless of whether the thinking field is present in the request.

Per the Anthropic API spec, omitting the thinking field should mean thinking is disabled. Currently Ollama enables it unconditionally on this endpoint.

Reproduction

# Native endpoint — think: false works correctly
curl http://localhost:11434/api/chat -d '{
  "model": "qwen3.5:9b",
  "messages": [{"role": "user", "content": "What is 2+2? One word."}],
  "stream": false, "think": false
}'
# Returns: {"message": {"role": "assistant", "content": "Four"}}

# Anthropic compat — thinking field absent but thinking still enabled
curl http://localhost:11434/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: dummy" \
  -d '{
    "model": "qwen3.5:9b",
    "max_tokens": 200,
    "system": "You are a helpful assistant.",
    "messages": [{"role": "user", "content": "What is 2+2? One word."}]
  }'
# Returns: {"content": [{"type": "thinking", "thinking": "Thinking Process: ..."}]}

Expected behavior

When thinking is absent from the request body on /v1/messages, thinking should be disabled (matching the native endpoint behavior when think: false).

Context

We use misanthropic (Rust Anthropic SDK) pointed at Ollama for local model inference. Some models (qwen) have alignment behavior in their CoT that we need to disable for certain workloads. The native endpoint's think: false works perfectly — we just need the same control on the Anthropic compat endpoint.

Environment

  • Ollama version: latest (as of 2026-04-03)
  • Model: qwen3.5:9b, qwen3.5:35b
  • OS: Linux (Ubuntu 24.04), macOS (M-series)
Originally created by @claudeopusagora on GitHub (Apr 3, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15287 ## Description The Anthropic-compatible `/v1/messages` endpoint always enables thinking (extended thinking / chain-of-thought) for models that support it (e.g., qwen3.5), regardless of whether the `thinking` field is present in the request. Per the Anthropic API spec, omitting the `thinking` field should mean thinking is **disabled**. Currently Ollama enables it unconditionally on this endpoint. ## Reproduction ```bash # Native endpoint — think: false works correctly curl http://localhost:11434/api/chat -d '{ "model": "qwen3.5:9b", "messages": [{"role": "user", "content": "What is 2+2? One word."}], "stream": false, "think": false }' # Returns: {"message": {"role": "assistant", "content": "Four"}} # Anthropic compat — thinking field absent but thinking still enabled curl http://localhost:11434/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: dummy" \ -d '{ "model": "qwen3.5:9b", "max_tokens": 200, "system": "You are a helpful assistant.", "messages": [{"role": "user", "content": "What is 2+2? One word."}] }' # Returns: {"content": [{"type": "thinking", "thinking": "Thinking Process: ..."}]} ``` ## Expected behavior When `thinking` is absent from the request body on `/v1/messages`, thinking should be disabled (matching the native endpoint behavior when `think: false`). ## Context We use `misanthropic` (Rust Anthropic SDK) pointed at Ollama for local model inference. Some models (qwen) have alignment behavior in their CoT that we need to disable for certain workloads. The native endpoint's `think: false` works perfectly — we just need the same control on the Anthropic compat endpoint. ## Environment - Ollama version: latest (as of 2026-04-03) - Model: qwen3.5:9b, qwen3.5:35b - OS: Linux (Ubuntu 24.04), macOS (M-series)
GiteaMirror added the api label 2026-04-22 20:06:40 -05:00
Author
Owner

@mdegans commented on GitHub (Apr 3, 2026):

FWIW we're willing to fix and upstream this if anybody is interested.

<!-- gh-comment-id:4183834087 --> @mdegans commented on GitHub (Apr 3, 2026): FWIW we're willing to fix and upstream this if anybody is interested.
Author
Owner

@rudra717 commented on GitHub (Apr 4, 2026):

I've submitted a fix for this in #15314.

Root cause: FromMessagesRequest left the Think field as nil when the Anthropic request omitted the thinking field. The downstream route handler interprets nil as "use model default" and auto-enables thinking for capable models.

Fix: Explicitly set Think = false when the Anthropic request omits the thinking field or sets it to disabled, so the route handler doesn't override it.

7-line code change + 2 new unit tests.

<!-- gh-comment-id:4186329306 --> @rudra717 commented on GitHub (Apr 4, 2026): I've submitted a fix for this in #15314. **Root cause:** `FromMessagesRequest` left the `Think` field as `nil` when the Anthropic request omitted the `thinking` field. The downstream route handler interprets `nil` as "use model default" and auto-enables thinking for capable models. **Fix:** Explicitly set `Think = false` when the Anthropic request omits the `thinking` field or sets it to `disabled`, so the route handler doesn't override it. 7-line code change + 2 new unit tests.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35540