[GH-ISSUE #12561] absence of thinking text of the prompt (created using previous messages) sent to the model qwen3:4b deployed in ollama #54846

Open
opened 2026-04-29 07:36:37 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @RedAch2000 on GitHub (Oct 10, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12561

Hey @redaachouhad , You called the Open AI API compatibility endpoint. You'll need to specify "reasoning_effort": "medium" instead of "think": true. The think parameter will only work if you call Ollama's /api/chat endpoint.

I did actually turn this on by default for the next version of Ollama just to decrease the confusion. I'm going to go ahead and close the issue as answer, but lmk if you run into problems.

Originally posted by @pdevine in #12551

Hey @pdevine, i have used "reasoning_effort": "medium". Unfortunately, it doesn’t work — I get this error:

curl -X POST http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @request.json

{"error":{"message":"think value "medium" is not supported for this model","type":"api_error","param":null,"code":null}}

Originally created by @RedAch2000 on GitHub (Oct 10, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12561 > Hey @redaachouhad , You called the Open AI API compatibility endpoint. You'll need to specify `"reasoning_effort": "medium"` instead of `"think": true`. The `think` parameter will only work if you call Ollama's `/api/chat` endpoint. > > I did actually turn this on by default for the next version of Ollama just to decrease the confusion. I'm going to go ahead and close the issue as answer, but lmk if you run into problems. _Originally posted by @pdevine in [#12551](https://github.com/ollama/ollama/issues/12551#issuecomment-3387076962)_ Hey @pdevine, i have used "reasoning_effort": "medium". Unfortunately, it doesn’t work — I get this error: curl -X POST http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @request.json {"error":{"message":"think value "medium" is not supported for this model","type":"api_error","param":null,"code":null}}
Author
Owner

@pdevine commented on GitHub (Oct 10, 2025):

@redaachouhad Yes, that endpoint will now (i.e. in the next version of ollama) automatically return the thinking text. You don't need to pass in the reasoning_effort as qwen doesn't support changing it.

<!-- gh-comment-id:3390867291 --> @pdevine commented on GitHub (Oct 10, 2025): @redaachouhad Yes, that endpoint will now (i.e. in the next version of ollama) automatically return the thinking text. You don't need to pass in the `reasoning_effort` as qwen doesn't support changing it.
Author
Owner

@pdevine commented on GitHub (Oct 10, 2025):

Here's what it will look like in 0.12.5:

% curl -s localhost:11434/v1/chat/completions -d '{"model": "qwen3:4b", "messages": [{"role": "user", "content": "hey there"}]}' | jq
{
  "id": "chatcmpl-287",
  "object": "chat.completion",
  "created": 1760111675,
  "model": "qwen3:4b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hey! 👋 How can I assist you today? 😊",
        "reasoning": "Okay, the user said \"hey there\". Let me think about how to respond.\n\nFirst, \"hey there\" is a friendly greeting, so I should respond in a warm and approachable way. Maybe add a smiley to keep it light.\n\nI should probably acknowledge their greeting and offer help. Something like \"Hey! 👋 How can I assist you today?\" That's friendly and opens the door for them to ask questions or get help.\n\nWait, let me check if there's anything else. Sometimes people use \"hey there\" to be casual, so keeping it simple is good. No need to overcomplicate.\n\nAlso, make sure the response is in English since the query is in English. No need for other languages here.\n\nI think that's it. Just a friendly response with a smiley and an offer to help.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 185,
    "total_tokens": 197
  }
}
<!-- gh-comment-id:3390909897 --> @pdevine commented on GitHub (Oct 10, 2025): Here's what it will look like in `0.12.5`: ``` % curl -s localhost:11434/v1/chat/completions -d '{"model": "qwen3:4b", "messages": [{"role": "user", "content": "hey there"}]}' | jq { "id": "chatcmpl-287", "object": "chat.completion", "created": 1760111675, "model": "qwen3:4b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hey! 👋 How can I assist you today? 😊", "reasoning": "Okay, the user said \"hey there\". Let me think about how to respond.\n\nFirst, \"hey there\" is a friendly greeting, so I should respond in a warm and approachable way. Maybe add a smiley to keep it light.\n\nI should probably acknowledge their greeting and offer help. Something like \"Hey! 👋 How can I assist you today?\" That's friendly and opens the door for them to ask questions or get help.\n\nWait, let me check if there's anything else. Sometimes people use \"hey there\" to be casual, so keeping it simple is good. No need to overcomplicate.\n\nAlso, make sure the response is in English since the query is in English. No need for other languages here.\n\nI think that's it. Just a friendly response with a smiley and an offer to help.\n" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 12, "completion_tokens": 185, "total_tokens": 197 } } ```
Author
Owner

@zhujunling-nj commented on GitHub (Oct 14, 2025):

qwen3 series model:
Using the openai Python module, ollama 0.12.5 returns the reasoning content through the reasoning attribute, while modelscope returns the reasoning content through the reasoning_content attribute.
This causes qwen-agent to be unable to retrieve the reasoning content.

<!-- gh-comment-id:3399987668 --> @zhujunling-nj commented on GitHub (Oct 14, 2025): qwen3 series model: Using the openai Python module, ollama 0.12.5 returns the reasoning content through the reasoning attribute, while modelscope returns the reasoning content through the reasoning_content attribute. This causes qwen-agent to be unable to retrieve the reasoning content.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54846