[GH-ISSUE #12628] The thinking content of OLLAMA should be placed within key "reasoning_content" #34142

Closed
opened 2026-04-22 17:27:46 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @zhujunling-nj on GitHub (Oct 15, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12628

What is the issue?

The thought processes of the OLLAMA /v1 endpoint are stored in key "reasoning", while those from platforms like VLLM are stored in key "reasoning_content". This situation may result in some Agent framework being unable to retrieve the corresponding thinking content.

Relevant log output


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.12.5

Originally created by @zhujunling-nj on GitHub (Oct 15, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12628 ### What is the issue? The thought processes of the OLLAMA /v1 endpoint are stored in key "reasoning", while those from platforms like VLLM are stored in key "reasoning_content". This situation may result in some Agent framework being unable to retrieve the corresponding thinking content. ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.12.5
GiteaMirror added the bug label 2026-04-22 17:27:46 -05:00
Author
Owner

@itzpingcat commented on GitHub (Oct 18, 2025):

I am pretty sure ollama stores reasoning in <\think> tags

<!-- gh-comment-id:3417637640 --> @itzpingcat commented on GitHub (Oct 18, 2025): I am pretty sure ollama stores reasoning in <think><\think> tags
Author
Owner

@zhujunling-nj commented on GitHub (Oct 20, 2025):

In ollama, the reasoning content is stored in the key "reasoning", while other reasoning platforms use the key "reasoning_content".

curl http://127.0.0.1:11434/v1/chat/completions -d '{"messages": [{"role": "user", "content": "Hello"}], "model": "qwen3:30b-thinking"}' | jq .

{
  "id": "chatcmpl-884",
  "object": "chat.completion",
  "created": 1760921673,
  "model": "qwen3:30b-thinking",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today? 😊",
        "reasoning": "Okay, the user said Hello. I need to respond appropriately. Let me see. First, I should greet them back. Maybe say something friendly like \"Hello! How can I assist you today?\" But wait, I should check if there's anything specific they need help with. Let me make sure I keep it open-ended so they can ask whatever they need. Let me not overcomplicate it. Just a simple, friendly response. Yeah, that should work. Let me make sure there are no typos. Alright, done.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 11,
    "completion_tokens": 122,
    "total_tokens": 133
  }
}

curl -H "Authorization: Bearer xxxx" https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions -d '{"messages": [{"role": "user", "content": "Hello"}], "model": "qwen3-30b-a3b-thinking-2507"}' | jq .

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Hello! How can I assist you today? 😊",
        "reasoning_content": "Okay, the user said \"Hello\". I need to respond politely. Let me check the guidelines. First, I should be friendly and welcoming. Maybe say something like \"Hello! How can I assist you today?\" But wait, the user might be testing if I'm working. Let me make sure it's a standard greeting response. Also, keep it simple and not too long. Let me confirm the guidelines again. Yes, respond warmly and ask how I can help. So the response should be \"Hello! How can I assist you today?\" Wait, but the user might have a typo. No, \"Hello\" is correct. Alright, go with that.",
        "role": "assistant"
      }
    }
  ],
  "created": 1760922432571,
  "id": "chatcmpl-86dbcd7f-78a8-42cb-b983-8640de03145e",
  "model": "qwen3-30b-a3b-thinking-2507",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 149,
    "completion_tokens_details": {
      "reasoning_tokens": 133
    },
    "prompt_tokens": 9,
    "total_tokens": 158
  }
}
<!-- gh-comment-id:3420160247 --> @zhujunling-nj commented on GitHub (Oct 20, 2025): In ollama, the reasoning content is stored in the key "reasoning", while other reasoning platforms use the key "reasoning_content". curl http://127.0.0.1:11434/v1/chat/completions -d '{"messages": [{"role": "user", "content": "Hello"}], "model": "qwen3:30b-thinking"}' | jq . ``` { "id": "chatcmpl-884", "object": "chat.completion", "created": 1760921673, "model": "qwen3:30b-thinking", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I assist you today? 😊", "reasoning": "Okay, the user said Hello. I need to respond appropriately. Let me see. First, I should greet them back. Maybe say something friendly like \"Hello! How can I assist you today?\" But wait, I should check if there's anything specific they need help with. Let me make sure I keep it open-ended so they can ask whatever they need. Let me not overcomplicate it. Just a simple, friendly response. Yeah, that should work. Let me make sure there are no typos. Alright, done.\n" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 11, "completion_tokens": 122, "total_tokens": 133 } } ``` curl -H "Authorization: Bearer xxxx" https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions -d '{"messages": [{"role": "user", "content": "Hello"}], "model": "qwen3-30b-a3b-thinking-2507"}' | jq . ``` { "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "Hello! How can I assist you today? 😊", "reasoning_content": "Okay, the user said \"Hello\". I need to respond politely. Let me check the guidelines. First, I should be friendly and welcoming. Maybe say something like \"Hello! How can I assist you today?\" But wait, the user might be testing if I'm working. Let me make sure it's a standard greeting response. Also, keep it simple and not too long. Let me confirm the guidelines again. Yes, respond warmly and ask how I can help. So the response should be \"Hello! How can I assist you today?\" Wait, but the user might have a typo. No, \"Hello\" is correct. Alright, go with that.", "role": "assistant" } } ], "created": 1760922432571, "id": "chatcmpl-86dbcd7f-78a8-42cb-b983-8640de03145e", "model": "qwen3-30b-a3b-thinking-2507", "object": "chat.completion", "usage": { "completion_tokens": 149, "completion_tokens_details": { "reasoning_tokens": 133 }, "prompt_tokens": 9, "total_tokens": 158 } } ```
Author
Owner

@zhujunling-nj commented on GitHub (Oct 30, 2025):

https://github.com/vllm-project/vllm/issues/27755
A proposal was made to replace reasoning_content with reasoning in vllm.

<!-- gh-comment-id:3466515616 --> @zhujunling-nj commented on GitHub (Oct 30, 2025): https://github.com/vllm-project/vllm/issues/27755 A proposal was made to replace reasoning_content with reasoning in vllm.
Author
Owner

@hmellor commented on GitHub (Oct 30, 2025):

FYI https://github.com/vllm-project/vllm/pull/27752 is the PR which adds reasoning (removing reasoning_content is the next step)

<!-- gh-comment-id:3467221587 --> @hmellor commented on GitHub (Oct 30, 2025): FYI https://github.com/vllm-project/vllm/pull/27752 is the PR which adds `reasoning` (removing `reasoning_content` is the next step)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34142