[GH-ISSUE #9137] Ollama API Mixes Reasoning Process with Final Output in content Field #67999

Closed
opened 2026-05-04 12:13:55 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @wuyanfeiwork on GitHub (Feb 15, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9137

Originally assigned to: @ParthSareen on GitHub.

Issue Description

When using Ollama’s OpenAI-compatible API (http://localhost:11434/v1/chat/completions) with models such as DeepSeek R1, the reasoning process and the final output are both written into the content field instead of being separated. This results in content field contamination, making the API response unpredictable and less compatible with OpenAI’s standard behavior.

Suggested Fix

To maintain compatibility with OpenAI’s API specifications, it is recommended to adjust the implementation so that the reasoning process is delivered through a dedicated field (e.g., reasoning_content), while the content field only contains the final generated output.

Relevant log output


OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.5.11

Originally created by @wuyanfeiwork on GitHub (Feb 15, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9137 Originally assigned to: @ParthSareen on GitHub. ### Issue Description When using Ollama’s OpenAI-compatible API (`http://localhost:11434/v1/chat/completions`) with models such as DeepSeek R1, the reasoning process and the final output are both written into the `content` field instead of being separated. This results in `content` field contamination, making the API response unpredictable and less compatible with OpenAI’s standard behavior. ### Suggested Fix To maintain compatibility with OpenAI’s API specifications, it is recommended to adjust the implementation so that the reasoning process is delivered through a dedicated field (e.g., `reasoning_content`), while the `content` field only contains the final generated output. ### Relevant log output ```shell ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.5.11
GiteaMirror added the bugapi labels 2026-05-04 12:13:56 -05:00
Author
Owner

@wuyanfeiwork commented on GitHub (Feb 17, 2025):

Hello @ParthSareen , could you please handle this as soon as possible? Thank you!

<!-- gh-comment-id:2663303236 --> @wuyanfeiwork commented on GitHub (Feb 17, 2025): Hello @ParthSareen , could you please handle this as soon as possible? Thank you!
Author
Owner

@ParthSareen commented on GitHub (Feb 18, 2025):

Hi, yes we'll probably do this at some point. For now it can be filtered out with a simple replace or regex. Thanks!

<!-- gh-comment-id:2666962442 --> @ParthSareen commented on GitHub (Feb 18, 2025): Hi, yes we'll probably do this at some point. For now it can be filtered out with a simple replace or regex. Thanks!
Author
Owner

@phuclh commented on GitHub (Aug 5, 2025):

Is this resolved yet? Thank you.

<!-- gh-comment-id:3156862936 --> @phuclh commented on GitHub (Aug 5, 2025): Is this resolved yet? Thank you.
Author
Owner

@shoiam commented on GitHub (Nov 28, 2025):

@wuyanfeiwork , I see that content and reasoning are getting separated already.
Input:
curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1", "messages": [{"role": "user", "content": "What is 2+2?"}] }'

Output:
"message":{ "role":"assistant", "content":"2 + 2 equals 4.", "reasoning":"First, the user asked: \"What is 2+2?\" This is a basic arithmetic question. I know that 2 + 2 equals 4. That's straightforward.\n\nBut maybe I should consider if there's any context or trick here. The question seems simple, and there's no indication of any special context like binary or other bases. It's just basic addition.\n\nAs an AI, I should respond helpfully and accurately. So, the answer should be 4.\n\nPossible responses:\n- Directly say \"2 + 2 = 4\"\n- Make it engaging, like \"That's an easy one! 2 + 2 equals 4.\"\n- Since the user might be testing me, keep it simple and correct.\n\nI should avoid overcomplicating it unless the user provides more context. The question is clear and direct.\n\nFinally, format the response: Keep it concise. Start with the answer and maybe a little explanation if appropriate.\n\nResponse structure:\n- Answer the question directly.\n- Optionally, confirm if it's helpful.\n" },

Is this the one you are referring to? Caz I see the conversion is already happening in the middleware in openai.go, ToChatCompletion function
Message: Message{Role: r.Message.Role, Content: r.Message.Content, ToolCalls: toolCalls, Reasoning: r.Message.Thinking},

<!-- gh-comment-id:3590161198 --> @shoiam commented on GitHub (Nov 28, 2025): @wuyanfeiwork , I see that content and reasoning are getting separated already. Input: `curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1", "messages": [{"role": "user", "content": "What is 2+2?"}] }'` Output: ` "message":{ "role":"assistant", "content":"2 + 2 equals 4.", "reasoning":"First, the user asked: \"What is 2+2?\" This is a basic arithmetic question. I know that 2 + 2 equals 4. That's straightforward.\n\nBut maybe I should consider if there's any context or trick here. The question seems simple, and there's no indication of any special context like binary or other bases. It's just basic addition.\n\nAs an AI, I should respond helpfully and accurately. So, the answer should be 4.\n\nPossible responses:\n- Directly say \"2 + 2 = 4\"\n- Make it engaging, like \"That's an easy one! 2 + 2 equals 4.\"\n- Since the user might be testing me, keep it simple and correct.\n\nI should avoid overcomplicating it unless the user provides more context. The question is clear and direct.\n\nFinally, format the response: Keep it concise. Start with the answer and maybe a little explanation if appropriate.\n\nResponse structure:\n- Answer the question directly.\n- Optionally, confirm if it's helpful.\n" },` Is this the one you are referring to? Caz I see the conversion is already happening in the middleware in openai.go, ToChatCompletion function `Message: Message{Role: r.Message.Role, Content: r.Message.Content, ToolCalls: toolCalls, Reasoning: r.Message.Thinking},`
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#67999