[GH-ISSUE #23175] issue: reasoning_content is stripped from assistant tool call messages, breaking multi-turn tool calling with reasoning models (Kimi K2.5, etc.) #35437

New Issue

GiteaMirror · 2026-04-25T09:38:49-05:00

GiteaMirror commented

2026-04-25 09:38:49 -05:00

Originally created by @estemit on GitHub (Mar 28, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23175

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.8.12 (latest)

Ollama Version (if applicable)

No response

Operating System

Debian 13

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When using reasoning-enabled models like Kimi K2.5 (Moonshot) with native function calling, the reasoning_content field from the assistant's tool call response should be preserved and included when reconstructing the conversation history for subsequent API calls. This is required by the Moonshot API and similar reasoning-model providers.

According to Moonshot's official documentation:

During multi-step tool calling, you must keep the reasoning_content from the assistant message in the current turn's tool call within the context, otherwise an error will be thrown.
— https://platform.moonshot.ai/docs/guide/kimi-k2-5-quickstart#tool-use-compatibility

Actual Behavior

OpenWebUI strips the reasoning_content field from assistant messages when they contain tool_calls. This causes the upstream API (Moonshot/OpencodeGO) to return a 400 Bad Request error on the next turn:

{
  "error": {
    "message": "thinking is enabled but reasoning_content is missing in assistant tool call message at index N",
    "type": "invalid_request_error"
  }
}

This breaks multi-turn tool calling with any reasoning-enabled model.

Steps to Reproduce

Configure OpenWebUI with a reasoning-enabled model (e.g., kimi-k2.5 via Moonshot API or OpencodeGO provider).
Enable native function calling (set function_calling: native in model advanced params).
Enable at least one tool (e.g., web search, code interpreter).
Start a new chat and send a prompt that requires multi-turn tool calling:

"Search for the current weather in New York and Tokyo, then calculate the temperature difference."
The model will:
- First reasoning block → tool calls (web search) → tool results
- Expected: Second reasoning block should work
- Actual: API returns 400 error because reasoning_content was stripped from the assistant message containing tool calls

Logs & Screenshots

Error from upstream API:

HTTP/1.1 400 Bad Request
{
  "is_bifrost_error": false,
  "status_code": 400,
  "error": {
    "message": "thinking is enabled but reasoning_content is missing in assistant tool call message at index 2",
    "type": "invalid_request_error"
  }
}

Additional Information

Root Cause Analysis

The issue is in how OpenWebUI reconstructs the conversation history when sending requests to the LLM API. When an assistant message contains both tool_calls and reasoning_content, OpenWebUI appears to be dropping the reasoning_content field before sending it to the API.

This is architecturally similar to the issue that LiteLLM faced with Anthropic's thinking_blocks (see reference below). The API is stateless and requires the client to resend reasoning_content in assistant messages, but OpenWebUI strips this field.

Current Workaround

I had to implement a proxy service (kimi-proxy) that:

Caches reasoning_content from assistant tool call responses
Restores it to the conversation history before sending to the API
Routes: OpenWebUI → kimi-proxy → Bifrost → OpencodeGO

This is not ideal and should be handled natively by OpenWebUI.

References

Similar issue in LiteLLM: https://github.com/BerriAI/litellm/issues/21672
- Same error message: "thinking is enabled but reasoning_content is missing in assistant tool call message at index N"
- Same root cause: reasoning_content stripped from conversation history
- Fixed in LiteLLM by preserving reasoning_content in assistant tool call messages
Moonshot API Documentation: https://platform.moonshot.ai/docs/guide/kimi-k2-5-quickstart#tool-use-compatibility
- Explicitly states that reasoning_content must be preserved during multi-step tool calling
Related OpenWebUI Issue: https://github.com/open-webui/open-webui/issues/23173
- Different symptom (truncation vs. complete stripping) but same underlying area of concern

Workaround proxy logs showing the fix:

Restored reasoning_content for tool_call_id=call_xxx (1234 chars)

Proposed Solution

When reconstructing the conversation history for API calls, OpenWebUI should:

Preserve the reasoning_content field in assistant messages that contain tool_calls
Include it in the request payload sent to the LLM API

This may require changes in:

Message serialization/handling code
The native tool calling pipeline
Frontend message store

Additional Information

Affects any reasoning/thinking-enabled model that requires reasoning_content to be preserved
The issue is specifically with multi-turn tool calling (second and subsequent tool call rounds)
First tool call works because there's no prior assistant message with reasoning to preserve

Originally created by @estemit on GitHub (Mar 28, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23175 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.8.12 (latest) ### Ollama Version (if applicable) _No response_ ### Operating System Debian 13 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When using reasoning-enabled models like **Kimi K2.5** (Moonshot) with native function calling, the `reasoning_content` field from the assistant's tool call response should be preserved and included when reconstructing the conversation history for subsequent API calls. This is required by the Moonshot API and similar reasoning-model providers. According to Moonshot's official documentation: > During multi-step tool calling, you must keep the `reasoning_content` from the assistant message in the current turn's tool call within the context, otherwise an error will be thrown. > — https://platform.moonshot.ai/docs/guide/kimi-k2-5-quickstart#tool-use-compatibility ### Actual Behavior OpenWebUI strips the `reasoning_content` field from assistant messages when they contain `tool_calls`. This causes the upstream API (Moonshot/OpencodeGO) to return a **400 Bad Request** error on the next turn: ```json { "error": { "message": "thinking is enabled but reasoning_content is missing in assistant tool call message at index N", "type": "invalid_request_error" } } ``` This breaks multi-turn tool calling with any reasoning-enabled model. ### Steps to Reproduce 1. Configure OpenWebUI with a reasoning-enabled model (e.g., `kimi-k2.5` via Moonshot API or OpencodeGO provider). 2. Enable native function calling (set `function_calling: native` in model advanced params). 3. Enable at least one tool (e.g., web search, code interpreter). 4. Start a new chat and send a prompt that requires multi-turn tool calling: > "Search for the current weather in New York and Tokyo, then calculate the temperature difference." 5. The model will: - First reasoning block → tool calls (web search) → tool results - **Expected:** Second reasoning block should work - **Actual:** API returns 400 error because `reasoning_content` was stripped from the assistant message containing tool calls ### Logs & Screenshots ### Error from upstream API: ``` HTTP/1.1 400 Bad Request { "is_bifrost_error": false, "status_code": 400, "error": { "message": "thinking is enabled but reasoning_content is missing in assistant tool call message at index 2", "type": "invalid_request_error" } } ``` ### Additional Information ## Root Cause Analysis The issue is in how OpenWebUI reconstructs the conversation history when sending requests to the LLM API. When an assistant message contains both `tool_calls` and `reasoning_content`, OpenWebUI appears to be dropping the `reasoning_content` field before sending it to the API. This is architecturally similar to the issue that LiteLLM faced with Anthropic's `thinking_blocks` (see reference below). The API is stateless and requires the client to resend `reasoning_content` in assistant messages, but OpenWebUI strips this field. ## Current Workaround I had to implement a proxy service (`kimi-proxy`) that: 1. Caches `reasoning_content` from assistant tool call responses 2. Restores it to the conversation history before sending to the API 3. Routes: OpenWebUI → kimi-proxy → Bifrost → OpencodeGO This is not ideal and should be handled natively by OpenWebUI. ## References - **Similar issue in LiteLLM**: https://github.com/BerriAI/litellm/issues/21672 - Same error message: `"thinking is enabled but reasoning_content is missing in assistant tool call message at index N"` - Same root cause: reasoning_content stripped from conversation history - Fixed in LiteLLM by preserving reasoning_content in assistant tool call messages - **Moonshot API Documentation**: https://platform.moonshot.ai/docs/guide/kimi-k2-5-quickstart#tool-use-compatibility - Explicitly states that `reasoning_content` must be preserved during multi-step tool calling - **Related OpenWebUI Issue**: https://github.com/open-webui/open-webui/issues/23173 - Different symptom (truncation vs. complete stripping) but same underlying area of concern ### Workaround proxy logs showing the fix: ``` Restored reasoning_content for tool_call_id=call_xxx (1234 chars) ``` ## Proposed Solution When reconstructing the conversation history for API calls, OpenWebUI should: 1. Preserve the `reasoning_content` field in assistant messages that contain `tool_calls` 2. Include it in the request payload sent to the LLM API This may require changes in: - Message serialization/handling code - The native tool calling pipeline - Frontend message store ## Additional Information - Affects any reasoning/thinking-enabled model that requires `reasoning_content` to be preserved - The issue is specifically with **multi-turn** tool calling (second and subsequent tool call rounds) - First tool call works because there's no prior assistant message with reasoning to preserve

GiteaMirror added the bug label 2026-04-25 09:38:49 -05:00

GiteaMirror closed this issue

2026-04-25 09:38:50 -05:00

GiteaMirror commented

2026-04-25 09:38:51 -05:00

@huaanhmai28-rgb commented on GitHub (Mar 31, 2026):

same problem

@huaanhmai28-rgb commented on GitHub (Mar 31, 2026): same problem

GiteaMirror commented

2026-04-25 09:38:51 -05:00

@aayushbaluni commented on GitHub (Apr 15, 2026):

Submitted a fix in #23742. The root cause is that convert_output_to_messages() in misc.py never sets reasoning_content on the emitted assistant message dict — the reasoning text is only folded into content as tagged text. The fix adds a pending_reasoning accumulator so reasoning_content is preserved alongside tool_calls for providers that require it.

@aayushbaluni commented on GitHub (Apr 15, 2026): Submitted a fix in #23742. The root cause is that `convert_output_to_messages()` in `misc.py` never sets `reasoning_content` on the emitted assistant message dict — the reasoning text is only folded into `content` as tagged text. The fix adds a `pending_reasoning` accumulator so `reasoning_content` is preserved alongside `tool_calls` for providers that require it.

GiteaMirror commented

2026-04-25 09:38:53 -05:00

@tjbck commented on GitHub (Apr 17, 2026):

Likely addressed in dev.

@tjbck commented on GitHub (Apr 17, 2026): Likely addressed in dev.

GiteaMirror commented

2026-04-25 09:38:53 -05:00

@tjbck commented on GitHub (Apr 21, 2026):

Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead.

@tjbck commented on GitHub (Apr 21, 2026): Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead.

GiteaMirror commented

2026-04-25 09:38:54 -05:00

@RodolfoCastanheira commented on GitHub (Apr 21, 2026):

Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead.

How?

@RodolfoCastanheira commented on GitHub (Apr 21, 2026): > Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead. How?

GiteaMirror commented

2026-04-25 09:38:55 -05:00

@Marutselu commented on GitHub (Apr 23, 2026):

Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead.

I have been using my own patch for almost 3 months, and it is working perfectly on my end with all providers (OpenAI, Anthropic, Gemini, Deepseek, MoonshotAI, LiteLLM, OpenRouter).

Also, currently it cannot be handled externally because reasoning_content is not kept without patching the code.

@Marutselu commented on GitHub (Apr 23, 2026): > Reverting this change in dev, this change introduces incompatibilities with certain providers. Should be handled externally instead. I have been using my own patch for almost 3 months, and it is working perfectly on my end with all providers (OpenAI, Anthropic, Gemini, Deepseek, MoonshotAI, LiteLLM, OpenRouter). Also, currently it cannot be handled externally because reasoning_content is not kept without patching the code.

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#35437