[GH-ISSUE #18612] feat: Refactor and Improve Tools System #18651

Closed
opened 2026-04-20 00:51:50 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @Davixk on GitHub (Oct 25, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18612

Problem Statement

Currently, tool calls are embedded inline as <details> XML within assistant message content during streaming. After the tool completes and the model continues generating text, this XML is stripped from the message before it's stored.

What happens:

  1. Model generates text: "sure! let me create that for you"
  2. Model calls tool (represented as XML in the stream)
  3. Tool executes and returns data to model
  4. Model continues: "done! I generated your cottage image"
  5. XML is stripped - final stored message is just the concatenated text

When the user sends the next message, the model receives:

{
	"role": "assistant",
	"content": "sure! let me create that for you\ndone! I generated your cottage image"
}

The model has no context that it called a tool, what arguments it passed, or what the tool returned. It only sees its own output text with no explanation for why it generated those specific words. This is misleading and degrades conversation quality.

Additional issues:

  • Tools cannot dynamically update their execution status (static XML can't be modified after streaming)
  • No event-driven system for tool status updates
  • Tool execution state is lost immediately after completion
  • Frontend cannot persist a proper timeline of what happened

Solution

Elevate Tool Execution to First-Class Messages

Transform tool calls from embedded XML into proper messages. What is currently ONE assistant message becomes THREE separate messages:

Before (1 message):

{
	"role": "assistant",
	"content": "sure! let me create that for you\ndone! I generated your cottage image"
}

After (3 messages):

[
	{
		"role": "assistant",
		"content": "sure! let me create that for you",
		"tool_calls": [
			{
				"id": "call_abc123",
				"function": "generate_image",
				"arguments": { "prompt": "beautiful cottage" }
			}
		]
	},
	{
		"role": "tool",
		"tool_call_id": "call_abc123",
		"tool_output": "{\"status\": \"success\", \"images\": [...]}"
	},
	{
		"role": "assistant",
		"content": "done! I generated your cottage image"
	}
]

Message Flow

Three message types in sequence:

  1. Assistant message - model generates optional text and/or calls tool(s)
  2. Tool message - represents the tool execution itself (NEW concept)
  3. Assistant message - model continues after receiving tool output

Each tool call now has its own message and event timeline, just like regular messages do.

Streaming Flow

Step-by-step:

  1. User sends message → backend forwards to model
  2. Model streams text tokens (optional step) → displayed on frontend
  3. Model makes tool call → assistant message ends with tool_calls field populated
  4. Backend immediately starts streaming a new message with role: "tool" and tool_call_id
  5. Frontend receives tool message → sets up tool execution UI inline after assistant message
  6. Tool executes → emits tool:status events with tool_call_id
  7. Frontend matches events to tool message by tool_call_id → updates UI in real-time
  8. Tool completes → backend populates tool_output field in tool message
  9. Backend sends tool output to model
  10. Model starts streaming new assistant message → frontend renders as separate message
  11. Tool UI remains visible with full event history preserved

Key characteristics:

  • Tool UI renders inline after the first assistant message
  • Status events are attached to tool_call_id, not message content
  • Tool execution state persists with full event timeline
  • All context is preserved in database and visible in UI

Event-Driven Status Updates

Each tool call gets its own event stream, identified by tool_call_id:

{
  type: "tool:status",
  tool_call_id: "call_abc123",
  data: {
    status: "processing",
    message: "Generating image...",
    progress: 45
  }
}

Frontend matches events to tool messages and maintains a chronological list of status updates per tool call. This list persists across page reloads.

Timeline Persistence

The frontend now maintains a proper, persistent timeline:

  • Message from assistant (text + tool call indication)
  • Tool execution with live status updates
  • Message from assistant (response after seeing tool output)

This timeline is preserved in the database and reconstructed on reload. Users can see the full history of what happened, when, and why.

Implementation Requirements

Database Schema

Add fields to Message model:

  • role (String, enum: "user" | "assistant" | "tool")
  • tool_calls (JSON, nullable) - for assistant messages that call tools
  • tool_call_id (String, nullable) - for tool messages
  • tool_output (JSON, nullable) - for tool messages

All new fields are nullable for backwards compatibility with existing messages.

Backend Changes

Message Creation:

  • Create 3 separate Message records instead of 1 when tools are used
  • Set appropriate role for each message type
  • Populate tool_calls field when model calls tools
  • Create tool message immediately with tool_call_id (before tool executes)
  • Populate tool_output after tool returns data

Streaming:

  • Stream tool messages with role: "tool" as separate messages
  • Stream tool:status events with tool_call_id for matching

Context Management:

  • Preserve all 3 message types in conversation history
  • Send full context to model (including tool messages with tool_output)
  • Do NOT strip tool-related fields from messages

Frontend Changes

Rendering:

  • Detect role: "tool" messages → render tool execution UI inline (not as separate chat bubble)
  • Match incoming tool:status events to tool messages by tool_call_id
  • Display tool status updates within tool UI in real-time

Persistence:

  • Store tool event timeline in message metadata
  • Reconstruct tool UI state on page reload from stored events
  • Maintain visual timeline: assistant msg → tool UI → assistant msg

Why This Works

Complete Context Preservation:

  • Model sees full history: what tools were called, with what arguments, and what they returned
  • No information loss across conversation turns
  • Model can reference previous tool calls and outputs accurately

Event-Driven Architecture:

  • Tools emit status events matched by tool_call_id
  • UI updates dynamically during execution
  • Event timeline persists indefinitely

Clean Separation of Concerns:

  • Tool execution is its own entity, not embedded markup
  • Each message type has a clear, singular purpose
  • Status updates are decoupled from message content

Benefits:

  • Real-time progress feedback improves UX
  • Full context available even if model API doesn't support tool message types
  • Proper foundation for future model API compatibility
  • Tool execution history is queryable and analyzable
  • Timeline-based UI is intuitive and informative
Originally created by @Davixk on GitHub (Oct 25, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/18612 ## Problem Statement Currently, tool calls are embedded inline as `<details>` XML within assistant message content during streaming. After the tool completes and the model continues generating text, this XML is stripped from the message before it's stored. **What happens:** 1. Model generates text: `"sure! let me create that for you"` 2. Model calls tool (represented as XML in the stream) 3. Tool executes and returns data to model 4. Model continues: `"done! I generated your cottage image"` 5. **XML is stripped** - final stored message is just the concatenated text **When the user sends the next message, the model receives:** ```json { "role": "assistant", "content": "sure! let me create that for you\ndone! I generated your cottage image" } ``` The model has **no context** that it called a tool, what arguments it passed, or what the tool returned. It only sees its own output text with no explanation for why it generated those specific words. This is misleading and degrades conversation quality. **Additional issues:** - Tools cannot dynamically update their execution status (static XML can't be modified after streaming) - No event-driven system for tool status updates - Tool execution state is lost immediately after completion - Frontend cannot persist a proper timeline of what happened ## Solution ### Elevate Tool Execution to First-Class Messages Transform tool calls from embedded XML into proper messages. What is currently ONE assistant message becomes THREE separate messages: **Before (1 message):** ```json { "role": "assistant", "content": "sure! let me create that for you\ndone! I generated your cottage image" } ``` **After (3 messages):** ```json [ { "role": "assistant", "content": "sure! let me create that for you", "tool_calls": [ { "id": "call_abc123", "function": "generate_image", "arguments": { "prompt": "beautiful cottage" } } ] }, { "role": "tool", "tool_call_id": "call_abc123", "tool_output": "{\"status\": \"success\", \"images\": [...]}" }, { "role": "assistant", "content": "done! I generated your cottage image" } ] ``` ### Message Flow **Three message types in sequence:** 1. **Assistant message** - model generates optional text and/or calls tool(s) 2. **Tool message** - represents the tool execution itself (NEW concept) 3. **Assistant message** - model continues after receiving tool output Each tool call now has its own message and event timeline, just like regular messages do. ### Streaming Flow **Step-by-step:** 1. User sends message → backend forwards to model 2. Model streams text tokens (optional step) → displayed on frontend 3. Model makes tool call → assistant message ends with `tool_calls` field populated 4. Backend immediately starts streaming a new message with `role: "tool"` and `tool_call_id` 5. Frontend receives tool message → sets up tool execution UI inline after assistant message 6. Tool executes → emits `tool:status` events with `tool_call_id` 7. Frontend matches events to tool message by `tool_call_id` → updates UI in real-time 8. Tool completes → backend populates `tool_output` field in tool message 9. Backend sends tool output to model 10. Model starts streaming new assistant message → frontend renders as separate message 11. Tool UI remains visible with full event history preserved **Key characteristics:** - Tool UI renders inline after the first assistant message - Status events are attached to `tool_call_id`, not message content - Tool execution state persists with full event timeline - All context is preserved in database and visible in UI ### Event-Driven Status Updates Each tool call gets its own event stream, identified by `tool_call_id`: ```typescript { type: "tool:status", tool_call_id: "call_abc123", data: { status: "processing", message: "Generating image...", progress: 45 } } ``` Frontend matches events to tool messages and maintains a chronological list of status updates per tool call. This list persists across page reloads. ### Timeline Persistence The frontend now maintains a proper, persistent timeline: - Message from assistant (text + tool call indication) - Tool execution with live status updates - Message from assistant (response after seeing tool output) This timeline is preserved in the database and reconstructed on reload. Users can see the full history of what happened, when, and why. ## Implementation Requirements ### Database Schema Add fields to Message model: - `role` (String, enum: "user" | "assistant" | "tool") - `tool_calls` (JSON, nullable) - for assistant messages that call tools - `tool_call_id` (String, nullable) - for tool messages - `tool_output` (JSON, nullable) - for tool messages All new fields are nullable for backwards compatibility with existing messages. ### Backend Changes **Message Creation:** - Create 3 separate Message records instead of 1 when tools are used - Set appropriate `role` for each message type - Populate `tool_calls` field when model calls tools - Create tool message immediately with `tool_call_id` (before tool executes) - Populate `tool_output` after tool returns data **Streaming:** - Stream tool messages with `role: "tool"` as separate messages - Stream `tool:status` events with `tool_call_id` for matching **Context Management:** - Preserve all 3 message types in conversation history - Send full context to model (including tool messages with `tool_output`) - Do NOT strip tool-related fields from messages ### Frontend Changes **Rendering:** - Detect `role: "tool"` messages → render tool execution UI inline (not as separate chat bubble) - Match incoming `tool:status` events to tool messages by `tool_call_id` - Display tool status updates within tool UI in real-time **Persistence:** - Store tool event timeline in message metadata - Reconstruct tool UI state on page reload from stored events - Maintain visual timeline: assistant msg → tool UI → assistant msg ## Why This Works **Complete Context Preservation:** - Model sees full history: what tools were called, with what arguments, and what they returned - No information loss across conversation turns - Model can reference previous tool calls and outputs accurately **Event-Driven Architecture:** - Tools emit status events matched by `tool_call_id` - UI updates dynamically during execution - Event timeline persists indefinitely **Clean Separation of Concerns:** - Tool execution is its own entity, not embedded markup - Each message type has a clear, singular purpose - Status updates are decoupled from message content **Benefits:** - Real-time progress feedback improves UX - Full context available even if model API doesn't support tool message types - Proper foundation for future model API compatibility - Tool execution history is queryable and analyzable - Timeline-based UI is intuitive and informative
Author
Owner

@Davixk commented on GitHub (Oct 25, 2025):

as for how the tool can emit tool events, I can see two ways:

  1. we provide a new __tool_call_id__ argument to the tool call, just like others like __event_emitter__ are already injected. the tool can then use it as needed with the event emitter to emit tool events
  2. we customize or provide a new __tool_event_emitter__ which is already associated with the current tool call ID, and can easily emit events directly displayed for that tool call

version 2 sounds better to me. it limits the tool from potentially guessing and hijacking other tool call status events, while also streamlining usage.

<!-- gh-comment-id:3445432266 --> @Davixk commented on GitHub (Oct 25, 2025): as for how the tool can emit tool events, I can see two ways: 1. we provide a new `__tool_call_id__` argument to the tool call, just like others like `__event_emitter__` are already injected. the tool can then use it as needed with the event emitter to emit tool events 2. we customize or provide a **new** `__tool_event_emitter__` which is already associated with the current tool call ID, and can easily emit events directly displayed for that tool call version 2 sounds better to me. it **limits the tool** from potentially guessing and hijacking other tool call status events, while also **streamlining usage**.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18651