[PR #20480] [CLOSED] fix: prevent system prompt duplication in native function calling #64496

New Issue

GiteaMirror · 2026-05-06T10:06:26-05:00

GiteaMirror commented

2026-05-06 10:06:26 -05:00

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/20480
Author: @jvadura
Created: 1/8/2026
Status: ❌ Closed

Base: dev ← Head: fix/system-prompt-duplication

📝 Commits (9)

fe6783c Merge pull request #19030 from open-webui/dev
fc05e0a Merge pull request #19405 from open-webui/dev
e3faec6 Merge pull request #19416 from open-webui/dev
9899293 Merge pull request #19448 from open-webui/dev
140605e Merge pull request #19462 from open-webui/dev
6f1486f Merge pull request #19466 from open-webui/dev
d95f533 Merge pull request #19729 from open-webui/dev
a727153 0.6.43 (#20093)
42b731c fix: prevent system prompt duplication in native function calling

📊 Changes

1 file changed (+18 additions, -0 deletions)

View changed files

📝 backend/open_webui/utils/misc.py (+18 -0)

📄 Description

Summary

Prevents system prompt content from being duplicated during native function calling with MCP tools, which was causing quadratic token growth and excessive API costs.

Problem

When using native function calling mode with MCP tools, each tool call iteration triggers update_message_content() which prepends the system prompt to the existing system message. This causes the same prompt to be duplicated multiple times:

Tool call 1: "System prompt" (~20k tokens)
Tool call 2: "System prompt\nSystem prompt" (~40k tokens)
Tool call 3: "System prompt\nSystem prompt\nSystem prompt" (~60k tokens)

Impact: A 20k token conversation can balloon to 3M+ tokens over multiple tool call iterations, causing massive unnecessary API costs.

Root Cause

The bug occurs in the agentic tool call loop:

Initial request applies system prompt via apply_system_prompt_to_body() with replace=True
Each tool call iteration calls generate_chat_completion() again
The router applies model system prompt via apply_system_prompt_to_body() with replace=False (default)
This calls update_message_content() with append=False, which prepends the content
The same system prompt gets prepended on every iteration

Fix

Add a check in update_message_content() to skip the update if the content is already present at the start of the existing message. This prevents duplicate prepending while preserving the ability to append genuinely new content.

Test Plan

Enable native function calling mode
Connect MCP tool server
Trigger multiple tool calls
Verify token count stays stable (not growing by ~system_prompt_size each iteration)

Verified: Token count remains stable at ~12k tokens after 3 tool calls (previously would have grown to ~36k+).

Related: #19656

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/20480 **Author:** [@jvadura](https://github.com/jvadura) **Created:** 1/8/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `fix/system-prompt-duplication` --- ### 📝 Commits (9) - [`fe6783c`](https://github.com/open-webui/open-webui/commit/fe6783c16699911c7be17392596d579333fb110c) Merge pull request #19030 from open-webui/dev - [`fc05e0a`](https://github.com/open-webui/open-webui/commit/fc05e0a6c5d39da60b603b4d520f800d6e36f748) Merge pull request #19405 from open-webui/dev - [`e3faec6`](https://github.com/open-webui/open-webui/commit/e3faec62c58e3a83d89aa3df539feacefa125e0c) Merge pull request #19416 from open-webui/dev - [`9899293`](https://github.com/open-webui/open-webui/commit/9899293f050ad50ae12024cbebee7e018acd851e) Merge pull request #19448 from open-webui/dev - [`140605e`](https://github.com/open-webui/open-webui/commit/140605e660b8186a7d5c79fb3be6ffb147a2f498) Merge pull request #19462 from open-webui/dev - [`6f1486f`](https://github.com/open-webui/open-webui/commit/6f1486ffd0cb288d0e21f41845361924e0d742b3) Merge pull request #19466 from open-webui/dev - [`d95f533`](https://github.com/open-webui/open-webui/commit/d95f533214e3fe5beb5e41ec1f349940bc4c7043) Merge pull request #19729 from open-webui/dev - [`a727153`](https://github.com/open-webui/open-webui/commit/a7271532f8a38da46785afcaa7e65f9a45e7d753) 0.6.43 (#20093) - [`42b731c`](https://github.com/open-webui/open-webui/commit/42b731ca36688013cc4a402895dd147104d606ee) fix: prevent system prompt duplication in native function calling ### 📊 Changes **1 file changed** (+18 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/utils/misc.py` (+18 -0) </details> ### 📄 Description ## Summary Prevents system prompt content from being duplicated during native function calling with MCP tools, which was causing quadratic token growth and excessive API costs. ## Problem When using native function calling mode with MCP tools, each tool call iteration triggers `update_message_content()` which prepends the system prompt to the existing system message. This causes the same prompt to be duplicated multiple times: ``` Tool call 1: "System prompt" (~20k tokens) Tool call 2: "System prompt\nSystem prompt" (~40k tokens) Tool call 3: "System prompt\nSystem prompt\nSystem prompt" (~60k tokens) ``` **Impact:** A 20k token conversation can balloon to 3M+ tokens over multiple tool call iterations, causing massive unnecessary API costs. ## Root Cause The bug occurs in the agentic tool call loop: 1. Initial request applies system prompt via `apply_system_prompt_to_body()` with `replace=True` 2. Each tool call iteration calls `generate_chat_completion()` again 3. The router applies model system prompt via `apply_system_prompt_to_body()` with `replace=False` (default) 4. This calls `update_message_content()` with `append=False`, which **prepends** the content 5. The same system prompt gets prepended on every iteration ## Fix Add a check in `update_message_content()` to skip the update if the content is already present at the start of the existing message. This prevents duplicate prepending while preserving the ability to append genuinely new content. ## Test Plan - [x] Enable native function calling mode - [x] Connect MCP tool server - [x] Trigger multiple tool calls - [x] Verify token count stays stable (not growing by ~system_prompt_size each iteration) **Verified:** Token count remains stable at ~12k tokens after 3 tool calls (previously would have grown to ~36k+). Related: #19656 --- ## Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT.md) (CLA), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-05-06 10:06:26 -05:00

GiteaMirror closed this issue

2026-05-06 10:06:28 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#64496