mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-05 18:38:17 -05:00
[GH-ISSUE #23269] issue: Pyodide prompt injection constantly poisons the token cache (native tool calling) #19937
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @arbv on GitHub (Mar 31, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23269
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.8.12
Ollama Version (if applicable)
No response
Operating System
NixOS
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
CODE_INTERPRETER_PYODIDE_PROMPTis static content that does not change between turns or sessions. It should be appended to the system prompt once, where static instructional context belongs architecturally. This makes it part of the stable cached prefix that LLM providers can reuse across turns without re-billing.Actual Behavior
CODE_INTERPRETER_PYODIDE_PROMPTis appended ephemerally to the last user message at request assembly time and is not persisted to conversation storage. This happens on every turn whenever the Pyodide engine is active, including when native tool calling mode is enabled, not only during active code execution.Because the injection is ephemeral, the following mismatch occurs on every turn:
[...history, userN + PYODIDE_PROMPT]and written to the provider cache.[...history, userN_clean, assistantN, userN+1 + PYODIDE_PROMPT], whereuserN_cleandoes not match the cacheduserN + PYODIDE_PROMPT.The prefix mismatch starts at
userN, invalidating the entire accumulated conversation history. The full conversation is charged at regular input token price instead of the much cheaper cache read rate on every single turn.Steps to Reproduce
It is especially annoying in long conversations.
Logs & Screenshots
N/A, but here are the links to the relevant parts of the code:
9bd84258d0/backend/open_webui/config.py (L2142)9bd84258d0/backend/open_webui/utils/middleware.py (L2375)Additional Information
This issue is specific to the Pyodide engine. The Jupyter does not exhibit this behaviour. Tool definitions themselves are also correctly placed in a stable position and do not cause this problem.
The fix is to append
CODE_INTERPRETER_PYODIDE_PROMPTto the system prompt when the Pyodide engine is active, consistent with how tool definitions are handled.@arbv commented on GitHub (Mar 31, 2026):
The ability to effectively set the
CODE_INTERPRETER_PYODIDE_PROMPTto""via an env variable would be a solution as well (although, a half-baked one). IMO, the prompt content belongs to the system prompt where is should be added alongside the native tool definitions.Note: for non-native tool calling the current approach (the most recent user message annotation) is understandable and acceptable.
@arbv commented on GitHub (Mar 31, 2026):
Also, on a slightly related note:
The same historical prefix invalidation occurs when chatting with documents, where the RAG template is injected ephemerally into the last user message. Unlike the Pyodide prompt, RAG content is dynamic per query and cannot be relocated to the system prompt. However, the ephemeral injection still invalidates the cached prefix from the previous turn onward, meaning the accumulated conversation history is re-billed from scratch on every turn where RAG is active.
Something to think about, I guess. Chatting with documents should not wrap the user message in native toll calling mode for the same reason. Though - is is a completely different, albeit vaguely related, issue.
@trevorhayes6561-maker commented on GitHub (Mar 31, 2026):
This makes sense — injecting the Pyodide prompt into each user message
breaks prefix caching and unnecessarily increases cost. Moving it to the
system prompt when the engine is active would align with how tool
definitions are handled and avoid cache invalidation. The env override
could help short-term, but fixing the placement seems like the right
long-term solution.
On Tue, Mar 31, 2026, 1:25 PM Artem Boldariev @.***>
wrote:
@tjbck commented on GitHub (Apr 1, 2026):
Refactored to use system prompt in the latest dev.
@arbv commented on GitHub (Apr 1, 2026):
@tjbck Thank you!