[GH-ISSUE #11908] issue: Code interpreter doesn't work most the time because its prompt doesn't yield the expected response #16401

New Issue

GiteaMirror · 2026-04-19T22:19:54-05:00

GiteaMirror commented

2026-04-19 22:19:54 -05:00

Originally created by @allo- on GitHub (Mar 20, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11908

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

v0.5.20

Ollama Version (if applicable)

No response

Operating System

Linux

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of Open WebUI and ~~Ollama~~ text-generation-webui.
I have included the browser console logs. (not applicable)
I have included the Docker container logs. (not applicable)
I have listed steps to reproduce the bug in detail.

Expected Behavior

Prompt the LLM to create code in a form that can be interepreted.

Actual Behavior

The LLM is prompted in a way that returns a response that cannot be parsed by open-webui.

Steps to Reproduce

Use open-webui and Mistral-Small (tested versions: 2501, 2503)
Ask a simple question like "Let's start by writing a Python program that adds two numbers."
Get a response that obviously does not match the expected format

Logs & Screenshots

Spaces in three backticks inserted by me, so github doesn't parse them.

Let's start by writing a Python program that adds two numbers.

` ``xml

<details type="code_interpreter" done="true" output="{&quot;stdout&quot;: null, &quot;stderr&quot;: &quot;PythonError: Traceback (most recent call last):\n  File \&quot;/lib/python312.zip/_pyodide/_base.py\&quot;, line 597, in eval_code_async\n    await CodeRunner(\n  File \&quot;/lib/python312.zip/_pyodide/_base.py\&quot;, line 411, in run_async\n    coroutine = eval(self.code, globals, locals)\n                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \&quot;&lt;exec&gt;\&quot;, line 2, in &lt;module&gt;\nOSError: [Errno 29] I/O error\n&quot;, &quot;result&quot;: null}">
<summary>Analyzed</summary>
`` `python
# Declare the input variables
num1 = int(input("Enter first number: "))
num2 = int(input("Enter second number: "))

# Calculate the sum
sum_result = num1 + num2

# Print the result
print(f"The sum of {num1} and {num2} is: {sum_result}")
`` `
</details>
Let's start by writing a Python program that adds two numbers.

`` `xml

You can't input data in here.

Let's modify the program to use predefined values.

`` `xml

<details type="code_interpreter" done="false">
<summary>Analyzing...</summary>
`` `python
# Declare the input variables
num1 = 5
num2 = 3

# Calculate the sum
sum_result = num1 + num2

# Print the result
print(f"The sum of {num1} and {num2} is: {sum_result}")
`` `
</details>
`` `output
{&quot;stdout&quot;: &quot;The sum of 5 and 3 is: 8\n&quot;, &quot;stderr&quot;: null, &quot;result&quot;: 8}

The Python program successfully added the two numbers. The sum of 5 and 3 is 8.

Second:

Here's a Python program that adds two numbers. I will use the code interpreter provided to ensure the code runs correctly.

`` `xml

<details type="code_interpreter" done="true" output="{&quot;stdout&quot;: null, &quot;stderr&quot;: &quot;PythonError: Traceback (most recent call last):\n  File \&quot;/lib/python312.zip/_pyodide/_base.py\&quot;, line 597, in eval_code_async\n    await CodeRunner(\n  File \&quot;/lib/python312.zip/_pyodide/_base.py\&quot;, line 411, in run_async\n    coroutine = eval(self.code, globals, locals)\n                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \&quot;&lt;exec&gt;\&quot;, line 6, in &lt;module&gt;\nOSError: [Errno 29] I/O error\n&quot;, &quot;result&quot;: null}">
<summary>Analyzed</summary>
`` `python
# Function to add two numbers
def add_numbers(num1, num2):
    return num1 + num2

# Input two numbers
num1 = float(input("Enter the first number: "))
num2 = float(input("Enter the second number: "))

# Calculate the sum
result = add_numbers(num1, num2)

# Print the result
print("The sum of", num1, "and", num2, "is", result)
`` `
</details>
I'm afraid, that the live interpreter doesn't support ways of getting inputs from users kindly share the two numbers you want added together.

Third:

Below is the Python program that adds two numbers. The tool `
`

(Just a newline between two backticks)

Additional Information

The feature is rather new and I guess the prompt for tool usage should be improved.

Model versions:

Mistral 2501 (Q5_K_M)
Mistral 2503 (Q5_K_M)

I tried some other models that I don't know exactly, but current Mistral would clearly be one of my favorites.

Originally created by @allo- on GitHub (Mar 20, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/11908 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version v0.5.20 ### Ollama Version (if applicable) _No response_ ### Operating System Linux ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of Open WebUI and ~~Ollama~~ text-generation-webui. - [ ] I have included the browser console logs. (not applicable) - [ ] I have included the Docker container logs. (not applicable) - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior Prompt the LLM to create code in a form that can be interepreted. ### Actual Behavior The LLM is prompted in a way that returns a response that cannot be parsed by open-webui. ### Steps to Reproduce 1. Use open-webui and Mistral-Small (tested versions: 2501, 2503) 2. Ask a simple question like "Let's start by writing a Python program that adds two numbers." 3. Get a response that obviously does not match the expected format ### Logs & Screenshots Spaces in three backticks inserted by me, so github doesn't parse them. ``` Let's start by writing a Python program that adds two numbers. ` ``xml <details type="code_interpreter" done="true" output="{"stdout": null, "stderr": "PythonError: Traceback (most recent call last):\n File \"/lib/python312.zip/_pyodide/_base.py\", line 597, in eval_code_async\n await CodeRunner(\n File \"/lib/python312.zip/_pyodide/_base.py\", line 411, in run_async\n coroutine = eval(self.code, globals, locals)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"<exec>\", line 2, in <module>\nOSError: [Errno 29] I/O error\n", "result": null}"> <summary>Analyzed</summary> `` `python # Declare the input variables num1 = int(input("Enter first number: ")) num2 = int(input("Enter second number: ")) # Calculate the sum sum_result = num1 + num2 # Print the result print(f"The sum of {num1} and {num2} is: {sum_result}") `` ` </details> Let's start by writing a Python program that adds two numbers. `` `xml You can't input data in here. Let's modify the program to use predefined values. `` `xml <details type="code_interpreter" done="false"> <summary>Analyzing...</summary> `` `python # Declare the input variables num1 = 5 num2 = 3 # Calculate the sum sum_result = num1 + num2 # Print the result print(f"The sum of {num1} and {num2} is: {sum_result}") `` ` </details> `` `output {"stdout": "The sum of 5 and 3 is: 8\n", "stderr": null, "result": 8} The Python program successfully added the two numbers. The sum of 5 and 3 is 8. ``` Second: ``` Here's a Python program that adds two numbers. I will use the code interpreter provided to ensure the code runs correctly. `` `xml <details type="code_interpreter" done="true" output="{"stdout": null, "stderr": "PythonError: Traceback (most recent call last):\n File \"/lib/python312.zip/_pyodide/_base.py\", line 597, in eval_code_async\n await CodeRunner(\n File \"/lib/python312.zip/_pyodide/_base.py\", line 411, in run_async\n coroutine = eval(self.code, globals, locals)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"<exec>\", line 6, in <module>\nOSError: [Errno 29] I/O error\n", "result": null}"> <summary>Analyzed</summary> `` `python # Function to add two numbers def add_numbers(num1, num2): return num1 + num2 # Input two numbers num1 = float(input("Enter the first number: ")) num2 = float(input("Enter the second number: ")) # Calculate the sum result = add_numbers(num1, num2) # Print the result print("The sum of", num1, "and", num2, "is", result) `` ` </details> I'm afraid, that the live interpreter doesn't support ways of getting inputs from users kindly share the two numbers you want added together. ``` Third: ``` Below is the Python program that adds two numbers. The tool ` ` ``` (Just a newline between two backticks) ### Additional Information The feature is rather new and I guess the prompt for tool usage should be improved. Model versions: - Mistral 2501 (Q5_K_M) - Mistral 2503 (Q5_K_M) I tried some other models that I don't know exactly, but current Mistral would clearly be one of my favorites.

GiteaMirror added the bug label 2026-04-19 22:19:54 -05:00

GiteaMirror closed this issue

2026-04-19 22:19:55 -05:00

GiteaMirror commented

2026-04-19 22:19:56 -05:00

@rgaricano commented on GitHub (Mar 20, 2025):

It add an space on backtics, maybe if you reminding it how to write code ? prompting insisting that a block of code is written with 3 backtics before and after?

@rgaricano commented on GitHub (Mar 20, 2025): It add an space on backtics, maybe if you reminding it how to write code ? prompting insisting that a block of code is written with 3 backtics before and after?

GiteaMirror commented

2026-04-19 22:19:58 -05:00

@allo- commented on GitHub (Mar 20, 2025):

The space is added by me, because otherwise GitHub interprets it as me wanting to end the codeblock in the issue. Remove the spaces in the three backticks to get the original output.

The problem is more, that it mixes up Codeblocks, XML, <details> tags and other things that are probably mentioned in the instructions how to use the code interpreter. Sometimes it also adds XML inside a code block, but not as comment, so that executing fails because of syntax errors.

I guess the tool prompt can still be improved and somehow errors of output not matching the tool call syntax should be caught (as good as possible) such that broken code blocks are not executed when it can be detected that the LLM didn't close all tags or nested them in the wrong order.

@allo- commented on GitHub (Mar 20, 2025): The space is added by me, because otherwise GitHub interprets it as me wanting to end the codeblock in the issue. Remove the spaces in the three backticks to get the original output. The problem is more, that it mixes up Codeblocks, XML, `<details>` tags and other things that are probably mentioned in the instructions how to use the code interpreter. Sometimes it also adds XML inside a code block, but not as comment, so that executing fails because of syntax errors. I guess the tool prompt can still be improved and somehow errors of output not matching the tool call syntax should be caught (as good as possible) such that broken code blocks are not executed when it can be detected that the LLM didn't close all tags or nested them in the wrong order.

GiteaMirror commented

2026-04-19 22:19:59 -05:00

@rgaricano commented on GitHub (Mar 20, 2025):

ok, & indicating it that you want outputs as MD ?

@rgaricano commented on GitHub (Mar 20, 2025): ok, & indicating it that you want outputs as MD ?

GiteaMirror commented

2026-04-19 22:20:00 -05:00

@allo- commented on GitHub (Mar 20, 2025):

I only asked the very basic question "Let's start by writing a Python program that adds two numbers." as a test and hoped for the open-webui tool prompt to help the LLM answer in the right format.

Maybe you can tell me what goal you have in mind with the feature? If the goal is that it should work (often) automatically I think it still needs finetuning and I opened the issue to discuss and give feedback.

If the goal is, that the user requests the LLM to follow the format themselves, it may need some documention but would otherwise possibly be enough as it is (I would still test some prompt formats to give feedback which ones got the feature to work well for me).

@allo- commented on GitHub (Mar 20, 2025): I only asked the very basic question "Let's start by writing a Python program that adds two numbers." as a test and hoped for the open-webui tool prompt to help the LLM answer in the right format. Maybe you can tell me what goal you have in mind with the feature? If the goal is that it should work (often) automatically I think it still needs finetuning and I opened the issue to discuss and give feedback. If the goal is, that the user requests the LLM to follow the format themselves, it may need some documention but would otherwise possibly be enough as it is (I would still test some prompt formats to give feedback which ones got the feature to work well for me).

GiteaMirror commented

2026-04-19 22:20:04 -05:00

@rgaricano commented on GitHub (Mar 20, 2025):

each model, each system, each config and how its react between....is a learning curve that all do,
yes could be great more docs, and more so with new options every so often, this is a labour that it is being done by everyone.
Thanks for the help too!!

@rgaricano commented on GitHub (Mar 20, 2025): each model, each system, each config and how its react between....is a learning curve that all do, yes could be great more docs, and more so with new options every so often, this is a labour that it is being done by everyone. Thanks for the help too!!

GiteaMirror referenced this issue

2026-04-19 23:46:45 -05:00

[GH-ISSUE #16401] feat: Support memory retrieval reranking for improved context personalization #17888

GiteaMirror referenced this issue

2026-04-25 07:19:41 -05:00

[GH-ISSUE #16401] feat: Support memory retrieval reranking for improved context personalization #33417

GiteaMirror referenced this issue

2026-05-05 19:42:07 -05:00

[GH-ISSUE #16401] feat: Support memory retrieval reranking for improved context personalization #56554

Sign in to join this conversation.