mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
refac/enh: refactor the default tools pipeline #1642
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @michaelpoluektov on GitHub (Jul 30, 2024).
Is your feature request related to a problem? Please describe.
There is a lot of unnecessary complexity associated with the current way tools are handled by default. As of right now, OWUI does the following:
Describe the solution you'd like
I would like to eventually make the following improvements, and I'm wondering if anyone has any opinions/objections to how this should be handled.
__tools__: list[dict]. Example:Tracked here: https://github.com/open-webui/open-webui/issues/3718
@michaelpoluektov commented on GitHub (Aug 19, 2024):
https://github.com/open-webui/open-webui/pull/4724
@EtiennePerot commented on GitHub (Sep 1, 2024):
I believe this current limitation makes it difficult for models to call tools that need more elaborate inputs, such as for code execution where the argument (the code to run) can be quite long and complicated. Even in simple cases, I'm sometimes running into cases where the model tries to generate a tool call but adds garbage tokens before the tool call's structured JSON, e.g. "
<|python_tag|>" in `<|python_tag|>{"name": "run_bash_command", "parameters": {"bash_command": "date; dmesg"}}.Specifically, if I add logging around this line:
f4df49e600/backend/main.py (L415)like so:
It (sometimes) prints:
@michaelpoluektov commented on GitHub (Sep 2, 2024):
The way I would implement a Python tool would be to define a function that takes no external parameters other than
__messages__which callsevalon the last snippet of code wrapped in a```pythontag. That should resolve your issue.@EtiennePerot commented on GitHub (Sep 2, 2024):
@michaelpoluektov I think that's a slightly different use-case than the one I'm trying to solve. The intent of having it as a tool is to let the LLM decide and query information it needs (without overloading the system message with too many tools) as a sort of knowledge-gathering step prior to answering the user's prompt, similar to other tools like web searching.
By contrast, the way the function you describe would work would be more like Open WebUI's existing Pyodide integration with the "Run" button showing up on Python code blocks. It would act on visible Python code blocks that the user would explicitly have to ask for.
@michaelpoluektov commented on GitHub (Sep 2, 2024):
The goal is eventually not to have tool outputs only be pre-pended on each query, but potentially called mid-way through the generation. Similar to https://openwebui.com/f/michaelpoluektov/openai_react/ (I'm waiting on Ollama to support streaming tool calls to make a PR implementing this natively)
This would allow you to run the code in on the backend (through something like Terrarium for security) and pipe the result back in to the LLM before the rest of the response is generated. The only difference is that the user would see the code but that can also easily be solved with functions magic if it's not desired
@EtiennePerot commented on GitHub (Sep 3, 2024):
Makes sense, thanks. I didn't realize ollama's tool support was still so primitive.
For those interested in tracking progress on the ollama side, I believe this is tracked in https://github.com/ollama/ollama/issues/5796.
In the meantime, I have also implemented your suggestion of a function and uploaded it here.
@bdqfork commented on GitHub (Sep 29, 2024):
Any update with tool message render?
@smonux commented on GitHub (Oct 13, 2024):
I have working on lately and I have working version (surely still has bugs and needs some cleaning). If anyone is interested, it's here:
https://github.com/smonux/open-webui/blob/dev-apitools/backend/open_webui/main.py
I have tested it witth 4omini, deepseek and qwen2.5, and they work fine.
I basically peeks into LLM api responses and if there are any tool calls, it stops forwarding the data to the client, aggregates the deltas and executes what the LLM wants to run.
In my opinion, this way of running tools has enormous advantage of being able to use different tools together and a same tool may be used several times so the LLM can correct its own mistakes. (which is crucial for running-code enabling tools).
Before working on polishing it, I have a couple questions/concerns:
I post here a comparison using the same prompt. In the first case, the LLM is able to self correct and complete the task while in the other starts hallucinating a response:
MULTIPLE CALLS VERSION
USER
Explain what you are doing after every step
ASSISTANT
Execution: 2024-10-13T13:26:57.464090
Output:
Execution: 2024-10-13T13:27:02.660632
Output:
Execution: 2024-10-13T13:27:07.640838
Output:
Execution: 2024-10-13T13:27:14.290234
Output:
Execution: 2024-10-13T13:27:19.597312
Output:
Execution: 2024-10-13T13:27:26.580866
Output:
The steps I've taken are as follows:
testdirectory. The files are:d,b,apple,a,f,c, ande.appleas the one with a name that is different from the others.applefile and found the following Python code:applefile, and it printed:SINGLE CALL VERSION
USER
Explain what you are doing after every step
ASSISTANT
Execution: 2024-10-13T11:35:26.091281
Output:
Sure, I'll go through the steps one by one and explain what I'm doing at each step.
Step 1: List the files in the directory "test"
I will list all the files in the "test" directory to see what we're working with. This will help us identify which file has a unique name.
Let's execute this code to list the files.
Step 2: Identify the file with a unique name
Once we have the list of files, I will inspect the names to find the one that is different from the others. Typically, a unique file might have a different extension, a different naming pattern, or some other distinguishing feature.
....
@michaelpoluektov commented on GitHub (Oct 14, 2024):
Thanks, this was my original plan!
This was discussed with Tim, from my understanding he wants to keep the default version as well to be used with models that weren't trained with tool calling in mind.
Ollama is a bit tricky and definitely needs to be handled separately: I would love to review this properly and fix any Ollama issues but I've been quite busy lately. Will get to it as soon as I'm free, hopefully this week, in the meantime:
@smonux commented on GitHub (Oct 14, 2024):
Thank you for your comments, Michael. My time is also pretty limited, so any progress from my part will be slow.
My current plan is to try to make it work for most common models (one of each family at least: haiku, 4omini, qwen-2.5, deepseek-chat-2.5, etc...) and polish it a bit (make citations work, etc...) I have a branch in my fork called dev-apitools where I'm doing the changes.
I understand this affects core functionality so keeping it a flag enabled feature makes a lot of sense. I will try to add it (there are several options... it could be at the user level, at the model one or for the whole app). Still, I don't know how well non-function capable models are actually able of taking advantage of the current implementation. My bet is that probably the don't perform very well.
With regards to llama3.2 1b/3b, I wasn't aware that they supported function calling. I will try to make it work in models hosted in Openrouter and if it works, I could try to run them in ollama on a cheap VPS or the like.
I will post here the advances. Best regards.
@ferrislucas commented on GitHub (Oct 27, 2024):
I have a fork that I was playing with groq’s tool use preview model here: https://github.com/ferrislucas/open-webui/pull/1
I’m only sharing here in case it’s useful for some reason. I wasn’t making any attempt outside of just getting it working with groq. Thanks for your work on this project!
@smonux commented on GitHub (Nov 2, 2024):
I'm nearer of having a working solution. Thanks to @michaelpoluektov suggestion, I have been able to test the ollama API.
Still, as impressive as it is for its size, llama3.2:3b sometimes hallucinates responses or fail, but I can't reasonably run stronger models. I think the implementation is on par with the openai one, but I can't be sure.
This is how it's going. I'm providing a couple of functions, one what merely shows the current time, and another which retrieves the value of a virtual anemometer (code below).
The code:
@smonux commented on GitHub (Nov 21, 2024):
I have created this PR, in case anybody it's interested:
https://github.com/open-webui/open-webui/pull/7021
Not sure if it will be accepted. Any help/suggestions to increase its odds will be welcomed.