[GH-ISSUE #6541] How to configure max generating tokens when calling TOOLS? #53066

New Issue

GiteaMirror · 2026-05-05T14:17:41-05:00

GiteaMirror commented

2026-05-05 14:17:41 -05:00

Originally created by @ghost on GitHub (Oct 29, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/6541

Hello, when my AI calls a tool, the maximum number of tokens it can transfer to this tool is limited to 400 maximum. I'm using Starting OpenAI Compatible API in KoboldCpp - Version 1.76.

I searched for this 400 limit in all the project files but didn't find anything. How can I make the AI able to send long queries to the tools? In regular chat there is a Maximum number of tokens (num_predict) parameter. But this advanced setting does not work for Tools. Can you add such a setting, or explain how to increase the AI request limit for the tool?

[14:49:59] Input Received
. \nAvailable Tools: [{"name": "get_current_date", "description": "Get the current date and time.", "parameters": {"type": "object", "properties": {}, "required": []}}]\nReturn an empty string if no tools match the query. If a function tool matches, construct and return a JSON object in the format {"name": "functionName", "parameters": {"requiredFunctionParamKey": "requiredFunctionParamValue"}} using the appropriate tool and its parameters. Only return the object and limit the response to the JSON object without additional text."}, {"role": "user", "content": "Query: History:\nUSER: """\u0443\u0437\u043d\u0430\u0439 \u0442\u0435\u043a\u0443\u0449\u0443\u044e \u0434\u0430\u0442\u0443 \u0438 \u0432\u0440\u0435\u043c\u044f"""\nSYSTEM: """\n\nUser Context:\n1. [2024-10-29]. \u0415\u0441\u043b\u0438 \u0438\u043d\u0441\u0442\u0440\u0443\u043c\u0435\u043d\u0442 \u043d\u0435 \u0441\u043e\u043e\u0442\u0432\u0435\u0442\u0441\u0442\u0432\u0443\u0435\u0442 \u0437\u0430\u043f\u0440\u043e\u0441\u0443, \u0432\u0435\u0440\u043d\u0438\u0442\u0435 \u043f\u0443\u0441\u0442\u043e\u0439 \u0441\u043f\u0438\u0441\u043e\u043a []. \n"""\nQuery: \u0443\u0437\u043d\u0430\u0439 \u0442\u0435\u043a\u0443\u0449\u0443\u044e \u0434\u0430\u0442\u0443 \u0438 \u0432\u0440\u0435\u043c\u044f"}], "stream": false}

Processing Prompt [BLAS] (190 / 190 tokens)
Generating (16 / 400 tokens)
(Stop sequence triggered: ### Instruction:)
CtxLimit:317/32768, Amt:16/400, Init:0.01s, Process:4.31s (22.7ms/T = 44.10T/s), Generate:7.52s (470.1ms/T = 2.13T/s), Total:11.83s (1.35T/s)
Output: {"name": "get_current_date", "parameters": {}}

Processing Prompt [BLAS] (326 / 326 tokens)
Generating (30 / 4096 tokens)
(EOS token triggered! ID:2)
CtxLimit:467/32768, Amt:30/4096, Init:0.01s, Process:5.74s (17.6ms/T = 56.79T/s), Generate:9.41s (313.8ms/T = 3.19T/s), Total:15.15s (1.98T/s)
Output: Sir, today is Tuesday, October 29, 2024, the time is 14:50:11.

Originally created by @ghost on GitHub (Oct 29, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/6541 Hello, when my AI calls a tool, the maximum number of tokens it can transfer to this tool is limited to 400 maximum. I'm using Starting OpenAI Compatible API in KoboldCpp - Version 1.76. I searched for this 400 limit in all the project files but didn't find anything. How can I make the AI able to send long queries to the tools? In regular chat there is a Maximum number of tokens (num_predict) parameter. But this advanced setting does not work for Tools. Can you add such a setting, or explain how to increase the AI request limit for the tool? [14:49:59] Input Received . \nAvailable Tools: [{\"name\": \"get_current_date\", \"description\": \"Get the current date and time.\", \"parameters\": {\"type\": \"object\", \"properties\": {}, \"required\": []}}]\nReturn an empty string if no tools match the query. If a function tool matches, construct and return a JSON object in the format {\"name\": \"functionName\", \"parameters\": {\"requiredFunctionParamKey\": \"requiredFunctionParamValue\"}} using the appropriate tool and its parameters. Only return the object and limit the response to the JSON object without additional text."}, {"role": "user", "content": "Query: History:\nUSER: \"\"\"\u0443\u0437\u043d\u0430\u0439 \u0442\u0435\u043a\u0443\u0449\u0443\u044e \u0434\u0430\u0442\u0443 \u0438 \u0432\u0440\u0435\u043c\u044f\"\"\"\nSYSTEM: \"\"\"\n\nUser Context:\n1. [2024-10-29]. \u0415\u0441\u043b\u0438 \u0438\u043d\u0441\u0442\u0440\u0443\u043c\u0435\u043d\u0442 \u043d\u0435 \u0441\u043e\u043e\u0442\u0432\u0435\u0442\u0441\u0442\u0432\u0443\u0435\u0442 \u0437\u0430\u043f\u0440\u043e\u0441\u0443, \u0432\u0435\u0440\u043d\u0438\u0442\u0435 \u043f\u0443\u0441\u0442\u043e\u0439 \u0441\u043f\u0438\u0441\u043e\u043a []. \n\"\"\"\nQuery: \u0443\u0437\u043d\u0430\u0439 \u0442\u0435\u043a\u0443\u0449\u0443\u044e \u0434\u0430\u0442\u0443 \u0438 \u0432\u0440\u0435\u043c\u044f"}], "stream": false} Processing Prompt [BLAS] (190 / 190 tokens) Generating (16 / 400 tokens) (Stop sequence triggered: ### Instruction:) CtxLimit:317/32768, Amt:16/400, Init:0.01s, Process:4.31s (22.7ms/T = 44.10T/s), Generate:7.52s (470.1ms/T = 2.13T/s), Total:11.83s (1.35T/s) Output: {"name": "get_current_date", "parameters": {}} Processing Prompt [BLAS] (326 / 326 tokens) Generating (30 / 4096 tokens) (EOS token triggered! ID:2) CtxLimit:467/32768, Amt:30/4096, Init:0.01s, Process:5.74s (17.6ms/T = 56.79T/s), Generate:9.41s (313.8ms/T = 3.19T/s), Total:15.15s (1.98T/s) Output: Sir, today is Tuesday, October 29, 2024, the time is 14:50:11.

GiteaMirror closed this issue

2026-05-05 14:17:45 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#53066