mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[PR #13061] [CLOSED] perf: Asynchronous process_chat_payload in chat completion
#9879
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/13061
Author: @tth37
Created: 4/19/2025
Status: ❌ Closed
Base:
dev← Head:perf_async_payload_processing📝 Commits (4)
89199f2perf: Asynchronousprocess_chat_payloadin chat completiond7f5f20fix: Error handling ontask-cancelled0b5de38fix: Error handling when chatting with multiple models4c9ff0afix: Enhanced persistent error messages📊 Changes
3 files changed (+72 additions, -18 deletions)
View changed files
📝
backend/open_webui/main.py(+55 -12)📝
backend/open_webui/utils/middleware.py(+4 -6)📝
src/lib/components/chat/Chat.svelte(+13 -0)📄 Description
Problem Description
The complete problem description is detailed in #13027.
For short, when using the
/api/chat/completionsendpoint asynchronously (typically from browser Web UI), the server should return atask_idimmediately so the UI can track process and allow early cancellation.However, if features like Web Search or Tools are enabled, the server waits for this preprocessing (
process_chat_payload) to finish before returningtask_id, leading two main issues:task_idyet.Solution
This PR changes the behavior for asynchronous requests:
event_emitterandevent_callerexists) at the very beginning of the chat completion handler.process_chat_payloadandprocess_chat_responsetask_idto the client right away.Error handling
I extended the previously unhandled
task-cancelledevent. This event can be triggered by eitherasyncio.CancelledError, or when theprocess_chat_payload_and_responseis running as a background job.Formerly, when errors are encountered during
process_chat_payloadorchat_completion_handler, the/api/chat/completionsendpoint would directly return the error message. Now that all the jobs are processed as a backend task, we must use WebSocket events to tell the browser about error details.When the browser receives the
task-cancelledevent, it will markcurrentMessage.done = trueand re-fetch current taskIds, to ensure that the messages and the stop button behaves properly. This ensures the exact same user experience as the previous error handling logic, and robust error handling when chatting with multiple models.Additional Messages
I've tried my best to keep the original structure of codebase, make minimal changes, and ensure backward-compatibility.
Tests
For sync requests, I used
curlto test the chat completion endpoints, and they are behaving the same as original version:For async requests, I tested the endpoint via browser Web UI. The endpoint returns with
task_idvery quickly and before the web search process ended. Upon receivingtask_id, I am able to cancel the generation task at any time.Error handling:
process_chat_payloadphaseprocess_chat_responsephasetask-cancellederror when user early stop a requestMore test cases / discussions are welcomed!
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.