mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #5449] Add an option to display tokens/s in real-time for both side-by-side and single model inference. #13985
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @zytoh0 on GitHub (Sep 16, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/5449
Is your feature request related to a problem? Please describe.

Open WebUI’s comparison feature is useful for evaluating output quality, but it lacks the ability to show tokens per second (tokens/s), which is important for comparing model speed. This is missing in both side-by-side and single model inference. UIs like Jan.ai already include this feature, making it easier to assess model performance. (See image below). Visit https://www.youtube.com/watch?v=QpMQgJL4AZA and scroll to 02:48 to see it display token/s in real-time.
Describe the solution you'd like
Add a setting that displays tokens/s in real-time for both side-by-side comparisons and single model inferences. This would allow users to track model speed alongside output quality.
Describe alternatives you've considered
Another option could be to display tokens/s within the response itself, either below or next to the text for every single response but having a setting to toggle this feature on or off would be more flexible for users who may not always need this information.
Additional context
Adding this feature would greatly improve the usability of Open WebUI by allowing users to evaluate both model output quality and processing speed. This is especially useful when speed is a critical factor.