[GH-ISSUE #5449] Add an option to display tokens/s in real-time for both side-by-side and single model inference. #52651

Closed
opened 2026-05-05 13:45:16 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @zytoh0 on GitHub (Sep 16, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/5449

Is your feature request related to a problem? Please describe.
Open WebUI’s comparison feature is useful for evaluating output quality, but it lacks the ability to show tokens per second (tokens/s), which is important for comparing model speed. This is missing in both side-by-side and single model inference. UIs like Jan.ai already include this feature, making it easier to assess model performance. (See image below). Visit https://www.youtube.com/watch?v=QpMQgJL4AZA and scroll to 02:48 to see it display token/s in real-time.
image

Describe the solution you'd like
Add a setting that displays tokens/s in real-time for both side-by-side comparisons and single model inferences. This would allow users to track model speed alongside output quality.

Describe alternatives you've considered
Another option could be to display tokens/s within the response itself, either below or next to the text for every single response but having a setting to toggle this feature on or off would be more flexible for users who may not always need this information.

Additional context
Adding this feature would greatly improve the usability of Open WebUI by allowing users to evaluate both model output quality and processing speed. This is especially useful when speed is a critical factor.

Originally created by @zytoh0 on GitHub (Sep 16, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/5449 **Is your feature request related to a problem? Please describe.** Open WebUI’s comparison feature is useful for evaluating output quality, but it lacks the ability to show tokens per second (tokens/s), which is important for comparing model speed. This is missing in both side-by-side and single model inference. UIs like Jan.ai already include this feature, making it easier to assess model performance. (See image below). Visit https://www.youtube.com/watch?v=QpMQgJL4AZA and scroll to 02:48 to see it display token/s in real-time. ![image](https://github.com/user-attachments/assets/fb3be782-f2ca-4cf1-aacd-2d03fa8e5482) **Describe the solution you'd like** Add a setting that displays tokens/s in real-time for both side-by-side comparisons and single model inferences. This would allow users to track model speed alongside output quality. **Describe alternatives you've considered** Another option could be to display tokens/s within the response itself, either below or next to the text for every single response but having a setting to toggle this feature on or off would be more flexible for users who may not always need this information. **Additional context** Adding this feature would greatly improve the usability of Open WebUI by allowing users to evaluate both model output quality and processing speed. This is especially useful when speed is a critical factor.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#52651