[GH-ISSUE #8160] v0.5.2 Token Generation Slower Than v0.4.8 #15023

Closed
opened 2026-04-19 21:18:42 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @yanghan-cyber on GitHub (Dec 28, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8160

Performance Regression in Token Generation Speed (v0.5.2 vs v0.4.8)

Environment Details

  • Version: v0.5.2
  • Deployment Method: Docker
  • Backend: Open-WebUI
  • API Management: Self-hosted OneAPI for OpenAI API

Issue Description

I've noticed a significant performance degradation in token generation speed when comparing version v0.5.2 to v0.4.8.

Steps to Reproduce

  1. Deploy Open-WebUI using Docker
  2. Use OneAPI to manage OpenAI API
  3. Compare token generation speed between v0.4.8 and v0.5.2

Observed Behavior

  • In v0.4.8: Token generation was faster and more responsive
  • In v0.5.2: Token generation speed has noticeably decreased

Expected Behavior

Token generation speed should remain consistent or improve between versions.

Additional Information

  • Can provide more detailed performance metrics if needed
  • Willing to assist in debugging or providing more information

Suggested Next Steps

  • Investigate potential performance regressions
  • Review changes between v0.4.8 and v0.5.2 that might impact token generation speed
Originally created by @yanghan-cyber on GitHub (Dec 28, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8160 ## Performance Regression in Token Generation Speed (v0.5.2 vs v0.4.8) ### Environment Details - **Version**: v0.5.2 - **Deployment Method**: Docker - **Backend**: Open-WebUI - **API Management**: Self-hosted OneAPI for OpenAI API ### Issue Description I've noticed a significant performance degradation in token generation speed when comparing version v0.5.2 to v0.4.8. ### Steps to Reproduce 1. Deploy Open-WebUI using Docker 2. Use OneAPI to manage OpenAI API 3. Compare token generation speed between v0.4.8 and v0.5.2 ### Observed Behavior - In v0.4.8: Token generation was faster and more responsive - In v0.5.2: Token generation speed has noticeably decreased ### Expected Behavior Token generation speed should remain consistent or improve between versions. ### Additional Information - Can provide more detailed performance metrics if needed - Willing to assist in debugging or providing more information ### Suggested Next Steps - Investigate potential performance regressions - Review changes between v0.4.8 and v0.5.2 that might impact token generation speed
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#15023