Feature: Embeddings Optimizer (Prompt Rephraser) #851

Closed
opened 2025-11-11 14:32:27 -06:00 by GiteaMirror · 2 comments
Owner

Originally created by @spammenotinoz on GitHub (May 8, 2024).

Brilliant project, but currently using another project as when using this one my API costs are considerably higher.

High API Usage Costs with embeddings.

A clear and concise description of what you want to happen.

  • Ability to send embeddings to a REPHRASER (lower cost model) before sending the relevant tokens to the users chosen model.
    Typically this can have a major speed improvement and cost reduction for premium \ large models.

Describe alternatives you've considered

  • Using free high-performance services like Groq

Additional context
Example of a Rephraser and Reranker as used on another project. This pull was not merged into the product, but works extremely well.
I do not use the Reranker jusr the Rephraser.
No modifications to GUI

Originally created by @spammenotinoz on GitHub (May 8, 2024). Brilliant project, but currently using another project as when using this one my API costs are considerably higher. High API Usage Costs with embeddings. A clear and concise description of what you want to happen. - Ability to send embeddings to a REPHRASER (lower cost model) before sending the relevant tokens to the users chosen model. Typically this can have a major speed improvement and cost reduction for premium \ large models. **Describe alternatives you've considered** - Using free high-performance services like Groq **Additional context** Example of a Rephraser and Reranker as used on another project. This pull was not merged into the product, but works extremely well. I do not use the Reranker jusr the Rephraser. [No modifications to GUI](https://github.com/mckaywrigley/chatbot-ui/pull/1535/files)
Author
Owner

@Yanyutin753 commented on GitHub (May 11, 2024):

@spammenotinoz Hey dude, with a free high-performance service like Groq, you can use [one-api] (https://github.com/songquanpeng/one-api) to sum it up and then use it in open webui

@Yanyutin753 commented on GitHub (May 11, 2024): @spammenotinoz Hey dude, with a free high-performance service like Groq, you can use [one-api] (https://github.com/songquanpeng/one-api) to sum it up and then use it in open webui
Author
Owner

@spammenotinoz commented on GitHub (May 12, 2024):

@spammenotinoz Hey dude, with a free high-performance service like Groq, you can use [one-api] (https://github.com/songquanpeng/one-api) to sum it up and then use it in open webui

Hi I already use one-api for another project that already has the "sum it up feature". I don't use one-api here, as litellm already load-balances multiple API keys.

@spammenotinoz commented on GitHub (May 12, 2024): > @spammenotinoz Hey dude, with a free high-performance service like Groq, you can use [one-api] (https://github.com/songquanpeng/one-api) to sum it up and then use it in open webui Hi I already use one-api for another project that already has the "sum it up feature". I don't use one-api here, as litellm already load-balances multiple API keys.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#851