[PR #2725] [MERGED] feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings #59928

Closed
opened 2026-05-06 02:24:44 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/2725
Author: @cheahjs
Created: 6/2/2024
Status: Merged
Merged: 6/3/2024
Merged by: @tjbck

Base: devHead: feat/openai-embeddings-batch


📝 Commits (2)

  • 0cb8163 feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings
  • 92d9b38 Merge branch 'dev' into feat/openai-embeddings-batch

📊 Changes

39 files changed (+112 additions, -19 deletions)

View changed files

📝 backend/apps/rag/main.py (+16 -2)
📝 backend/apps/rag/utils.py (+27 -16)
📝 backend/config.py (+6 -0)
📝 src/lib/apis/rag/index.ts (+1 -0)
📝 src/lib/components/documents/Settings/General.svelte (+28 -1)
📝 src/lib/i18n/locales/ar-BH/translation.json (+1 -0)
📝 src/lib/i18n/locales/bg-BG/translation.json (+1 -0)
📝 src/lib/i18n/locales/bn-BD/translation.json (+1 -0)
📝 src/lib/i18n/locales/ca-ES/translation.json (+1 -0)
📝 src/lib/i18n/locales/ceb-PH/translation.json (+1 -0)
📝 src/lib/i18n/locales/de-DE/translation.json (+1 -0)
📝 src/lib/i18n/locales/dg-DG/translation.json (+1 -0)
📝 src/lib/i18n/locales/en-GB/translation.json (+1 -0)
📝 src/lib/i18n/locales/en-US/translation.json (+1 -0)
📝 src/lib/i18n/locales/es-ES/translation.json (+1 -0)
📝 src/lib/i18n/locales/fa-IR/translation.json (+1 -0)
📝 src/lib/i18n/locales/fi-FI/translation.json (+1 -0)
📝 src/lib/i18n/locales/fr-CA/translation.json (+1 -0)
📝 src/lib/i18n/locales/fr-FR/translation.json (+1 -0)
📝 src/lib/i18n/locales/he-IL/translation.json (+1 -0)

...and 19 more files

📄 Description

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Please verify that the pull request targets the dev branch.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests for validating the changes?
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Label: To cleary categorize this pull request, assign a relevant label to the pull request title, using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

Description

Add RAG_EMBEDDING_OPENAI_BATCH_SIZE that controls how many texts are batched together in a single OpenAI embedding call. OpenAI has a limit of 2048 in a single request.

This reduces the number of calls and thus latency of large number of embeddings, and is helpful when limited by the number of API calls instead of tokens embedded (such as Cohere's 5 RPM, 96 texts/embed or Geminis 150 RPM, 100 texts/embed)

Screenshots or Videos

image

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/2725 **Author:** [@cheahjs](https://github.com/cheahjs) **Created:** 6/2/2024 **Status:** ✅ Merged **Merged:** 6/3/2024 **Merged by:** [@tjbck](https://github.com/tjbck) **Base:** `dev` ← **Head:** `feat/openai-embeddings-batch` --- ### 📝 Commits (2) - [`0cb8163`](https://github.com/open-webui/open-webui/commit/0cb816332135564f65b207dfcb4f7809a9362e2e) feat: add RAG_EMBEDDING_OPENAI_BATCH_SIZE to batch multiple embeddings - [`92d9b38`](https://github.com/open-webui/open-webui/commit/92d9b3811087a7950233057497f0e35d37a02ac0) Merge branch 'dev' into feat/openai-embeddings-batch ### 📊 Changes **39 files changed** (+112 additions, -19 deletions) <details> <summary>View changed files</summary> 📝 `backend/apps/rag/main.py` (+16 -2) 📝 `backend/apps/rag/utils.py` (+27 -16) 📝 `backend/config.py` (+6 -0) 📝 `src/lib/apis/rag/index.ts` (+1 -0) 📝 `src/lib/components/documents/Settings/General.svelte` (+28 -1) 📝 `src/lib/i18n/locales/ar-BH/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/bg-BG/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/bn-BD/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/ca-ES/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/ceb-PH/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/de-DE/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/dg-DG/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/en-GB/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/en-US/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/es-ES/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/fa-IR/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/fi-FI/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/fr-CA/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/fr-FR/translation.json` (+1 -0) 📝 `src/lib/i18n/locales/he-IL/translation.json` (+1 -0) _...and 19 more files_ </details> ### 📄 Description # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [ ] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [ ] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests for validating the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Label:** To cleary categorize this pull request, assign a relevant label to the pull request title, using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # Changelog Entry ### Description Add `RAG_EMBEDDING_OPENAI_BATCH_SIZE` that controls how many texts are batched together in a single OpenAI embedding call. OpenAI has a limit of 2048 in a single request. This reduces the number of calls and thus latency of large number of embeddings, and is helpful when limited by the number of API calls instead of tokens embedded (such as Cohere's 5 RPM, 96 texts/embed or Geminis 150 RPM, 100 texts/embed) ### Screenshots or Videos <img width="759" alt="image" src="https://github.com/open-webui/open-webui/assets/818368/1351457c-2dc7-436f-95ff-f01c1fe06992"> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-06 02:24:44 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#59928