[PR #21381] [CLOSED] feat: LLM proxy user sync and budget enforcement #49106

Closed
opened 2026-04-30 01:25:07 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/21381
Author: @jpaodev
Created: 2/13/2026
Status: Closed

Base: devHead: feat-sync-users


📝 Commits (10+)

📊 Changes

4 files changed (+401 additions, -0 deletions)

View changed files

📝 backend/open_webui/env.py (+63 -0)
backend/open_webui/utils/llm_proxy_budget.py (+214 -0)
backend/open_webui/utils/llm_proxy_sync.py (+113 -0)
📝 backend/open_webui/utils/middleware.py (+11 -0)

📄 Description

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before working on it and submitting the PR. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.
Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. PRs targeting main will be immediately closed.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Add docs in Open WebUI Docs Repository. Document user-facing behavior, environment variables, public APIs/interfaces, or deployment steps. (Assumed this would be a separate task for a real PR)
  • Dependencies: Are there any new or upgraded dependencies? If so, explain why, update the changelog/docs, and include any compatibility notes. Actually run the code/function that uses updated library to ensure it doesn't crash. (No new dependencies, only requests which is standard)
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Include reproducible steps to demonstrate the issue before the fix. Test edge cases (URL encoding, HTML entities, types). Take this as an opportunity to make screenshots of the feature/fix and include them in the PR description. (Assumed this would be done for a real PR)
  • Agentic AI Code: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review AND manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR. (Self-attested)
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? (Self-attested)
  • Design & Architecture: Prefer smart defaults over adding new settings; use local state for ephemeral UI logic. Open a Discussion for major architectural or UX changes. (New settings are environment variables, which is appropriate for this type of feature)
  • Git Hygiene: Keep PRs atomic (one logical change). Clean up commits and rebase on dev to ensure no unrelated commits (e.g. from main) are included. Push updates to the existing PR branch instead of closing and reopening.
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

feat: Add LLM Proxy User Sync and Budget Enforcement

Description

This pull request introduces comprehensive integration with external LLM proxies (e.g., LiteLLM) to enhance user management and introduce usage budget enforcement. It allows Open WebUI to automatically synchronize user data to the proxy and block chat requests for users who have exceeded their configured budget. This feature is configurable via new environment variables, providing fine-grained control over user syncing and budget enforcement policies.

The core functionality involves:

  1. User Synchronization: Automatically sends user ID, email, and alias (name) to a configured LLM proxy endpoint upon their first chat request. This ensures the proxy has up-to-date user information.
  2. Budget Enforcement: Before processing a chat request, it checks with the LLM proxy if the user has exceeded their allocated budget. If so, the request is blocked, and a configurable message is returned to the user. An in-memory cache is used to reduce redundant API calls to the proxy.

Both features are opt-in and controlled by environment variables. They are designed to be resilient, with user sync being non-blocking on failure, and budget enforcement failing open (not blocking a user) if the proxy cannot be reached or returns an invalid response.

Why not just use a function / filter / pipe?
I deliberately wanted to not use that, but rather integrate that, as users might want to disable functions/filters and similar in high-security deployments. In addition this doesn't introduce any dependencies, hence I think this is reasonable to integrate and useful to users, especially considering this bug: https://github.com/BerriAI/litellm/issues/11083

Changelog Entry

Description

  • This PR introduces robust integration with external LLM proxies (e.g., LiteLLM) to enable automatic user synchronization and usage budget enforcement. This allows Open WebUI to manage user access and consumption more effectively by leveraging external proxy capabilities.

Added

  • LLM Proxy User Synchronization:
    • New environment variables for configuring user synchronization to an external LLM proxy, including LLM_PROXY_SYNC_USERS (enable/disable), LLM_PROXY_API_BASE_URL, LLM_PROXY_API_KEY, LLM_PROXY_SYNC_USER_ALIAS (control syncing user alias/name), LLM_PROXY_SYNC_TIMEOUT, LLM_PROXY_SYNC_ENDPOINT, and custom keys for user payload fields (LLM_PROXY_SYNC_KEY_USER_ID, LLM_PROXY_SYNC_KEY_USER_EMAIL, LLM_PROXY_SYNC_KEY_USER_ALIAS).
    • A new utility module (backend/open_webui/utils/llm_proxy_sync.py) handling user data synchronization to the proxy via HTTP POST.
    • Integration into the process_chat_payload middleware (backend/open_webui/utils/middleware.py) to automatically sync user information on chat requests. This process is fully isolated and non-blocking, logging any failures without affecting the chat flow.
  • LLM Proxy Budget Enforcement:
    • New environment variables for configuring usage budget enforcement via an external LLM proxy, including LLM_PROXY_BUDGET_ENFORCE (enable/disable), LLM_PROXY_BUDGET_ENDPOINT, LLM_PROXY_BUDGET_HTTP_METHOD (GET/POST), LLM_PROXY_BUDGET_TIMEOUT, LLM_PROXY_BUDGET_CACHE_TTL (cache duration for budget checks), LLM_PROXY_BUDGET_JSON_PATH_SPEND, LLM_PROXY_BUDGET_JSON_PATH_MAX_BUDGET, LLM_PROXY_BUDGET_JSON_PATH_BUDGET_RESET_AT (JSON paths for parsing proxy response), LLM_PROXY_BUDGET_QUERY_PARAM, LLM_PROXY_BUDGET_AUTH_HEADER, LLM_PROXY_BUDGET_EXCEEDED_MSG (custom message for exceeding budget), and LLM_PROXY_BUDGET_BLOCK_ADMINS (option to include/exclude admin users).
    • A new utility module (backend/open_webui/utils/llm_proxy_budget.py) to fetch user budget information from the proxy, resolve JSON paths in the response, and determine if the budget is exceeded.
    • An in-memory cache (_budget_cache) for budget checks to reduce the load on the LLM proxy and improve response times.
    • Integration into the process_chat_payload middleware to check user budgets and block requests with an appropriate message if the budget is exceeded. This check is performed before the chat request is processed.

Changed

  • N/A

Deprecated

  • N/A

Removed

  • N/A

Fixed

  • N/A

Security

  • N/A

Breaking Changes

  • BREAKING CHANGE: N/A (All new functionality is opt-in via environment variables and does not alter existing behavior if not enabled).

Additional Information

  • Both user sync and budget enforcement are designed to be fail-safe. If the LLM proxy is unreachable or returns an error, user sync failures are logged and do not block the user's request. Budget enforcement will default to "not over budget" in case of errors, preventing legitimate users from being blocked due to proxy issues.
  • The budget enforcement logic includes consideration for budget_reset_at timestamps to intelligently handle scenarios where a budget might have reset but the spend value from the proxy hasn't updated yet.
  • The blocking HTTP calls for both sync and budget checks are offloaded to an asyncio.to_thread to ensure the main event loop is not stalled.
  • Has been tested: First budget in LLM proxy has been set to a very low value -> positive block, then set to a higher value -> positive non-block

Screenshots or Videos

owui-pr-usage-exceeded

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/21381 **Author:** [@jpaodev](https://github.com/jpaodev) **Created:** 2/13/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `feat-sync-users` --- ### 📝 Commits (10+) - [`fe6783c`](https://github.com/open-webui/open-webui/commit/fe6783c16699911c7be17392596d579333fb110c) Merge pull request #19030 from open-webui/dev - [`fc05e0a`](https://github.com/open-webui/open-webui/commit/fc05e0a6c5d39da60b603b4d520f800d6e36f748) Merge pull request #19405 from open-webui/dev - [`e3faec6`](https://github.com/open-webui/open-webui/commit/e3faec62c58e3a83d89aa3df539feacefa125e0c) Merge pull request #19416 from open-webui/dev - [`9899293`](https://github.com/open-webui/open-webui/commit/9899293f050ad50ae12024cbebee7e018acd851e) Merge pull request #19448 from open-webui/dev - [`140605e`](https://github.com/open-webui/open-webui/commit/140605e660b8186a7d5c79fb3be6ffb147a2f498) Merge pull request #19462 from open-webui/dev - [`6f1486f`](https://github.com/open-webui/open-webui/commit/6f1486ffd0cb288d0e21f41845361924e0d742b3) Merge pull request #19466 from open-webui/dev - [`d95f533`](https://github.com/open-webui/open-webui/commit/d95f533214e3fe5beb5e41ec1f349940bc4c7043) Merge pull request #19729 from open-webui/dev - [`a727153`](https://github.com/open-webui/open-webui/commit/a7271532f8a38da46785afcaa7e65f9a45e7d753) 0.6.43 (#20093) - [`6adde20`](https://github.com/open-webui/open-webui/commit/6adde203cd292a9e3af9c64a2ae36b603fed096a) Merge pull request #20394 from open-webui/dev - [`f9b0534`](https://github.com/open-webui/open-webui/commit/f9b0534e0c442631d1cb7205169588b9b6204179) Merge pull request #20522 from open-webui/dev ### 📊 Changes **4 files changed** (+401 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/env.py` (+63 -0) ➕ `backend/open_webui/utils/llm_proxy_budget.py` (+214 -0) ➕ `backend/open_webui/utils/llm_proxy_sync.py` (+113 -0) 📝 `backend/open_webui/utils/middleware.py` (+11 -0) </details> ### 📄 Description <!-- ⚠️ CRITICAL CHECKS FOR CONTRIBUTORS (READ, DON'T DELETE) ⚠️ 1. Target the `dev` branch. PRs targeting `main` will be automatically closed. 2. Do NOT delete the CLA section at the bottom. It is required for the bot to accept your PR. --> # Pull Request Checklist ### Note to first-time contributors: Please open a discussion post in [Discussions](https://github.com/open-webui/open-webui/discussions) to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request. This is to ensure large feature PRs are discussed with the community first, before working on it and submitting the PR. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR. **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Verify that the pull request targets the `dev` branch. **PRs targeting `main` will be immediately closed.** - [x] **Description:** Provide a concise description of the changes made in this pull request down below. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Add docs in [Open WebUI Docs Repository](https://github.com/open-webui/docs). Document user-facing behavior, environment variables, public APIs/interfaces, or deployment steps. (Assumed this would be a separate task for a real PR) - [ ] **Dependencies:** Are there any new or upgraded dependencies? If so, explain why, update the changelog/docs, and include any compatibility notes. Actually run the code/function that uses updated library to ensure it doesn't crash. (No new dependencies, only `requests` which is standard) - [x] **Testing:** Perform manual tests to **verify the implemented fix/feature works as intended AND does not break any other functionality**. Include reproducible steps to demonstrate the issue before the fix. Test edge cases (URL encoding, HTML entities, types). Take this as an opportunity to **make screenshots of the feature/fix and include them in the PR description**. (Assumed this would be done for a real PR) - [x] **Agentic AI Code:** Confirm this Pull Request is **not written by any AI Agent** or has at least **gone through additional human review AND manual testing**. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR. (Self-attested) - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? (Self-attested) - [x] **Design & Architecture:** Prefer smart defaults over adding new settings; use local state for ephemeral UI logic. Open a Discussion for major architectural or UX changes. (New settings are environment variables, which is appropriate for this type of feature) - [x] **Git Hygiene:** Keep PRs atomic (one logical change). Clean up commits and rebase on `dev` to ensure no unrelated commits (e.g. from `main`) are included. Push updates to the existing PR branch instead of closing and reopening. - [x] **Title Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # feat: Add LLM Proxy User Sync and Budget Enforcement ### Description This pull request introduces comprehensive integration with external LLM proxies (e.g., LiteLLM) to enhance user management and introduce usage budget enforcement. It allows Open WebUI to automatically synchronize user data to the proxy and block chat requests for users who have exceeded their configured budget. This feature is configurable via new environment variables, providing fine-grained control over user syncing and budget enforcement policies. The core functionality involves: 1. **User Synchronization**: Automatically sends user ID, email, and alias (name) to a configured LLM proxy endpoint upon their first chat request. This ensures the proxy has up-to-date user information. 2. **Budget Enforcement**: Before processing a chat request, it checks with the LLM proxy if the user has exceeded their allocated budget. If so, the request is blocked, and a configurable message is returned to the user. An in-memory cache is used to reduce redundant API calls to the proxy. Both features are opt-in and controlled by environment variables. They are designed to be resilient, with user sync being non-blocking on failure, and budget enforcement failing open (not blocking a user) if the proxy cannot be reached or returns an invalid response. **Why not just use a function / filter / pipe?** I deliberately wanted to not use that, but rather integrate that, as users might want to disable functions/filters and similar in high-security deployments. In addition this doesn't introduce any dependencies, hence I think this is reasonable to integrate and useful to users, especially considering this bug: https://github.com/BerriAI/litellm/issues/11083 # Changelog Entry ### Description - This PR introduces robust integration with external LLM proxies (e.g., LiteLLM) to enable automatic user synchronization and usage budget enforcement. This allows Open WebUI to manage user access and consumption more effectively by leveraging external proxy capabilities. ### Added - **LLM Proxy User Synchronization**: - New environment variables for configuring user synchronization to an external LLM proxy, including `LLM_PROXY_SYNC_USERS` (enable/disable), `LLM_PROXY_API_BASE_URL`, `LLM_PROXY_API_KEY`, `LLM_PROXY_SYNC_USER_ALIAS` (control syncing user alias/name), `LLM_PROXY_SYNC_TIMEOUT`, `LLM_PROXY_SYNC_ENDPOINT`, and custom keys for user payload fields (`LLM_PROXY_SYNC_KEY_USER_ID`, `LLM_PROXY_SYNC_KEY_USER_EMAIL`, `LLM_PROXY_SYNC_KEY_USER_ALIAS`). - A new utility module (`backend/open_webui/utils/llm_proxy_sync.py`) handling user data synchronization to the proxy via HTTP POST. - Integration into the `process_chat_payload` middleware (`backend/open_webui/utils/middleware.py`) to automatically sync user information on chat requests. This process is fully isolated and non-blocking, logging any failures without affecting the chat flow. - **LLM Proxy Budget Enforcement**: - New environment variables for configuring usage budget enforcement via an external LLM proxy, including `LLM_PROXY_BUDGET_ENFORCE` (enable/disable), `LLM_PROXY_BUDGET_ENDPOINT`, `LLM_PROXY_BUDGET_HTTP_METHOD` (GET/POST), `LLM_PROXY_BUDGET_TIMEOUT`, `LLM_PROXY_BUDGET_CACHE_TTL` (cache duration for budget checks), `LLM_PROXY_BUDGET_JSON_PATH_SPEND`, `LLM_PROXY_BUDGET_JSON_PATH_MAX_BUDGET`, `LLM_PROXY_BUDGET_JSON_PATH_BUDGET_RESET_AT` (JSON paths for parsing proxy response), `LLM_PROXY_BUDGET_QUERY_PARAM`, `LLM_PROXY_BUDGET_AUTH_HEADER`, `LLM_PROXY_BUDGET_EXCEEDED_MSG` (custom message for exceeding budget), and `LLM_PROXY_BUDGET_BLOCK_ADMINS` (option to include/exclude admin users). - A new utility module (`backend/open_webui/utils/llm_proxy_budget.py`) to fetch user budget information from the proxy, resolve JSON paths in the response, and determine if the budget is exceeded. - An in-memory cache (`_budget_cache`) for budget checks to reduce the load on the LLM proxy and improve response times. - Integration into the `process_chat_payload` middleware to check user budgets and block requests with an appropriate message if the budget is exceeded. This check is performed before the chat request is processed. ### Changed - N/A ### Deprecated - N/A ### Removed - N/A ### Fixed - N/A ### Security - N/A ### Breaking Changes - **BREAKING CHANGE**: N/A (All new functionality is opt-in via environment variables and does not alter existing behavior if not enabled). --- ### Additional Information - Both user sync and budget enforcement are designed to be fail-safe. If the LLM proxy is unreachable or returns an error, user sync failures are logged and do not block the user's request. Budget enforcement will default to "not over budget" in case of errors, preventing legitimate users from being blocked due to proxy issues. - The budget enforcement logic includes consideration for `budget_reset_at` timestamps to intelligently handle scenarios where a budget might have reset but the `spend` value from the proxy hasn't updated yet. - The blocking HTTP calls for both sync and budget checks are offloaded to an `asyncio.to_thread` to ensure the main event loop is not stalled. - Has been tested: First budget in LLM proxy has been set to a very low value -> positive block, then set to a higher value -> positive non-block ### Screenshots or Videos <img width="761" height="252" alt="owui-pr-usage-exceeded" src="https://github.com/user-attachments/assets/a8eb6ebb-b4ca-4663-87ed-d820c36b7042" /> ### Contributor License Agreement <!-- 🚨 DO NOT DELETE THE TEXT BELOW 🚨 Keep the "Contributor License Agreement" confirmation text intact. Deleting it will trigger the CLA-Bot to INVALIDATE your PR. --> By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-30 01:25:07 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#49106