mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[PR #22373] [CLOSED] feat: add utilities and deployment scaffolding for external MedGemma model server #26641
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/22373
Author: @cadeferg
Created: 3/7/2026
Status: ❌ Closed
Base:
dev← Head:medgemma-runpod-setup📝 Commits (10+)
fe6783cMerge pull request #19030 from open-webui/devfc05e0aMerge pull request #19405 from open-webui/deve3faec6Merge pull request #19416 from open-webui/dev9899293Merge pull request #19448 from open-webui/dev140605eMerge pull request #19462 from open-webui/dev6f1486fMerge pull request #19466 from open-webui/devd95f533Merge pull request #19729 from open-webui/deva7271530.6.43 (#20093)6adde20Merge pull request #20394 from open-webui/devf9b0534Merge pull request #20522 from open-webui/dev📊 Changes
14 files changed (+280 additions, -25 deletions)
View changed files
📝
.env.example(+27 -1)📝
Dockerfile(+1 -0)📝
README.md(+7 -0)📝
backend/open_webui/config.py(+13 -13)📝
backend/open_webui/main.py(+5 -0)➕
backend/open_webui/prompts/medical_simplification.txt(+22 -0)➕
backend/open_webui/utils/document_chunking.py(+21 -0)➕
backend/open_webui/utils/runpod_idle_shutdown.py(+126 -0)➕
deploy/runpod-model/.env.example(+4 -0)➕
deploy/runpod-model/README.md(+11 -0)➕
deploy/runpod-model/run_vllm.sh(+12 -0)➕
desktop.ini(+2 -0)➕
docker-compose.medgemma-dev.yml(+27 -0)📝
docker-compose.yaml(+2 -11)📄 Description
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request.
Before submitting, make sure you've checked the following:
devbranch.choreDescription
This PR introduces backend utilities and deployment scaffolding to support running Open WebUI with an external HuggingFace-hosted MedGemma model server (e.g., served via vLLM on infrastructure such as RunPod).
The goal is to prepare the repository for integration with external LLM inference services rather than assuming local Ollama-based models.
These changes are non-breaking and primarily add optional utilities and configuration scaffolding that can be used in external deployments.
Key motivations:
None of these additions modify existing model providers or affect default Open WebUI behavior.
Changelog Entry
Description
Adds backend utilities and deployment scaffolding to support external model serving (e.g., MedGemma via vLLM) and infrastructure-aware deployments such as GPU spot instances.
Added
runpod_idle_shutdown.pyutility to support optional automatic shutdown of idle GPU podsdocument_chunking.pyhelper for splitting long medical documents into manageable chunksChanged
.env.examplewith optional configuration variables for external model servers and GPU deployment environments.Deprecated
Removed
Fixed
Security
Breaking Changes
Additional Information
These additions are intended for deployments where Open WebUI interacts with an external model inference server (e.g., HuggingFace models served through vLLM).
The utilities are optional and do not affect existing Open WebUI functionality.
Future work may include:
Screenshots or Videos
N/A — backend utilities and deployment scaffolding only.
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.