mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-22 06:02:06 -05:00
refac: ollama separate endpoint mode #2186
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rampageservices on GitHub (Sep 24, 2024).
Is your feature request related to a problem? Please describe.
I'd like to use the same Open WebUI instance with two different Ollama servers, one remote and one local. The remote server may not always have a connection. If the remote server is inaccessible and both local and remote API endpoint connections are set in the admin settings then Open WebUI will become unresponsive for sometime until it realizes the connection does not exist. I have to navigate through multiple windows in order to disable the remote server.
Describe the solution you'd like
I'd like the model selector to allow me to show me which endpoints are active and which models are available from those endpoints.
Describe alternatives you've considered
I'm considering having two Open WebUI instances, one solely for remote and one solely for local as to not impede my productivity if I just need to try something local and offline.
Additional context
Open WebUI has the ability to do load balancing but if I am using a remote server for larger models, the request may run locally instead also causing productivity loss.
@rgaricano commented on GitHub (Oct 6, 2024):
I'm not very familiar with that project, but I think that pipelines are designed for these functions,
create a pipeline with a valve for each ollama connection ( https://github.com/open-webui/pipelines/blob/main/examples/pipelines/providers/ollama_manifold_pipeline.py ) and do a monitoring of servers process (as is done in https://github.com/open-webui/pipelines/blob/main/examples/pipelines/providers/mlx_manifold_pipeline.py ) for on/offline detection, more specific balance,...
@tjbck commented on GitHub (Nov 12, 2024):
You can now disable the default ollama load balancing by specifying the prefix id for each endpoints like such:
Testing wanted here!
@rampageservices commented on GitHub (Nov 12, 2024):
This really made a bad day brighter. Thank you so much. I'll be sure to provide feedback soon.
@bkev commented on GitHub (Nov 12, 2024):
Is this available in the docker image? I've tried to update it, but I can't see those options.