[GH-ISSUE #6452] Ollama load balancing doesn't supprot different model names with the same ID. #53038

Closed
opened 2026-05-05 14:15:00 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @haydonryan on GitHub (Oct 26, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/6452

Bug Report

Probably also related to https://github.com/open-webui/open-webui/issues/1081

Installation Method

docker-compose.

Environment

  • Open WebUI Version: v0.3.33

  • Ollama (if applicable): 0.3.6

  • Operating System: Arch linux

  • Browser (if applicable): Brave

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

I expect different model names to appear as different models in the openwebui dropdown.

Actual Behavior:

OpenWeb UI is seeing these as the same model (because they are, just aliased). I htink this is becuase the ID numbers are the same. Openwebui is ignoring the model names and combining the two. however this results in a 404 because it's only associating one model name with the index. When openwebui load blances to the other model, it can't find it as the call to ollama uses the model name not the ID.

Description

Bug Summary:
Custom Model Names do not work correctly with loadbalancing in openwebui.

Reproduction Details

Steps to Reproduce:

I have two machines that run Ollama servers - epyc 16 core based that only has CPU, and a desktop with a 3090 in it. I used ollama cp to rename the models so that if the desktop was on i could choose the fast option.

Epyc:
NAME ID SIZE
codestral:22b-v0.1-q8_0-cpu-16 8dde0029a91f 23 GB
Desktop:
codestral:22b-v0.1-q8_0-desktop 8dde0029a91f 23 GB

Logs and Screenshots

Screenshots/Screen Recordings (if applicable):
Screenshot From 2024-10-26 12-33-55

Additional Information

The only reason i'm using custom model names is because there's no ollama server load balancing where it allows you to prefer one server over the other (or routes to the more performant server).

My needs are simple so I'd be happy with "if desktop is offline then use server cpu"

Originally created by @haydonryan on GitHub (Oct 26, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/6452 # Bug Report Probably also related to https://github.com/open-webui/open-webui/issues/1081 ## Installation Method docker-compose. ## Environment - **Open WebUI Version:** v0.3.33 - **Ollama (if applicable):** 0.3.6 - **Operating System:** Arch linux - **Browser (if applicable):** Brave **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: I expect different model names to appear as different models in the openwebui dropdown. ## Actual Behavior: OpenWeb UI is seeing these as the same model (because they are, just aliased). I htink this is becuase the ID numbers are the same. Openwebui is ignoring the model names and combining the two. *however* this results in a 404 because it's only associating _one_ model name with the index. When openwebui load blances to the other model, it can't find it as the call to ollama uses the model name not the ID. ## Description **Bug Summary:** Custom Model Names do not work correctly with loadbalancing in openwebui. ## Reproduction Details **Steps to Reproduce:** I have two machines that run Ollama servers - epyc 16 core based that only has CPU, and a desktop with a 3090 in it. I used `ollama cp` to rename the models so that if the desktop was on i could choose the fast option. Epyc: NAME ID SIZE codestral:22b-v0.1-q8_0-cpu-16 8dde0029a91f 23 GB Desktop: codestral:22b-v0.1-q8_0-desktop 8dde0029a91f 23 GB ## Logs and Screenshots **Screenshots/Screen Recordings (if applicable):** ![Screenshot From 2024-10-26 12-33-55](https://github.com/user-attachments/assets/47a22380-e5c0-4f2b-9355-9c6604eb7bde) ## Additional Information The only reason i'm using custom model names is because there's no ollama server load balancing where it allows you to prefer one server over the other (or routes to the more performant server). My needs are simple so I'd be happy with "if desktop is offline then use server cpu"
Author
Owner

@tjbck commented on GitHub (Oct 26, 2024):

Ollama load balancing will be deprecated in favour of #5680 in the near future! Stay tuned!

<!-- gh-comment-id:2439703278 --> @tjbck commented on GitHub (Oct 26, 2024): Ollama load balancing will be deprecated in favour of #5680 in the near future! Stay tuned!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#53038