feat: faster /model route on webui webpage load. #5711

New Issue

GiteaMirror · 2025-11-11T16:30:57-06:00

GiteaMirror commented

2025-11-11 16:30:57 -06:00

Originally created by @eregnier on GitHub (Jul 6, 2025).

Check Existing Issues

I have searched the existing issues and discussions.

Problem Description

Hi folks,

Thank you for this amazing project.

I have the issue that /api/model query is slow it takes around 20s each time it is called.
This leads the web ui to hang for such a long time on webapp startup because main chat panel seems to wait for models availability to display anything

all other queries are fast,
my local server is fast (< 100ms)
my network (lan / wan) connection are fast (1Gbps <100ms on 1.1.1.1)

This is not an hardware issue, however as hint, I have 2 custom connections created from admin panel to :

an ollama instance on my windows gaming computer
an ollama instance on my linux laptop

I do not all of them up always, because I use my open webui from one of these devices , or my phone or even from any other device / location

I suspect the /api/model route is slow because it waits all AI providers answer and / or timeout they generate (for my ollama instance on shut down devices for exemple)

I wonder if it is possible to have an option or a default behavior to keep models in local storage or something likethis until real response comes, so I can have the app load and display quickly and start input things.

I think users behavior is to jump in the webui start input things more or less carfully for few seconds, then submit AI request (with text and / or media)

Most of the time (at least for me) I want to input something quickly, -not waiting-, then submit to default provides which in my case is openai, an "always up" AI provider that I should not have to wait.

All of these behaviors (I guess I am not alone behaving like this) can lead to the following thoughts:

It should be possible to move the "wait for model" operation to the submit AI input moment, so in beetween, users have time to write with no wait while models are beeing retrieved
or assume user will send to default AI provider it knows is up when he reaches the webui, which can lead to caching things until a true response from the api/model.

All of this are just guessing and proposals. I let you thing about how to improve this in case this case is handled any soon.

Thank you for any help support or suggestion to help me having a more reactive , ready to answer, awesome open webui.

Desired Solution you'd like

faster / non blocking open webui homepage load, so I can start input things.

Alternatives Considered

cache things or let input instantly by assuming default provider is an "always on" one (or an option to do so)

Additional Context

No response

Originally created by @eregnier on GitHub (Jul 6, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Hi folks, Thank you for this amazing project. I have the issue that /api/model query is slow it takes around 20s each time it is called. This leads the web ui to hang for such a long time on webapp startup because main chat panel seems to wait for models availability to display anything all other queries are fast, my local server is fast (< 100ms) my network (lan / wan) connection are fast (1Gbps <100ms on 1.1.1.1) This is not an hardware issue, however as hint, I have 2 custom connections created from admin panel to : - an ollama instance on my windows gaming computer - an ollama instance on my linux laptop I do not all of them up always, because I use my open webui from one of these devices , or my phone or even from any other device / location I suspect the /api/model route is slow because it waits all AI providers answer and / or timeout they generate (for my ollama instance on shut down devices for exemple) I wonder if it is possible to have an option or a default behavior to keep models in local storage or something likethis until real response comes, so I can have the app load and display quickly and start input things. I think users behavior is to jump in the webui start input things more or less carfully for few seconds, then submit AI request (with text and / or media) Most of the time (at least for me) I want to input something quickly, -not waiting-, then submit to default provides which in my case is openai, an "always up" AI provider that I should not have to wait. All of these behaviors (I guess I am not alone behaving like this) can lead to the following thoughts: - It should be possible to move the "wait for model" operation to the submit AI input moment, so in beetween, users have time to write with no wait while models are beeing retrieved - or assume user will send to default AI provider it knows is up when he reaches the webui, which can lead to caching things until a true response from the api/model. All of this are just guessing and proposals. I let you thing about how to improve this in case this case is handled any soon. Thank you for any help support or suggestion to help me having a more reactive , ready to answer, awesome open webui. ### Desired Solution you'd like faster / non blocking open webui homepage load, so I can start input things. ### Alternatives Considered cache things or let input instantly by assuming default provider is an "always on" one (or an option to do so) ### Additional Context _No response_

GiteaMirror closed this issue

2025-11-11 16:30:57 -06:00

GiteaMirror commented

2025-11-11 16:31:00 -06:00

@tjbck commented on GitHub (Jul 7, 2025):

You can either specify model ids directly from the connections or enable model list cache (available in our dev branch).

@tjbck commented on GitHub (Jul 7, 2025): You can either specify model ids directly from the connections or enable model list cache (available in our dev branch).