mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-17 12:31:06 -05:00
issue: The "Keep Alive" setting has no effect #4428
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @FlippingBinary on GitHub (Mar 14, 2025).
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.5.20
Ollama Version (if applicable)
v0.6.0
Operating System
Windows 11
Browser (if applicable)
Vivaldi 7.1.3570.39 (Stable channel) (64-bit)
Confirmation
README.md.Expected Behavior
Settings -> General -> Advanced Parameters -> Show -> Keep Alive -> Custom -> 60mshould cause open-webui to send akeep_aliveparameter in requests to the/api/generateand/api/chatendpoints.Actual Behavior
The parameter is absent from requests and does not modify Ollama's behavior. This can be verified by examining the request body in the networking tab of the browser's developer tools and with the CLI by
ollama ps, which shows the time to live for any loaded models.Steps to Reproduce
60mor-1.Logs & Screenshots
Even though
Keep Aliveis set to60mand the response is actively being generated while callingollama ps:The network tab shows the request payload:
Additional Information
Before opening this issue, I searched for code in the repository that sends the parameter to Ollama because I was trying to understand why it's not working. I quickly found pull request #721 that seems to add the capability in its initial commit
7053f2f67d. It modifiedgeneratePromptto take a body object and pass it to Ollama. It also modifiedsrc/lib/components/chat/MessageInput/Models.svelteto set the parameter in the body when callinggeneratePrompt.In that same pull request,
3057bfe5a0reverted those changes, and I can't find any reason why it was reverted. It looks like that's the reason the keep alive setting has never changed Ollama's behavior since that PR added the setting to open-webui's interface.After #721 was merged, @jupiterbjy, @LoadingCode233, @SteamNimmersatt, and @tinglion all wrote comments about it not working for them.
Would a PR fixing this be welcome? I feel like I'm missing something like maybe there are bigger changes in the works.
@foraxe commented on GitHub (Mar 14, 2025):
I just created this discussion before making a Pull request. It is on this issue.
https://github.com/open-webui/open-webui/discussions/11690
The problem with the current code is because of the keep_alive config not correctly sent to ollama.
I now am using a uv installed owui. A temporary fix would be to manually modify the payload code.
@FlippingBinary commented on GitHub (Mar 14, 2025):
Ahh, sorry @foraxe. I didn't notice you created a discussion while I was writing up this issue. Now I see the call to
/api/chat/completionsdoes sendparams.keep_alivein the request payload, according to the browser.So that should just need to be moved up a level like you wrote in your discussion, but I'm not sure where it's getting set.
@FlippingBinary commented on GitHub (Mar 14, 2025):
Wait, that's what's getting sent to open-webui, which is okay the way it is. The problem is open-webui isn't passing that parameter on to the Ollama backend. It's not going to be visible in the browser tools.
@foraxe commented on GitHub (Mar 14, 2025):
Hi, @FlippingBinary I am still testing the fix.
But, for a temporary fix, you can modify the backend/open_webui/routers/ollama.py.
Add: payload["keep_alive"] = -1 # keep alive forever
In function:
@router.post("/api/chat")
@router.post("/api/chat/{url_idx}")
async def generate_chat_completion(
,before return await send_post_request(
like:
@router.post("/api/chat")
@router.post("/api/chat/{url_idx}")
async def generate_chat_completion(
request: Request,
form_data: dict,
....
payload["keep_alive"] = -1 # keep alive forever
return await send_post_request(
url=f"{url}/api/chat",
payload=json.dumps(payload),
stream=form_data.stream,
key=get_api_key(url_idx, url, request.app.state.config.OLLAMA_API_CONFIGS),
content_type="application/x-ndjson",
user=user,
)
@FlippingBinary commented on GitHub (Mar 14, 2025):
Setting environment variables for Ollama is another way to set a more useful default than 5 minutes. I imagine most people are using a workaround like this:
It would be nice if the keep alive parameter could be set on a per-model basis, though, especially if the administrator can set boundaries on upper and lower values. That would meet the need described in #3284.
@foraxe commented on GitHub (Mar 15, 2025):
In
ef378ad673/docs/faq.md (L253):OLLAMA_KEEP_ALIVEwas overridden by the POST message from owui. So, using environment variables for workarounds here is not effective.OLLAMA_KEEP_ALIVEcan be overridden by other program's behaviors. In such scenarios, it would induce more complexity in owui.@tjbck commented on GitHub (Mar 15, 2025):
PR merged, Thanks!