[GH-ISSUE #4414] Title auto-generation does not respect keep_alive setting #29128

New Issue

GiteaMirror · 2026-04-25T03:34:56-05:00

GiteaMirror commented

2026-04-25 03:34:56 -05:00

Originally created by @amirhossein-ka on GitHub (Aug 7, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/4414

Bug Report

Description

Bug Summary:
When using keep alive setting to change time that models is loaded on gpu (in my case -1), on the new chats on the first prompt the model reloads and its keep alive time resets to 5 min, leading to performance issues.

Steps to Reproduce:

Change keep alive setting to some other value than 5m.
Start a new chat and give a prompt to model.
Monitor the model load keep alive time using watch -n 0.1 ollama ps or nvtop to check gpu memory usage.

Expected Behavior:
Title auto-generation should respect keep-alive setting and do not cause to models get reloaded.

Actual Behavior:
This feature ignores keep alive setting.

Environment

Open WebUI Version: v0.3.10
Ollama (if applicable): 0.3.0
Operating System: Ubuntu 24.04

Reproduction Details

Confirmation:

I have read and followed all the instructions provided in the README.md.
I am on the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.

Logs and Screenshots

Screenshots (if applicable):

Installation Method

open-webui installed using docker and ollama is installed directly on host machine.

Additional Information

[Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]

Originally created by @amirhossein-ka on GitHub (Aug 7, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/4414 # Bug Report ## Description **Bug Summary:** When using keep alive setting to change time that models is loaded on gpu (in my case -1), on the new chats on the first prompt the model reloads and its keep alive time resets to 5 min, leading to performance issues. **Steps to Reproduce:** 1. Change keep alive setting to some other value than 5m. 2. Start a new chat and give a prompt to model. 3. Monitor the model load keep alive time using `watch -n 0.1 ollama ps` or `nvtop` to check gpu memory usage. **Expected Behavior:** Title auto-generation should respect keep-alive setting and do not cause to models get reloaded. **Actual Behavior:** This feature ignores keep alive setting. ## Environment - **Open WebUI Version:** v0.3.10 - **Ollama (if applicable):** 0.3.0 - **Operating System:** Ubuntu 24.04 ## Reproduction Details **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. - [ ] I have included the Docker container logs. ## Logs and Screenshots **Screenshots (if applicable):** ![image](https://github.com/user-attachments/assets/1fe32a91-12ef-43c3-865c-5d7a54e921a3) ## Installation Method open-webui installed using docker and ollama is installed directly on host machine. ## Additional Information [Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]

GiteaMirror closed this issue

2026-04-25 03:34:56 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#29128