[GH-ISSUE #4414] Title auto-generation does not respect keep_alive setting #29128

Closed
opened 2026-04-25 03:34:56 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @amirhossein-ka on GitHub (Aug 7, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/4414

Bug Report

Description

Bug Summary:
When using keep alive setting to change time that models is loaded on gpu (in my case -1), on the new chats on the first prompt the model reloads and its keep alive time resets to 5 min, leading to performance issues.

Steps to Reproduce:

  1. Change keep alive setting to some other value than 5m.
  2. Start a new chat and give a prompt to model.
  3. Monitor the model load keep alive time using watch -n 0.1 ollama ps or nvtop to check gpu memory usage.

Expected Behavior:
Title auto-generation should respect keep-alive setting and do not cause to models get reloaded.

Actual Behavior:
This feature ignores keep alive setting.

Environment

  • Open WebUI Version: v0.3.10

  • Ollama (if applicable): 0.3.0

  • Operating System: Ubuntu 24.04

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Logs and Screenshots

Screenshots (if applicable):
image

Installation Method

open-webui installed using docker and ollama is installed directly on host machine.

Additional Information

[Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]

Originally created by @amirhossein-ka on GitHub (Aug 7, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/4414 # Bug Report ## Description **Bug Summary:** When using keep alive setting to change time that models is loaded on gpu (in my case -1), on the new chats on the first prompt the model reloads and its keep alive time resets to 5 min, leading to performance issues. **Steps to Reproduce:** 1. Change keep alive setting to some other value than 5m. 2. Start a new chat and give a prompt to model. 3. Monitor the model load keep alive time using `watch -n 0.1 ollama ps` or `nvtop` to check gpu memory usage. **Expected Behavior:** Title auto-generation should respect keep-alive setting and do not cause to models get reloaded. **Actual Behavior:** This feature ignores keep alive setting. ## Environment - **Open WebUI Version:** v0.3.10 - **Ollama (if applicable):** 0.3.0 - **Operating System:** Ubuntu 24.04 ## Reproduction Details **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. - [ ] I have included the Docker container logs. ## Logs and Screenshots **Screenshots (if applicable):** ![image](https://github.com/user-attachments/assets/1fe32a91-12ef-43c3-865c-5d7a54e921a3) ## Installation Method open-webui installed using docker and ollama is installed directly on host machine. ## Additional Information [Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#29128