[GH-ISSUE #9174] Add option to ignore 'keep_alive' in request body #5974

Open
opened 2026-04-12 17:19:37 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @sjoerdvandenbos-prodrive on GitHub (Feb 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9174

When hosting this in prod we would like our users to not be able to unload the model from the GPU. Currently whenever users use the continue extension to communicate with the model they reset the keep_alive to 5 minutes.

It would be nice to have an environment variable like IGNORE_KEEP_ALIVE_REQUESTS=1 so that we can set OLLAMA_KEEP_ALIVE=-1 in the container.

Originally created by @sjoerdvandenbos-prodrive on GitHub (Feb 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9174 When hosting this in prod we would like our users to not be able to unload the model from the GPU. Currently whenever users use the continue extension to communicate with the model they reset the keep_alive to 5 minutes. It would be nice to have an environment variable like `IGNORE_KEEP_ALIVE_REQUESTS=1` so that we can set `OLLAMA_KEEP_ALIVE=-1` in the container.
GiteaMirror added the feature request label 2026-04-12 17:19:37 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5974