[GH-ISSUE #3499] OLLAMA_INITIAL_MODEL for use with OLLAMA_KEEP_ALLIVE=-1 #2155

Closed
opened 2026-04-12 12:22:58 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @BananaAcid on GitHub (Apr 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3499

What are you trying to do?

It would be nice, to be able to initially load a model using an env like OLLAMA_INITIAL_MODEL in conjunction with the keep_alive=-1 option, to have OLLAMA start up and be ready to go on slow systems (as on a mining rig with usb2-raiser connected RTXs)

How should we solve this?

No response

What is the impact of not solving this?

No response

Anything else?

No response

Originally created by @BananaAcid on GitHub (Apr 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3499 ### What are you trying to do? It would be nice, to be able to initially load a model using an env like OLLAMA_INITIAL_MODEL in conjunction with the keep_alive=-1 option, to have OLLAMA start up and be ready to go on slow systems (as on a mining rig with usb2-raiser connected RTXs) ### How should we solve this? _No response_ ### What is the impact of not solving this? _No response_ ### Anything else? _No response_
Author
Owner

@pdevine commented on GitHub (Apr 12, 2024):

@BananaAcid you can do this by sending a call to the /api/generate endpoint with an empty prompt.

% curl localhost:11434/api/generate -d '{"model": "mistral", "prompt": ""}'
{"model":"mistral","created_at":"2024-04-12T22:03:18.079017Z","response":"","done":true}

If you have OLLAMA_KEEP_ALIVE=-1 set it will never unload.

<!-- gh-comment-id:2052618781 --> @pdevine commented on GitHub (Apr 12, 2024): @BananaAcid you can do this by sending a call to the `/api/generate` endpoint with an empty prompt. ``` % curl localhost:11434/api/generate -d '{"model": "mistral", "prompt": ""}' {"model":"mistral","created_at":"2024-04-12T22:03:18.079017Z","response":"","done":true} ``` If you have `OLLAMA_KEEP_ALIVE=-1` set it will never unload.
Author
Owner

@BananaAcid commented on GitHub (Apr 13, 2024):

@pdevine any idea how to trigger that workaround the moment the service is ready (or ollama has unloaded the model for any necessary cleanup)?

<!-- gh-comment-id:2053736773 --> @BananaAcid commented on GitHub (Apr 13, 2024): @pdevine any idea how to trigger that workaround the moment the service is ready (or ollama has unloaded the model for any necessary cleanup)?
Author
Owner

@pdevine commented on GitHub (Apr 13, 2024):

@BananaAcid there's no way to really trigger it right now, but you can call out to localhost:11434 (the status endpoint) until you get a 200, then immediately call /api/generate to load the model. It's not ideal, but should work as a workaround.

<!-- gh-comment-id:2053784673 --> @pdevine commented on GitHub (Apr 13, 2024): @BananaAcid there's no way to really trigger it right now, but you can call out to `localhost:11434` (the status endpoint) until you get a 200, then immediately call `/api/generate` to load the model. It's not ideal, but should work as a workaround.
Author
Owner

@pdevine commented on GitHub (May 15, 2024):

With the next version of ollama you will also be able to do: ollama run --keepalive -1s llama3 "" which will load the model into memory indefinitely.

I'm going to go ahead and close the issue, but feel free to keep commenting.

<!-- gh-comment-id:2111377597 --> @pdevine commented on GitHub (May 15, 2024): With the next version of ollama you will also be able to do: `ollama run --keepalive -1s llama3 ""` which will load the model into memory indefinitely. I'm going to go ahead and close the issue, but feel free to keep commenting.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2155