[GH-ISSUE #1266] Add a stop/restart command #648

Closed
opened 2026-04-12 10:20:27 -05:00 by GiteaMirror · 14 comments
Owner

Originally created by @davlgd on GitHub (Nov 24, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1266

When I setup/launch ollama the manual way, I can launch the server with serve command but don't have a easy way to stop/restart it (so I need to kill the process). It would be great to have dedicated command for theses actions.

Originally created by @davlgd on GitHub (Nov 24, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1266 When I setup/launch `ollama` the manual way, I can launch the server with `serve` command but don't have a easy way to stop/restart it (so I need to kill the process). It would be great to have dedicated command for theses actions.
GiteaMirror added the feature request label 2026-04-12 10:20:27 -05:00
Author
Owner

@robbiemu commented on GitHub (Nov 25, 2023):

would love to see currently running models, and the url, in the menu bar icon (on Mac, that icon only offers "Quit" even once a model is running).

<!-- gh-comment-id:1826439818 --> @robbiemu commented on GitHub (Nov 25, 2023): would love to see currently running models, and the url, in the menu bar icon (on Mac, that icon only offers "Quit" even once a model is running).
Author
Owner

@technovangelist commented on GitHub (Dec 4, 2023):

Can you talk about why you want to restart the server?

and @robbiemu this seems to be an unrelated comment. But there is only one model at a time running. Whatever model is on the command line ollama run <model> is the one currently loaded.

<!-- gh-comment-id:1839753586 --> @technovangelist commented on GitHub (Dec 4, 2023): Can you talk about why you want to restart the server? and @robbiemu this seems to be an unrelated comment. But there is only one model at a time running. Whatever model is on the command line `ollama run <model>` is the one currently loaded.
Author
Owner

@robbiemu commented on GitHub (Dec 4, 2023):

@technovangelist you're right. sorry, I thought he meant individual models (I misread the OP). I was suggesting: if you are going to add cli to stop models (thank might ahve been started by API), you could also maybe add it to the gui

<!-- gh-comment-id:1839760538 --> @robbiemu commented on GitHub (Dec 4, 2023): @technovangelist you're right. sorry, I thought he meant individual models (I misread the OP). I was suggesting: if you are going to add cli to stop models (thank might ahve been started by API), you could also maybe add it to the gui
Author
Owner

@dudiao commented on GitHub (Dec 21, 2023):

I just wanted to stop the api server, for now I can only use the kill command

<!-- gh-comment-id:1866028280 --> @dudiao commented on GitHub (Dec 21, 2023): I just wanted to stop the api server, for now I can only use the kill command
Author
Owner

@mtrin commented on GitHub (Dec 29, 2023):

just like docker container stop exists, there should be some command to gracefully stop
for example I want it to pick up a new OLLAMA_MODELS env variable

<!-- gh-comment-id:1871965205 --> @mtrin commented on GitHub (Dec 29, 2023): just like docker container stop exists, there should be some command to gracefully stop for example I want it to pick up a new OLLAMA_MODELS env variable
Author
Owner

@technovangelist commented on GitHub (Jan 3, 2024):

On Mac, the way to stop Ollama is to click the menu bar icon and choose Quit Ollama. On Linux run sudo systemctl stop ollama

<!-- gh-comment-id:1875703294 --> @technovangelist commented on GitHub (Jan 3, 2024): On Mac, the way to stop Ollama is to click the menu bar icon and choose `Quit Ollama`. On Linux run `sudo systemctl stop ollama`
Author
Owner

@adrian-valente commented on GitHub (Feb 9, 2024):

I also think such an addition could be useful to allow non-root users to be able to free up GPU resources when the service is not needed, please tell me if there's a way to do this I am missing.

<!-- gh-comment-id:1935844912 --> @adrian-valente commented on GitHub (Feb 9, 2024): I also think such an addition could be useful to allow non-root users to be able to free up GPU resources when the service is not needed, please tell me if there's a way to do this I am missing.
Author
Owner

@chenxi1228 commented on GitHub (Feb 10, 2024):

On Mac, the way to stop Ollama is to click the menu bar icon and choose Quit Ollama. On Linux run sudo systemctl stop ollama

I'm wondering if I'm not a sudoer, how could I stop Ollama, since it will always occupy around 500MB GPU memory on each GPU (4 in total).

<!-- gh-comment-id:1936821782 --> @chenxi1228 commented on GitHub (Feb 10, 2024): > On Mac, the way to stop Ollama is to click the menu bar icon and choose `Quit Ollama`. On Linux run `sudo systemctl stop ollama` I'm wondering if I'm not a sudoer, how could I stop Ollama, since it will always occupy around 500MB GPU memory on each GPU (4 in total).
Author
Owner

@timiil commented on GitHub (Feb 19, 2024):

To give us a http endpoint that stop / restart the ollama service is good, specailly sometimes the ollama just hang up for the VRAM not realse correctly.

<!-- gh-comment-id:1951756693 --> @timiil commented on GitHub (Feb 19, 2024): To give us a http endpoint that stop / restart the ollama service is good, specailly sometimes the ollama just hang up for the VRAM not realse correctly.
Author
Owner

@jmorganca commented on GitHub (Feb 20, 2024):

Hi all – to unload a model you can use the new keep_alive api parameter set to 0 https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately

Regarding stopping the Ollama service – you can send it a regular signal message with ctrl+c or kill. Let me know if this doesn't solve the issue though!

<!-- gh-comment-id:1953338470 --> @jmorganca commented on GitHub (Feb 20, 2024): Hi all – to unload a model you can use the new `keep_alive` api parameter set to `0` https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately Regarding stopping the Ollama service – you can send it a regular signal message with `ctrl+c` or `kill`. Let me know if this doesn't solve the issue though!
Author
Owner

@chenxi1228 commented on GitHub (Feb 20, 2024):

Hi all – to unload a model you can use the new keep_alive api parameter set to 0 https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately

Regarding stopping the Ollama service – you can send it a regular signal message with ctrl+c or kill. Let me know if this doesn't solve the issue though!

This can release the memory used by the model. However, the problem of "it will always occupy around 500MB GPU memory on each GPU (4 in total)" still exists. It cannot be stopped totally.

<!-- gh-comment-id:1953533823 --> @chenxi1228 commented on GitHub (Feb 20, 2024): > Hi all – to unload a model you can use the new `keep_alive` api parameter set to `0` https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately > > Regarding stopping the Ollama service – you can send it a regular signal message with `ctrl+c` or `kill`. Let me know if this doesn't solve the issue though! This can release the memory used by the model. However, the problem of "it will always occupy around 500MB GPU memory on each GPU (4 in total)" still exists. It cannot be stopped totally.
Author
Owner

@davlgd commented on GitHub (Feb 21, 2024):

Hi all – to unload a model you can use the new keep_alive api parameter set to 0 https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately

Regarding stopping the Ollama service – you can send it a regular signal message with ctrl+c or kill. Let me know if this doesn't solve the issue though!

My initial point on this was that, if I launch/use ollama as a server, I don't have any way to act on it as I have with the GUI. I can ollama serve but I don't have a way to do the opposite. I effectively detect it in processes and kill it in such situation but it could be great have a way to just ask it to ollama stop or even ollama restart (as we do with the GUI after an update for example).

<!-- gh-comment-id:1956072210 --> @davlgd commented on GitHub (Feb 21, 2024): > Hi all – to unload a model you can use the new `keep_alive` api parameter set to `0` https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately > > Regarding stopping the Ollama service – you can send it a regular signal message with `ctrl+c` or `kill`. Let me know if this doesn't solve the issue though! My initial point on this was that, if I launch/use ollama as a server, I don't have any way to act on it as I have with the GUI. I can `ollama serve` but I don't have a way to do the opposite. I effectively detect it in processes and kill it in such situation but it could be great have a way to just ask it to `ollama stop` or even `ollama restart` (as we do with the GUI after an update for example).
Author
Owner

@maxgreco commented on GitHub (Mar 10, 2024):

Hi, maybe for Windows this workaround ca be useful -> register this program for restart
https://www.thewindowsclub.com/what-does-register-this-program-for-restart-do
I will try it myself for may forked version of https://github.com/technovangelist/airenamer (BTW thanks a lot!)
I will follow this conversation, bye

<!-- gh-comment-id:1987105371 --> @maxgreco commented on GitHub (Mar 10, 2024): Hi, maybe for Windows this workaround ca be useful -> register this program for restart https://www.thewindowsclub.com/what-does-register-this-program-for-restart-do I will try it myself for may forked version of https://github.com/technovangelist/airenamer (BTW thanks a lot!) I will follow this conversation, bye
Author
Owner

@sheecet commented on GitHub (Apr 8, 2024):

this issue is not resolved

<!-- gh-comment-id:2042950793 --> @sheecet commented on GitHub (Apr 8, 2024): this issue is not resolved
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#648