[GH-ISSUE #7329] Terminate the current task after the REST request is actively ended #4656

Closed
opened 2026-04-12 15:34:31 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @viosay on GitHub (Oct 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7329

Can the current executing task be terminated after a REST request is actively interrupted?

Originally created by @viosay on GitHub (Oct 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7329 Can the current executing task be terminated after a REST request is actively interrupted?
GiteaMirror added the question label 2026-04-12 15:34:31 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 23, 2024):

Seems to do that now. If I start nvtop and then run

curl -s localhost:11434/api/generate -d '{"model":"llama3.1","prompt":"write a long story about a unicorn lost in a rainforest"}'

I see GPU usage go to 90%, and drop back to 0 immediately after I ^C the curl command. When have you observed that it doesn't do this?

<!-- gh-comment-id:2431718566 --> @rick-github commented on GitHub (Oct 23, 2024): Seems to do that now. If I start `nvtop` and then run ``` curl -s localhost:11434/api/generate -d '{"model":"llama3.1","prompt":"write a long story about a unicorn lost in a rainforest"}' ``` I see GPU usage go to 90%, and drop back to 0 immediately after I ^C the curl command. When have you observed that it doesn't do this?
Author
Owner

@dhiltgen commented on GitHub (Oct 23, 2024):

In addition to what Rick mentioned, the default timeout for models is 5 minutes, so we will stay loaded on the GPU until that timer expires so we're ready to handle additional requests. You can unload the model immediately with the stop command

<!-- gh-comment-id:2432796271 --> @dhiltgen commented on GitHub (Oct 23, 2024): In addition to what Rick mentioned, the default timeout for models is 5 minutes, so we will stay loaded on the GPU until that timer expires so we're ready to handle additional requests. You can unload the model immediately with the [stop command](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4656