[PR #2422] [MERGED] More robust shutdown #10883

Closed
opened 2026-04-12 23:14:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/2422
Author: @dhiltgen
Created: 2/9/2024
Status: Merged
Merged: 2/12/2024
Merged by: @dhiltgen

Base: mainHead: better_kill


📝 Commits (1)

📊 Changes

2 files changed (+44 additions, -1 deletions)

View changed files

📝 llm/dyn_ext_server.go (+1 -1)
📝 llm/ext_server/ext_server.cpp (+43 -0)

📄 Description

Make sure that when a shutdown signal comes, we shutdown quickly instead of waiting for a potentially long exchange to wrap up.

My initial strategy was going to be multiple signals to trigger a more aggressive shutdown, but that turned into a much more invasive change to try to recover once shutting down had already started, so I aborted that approach. This now takes a simpler approach to simply stop new requests from coming in, canceling whatever is in flight at the next completion, and then shutting down once no requests are actively being processed. If we want to refine this in the future to have the double-signal strategy, we can add that incrementally by just blocking new requests from coming in on the first signal, and on a second signal, cancel tasks that are still iterating in completion.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/2422 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 2/9/2024 **Status:** ✅ Merged **Merged:** 2/12/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `better_kill` --- ### 📝 Commits (1) - [`6680761`](https://github.com/ollama/ollama/commit/6680761596cbd832619ba5a295f03b74c6500743) Shutdown faster ### 📊 Changes **2 files changed** (+44 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/dyn_ext_server.go` (+1 -1) 📝 `llm/ext_server/ext_server.cpp` (+43 -0) </details> ### 📄 Description Make sure that when a shutdown signal comes, we shutdown quickly instead of waiting for a potentially long exchange to wrap up. My initial strategy was going to be multiple signals to trigger a more aggressive shutdown, but that turned into a much more invasive change to try to recover once shutting down had already started, so I aborted that approach. This now takes a simpler approach to simply stop new requests from coming in, canceling whatever is in flight at the next completion, and then shutting down once no requests are actively being processed. If we want to refine this in the future to have the double-signal strategy, we can add that incrementally by just blocking new requests from coming in on the first signal, and on a second signal, cancel tasks that are still iterating in completion. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:14:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#10883