[PR #10599] [MERGED] sched: fix race leading to orphaned runners #44540

Closed
opened 2026-04-24 23:59:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10599
Author: @dhiltgen
Created: 5/6/2025
Status: Merged
Merged: 5/7/2025
Merged by: @dhiltgen

Base: mainHead: sched_log


📝 Commits (1)

  • 19f7c50 sched: fix race leading to orphaned runners

📊 Changes

2 files changed (+40 additions, -20 deletions)

View changed files

📝 llm/server.go (+3 -3)
📝 server/sched.go (+37 -17)

📄 Description

If a model is loading, and the request context is canceled during the load by a client closing the connection, and another request is inbound for the same model with a different configuration (context size, etc.) thus requiring a reload, two unload events can be in flight. The first shuts down the original model load, but the second one caused the loss of the new reloading runner reference, thus triggering the leak.

The primary fix is detecting the duplicate unload and ignoring the second instance. The load routine is also hardened to ensure we detect clobbering an already present runner and unload it with a warning.

Fixes #10433


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10599 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 5/6/2025 **Status:** ✅ Merged **Merged:** 5/7/2025 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `sched_log` --- ### 📝 Commits (1) - [`19f7c50`](https://github.com/ollama/ollama/commit/19f7c507506fb44833dfd6f72f20674a1530e5a5) sched: fix race leading to orphaned runners ### 📊 Changes **2 files changed** (+40 additions, -20 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+3 -3) 📝 `server/sched.go` (+37 -17) </details> ### 📄 Description If a model is loading, and the request context is canceled during the load by a client closing the connection, and another request is inbound for the same model with a different configuration (context size, etc.) thus requiring a reload, two unload events can be in flight. The first shuts down the original model load, but the second one caused the loss of the new reloading runner reference, thus triggering the leak. The primary fix is detecting the duplicate unload and ignoring the second instance. The load routine is also hardened to ensure we detect clobbering an already present runner and unload it with a warning. Fixes #10433 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 23:59:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#44540