[PR #14403] [MERGED] mlxrunner: Cancel in-flight requests when the client disconnects #19929

Closed
opened 2026-04-16 07:21:08 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14403
Author: @jessegross
Created: 2/25/2026
Status: Merged
Merged: 2/25/2026
Merged by: @jessegross

Base: mainHead: jessegross/mlx-cancel


📝 Commits (2)

  • a1a6847 mlxrunner: Simplify pipeline memory and cache management
  • f80c4e0 mlxrunner: Cancel in-flight requests when the client disconnects

📊 Changes

4 files changed (+150 additions, -79 deletions)

View changed files

📝 x/mlxrunner/cache.go (+75 -35)
📝 x/mlxrunner/pipeline.go (+45 -32)
📝 x/mlxrunner/runner.go (+4 -4)
📝 x/mlxrunner/server.go (+26 -8)

📄 Description

Currently, a canceled request can result in computation continuing in the background to completion. It can also trigger a deadlock when there is nobody to read the output tokens and the pipeline cannot continue to the next request.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14403 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 2/25/2026 **Status:** ✅ Merged **Merged:** 2/25/2026 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/mlx-cancel` --- ### 📝 Commits (2) - [`a1a6847`](https://github.com/ollama/ollama/commit/a1a68478e2cbc4fc34d85c97cf3cb00d735874af) mlxrunner: Simplify pipeline memory and cache management - [`f80c4e0`](https://github.com/ollama/ollama/commit/f80c4e045e104fd11c9466f28164a562db4f0e29) mlxrunner: Cancel in-flight requests when the client disconnects ### 📊 Changes **4 files changed** (+150 additions, -79 deletions) <details> <summary>View changed files</summary> 📝 `x/mlxrunner/cache.go` (+75 -35) 📝 `x/mlxrunner/pipeline.go` (+45 -32) 📝 `x/mlxrunner/runner.go` (+4 -4) 📝 `x/mlxrunner/server.go` (+26 -8) </details> ### 📄 Description Currently, a canceled request can result in computation continuing in the background to completion. It can also trigger a deadlock when there is nobody to read the output tokens and the pipeline cannot continue to the next request. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:21:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#19929