[PR #15431] mlx: refined model push behavior #77449

Open
opened 2026-05-05 10:06:52 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15431
Author: @dhiltgen
Created: 4/8/2026
Status: 🔄 Open

Base: mainHead: push_st


📝 Commits (2)

  • ae552c4 mlx: refined model push behavior
  • 92f2b0d review comments, hardening, and performance tuning for slow links

📊 Changes

7 files changed (+1716 additions, -240 deletions)

View changed files

📝 cmd/cmd.go (+1 -0)
📝 envconfig/config.go (+6 -0)
📝 server/images.go (+20 -18)
📝 x/imagegen/transfer/download.go (+81 -9)
📝 x/imagegen/transfer/transfer.go (+25 -23)
📝 x/imagegen/transfer/transfer_test.go (+1068 -85)
📝 x/imagegen/transfer/upload.go (+515 -105)

📄 Description

Refine the algorithm for parallel push of safetensors based models to get better reliability and throughput.

This reduces the default parallelism for pull and push for safetensor models to 4 so we put less burden on slow networks/routers. There is now an env var to adjust the server default.

% ollama serve --help
Start Ollama
...
      OLLAMA_MAX_TRANSFERS       Maximum number of parallel safetensors pull or push streams (default 4)
...

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15431 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 4/8/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `push_st` --- ### 📝 Commits (2) - [`ae552c4`](https://github.com/ollama/ollama/commit/ae552c4cc18bb5b355d7b22854425d10cab6561a) mlx: refined model push behavior - [`92f2b0d`](https://github.com/ollama/ollama/commit/92f2b0dbf6a0dbd266d4b21464f3337e9b9249e3) review comments, hardening, and performance tuning for slow links ### 📊 Changes **7 files changed** (+1716 additions, -240 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+1 -0) 📝 `envconfig/config.go` (+6 -0) 📝 `server/images.go` (+20 -18) 📝 `x/imagegen/transfer/download.go` (+81 -9) 📝 `x/imagegen/transfer/transfer.go` (+25 -23) 📝 `x/imagegen/transfer/transfer_test.go` (+1068 -85) 📝 `x/imagegen/transfer/upload.go` (+515 -105) </details> ### 📄 Description Refine the algorithm for parallel push of safetensors based models to get better reliability and throughput. This reduces the default parallelism for pull and push for safetensor models to 4 so we put less burden on slow networks/routers. There is now an env var to adjust the server default. ``` % ollama serve --help Start Ollama ... OLLAMA_MAX_TRANSFERS Maximum number of parallel safetensors pull or push streams (default 4) ... ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 10:06:52 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77449