[PR #14247] [MERGED] mlxrunner fixes #19859

Closed
opened 2026-04-16 07:18:46 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14247
Author: @pdevine
Created: 2/14/2026
Status: Merged
Merged: 2/14/2026
Merged by: @pdevine

Base: mainHead: pdevine/mlxrunner-fixes


📝 Commits (4)

📊 Changes

19 files changed (+750 additions, -267 deletions)

View changed files

📝 cmd/cmd.go (+14 -0)
📝 server/routes.go (+4 -1)
📝 server/routes_generate_test.go (+1 -0)
📝 server/sched.go (+23 -9)
📝 server/sched_test.go (+4 -4)
📝 x/imagegen/manifest/weights.go (+8 -3)
📝 x/mlxrunner/client.go (+291 -51)
x/mlxrunner/imports.go (+7 -0)
📝 x/mlxrunner/mlx/array.go (+2 -1)
📝 x/mlxrunner/mlx/dynamic.go (+51 -30)
📝 x/mlxrunner/mlx/ops_extra.go (+29 -6)
x/mlxrunner/model/base/base.go (+85 -0)
x/mlxrunner/model/base/base_stub.go (+3 -0)
x/mlxrunner/model/root.go (+97 -0)
x/mlxrunner/model/root_stub.go (+3 -0)
📝 x/mlxrunner/pipeline.go (+5 -2)
📝 x/mlxrunner/runner.go (+68 -33)
📝 x/mlxrunner/sample/sample.go (+1 -1)
📝 x/models/glm4_moe_lite/glm4_moe_lite.go (+54 -126)

📄 Description

This PR:

  • correctly wires up the Safetensors based GLM 4.7 Flash model to the MLX runner
  • improves the eval performance of GLM4.7 Flash by around 150% (by fixing how scalar dtypes are being handled)
  • adds a fix so that older experimental x/flux2-klein and x/z-image-turbo models should work again
  • adds a hidden flag to try loading the safetensors glm 4.7 flash model onto the older imagegen runner vs. the mlxrunner
  • fixes model loading for the mlxrunner to correctly load from a manifest
  • adds back the root / base models for the mlxrunner
  • simplifies model loading in the mlxrunner

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14247 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 2/14/2026 **Status:** ✅ Merged **Merged:** 2/14/2026 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/mlxrunner-fixes` --- ### 📝 Commits (4) - [`967bedc`](https://github.com/ollama/ollama/commit/967bedce3071e46a7be8d80f68b566b706159db6) load glm4_moe_lite from the mlxrunner - [`f354af3`](https://github.com/ollama/ollama/commit/f354af31903d7204b402fb940269bdb01779c945) fix loading diffusion models - [`8faae6e`](https://github.com/ollama/ollama/commit/8faae6e44344a9740569b8488dcbf92f776fd2f2) remove log lines - [`050b0a0`](https://github.com/ollama/ollama/commit/050b0a03a601c0a086fe1c16ca0613c019a6a5ca) fix --imagegen flag ### 📊 Changes **19 files changed** (+750 additions, -267 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+14 -0) 📝 `server/routes.go` (+4 -1) 📝 `server/routes_generate_test.go` (+1 -0) 📝 `server/sched.go` (+23 -9) 📝 `server/sched_test.go` (+4 -4) 📝 `x/imagegen/manifest/weights.go` (+8 -3) 📝 `x/mlxrunner/client.go` (+291 -51) ➕ `x/mlxrunner/imports.go` (+7 -0) 📝 `x/mlxrunner/mlx/array.go` (+2 -1) 📝 `x/mlxrunner/mlx/dynamic.go` (+51 -30) 📝 `x/mlxrunner/mlx/ops_extra.go` (+29 -6) ➕ `x/mlxrunner/model/base/base.go` (+85 -0) ➕ `x/mlxrunner/model/base/base_stub.go` (+3 -0) ➕ `x/mlxrunner/model/root.go` (+97 -0) ➕ `x/mlxrunner/model/root_stub.go` (+3 -0) 📝 `x/mlxrunner/pipeline.go` (+5 -2) 📝 `x/mlxrunner/runner.go` (+68 -33) 📝 `x/mlxrunner/sample/sample.go` (+1 -1) 📝 `x/models/glm4_moe_lite/glm4_moe_lite.go` (+54 -126) </details> ### 📄 Description This PR: * correctly wires up the Safetensors based GLM 4.7 Flash model to the MLX runner * improves the eval performance of GLM4.7 Flash by around 150% (by fixing how scalar dtypes are being handled) * adds a fix so that older experimental `x/flux2-klein` and `x/z-image-turbo` models should work again * adds a hidden flag to try loading the safetensors glm 4.7 flash model onto the older imagegen runner vs. the mlxrunner * fixes model loading for the mlxrunner to correctly load from a manifest * adds back the root / base models for the mlxrunner * simplifies model loading in the mlxrunner --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:18:46 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#19859