[PR #15015] [MERGED] mlx: add mxfp4/mxfp8/nvfp4 importing #77263

Closed
opened 2026-05-05 09:56:05 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15015
Author: @pdevine
Created: 3/22/2026
Status: Merged
Merged: 3/24/2026
Merged by: @pdevine

Base: mainHead: pdevine/qwen-quantization


📝 Commits (2)

📊 Changes

11 files changed (+1345 additions, -124 deletions)

View changed files

📝 x/create/client/create.go (+2 -2)
📝 x/create/client/quantize.go (+381 -74)
📝 x/create/create.go (+187 -25)
📝 x/create/create_test.go (+509 -1)
📝 x/create/qwen35.go (+28 -0)
📝 x/mlxrunner/mlx/io.go (+116 -20)
📝 x/mlxrunner/mlx/ops_extra.go (+46 -0)
📝 x/mlxrunner/model/quant.go (+3 -1)
📝 x/models/nn/nn_test.go (+41 -0)
📝 x/models/qwen3_5/qwen3_5.go (+10 -1)
📝 x/models/qwen3_5/qwen3_5_test.go (+22 -0)

📄 Description

This change allows importing bf16 and converting to mxfp4/mxfp8/nvfp4 and also importing fp8 and converting directly to mxfp8.

This depends on MLX being bumped to a newer version so requires either #15014 or #14789 to merge first.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15015 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 3/22/2026 **Status:** ✅ Merged **Merged:** 3/24/2026 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/qwen-quantization` --- ### 📝 Commits (2) - [`e4a6c5c`](https://github.com/ollama/ollama/commit/e4a6c5ce647d97b0387d38a51945fe899ff1c6a0) mlx: add mxfp4/mxfp8/nvfp4 importing - [`d0ff0cd`](https://github.com/ollama/ollama/commit/d0ff0cd21a97e2e5f537d3888c4be8c3c0000743) linter cleanup ### 📊 Changes **11 files changed** (+1345 additions, -124 deletions) <details> <summary>View changed files</summary> 📝 `x/create/client/create.go` (+2 -2) 📝 `x/create/client/quantize.go` (+381 -74) 📝 `x/create/create.go` (+187 -25) 📝 `x/create/create_test.go` (+509 -1) 📝 `x/create/qwen35.go` (+28 -0) 📝 `x/mlxrunner/mlx/io.go` (+116 -20) 📝 `x/mlxrunner/mlx/ops_extra.go` (+46 -0) 📝 `x/mlxrunner/model/quant.go` (+3 -1) 📝 `x/models/nn/nn_test.go` (+41 -0) 📝 `x/models/qwen3_5/qwen3_5.go` (+10 -1) 📝 `x/models/qwen3_5/qwen3_5_test.go` (+22 -0) </details> ### 📄 Description This change allows importing bf16 and converting to mxfp4/mxfp8/nvfp4 and also importing fp8 and converting directly to mxfp8. This depends on MLX being bumped to a newer version so requires either #15014 or #14789 to merge first. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 09:56:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77263