[PR #14878] [MERGED] mlx: add prequantized tensor packing + changes for qwen35 #14889

Closed
opened 2026-04-13 01:04:57 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14878
Author: @pdevine
Created: 3/16/2026
Status: Merged
Merged: 3/17/2026
Merged by: @pdevine

Base: mainHead: pdevine/qwen35-create


📝 Commits (2)

  • d3112db mlx: add prequantized tensor packing + changes for qwen35
  • 94fa8cb linting

📊 Changes

5 files changed (+1037 additions, -34 deletions)

View changed files

📝 x/create/create.go (+258 -33)
📝 x/create/create_test.go (+319 -0)
x/create/dtype.go (+109 -0)
x/create/qwen35.go (+323 -0)
📝 x/imagegen/safetensors/extractor.go (+28 -1)

📄 Description

This change adds a tensorImportTransform interface for model-specific tensor transformations during safetensors import. This allows importing and modifying the standard HF based weights as well as the mlx-community derived pre-quantized safetensors repos to be directly imported into ollama create. Right now this only works with Qwen3.5 importing which does tensor renaming, norm weight shifting (it adds +1 to each value of the norm vectors), conv1d transposition, and casts to BF16s for F32 based vectors.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14878 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 3/16/2026 **Status:** ✅ Merged **Merged:** 3/17/2026 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/qwen35-create` --- ### 📝 Commits (2) - [`d3112db`](https://github.com/ollama/ollama/commit/d3112db0539e61d4f9aa391a2e8eeb06f34bf6d3) mlx: add prequantized tensor packing + changes for qwen35 - [`94fa8cb`](https://github.com/ollama/ollama/commit/94fa8cb82f4bf6cda2e875b037522a45cc54b4a3) linting ### 📊 Changes **5 files changed** (+1037 additions, -34 deletions) <details> <summary>View changed files</summary> 📝 `x/create/create.go` (+258 -33) 📝 `x/create/create_test.go` (+319 -0) ➕ `x/create/dtype.go` (+109 -0) ➕ `x/create/qwen35.go` (+323 -0) 📝 `x/imagegen/safetensors/extractor.go` (+28 -1) </details> ### 📄 Description This change adds a tensorImportTransform interface for model-specific tensor transformations during safetensors import. This allows importing and modifying the standard HF based weights as well as the mlx-community derived pre-quantized safetensors repos to be directly imported into `ollama create`. Right now this only works with Qwen3.5 importing which does tensor renaming, norm weight shifting (it adds +1 to each value of the norm vectors), conv1d transposition, and casts to BF16s for F32 based vectors. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 01:04:57 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14889