[PR #11486] [CLOSED] MXFP4 support #18825

Closed
opened 2026-04-16 06:48:06 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/11486
Author: @dhiltgen
Created: 7/21/2025
Status: Closed

Base: mainHead: mx4


📝 Commits (2)

📊 Changes

26 files changed (+3291 additions, -22 deletions)

View changed files

📝 fs/ggml/type.go (+6 -2)
llama/patches/0021-MXFP4.patch (+1301 -0)
📝 ml/backend.go (+1 -0)
📝 ml/backend/ggml/ggml.go (+70 -0)
📝 ml/backend/ggml/ggml/include/ggml.h (+1 -1)
📝 ml/backend/ggml/ggml/src/ggml-common.h (+7 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu-quants.h (+2 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c (+5 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/ops.cpp (+1 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/vec.cpp (+69 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/vec.h (+2 -0)
📝 ml/backend/ggml/ggml/src/ggml-cuda/convert.cu (+80 -0)
📝 ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu (+14 -2)
ml/backend/ggml/ggml/src/ggml-cuda/mmvmxfp4.cu (+307 -0)
ml/backend/ggml/ggml/src/ggml-cuda/mmvmxfp4.cuh (+9 -0)
📝 ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu (+3 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal (+178 -5)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-impl.h (+3 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m (+24 -1)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal (+168 -5)

...and 6 more files

📄 Description

Partial implementation for MXFP4 tensor type

  • Metal
  • CPU
  • CUDA

Draft as it still needs some further cleanup, and the unit tests aren't suitable to merge yet...


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/11486 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 7/21/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `mx4` --- ### 📝 Commits (2) - [`5fbc4a5`](https://github.com/ollama/ollama/commit/5fbc4a5cbf790ea8ff643a2b0c03d3d84daa263d) MXFP4 support - [`85c6f45`](https://github.com/ollama/ollama/commit/85c6f450173115d5d4910336d18a36a938759fa1) Unit tests for MXFP4 support ### 📊 Changes **26 files changed** (+3291 additions, -22 deletions) <details> <summary>View changed files</summary> 📝 `fs/ggml/type.go` (+6 -2) ➕ `llama/patches/0021-MXFP4.patch` (+1301 -0) 📝 `ml/backend.go` (+1 -0) 📝 `ml/backend/ggml/ggml.go` (+70 -0) 📝 `ml/backend/ggml/ggml/include/ggml.h` (+1 -1) 📝 `ml/backend/ggml/ggml/src/ggml-common.h` (+7 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu-quants.h` (+2 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c` (+5 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/ops.cpp` (+1 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/vec.cpp` (+69 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/vec.h` (+2 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/convert.cu` (+80 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu` (+14 -2) ➕ `ml/backend/ggml/ggml/src/ggml-cuda/mmvmxfp4.cu` (+307 -0) ➕ `ml/backend/ggml/ggml/src/ggml-cuda/mmvmxfp4.cuh` (+9 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/mmvq.cu` (+3 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal` (+178 -5) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-impl.h` (+3 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m` (+24 -1) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal` (+168 -5) _...and 6 more files_ </details> ### 📄 Description Partial implementation for MXFP4 tensor type - [X] Metal - [X] CPU - [x] CUDA Draft as it still needs some further cleanup, and the unit tests aren't suitable to merge yet... --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:48:06 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#18825