[PR #13773] [MERGED] x/imagegen: add FP4 quantization support for image generation models #24923

Closed
opened 2026-04-19 17:53:43 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13773
Author: @jmorganca
Created: 1/19/2026
Status: Merged
Merged: 1/19/2026
Merged by: @jmorganca

Base: mainHead: jmorganca/fp4-quantization


📝 Commits (1)

  • e37fdcd x/imagegen: add FP4 quantization support for image generation models

📊 Changes

5 files changed (+43 additions, -6 deletions)

View changed files

📝 x/create/client/quantize.go (+3 -0)
📝 x/create/imagegen.go (+2 -2)
📝 x/imagegen/safetensors/loader.go (+18 -4)
📝 x/imagegen/safetensors/safetensors.go (+5 -0)
📝 x/imagegen/weights.go (+15 -0)

📄 Description

Add --quantize fp4 support to ollama create for image generation models (flux2, z-image-turbo), using MLX's affine 4-bit quantization.

Changes:

  • Add fp4 to validation in CreateImageGenModel
  • Add FP4 case to quantizeTensor (group_size=32, bits=4, affine mode)
  • Add GetQuantization() to WeightSource interface for dynamic params
  • Update LoadLinearLayer to use quantization params from model metadata

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13773 **Author:** [@jmorganca](https://github.com/jmorganca) **Created:** 1/19/2026 **Status:** ✅ Merged **Merged:** 1/19/2026 **Merged by:** [@jmorganca](https://github.com/jmorganca) **Base:** `main` ← **Head:** `jmorganca/fp4-quantization` --- ### 📝 Commits (1) - [`e37fdcd`](https://github.com/ollama/ollama/commit/e37fdcd25144b02e2d4c787b19b776d65c2faa69) x/imagegen: add FP4 quantization support for image generation models ### 📊 Changes **5 files changed** (+43 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `x/create/client/quantize.go` (+3 -0) 📝 `x/create/imagegen.go` (+2 -2) 📝 `x/imagegen/safetensors/loader.go` (+18 -4) 📝 `x/imagegen/safetensors/safetensors.go` (+5 -0) 📝 `x/imagegen/weights.go` (+15 -0) </details> ### 📄 Description Add --quantize fp4 support to ollama create for image generation models (flux2, z-image-turbo), using MLX's affine 4-bit quantization. Changes: - Add fp4 to validation in CreateImageGenModel - Add FP4 case to quantizeTensor (group_size=32, bits=4, affine mode) - Add GetQuantization() to WeightSource interface for dynamic params - Update LoadLinearLayer to use quantization params from model metadata --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 17:53:43 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#24923