[PR #13869] [MERGED] Revert "model: add MLA absorption for glm4moelite" #14426

Closed
opened 2026-04-13 00:53:43 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13869
Author: @jmorganca
Created: 1/24/2026
Status: Merged
Merged: 1/24/2026
Merged by: @jmorganca

Base: mainHead: revert-13810-glm4moelite-mla-absorption


📝 Commits (1)

  • cd4a65a Revert "model: add MLA absorption for glm4moelite (#13810)"

📊 Changes

16 files changed (+23 additions, -522 deletions)

View changed files

📝 convert/convert_glm4moelite.go (+0 -114)
llama/patches/0032-ggml-enable-MLA-flash-attention-for-GLM-4.7-flash.patch (+0 -248)
📝 ml/backend/ggml/ggml/src/ggml-cuda/fattn-mma-f16.cuh (+3 -12)
📝 ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile.cuh (+0 -16)
📝 ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu (+4 -8)
📝 ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu (+0 -1)
📝 ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu (+0 -1)
📝 ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu (+0 -1)
📝 ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu (+0 -1)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-device.m (+6 -2)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal (+0 -1)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-ops.cpp (+1 -1)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal (+0 -1)
📝 model/model.go (+0 -14)
📝 model/models/glm4moelite/model.go (+9 -28)
model/models/glm4moelite/model_test.go (+0 -73)

📄 Description

Reverts ollama/ollama#13810 until we fix the CUDA build


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13869 **Author:** [@jmorganca](https://github.com/jmorganca) **Created:** 1/24/2026 **Status:** ✅ Merged **Merged:** 1/24/2026 **Merged by:** [@jmorganca](https://github.com/jmorganca) **Base:** `main` ← **Head:** `revert-13810-glm4moelite-mla-absorption` --- ### 📝 Commits (1) - [`cd4a65a`](https://github.com/ollama/ollama/commit/cd4a65a72fdda507130a3ac6aa5a9faf3387aefd) Revert "model: add MLA absorption for glm4moelite (#13810)" ### 📊 Changes **16 files changed** (+23 additions, -522 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert_glm4moelite.go` (+0 -114) ➖ `llama/patches/0032-ggml-enable-MLA-flash-attention-for-GLM-4.7-flash.patch` (+0 -248) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/fattn-mma-f16.cuh` (+3 -12) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/fattn-tile.cuh` (+0 -16) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/fattn.cu` (+4 -8) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu` (+0 -1) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu` (+0 -1) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu` (+0 -1) 📝 `ml/backend/ggml/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu` (+0 -1) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-device.m` (+6 -2) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal` (+0 -1) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-ops.cpp` (+1 -1) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal` (+0 -1) 📝 `model/model.go` (+0 -14) 📝 `model/models/glm4moelite/model.go` (+9 -28) ➖ `model/models/glm4moelite/model_test.go` (+0 -73) </details> ### 📄 Description Reverts ollama/ollama#13810 until we fix the CUDA build --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:53:43 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14426