[PR #12971] [MERGED] vulkan: cherry-pick upstream changes #14018

Closed
opened 2026-04-13 00:42:41 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12971
Author: @dhiltgen
Created: 11/5/2025
Status: Merged
Merged: 11/12/2025
Merged by: @dhiltgen

Base: mainHead: ggml_bump_with_vulkan


📝 Commits (1)

  • 4eefe9d vulkan: temporary cary of vulkan fixes

📊 Changes

32 files changed (+5813 additions, -642 deletions)

View changed files

llama/patches/0029-vulkan-Call-ggml_vk_buffer_write_2d-from-ggml_vk_buf.patch (+32 -0)
llama/patches/0030-Vulkan-MMQ-Integer-Dot-Refactor-and-K-Quant-support-.patch (+2140 -0)
llama/patches/0031-vulkan-Update-topk_moe-fusion-to-handle-gpt-s-late-s.patch (+657 -0)
llama/patches/0032-vulkan-Fuse-rope-set_rows-16769.patch (+1242 -0)
llama/patches/0033-vulkan-Handle-argsort-with-a-large-number-of-rows-16.patch (+85 -0)
llama/patches/0034-vulkan-fix-shmem-overrun-in-mmq-id-shader-16873.patch (+77 -0)
llama/patches/0035-vulkan-Fix-crash-when-FP16-mul_mat-accumulation-is-n.patch (+80 -0)
📝 ml/backend/ggml/ggml/src/ggml-impl.h (+16 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-device.cpp (+1 -1)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/ggml-vulkan.cpp (+594 -224)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/argsort.comp (+12 -4)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs.glsl (+5 -5)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.glsl (+3 -3)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_mxfp4.comp (+2 -2)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q2_k.comp (+2 -2)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q4_k.comp (+2 -2)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q5_k.comp (+2 -2)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q2_k.comp (+2 -4)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q4_k.comp (+2 -4)
📝 ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q5_k.comp (+2 -4)

...and 12 more files

📄 Description

This temporary change carries a set of upstream vulkan changes on top of b6840 so we can fix the mxfp4 crash on older Intel drivers.

This should merge after #12791 and we can revert once we update GGML beyond b6840.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12971 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 11/5/2025 **Status:** ✅ Merged **Merged:** 11/12/2025 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `ggml_bump_with_vulkan` --- ### 📝 Commits (1) - [`4eefe9d`](https://github.com/ollama/ollama/commit/4eefe9df015177434b35f0ab65981c8e2ae83b3c) vulkan: temporary cary of vulkan fixes ### 📊 Changes **32 files changed** (+5813 additions, -642 deletions) <details> <summary>View changed files</summary> ➕ `llama/patches/0029-vulkan-Call-ggml_vk_buffer_write_2d-from-ggml_vk_buf.patch` (+32 -0) ➕ `llama/patches/0030-Vulkan-MMQ-Integer-Dot-Refactor-and-K-Quant-support-.patch` (+2140 -0) ➕ `llama/patches/0031-vulkan-Update-topk_moe-fusion-to-handle-gpt-s-late-s.patch` (+657 -0) ➕ `llama/patches/0032-vulkan-Fuse-rope-set_rows-16769.patch` (+1242 -0) ➕ `llama/patches/0033-vulkan-Handle-argsort-with-a-large-number-of-rows-16.patch` (+85 -0) ➕ `llama/patches/0034-vulkan-fix-shmem-overrun-in-mmq-id-shader-16873.patch` (+77 -0) ➕ `llama/patches/0035-vulkan-Fix-crash-when-FP16-mul_mat-accumulation-is-n.patch` (+80 -0) 📝 `ml/backend/ggml/ggml/src/ggml-impl.h` (+16 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-device.cpp` (+1 -1) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/ggml-vulkan.cpp` (+594 -224) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/argsort.comp` (+12 -4) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs.glsl` (+5 -5) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.glsl` (+3 -3) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_mxfp4.comp` (+2 -2) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q2_k.comp` (+2 -2) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q4_k.comp` (+2 -2) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/dequant_q5_k.comp` (+2 -2) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q2_k.comp` (+2 -4) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q4_k.comp` (+2 -4) 📝 `ml/backend/ggml/ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q5_k.comp` (+2 -4) _...and 12 more files_ </details> ### 📄 Description This temporary change carries a set of upstream vulkan changes on top of b6840 so we can fix the mxfp4 crash on older Intel drivers. This should merge after #12791 and we can revert once we update GGML beyond b6840. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:42:41 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14018