[PR #9203] [MERGED] build: remove backend build for sapphirerapids #18159

Closed
opened 2026-04-16 06:26:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/9203
Author: @mxyng
Created: 2/18/2025
Status: Merged
Merged: 2/19/2025
Merged by: @mxyng

Base: mainHead: mxyng/sapphirerapids


📝 Commits (1)

  • 5f8c031 build: remove backend build for sapphirerapids

📊 Changes

2 files changed (+24 additions, -4 deletions)

View changed files

llama/patches/0018-remove-amx.patch (+24 -0)
📝 ml/backend/ggml/ggml/src/CMakeLists.txt (+0 -4)

📄 Description

sapphire rapids has amx support but it ends up having a negative performance impact.

emerald rapids also has amx support with a positive performance impact however there's no reasonable way in ggml to differentiate between the two. the impact is small (~6.5%) so disable amx entirely for simplicity

some quick tests

sapphire rapids

0.5.7 mean: 7.130 tokens/s stdev: 0.044 n: 3
0.5.11 mean: 6.513 tokens/s stdev: 0.015 n: 3
noamx mean: 7.220 tokens/s stdev: 0.061 n: 3
percent change
0.5.7 --> 0.5.11 -8.649%
0.5.11 --> noamx 10.850%
0.5.7 --> noamx 1.262%

emerald rapids

0.5.7 mean: 4.853 tokens/s stdev: 0.015 n: 3
0.5.11 mean: 5.797 tokens/s stdev: 0.012 n: 3
noamx mean: 5.407 tokens/s stdev: 0.012 n: 3
percent change
0.5.7 --> 0.5.11 19.437%
0.5.11 --> noamx -6.728%
0.5.7 --> noamx 11.401%

resolves #9087


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/9203 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 2/18/2025 **Status:** ✅ Merged **Merged:** 2/19/2025 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `mxyng/sapphirerapids` --- ### 📝 Commits (1) - [`5f8c031`](https://github.com/ollama/ollama/commit/5f8c03189e73758339c3652d7eb0e6c3aed9761f) build: remove backend build for sapphirerapids ### 📊 Changes **2 files changed** (+24 additions, -4 deletions) <details> <summary>View changed files</summary> ➕ `llama/patches/0018-remove-amx.patch` (+24 -0) 📝 `ml/backend/ggml/ggml/src/CMakeLists.txt` (+0 -4) </details> ### 📄 Description sapphire rapids has amx support but it ends up having a negative performance impact. emerald rapids also has amx support with a positive performance impact however there's no reasonable way in ggml to differentiate between the two. the impact is small (~6.5%) so disable amx entirely for simplicity some quick tests sapphire rapids ``` 0.5.7 mean: 7.130 tokens/s stdev: 0.044 n: 3 0.5.11 mean: 6.513 tokens/s stdev: 0.015 n: 3 noamx mean: 7.220 tokens/s stdev: 0.061 n: 3 percent change 0.5.7 --> 0.5.11 -8.649% 0.5.11 --> noamx 10.850% 0.5.7 --> noamx 1.262% ``` emerald rapids ``` 0.5.7 mean: 4.853 tokens/s stdev: 0.015 n: 3 0.5.11 mean: 5.797 tokens/s stdev: 0.012 n: 3 noamx mean: 5.407 tokens/s stdev: 0.012 n: 3 percent change 0.5.7 --> 0.5.11 19.437% 0.5.11 --> noamx -6.728% 0.5.7 --> noamx 11.401% ``` resolves #9087 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:26:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#18159