[PR #13543] ggml: Fix PowerPC build and enable MMA Optimizations #14263

Open
opened 2026-04-13 00:49:35 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13543
Author: @shalinib-ibm
Created: 12/22/2025
Status: 🔄 Open

Base: mainHead: ppc_build_fix


📝 Commits (1)

  • 4115e4f ggml: Fix PowerPC build and enable MMA optimizations

📊 Changes

7 files changed (+106 additions, -2 deletions)

View changed files

llama/patches/0032-ggml-fix-vector-macro-collision-on-Power.patch (+44 -0)
llama/patches/0033-ggml-conditionally-enable-POWER11-CPU-backend-based-.patch (+36 -0)
📝 ml/backend/ggml/ggml/src/CMakeLists.txt (+8 -1)
ml/backend/ggml/ggml/src/ggml-cpu/llamafile/llamafile_ppc64le_power10.go (+7 -0)
ml/backend/ggml/ggml/src/ggml-cpu/llamafile/llamafile_ppc64le_power9.go (+7 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/llamafile/sgemm.cpp (+3 -0)
📝 ml/backend/ggml/ggml/src/ggml-cpu/simd-mappings.h (+1 -1)

📄 Description

This change resolves PowerPC build breakage and enables Matrix Math
Accelerator (MMA) optimizations on supported hardware.

Key changes:
- Include ggml-cpu PowerPC backend sources in the rsync filter so they are
  propagated from llama/vendor into Ollama’s vendored ggml snapshot.
- Apply required upstream ggml fixes for PowerPC builds, including:
  - vector macro collision fixes
  - conditional POWER11 backend enablement
- Enable Matrix Math Accelerator (MMA) support for Power10.
- Add architecture-specific compiler flags to enable optimized code paths:
  - `-mcpu=power10` when built with the `ppc64le.power10` build tag
    (enables MMA-based kernels, including llamafile_sgemm)
  - `-mcpu=power9` when built with the `ppc64le.power9` build tag
    (enables VSX optimizations)

Build instructions:
- Power10:
    go build --tags ppc64le.power10 .
- Power9:
    go build --tags ppc64le.power9 .

Performance impact:
- ~30% inference time reduction on Power10 with MMA enabled.
- Measured using:
    ollama run llama3:8b (Q4_0)
    ~50-word summarization, 512-token prompt
  - With MMA: ~6.05s
  - Without MMA: ~8.45s

Improves performance for Q4_0, Q8_0, FP32, and BF16 models on Power10.

This change also addresses the PowerPC build breakage noted in the review discussion of #13427.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13543 **Author:** [@shalinib-ibm](https://github.com/shalinib-ibm) **Created:** 12/22/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `ppc_build_fix` --- ### 📝 Commits (1) - [`4115e4f`](https://github.com/ollama/ollama/commit/4115e4f58f8c3fb86cdce2d58aae811c2d26cc52) ggml: Fix PowerPC build and enable MMA optimizations ### 📊 Changes **7 files changed** (+106 additions, -2 deletions) <details> <summary>View changed files</summary> ➕ `llama/patches/0032-ggml-fix-vector-macro-collision-on-Power.patch` (+44 -0) ➕ `llama/patches/0033-ggml-conditionally-enable-POWER11-CPU-backend-based-.patch` (+36 -0) 📝 `ml/backend/ggml/ggml/src/CMakeLists.txt` (+8 -1) ➕ `ml/backend/ggml/ggml/src/ggml-cpu/llamafile/llamafile_ppc64le_power10.go` (+7 -0) ➕ `ml/backend/ggml/ggml/src/ggml-cpu/llamafile/llamafile_ppc64le_power9.go` (+7 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/llamafile/sgemm.cpp` (+3 -0) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/simd-mappings.h` (+1 -1) </details> ### 📄 Description This change resolves PowerPC build breakage and enables Matrix Math Accelerator (MMA) optimizations on supported hardware. Key changes: - Include ggml-cpu PowerPC backend sources in the rsync filter so they are propagated from llama/vendor into Ollama’s vendored ggml snapshot. - Apply required upstream ggml fixes for PowerPC builds, including: - vector macro collision fixes - conditional POWER11 backend enablement - Enable Matrix Math Accelerator (MMA) support for Power10. - Add architecture-specific compiler flags to enable optimized code paths: - `-mcpu=power10` when built with the `ppc64le.power10` build tag (enables MMA-based kernels, including llamafile_sgemm) - `-mcpu=power9` when built with the `ppc64le.power9` build tag (enables VSX optimizations) Build instructions: - Power10: go build --tags ppc64le.power10 . - Power9: go build --tags ppc64le.power9 . Performance impact: - ~30% inference time reduction on Power10 with MMA enabled. - Measured using: ollama run llama3:8b (Q4_0) ~50-word summarization, 512-token prompt - With MMA: ~6.05s - Without MMA: ~8.45s Improves performance for Q4_0, Q8_0, FP32, and BF16 models on Power10. This change also addresses the PowerPC build breakage noted in the review discussion of #13427. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:49:35 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14263