[PR #10362] Converter for GraniteMoE architecture #11957

Open
opened 2025-11-12 16:25:37 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10362
Author: @mrutkows
Created: 4/21/2025
Status: 🔄 Open

Base: mainHead: granitemoe-converter


📝 Commits (4)

  • 498808b Converter for GraniteMoE architecture
  • 3e381d8 Converter for GraniteMoE architecture
  • 869d98e Remove temp. changes awaiting upstream merges
  • 024c358 Remove temp. changes awaiting upstream merges

📊 Changes

3 files changed (+364 additions, -1 deletions)

View changed files

📝 convert/convert.go (+4 -0)
convert/convert_granite.go (+359 -0)
📝 fs/ggml/gguf.go (+1 -1)

📄 Description

This PR builds upon 2 (pre-req.) PRs:

This PR adds support for the Granite MoE arch. to add additional tensor (name) mappings and properly slice a combined tensor.

Specifically, the key architectural change for GraniteMoe models in the HF safetensor format is that it uses a JetMoe implementation of parallel experts such that the "gate" and "up" tensors are merged in a single tensor (and need to be "sliced" in half from 1024->512) when converted to GGUF format.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10362 **Author:** [@mrutkows](https://github.com/mrutkows) **Created:** 4/21/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `granitemoe-converter` --- ### 📝 Commits (4) - [`498808b`](https://github.com/ollama/ollama/commit/498808b46c968126f9bef2a6fe0f375ea46fe8dc) Converter for GraniteMoE architecture - [`3e381d8`](https://github.com/ollama/ollama/commit/3e381d87b099c3ac4708d5d2716d024cedca044b) Converter for GraniteMoE architecture - [`869d98e`](https://github.com/ollama/ollama/commit/869d98e5afa53820e803032c72b6275b0aec2bcb) Remove temp. changes awaiting upstream merges - [`024c358`](https://github.com/ollama/ollama/commit/024c358f1d7f68a5cc80a460d50e4525588c8675) Remove temp. changes awaiting upstream merges ### 📊 Changes **3 files changed** (+364 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert.go` (+4 -0) ➕ `convert/convert_granite.go` (+359 -0) 📝 `fs/ggml/gguf.go` (+1 -1) </details> ### 📄 Description This PR builds upon 2 (pre-req.) PRs: - "Granite new engine" https://github.com/ollama/ollama/pull/9966 (from @gabe-l-hart) which establishes support for the base Granite archtecture. - In addition, the "split" of the original safetensors is made possible by the addition of both the `Clone()` method and the `Repacker` type in PR: "models: llama4 multimodal" https://github.com/ollama/ollama/pull/10141 (from @mxyng) This PR adds support for the Granite MoE arch. to add additional tensor (name) mappings and properly slice a combined tensor. Specifically, the key architectural change for GraniteMoe models in the HF safetensor format is that it uses a JetMoe implementation of parallel experts such that the "gate" and "up" tensors are merged in a single tensor (and need to be "sliced" in half from 1024->512) when converted to GGUF format. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-12 16:25:37 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#11957