[PR #9115] [CLOSED] Row Order model definitions #38738

Closed
opened 2026-04-22 23:24:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/9115
Author: @dhiltgen
Created: 2/14/2025
Status: Closed

Base: mainHead: row_order


📝 Commits (1)

  • 72abcaa Row Order model definitions

📊 Changes

20 files changed (+3546 additions, -291 deletions)

View changed files

📝 .gitignore (+2 -0)
📝 kvcache/causal.go (+53 -46)
📝 kvcache/causal_test.go (+60 -31)
llama/patches/0022-fix-crash-in-memcpy-with-quantized-types.patch (+43 -0)
📝 ml/backend.go (+25 -7)
📝 ml/backend/ggml/ggml.go (+517 -111)
📝 ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c (+10 -8)
ml/backend/ggml/ggml_test.go (+2676 -0)
📝 ml/nn/attention.go (+13 -13)
📝 ml/nn/linear.go (+1 -1)
model/README.md (+62 -0)
📝 model/model.go (+11 -0)
📝 model/models/gemma2/model.go (+11 -10)
📝 model/models/gemma3/model.go (+7 -7)
📝 model/models/gemma3/model_text.go (+7 -7)
📝 model/models/gemma3/model_vision.go (+5 -7)
📝 model/models/llama/model.go (+5 -5)
📝 model/models/mllama/model.go (+2 -2)
📝 model/models/mllama/model_text.go (+16 -16)
📝 model/models/mllama/model_vision.go (+20 -20)

📄 Description

Replaces #8731 on main.

This change switches the model API (and backend) to be row-order to make it easier to port model definitions from other frameworks that use row-order patterns. I've made the following changes to the Backend API interface definitions:

  • All APIs assume row-order instead of column order
  • View no longer interleaves shape and stride - passed as two discrete int arrays

This requires a number of notable changes in the GGML backend:

  • Dimensions and shapes exposed in the API are reversed in the underlying GGML tensor to ensure operations work properly
  • Number of dimensions tracked in wrapper tensor type. GGML treats trailing dimensions of 1 as no-ops, so in order to retain the correct number of dimensions if the leading dimension has a shape of 1 (thus reversed to be the trailing dimension), this tracking is used instead of the underlying GGML reported number of dimensions.
  • Permute revamped to be consistent with other row-order APIs (pytorch, etc.). GGML treats the shape as the "destination" on where to move to. Other APIs (and ours with this change) treat the shape as the "source" on where to get the data from.
  • Reshape updated to support a -1 as a dimension consistent with other APIs, where the value will be calculated and filled in automatically.

Other potential refinements that aren't currently included but which may make sense:

  • Soften the "must have 4 dimensions" parameters to routines to be more consistent with other APIs and only require to match the actual number of dimensions in the tensor
  • Switch to Matmul pattern

The cache also required some adjustments based on these changes.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/9115 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 2/14/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `row_order` --- ### 📝 Commits (1) - [`72abcaa`](https://github.com/ollama/ollama/commit/72abcaa2a8e829f4bbaad5d032f7f0f0c6a743ae) Row Order model definitions ### 📊 Changes **20 files changed** (+3546 additions, -291 deletions) <details> <summary>View changed files</summary> 📝 `.gitignore` (+2 -0) 📝 `kvcache/causal.go` (+53 -46) 📝 `kvcache/causal_test.go` (+60 -31) ➕ `llama/patches/0022-fix-crash-in-memcpy-with-quantized-types.patch` (+43 -0) 📝 `ml/backend.go` (+25 -7) 📝 `ml/backend/ggml/ggml.go` (+517 -111) 📝 `ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c` (+10 -8) ➕ `ml/backend/ggml/ggml_test.go` (+2676 -0) 📝 `ml/nn/attention.go` (+13 -13) 📝 `ml/nn/linear.go` (+1 -1) ➕ `model/README.md` (+62 -0) 📝 `model/model.go` (+11 -0) 📝 `model/models/gemma2/model.go` (+11 -10) 📝 `model/models/gemma3/model.go` (+7 -7) 📝 `model/models/gemma3/model_text.go` (+7 -7) 📝 `model/models/gemma3/model_vision.go` (+5 -7) 📝 `model/models/llama/model.go` (+5 -5) 📝 `model/models/mllama/model.go` (+2 -2) 📝 `model/models/mllama/model_text.go` (+16 -16) 📝 `model/models/mllama/model_vision.go` (+20 -20) </details> ### 📄 Description Replaces #8731 on main. This change switches the model API (and backend) to be row-order to make it easier to port model definitions from other frameworks that use row-order patterns. I've made the following changes to the Backend API interface definitions: - All APIs assume row-order instead of column order - View no longer interleaves shape and stride - passed as two discrete int arrays This requires a number of notable changes in the GGML backend: - Dimensions and shapes exposed in the API are reversed in the underlying GGML tensor to ensure operations work properly - Number of dimensions tracked in wrapper tensor type. GGML treats trailing dimensions of `1` as no-ops, so in order to retain the correct number of dimensions if the leading dimension has a shape of 1 (thus reversed to be the trailing dimension), this tracking is used instead of the underlying GGML reported number of dimensions. - Permute revamped to be consistent with other row-order APIs (pytorch, etc.). GGML treats the shape as the "destination" on where to move to. Other APIs (and ours with this change) treat the shape as the "source" on where to get the data from. - Reshape updated to support a `-1` as a dimension consistent with other APIs, where the value will be calculated and filled in automatically. Other potential refinements that aren't currently included but which may make sense: - Soften the "must have 4 dimensions" parameters to routines to be more consistent with other APIs and only require to match the actual number of dimensions in the tensor - Switch to Matmul pattern The cache also required some adjustments based on these changes. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:24:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#38738