[PR #5704] [CLOSED] Add TensorSplit option to runners and API #11891

Closed
opened 2026-04-12 23:41:57 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5704
Author: @NormalFishDev
Created: 7/15/2024
Status: Closed

Base: mainHead: rework-tensor-split


📝 Commits (5)

📊 Changes

2 files changed (+28 additions, -24 deletions)

View changed files

📝 api/types.go (+25 -23)
📝 llm/server.go (+3 -1)

📄 Description

This pull request adds non-breaking functionality to Ollama function NewLlamaServer and adds a TensorSplit field to the Runner struct in api/types.go.

  • Add option to pass a tensor_split in "options" object for generate api to manually define how tensors should be split with llama.cpp.
  • Add conditional to check for manual tensor_split value in the Runner options to set the tensor_split parameter, defaults to estimate.TensorSplit when no manual tensor_split is passed (works exactly the same for existing applications).
  • Add TensorSplit to Runner struct in api/types.go for accessing the tensor_split value.

I have added this functionality because the team I work with is using Ollama on a server with 8 GPUs. We have run into issues with the automatic tensor splitting causing the server to crash due to unbalanced splitting between the GPUs (the splitting does not account for buffer VRAM usage after the model is loaded). Being able to specify manually the tensor splits has made it easier to implement Ollama on the server.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5704 **Author:** [@NormalFishDev](https://github.com/NormalFishDev) **Created:** 7/15/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `rework-tensor-split` --- ### 📝 Commits (5) - [`d841d88`](https://github.com/ollama/ollama/commit/d841d885db24f834ab6473faa39814be68ab3cb0) Add TensorSplit option - [`9f45b69`](https://github.com/ollama/ollama/commit/9f45b696761cbdc110f7fcfb48b6ada0d1958a16) Fix TensorSplit Option - [`4fb8cc1`](https://github.com/ollama/ollama/commit/4fb8cc11ce3874e4840dd18ecb99c015eb588a66) Format changed files - [`070ac1d`](https://github.com/ollama/ollama/commit/070ac1d600d59a3895330210d3617b58fab4a1cc) Rewrite tensor split logic - [`3b100a9`](https://github.com/ollama/ollama/commit/3b100a91f838ec404e10ee7c5f9948403ce211f5) Format changed files with gofmt ### 📊 Changes **2 files changed** (+28 additions, -24 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+25 -23) 📝 `llm/server.go` (+3 -1) </details> ### 📄 Description This pull request adds non-breaking functionality to Ollama function `NewLlamaServer` and adds a `TensorSplit` field to the `Runner` struct in `api/types.go`. - Add option to pass a `tensor_split` in "options" object for generate api to manually define how tensors should be split with llama.cpp. - Add conditional to check for manual `tensor_split` value in the Runner options to set the `tensor_split` parameter, defaults to `estimate.TensorSplit` when no manual `tensor_split` is passed (works exactly the same for existing applications). - Add `TensorSplit` to `Runner` struct in `api/types.go` for accessing the `tensor_split` value. I have added this functionality because the team I work with is using Ollama on a server with 8 GPUs. We have run into issues with the automatic tensor splitting causing the server to crash due to unbalanced splitting between the GPUs (the splitting does not account for buffer VRAM usage after the model is loaded). Being able to specify manually the tensor splits has made it easier to implement Ollama on the server. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:41:57 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11891