[PR #3923] [MERGED] precalculate output tensor memory for metal and mmap #11320

Closed
opened 2026-04-12 23:27:43 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/3923
Author: @mxyng
Created: 4/25/2024
Status: Merged
Merged: 4/25/2024
Merged by: @mxyng

Base: mainHead: mxyng/mem


📝 Commits (1)

  • 7bb7cb8 only count output tensors

📊 Changes

1 file changed (+18 additions, -9 deletions)

View changed files

📝 llm/memory.go (+18 -9)

📄 Description

on metal with mmap, the output tensors are always allocated even if the offloaded layers < total layers + 1. other backends are unaffected


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/3923 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 4/25/2024 **Status:** ✅ Merged **Merged:** 4/25/2024 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `mxyng/mem` --- ### 📝 Commits (1) - [`7bb7cb8`](https://github.com/ollama/ollama/commit/7bb7cb8a605b7e3038d60188c78b3edb4c0c9aa9) only count output tensors ### 📊 Changes **1 file changed** (+18 additions, -9 deletions) <details> <summary>View changed files</summary> 📝 `llm/memory.go` (+18 -9) </details> ### 📄 Description on metal with mmap, the output tensors are always allocated even if the offloaded layers < total layers + 1. other backends are unaffected --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:27:44 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11320