[PR #10773] [MERGED] llm: Use first layer as memory buffer in estimation #39231

Closed
opened 2026-04-22 23:53:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10773
Author: @jessegross
Created: 5/19/2025
Status: Merged
Merged: 5/19/2025
Merged by: @jessegross

Base: mainHead: jessegross/layers


📝 Commits (1)

  • 1fcfeab llm: Use first layer as memory buffer in estimation

📊 Changes

1 file changed (+6 additions, -7 deletions)

View changed files

📝 llm/memory.go (+6 -7)

📄 Description

This is a partial revert of 0478d44 "Fixed over vram allcation dure to small initial layer sizes."

Previously we used the size of the first layer as an extra reserved amount of space to buffer our memory estimates. The above commit changed this to use the largest layer. However, this had performance impacts on more models than the original commit was trying to fix.

There is just a heuristic without an ideal solution so this goes back to the historic behavior.

Fixes: #10765, #10756, #10752, #10726


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10773 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 5/19/2025 **Status:** ✅ Merged **Merged:** 5/19/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/layers` --- ### 📝 Commits (1) - [`1fcfeab`](https://github.com/ollama/ollama/commit/1fcfeabc0cb46e0274d5d5b196b6d5c0c16a7d84) llm: Use first layer as memory buffer in estimation ### 📊 Changes **1 file changed** (+6 additions, -7 deletions) <details> <summary>View changed files</summary> 📝 `llm/memory.go` (+6 -7) </details> ### 📄 Description This is a partial revert of 0478d44 "Fixed over vram allcation dure to small initial layer sizes." Previously we used the size of the first layer as an extra reserved amount of space to buffer our memory estimates. The above commit changed this to use the largest layer. However, this had performance impacts on more models than the original commit was trying to fix. There is just a heuristic without an ideal solution so this goes back to the historic behavior. Fixes: #10765, #10756, #10752, #10726 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:53:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39231