[PR #2057] [CLOSED] Improve scratch buffer estimates #42021

Closed
opened 2026-04-24 21:48:46 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/2057
Author: @jmorganca
Created: 1/18/2024
Status: Closed

Base: mainHead: scratch


📝 Commits (1)

  • 2789ed3 improve scratch buffer estimates

📊 Changes

2 files changed (+8 additions, -14 deletions)

View changed files

📝 gpu/gpu.go (+1 -7)
📝 llm/llm.go (+7 -7)

📄 Description

This tweaks the scratch buffer estimates to account for batch size and allocates a larger amount of overhead. This is a temporary fix – long term we want to inspect the model weights for proper tensor-by-tensor estimates.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/2057 **Author:** [@jmorganca](https://github.com/jmorganca) **Created:** 1/18/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `scratch` --- ### 📝 Commits (1) - [`2789ed3`](https://github.com/ollama/ollama/commit/2789ed31a7fb3930dd47d0e1aa5aa50fc0f044f2) improve scratch buffer estimates ### 📊 Changes **2 files changed** (+8 additions, -14 deletions) <details> <summary>View changed files</summary> 📝 `gpu/gpu.go` (+1 -7) 📝 `llm/llm.go` (+7 -7) </details> ### 📄 Description This tweaks the scratch buffer estimates to account for batch size and allocates a larger amount of overhead. This is a temporary fix – long term we want to inspect the model weights for proper tensor-by-tensor estimates. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 21:48:46 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#42021