[PR #10770] [CLOSED] llm: exclude input layers from VRAM layer size estimation #39229

Closed
opened 2026-04-22 23:53:32 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10770
Author: @rick-github
Created: 5/19/2025
Status: Closed

Base: mainHead: estimation


📝 Commits (1)

  • 4b063a8 llm: exclude input layers from VRAM estimation

📊 Changes

1 file changed (+13 additions, -1 deletions)

View changed files

📝 llm/memory.go (+13 -1)

📄 Description

Input layers only run on CPU, so don't use them when finding the maximum size of a layer in VRAM.

Doesn't completely ease the increased memory reported in the following, but reduces it somewhat:

Fixes: #10765, #10756, #10752, #10726


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10770 **Author:** [@rick-github](https://github.com/rick-github) **Created:** 5/19/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `estimation` --- ### 📝 Commits (1) - [`4b063a8`](https://github.com/ollama/ollama/commit/4b063a85cbfec6fed52f96cd38fdb5643042b8a1) llm: exclude input layers from VRAM estimation ### 📊 Changes **1 file changed** (+13 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/memory.go` (+13 -1) </details> ### 📄 Description Input layers only run on [CPU](https://github.com/ollama/ollama/blob/7edfdd2f5f48a7be035cec23b4acd12f7c112e1c/ml/backend/ggml/ggml.go#L248), so don't use them when finding the maximum size of a layer in VRAM. Doesn't completely ease the increased memory reported in the following, but reduces it somewhat: Fixes: #10765, #10756, #10752, #10726 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:53:32 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39229