[PR #10787] [MERGED] Memory usage reporting #23905

Closed
opened 2026-04-19 17:16:54 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10787
Author: @jessegross
Created: 5/20/2025
Status: Merged
Merged: 5/22/2025
Merged by: @jessegross

Base: mainHead: jessegross/mem_usage


📝 Commits (3)

  • 73bd0de ggml: Report graph memory for failed allocations
  • 037bc67 ollamarunner: Memory usage reporting
  • f4d9f27 ml: Panic rather than return error on tensor allocation failure

📊 Changes

25 files changed (+499 additions, -286 deletions)

View changed files

📝 kvcache/causal.go (+7 -19)
📝 kvcache/causal_test.go (+10 -10)
llama/patches/0016-graph-memory-reporting-on-failure.patch (+156 -0)
📝 ml/backend.go (+85 -3)
📝 ml/backend/ggml/ggml.go (+117 -69)
📝 ml/backend/ggml/ggml/include/ggml-alloc.h (+6 -0)
📝 ml/backend/ggml/ggml/include/ggml-backend.h (+6 -0)
📝 ml/backend/ggml/ggml/src/ggml-alloc.c (+34 -4)
📝 ml/backend/ggml/ggml/src/ggml-backend.cpp (+10 -0)
📝 model/model.go (+1 -5)
📝 model/models/gemma2/model.go (+2 -9)
📝 model/models/gemma3/model.go (+3 -13)
📝 model/models/llama/model.go (+2 -8)
📝 model/models/llama4/model.go (+4 -18)
📝 model/models/llama4/model_text.go (+1 -5)
📝 model/models/llama4/model_vision.go (+1 -4)
📝 model/models/mistral3/model.go (+3 -13)
📝 model/models/mistral3/model_vision.go (+3 -13)
📝 model/models/mllama/model.go (+4 -18)
📝 model/models/qwen2/model.go (+2 -8)

...and 5 more files

📄 Description

This provides granular information about the backend memory allocations required by the runner:

  • Per backend
  • Per layer
  • Weights, cache and graph
  • Allocation status

This can be used for debugging and validating memory estimates. The usage information will be printed to the log if there is a memory allocation failure or if debug is enabled.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10787 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 5/20/2025 **Status:** ✅ Merged **Merged:** 5/22/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/mem_usage` --- ### 📝 Commits (3) - [`73bd0de`](https://github.com/ollama/ollama/commit/73bd0de7a4368cbd9c1fdf4290570e339cc13f7e) ggml: Report graph memory for failed allocations - [`037bc67`](https://github.com/ollama/ollama/commit/037bc67a13c072494cf42bce8ecd6c05c512a4d3) ollamarunner: Memory usage reporting - [`f4d9f27`](https://github.com/ollama/ollama/commit/f4d9f27c82b78a0fe51cfd10c47c36b03ef9a26d) ml: Panic rather than return error on tensor allocation failure ### 📊 Changes **25 files changed** (+499 additions, -286 deletions) <details> <summary>View changed files</summary> 📝 `kvcache/causal.go` (+7 -19) 📝 `kvcache/causal_test.go` (+10 -10) ➕ `llama/patches/0016-graph-memory-reporting-on-failure.patch` (+156 -0) 📝 `ml/backend.go` (+85 -3) 📝 `ml/backend/ggml/ggml.go` (+117 -69) 📝 `ml/backend/ggml/ggml/include/ggml-alloc.h` (+6 -0) 📝 `ml/backend/ggml/ggml/include/ggml-backend.h` (+6 -0) 📝 `ml/backend/ggml/ggml/src/ggml-alloc.c` (+34 -4) 📝 `ml/backend/ggml/ggml/src/ggml-backend.cpp` (+10 -0) 📝 `model/model.go` (+1 -5) 📝 `model/models/gemma2/model.go` (+2 -9) 📝 `model/models/gemma3/model.go` (+3 -13) 📝 `model/models/llama/model.go` (+2 -8) 📝 `model/models/llama4/model.go` (+4 -18) 📝 `model/models/llama4/model_text.go` (+1 -5) 📝 `model/models/llama4/model_vision.go` (+1 -4) 📝 `model/models/mistral3/model.go` (+3 -13) 📝 `model/models/mistral3/model_vision.go` (+3 -13) 📝 `model/models/mllama/model.go` (+4 -18) 📝 `model/models/qwen2/model.go` (+2 -8) _...and 5 more files_ </details> ### 📄 Description This provides granular information about the backend memory allocations required by the runner: - Per backend - Per layer - Weights, cache and graph - Allocation status This can be used for debugging and validating memory estimates. The usage information will be printed to the log if there is a memory allocation failure or if debug is enabled. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 17:16:54 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#23905