[PR #14525] Vulkan Use Mmap (disable Vulkan MMap opt out) #25252

Open
opened 2026-04-19 18:06:10 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14525
Author: @inforithmics
Created: 3/1/2026
Status: 🔄 Open

Base: mainHead: VulkanUseMMap


📝 Commits (1)

📊 Changes

1 file changed (+0 additions, -1 deletions)

View changed files

📝 llm/server.go (+0 -1)

📄 Description

Vulkan Use Mmap (disable Vulkan MMAp opt out)

Draft until new Vendor sync because in newer llama.cpp Versions mmap works without problems in Vulkan (It shows that it uses more memory but in reality it doesn't) It shows more used memory in Process and gpu but in effect there is more free memory.

The reason it seems that the Memory is duplicated is that the Shared Memory used in the GPU is also reported to be used by ollama (But it is the same memory) So for example 6GB Shared Memory used Ollama although "uses" 6GB Memory. This hapens on iGPU where the shared Memory is used for GPU offload.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14525 **Author:** [@inforithmics](https://github.com/inforithmics) **Created:** 3/1/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `VulkanUseMMap` --- ### 📝 Commits (1) - [`da8659a`](https://github.com/ollama/ollama/commit/da8659afbb01d6707ccc44d02e79caccb0fdd5f1) Vulkan Use Mmap ### 📊 Changes **1 file changed** (+0 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+0 -1) </details> ### 📄 Description Vulkan Use Mmap (disable Vulkan MMAp opt out) Draft until new Vendor sync because in newer llama.cpp Versions mmap works without problems in Vulkan (It shows that it uses more memory but in reality it doesn't) It shows more used memory in Process and gpu but in effect there is more free memory. The reason it seems that the Memory is duplicated is that the Shared Memory used in the GPU is also reported to be used by ollama (But it is the same memory) So for example 6GB Shared Memory used Ollama although "uses" 6GB Memory. This hapens on iGPU where the shared Memory is used for GPU offload. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 18:06:10 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#25252