[PR #8895] Don't perform memory check if client sets use_mmap true. #44056

Open
opened 2026-04-24 23:35:48 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/8895
Author: @rick-github
Created: 2/6/2025
Status: 🔄 Open

Base: mainHead: mmap


📝 Commits (10+)

📊 Changes

3 files changed (+44 additions, -18 deletions)

View changed files

📝 api/types.go (+1 -1)
📝 envconfig/config.go (+27 -1)
📝 llm/server.go (+16 -16)

📄 Description

If the client overrides use_mmap, don't prevent the model from loading due to apparent over-commit.

On Linux, a mmap'd file doesn't use swap backing store unless modified, so there's no need for the check. Windows has dynamic swap and so falls in to the same bucket as darwin. Inference on deepseek-r1:671b-1.5b runs at ~0.15 t/s where the model requires swap on SSD, ~0.3 t/s with mmap instead of swap on the the same SSD, and ~1.4 t/s when the model is mapped on an NVME drive.

Also add OLLAMA_USE_MMAP for global configuration.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/8895 **Author:** [@rick-github](https://github.com/rick-github) **Created:** 2/6/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `mmap` --- ### 📝 Commits (10+) - [`e5e1ded`](https://github.com/ollama/ollama/commit/e5e1dedccd11e253d24b22c60a78e589551f60e3) Don't perform memory check of client sets use_mmap true. - [`ec0ef40`](https://github.com/ollama/ollama/commit/ec0ef4058dee00d9dc4745b567a80dff2e93a605) Merge branch 'ollama:main' into mmap - [`836153d`](https://github.com/ollama/ollama/commit/836153d7e9da6392a7c63188ba49272e74f31f46) Merge branch 'ollama:main' into mmap - [`a374fbd`](https://github.com/ollama/ollama/commit/a374fbde4dd4bf7525d9fe70e51527520963faf0) Merge branch 'ollama:main' into mmap - [`93113a7`](https://github.com/ollama/ollama/commit/93113a7b9fce3f7cecad5283c97ad5254d946aa9) Merge branch 'ollama:main' into mmap - [`eba78f9`](https://github.com/ollama/ollama/commit/eba78f9f1852752c466b73959328f376bd2c2347) Add environment variable OLLAMA_USE_MMAP. - [`4d5ba52`](https://github.com/ollama/ollama/commit/4d5ba525584b7de46ed0d4d46979a129cf3712a8) Merge branch 'ollama:main' into mmap - [`22c7219`](https://github.com/ollama/ollama/commit/22c72191c5c857464eb327c49de016a518f1806e) Merge branch 'mmap' of https://github.com/rick-github/ollama into mmap - [`1c8af56`](https://github.com/ollama/ollama/commit/1c8af566bfc58d773a41b5f128fbfc6e4deda459) Merge branch 'ollama:main' into mmap - [`583a8ac`](https://github.com/ollama/ollama/commit/583a8ac302a41d0af0bf9ff8526c79e1fe2e8296) Merge branch 'ollama:main' into mmap ### 📊 Changes **3 files changed** (+44 additions, -18 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+1 -1) 📝 `envconfig/config.go` (+27 -1) 📝 `llm/server.go` (+16 -16) </details> ### 📄 Description If the client overrides `use_mmap`, don't prevent the model from loading due to apparent over-commit. On Linux, a mmap'd file doesn't use swap backing store unless modified, so there's no need for the check. Windows has dynamic swap and so falls in to the same bucket as darwin. Inference on deepseek-r1:671b-1.5b runs at ~0.15 t/s where the model requires swap on SSD, ~0.3 t/s with mmap instead of swap on the the same SSD, and ~1.4 t/s when the model is mapped on an NVME drive. Also add `OLLAMA_USE_MMAP` for global configuration. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 23:35:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#44056