[PR #15701] [CLOSED] discover: account for reclaimable page cache in cgroup v2 memory detection #41141

Closed
opened 2026-04-23 01:52:07 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15701
Author: @mverrilli
Created: 4/19/2026
Status: Closed

Base: mainHead: fix/issue-15474-cgroup-memory


📝 Commits (1)

  • 82b46fb discover: account for reclaimable page cache in cgroup v2 memory detection

📊 Changes

2 files changed (+184 additions, -1 deletions)

View changed files

discover/cgroup_mem_test.go (+153 -0)
📝 discover/cpu_linux.go (+31 -1)

📄 Description

Problem

When Ollama runs in a container with a cgroup v2 memory limit, it calculates available memory as:

FreeMemory = memory.max - memory.current

memory.current includes reclaimable page cache (inactive_file in memory.stat). After a model is loaded and unloaded, the Linux page cache fills and remains filled. Since page cache is included in memory.current, Ollama incorrectly concludes memory is exhausted and refuses to load models that would actually fit.

On bare metal, Ollama correctly reads MemAvailable from /proc/meminfo, which already accounts for reclaimable cache. The cgroup v2 path needed the same treatment.

Root cause

discover/cpu_linux.go getCPUMemByCgroups() used a naive subtraction:

mem.FreeMemory = mem.TotalMemory - used  // used = memory.current

memory.current counts all memory including page cache that the kernel can reclaim instantly. This produces an overly pessimistic estimate of free memory in containers.

Fix

Read inactive_file from /sys/fs/cgroup/memory.stat (reclaimable page cache) and subtract it from memory.current before computing free memory:

FreeMemory = memory.max - (memory.current - inactive_file)

This matches the kernel's own MemAvailable formula. If memory.stat is unavailable, the calculation falls back to the original behavior.

Changes

  • discover/cpu_linux.go — update getCPUMemByCgroups to subtract inactive_file; add getNamedUint64FromStatFile and getNamedUint64FromStat(io.Reader) helpers
  • discover/cgroup_mem_test.go — unit tests for the stat parser and the corrected formula

Test plan

  • go test ./discover/... — all 23 tests pass
  • Logic verified: with 8 GiB limit, 6 GiB memory.current (4 GiB reclaimable), corrected free = 6 GiB vs old incorrect 2 GiB

Fixes #15474


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15701 **Author:** [@mverrilli](https://github.com/mverrilli) **Created:** 4/19/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `fix/issue-15474-cgroup-memory` --- ### 📝 Commits (1) - [`82b46fb`](https://github.com/ollama/ollama/commit/82b46fb6e9c02b1e2215035863e494f9b4dae57d) discover: account for reclaimable page cache in cgroup v2 memory detection ### 📊 Changes **2 files changed** (+184 additions, -1 deletions) <details> <summary>View changed files</summary> ➕ `discover/cgroup_mem_test.go` (+153 -0) 📝 `discover/cpu_linux.go` (+31 -1) </details> ### 📄 Description ## Problem When Ollama runs in a container with a cgroup v2 memory limit, it calculates available memory as: ``` FreeMemory = memory.max - memory.current ``` `memory.current` includes reclaimable page cache (`inactive_file` in `memory.stat`). After a model is loaded and unloaded, the Linux page cache fills and remains filled. Since page cache is included in `memory.current`, Ollama incorrectly concludes memory is exhausted and refuses to load models that would actually fit. On bare metal, Ollama correctly reads `MemAvailable` from `/proc/meminfo`, which already accounts for reclaimable cache. The cgroup v2 path needed the same treatment. ## Root cause `discover/cpu_linux.go` `getCPUMemByCgroups()` used a naive subtraction: ```go mem.FreeMemory = mem.TotalMemory - used // used = memory.current ``` `memory.current` counts all memory including page cache that the kernel can reclaim instantly. This produces an overly pessimistic estimate of free memory in containers. ## Fix Read `inactive_file` from `/sys/fs/cgroup/memory.stat` (reclaimable page cache) and subtract it from `memory.current` before computing free memory: ```go FreeMemory = memory.max - (memory.current - inactive_file) ``` This matches the kernel's own `MemAvailable` formula. If `memory.stat` is unavailable, the calculation falls back to the original behavior. ## Changes - **`discover/cpu_linux.go`** — update `getCPUMemByCgroups` to subtract `inactive_file`; add `getNamedUint64FromStatFile` and `getNamedUint64FromStat(io.Reader)` helpers - **`discover/cgroup_mem_test.go`** — unit tests for the stat parser and the corrected formula ## Test plan - [x] `go test ./discover/...` — all 23 tests pass - [x] Logic verified: with 8 GiB limit, 6 GiB `memory.current` (4 GiB reclaimable), corrected free = 6 GiB vs old incorrect 2 GiB Fixes #15474 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-23 01:52:07 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#41141