[GH-ISSUE #6020] not utilizing ram after vram #3766

Closed
opened 2026-04-12 14:35:30 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @uploadsjuicers on GitHub (Jul 27, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6020

What is the issue?

I am running ollama in docker with an nvidia gpu. When I load a model that is larger than the 8gb of vram my gpu has, my ram usage doesn't increase, though the model does respond. I am assuming it is using mmap instead of ram. Is this intended or is there a way to configure it to use ram?

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.0

Originally created by @uploadsjuicers on GitHub (Jul 27, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6020 ### What is the issue? I am running ollama in docker with an nvidia gpu. When I load a model that is larger than the 8gb of vram my gpu has, my ram usage doesn't increase, though the model does respond. I am assuming it is using mmap instead of ram. Is this intended or is there a way to configure it to use ram? ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.0
GiteaMirror added the bug label 2026-04-12 14:35:30 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3766