[GH-ISSUE #3415] Ollama does not use my ram memory #2105

Closed
opened 2026-04-12 12:20:43 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @faugustdev on GitHub (Mar 30, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3415

What is the issue?

Ollama Resource Utilization: Potential Optimization Opportunity

I'm deploying a model within Ollama and noticed that while I've allocated 24GB of RAM to the Docker container, it's currently only utilizing 117MB.

This efficient resource usage is commendable, but it might also indicate room for optimization. To ensure optimal performance, it would be beneficial if the model could leverage at least the minimum required resources.

What did you expect to see?

Model Response Speed and Resource Usage

While I allocated 24GB of RAM to the Docker container running the model, it's currently utilizing only 117MB. Given this limited resource usage, achieving an acceptable response speed for the model is impossible.

Steps to reproduce

run this command to start my docker container run -d -v rocama:/root/.ollama -p 11434:11434 --name LLMsDazlabs --memory=24g --memory-reservation=24g rocama/ollama

Are there any recent changes that introduced the issue?

No response

OS

Linux

Architecture

arm64

Platform

Docker

Ollama version

ollama version is 0.1.30

GPU

No response

GPU info

I am not using GPU, I am running Ollama with only CPU
Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

CPU

Intel

Other software

No response

Originally created by @faugustdev on GitHub (Mar 30, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3415 ### What is the issue? ## Ollama Resource Utilization: Potential Optimization Opportunity I'm deploying a model within Ollama and noticed that while I've allocated 24GB of RAM to the Docker container, it's currently only utilizing 117MB. This efficient resource usage is commendable, but it might also indicate room for optimization. To ensure optimal performance, it would be beneficial if the model could leverage at least the minimum required resources. ### What did you expect to see? ### Model Response Speed and Resource Usage While I allocated 24GB of RAM to the Docker container running the model, it's currently utilizing only 117MB. Given this limited resource usage, achieving an acceptable response speed for the model is impossible. ### Steps to reproduce run this command to start my docker container run -d -v rocama:/root/.ollama -p 11434:11434 --name LLMsDazlabs --memory=24g --memory-reservation=24g rocama/ollama ### Are there any recent changes that introduced the issue? _No response_ ### OS Linux ### Architecture arm64 ### Platform Docker ### Ollama version ollama version is 0.1.30 ### GPU _No response_ ### GPU info I am not using GPU, I am running Ollama with only CPU Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz ### CPU Intel ### Other software _No response_
GiteaMirror added the bug label 2026-04-12 12:20:43 -05:00
Author
Owner

@easp commented on GitHub (Mar 30, 2024):

How/where are you measuring resource utilization? Ollama memory maps model weights. Linux utilities, like free, account for this in buffer/cache which is considered "available."

<!-- gh-comment-id:2028444031 --> @easp commented on GitHub (Mar 30, 2024): How/where are you measuring resource utilization? Ollama memory maps model weights. Linux utilities, like free, account for this in buffer/cache which is considered "available."
Author
Owner

@faugustdev commented on GitHub (Mar 31, 2024):

since I am using docker, only need to check with 'docker stats' command

<!-- gh-comment-id:2028512655 --> @faugustdev commented on GitHub (Mar 31, 2024): since I am using docker, only need to check with 'docker stats' command
Author
Owner

@jmorganca commented on GitHub (Apr 15, 2024):

Hi there, which model are you running? I would definitely expect more memory utilization than this, however as @easp mentioned it's possible that the memory isn't reflected as the file is memory mapped

<!-- gh-comment-id:2057654776 --> @jmorganca commented on GitHub (Apr 15, 2024): Hi there, which model are you running? I would definitely expect more memory utilization than this, however as @easp mentioned it's possible that the memory isn't reflected as the file is memory mapped
Author
Owner

@tahitimoon commented on GitHub (Apr 19, 2024):

How do you solve this problem?

<!-- gh-comment-id:2065651376 --> @tahitimoon commented on GitHub (Apr 19, 2024): How do you solve this problem?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2105