[GH-ISSUE #10948] "model thrashing" even though OLLAMA_KEEP_ALIVE=-1 is set #53722

Closed
opened 2026-04-29 04:35:50 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @FieldMouse-AI on GitHub (Jun 2, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10948

What is the issue?

Similar to

  1. https://github.com/ollama/ollama/issues/8903
  2. https://github.com/ollama/ollama/pull/10003

Problem Descriptionn

I created two models using Modelfiles that both are based on llama3.2:1b.
We will them models A and B.

I run Ollama in a docker container and the environment variable setting OLLAMA_KEEP_ALIVE=-1 is already set and is visible to all processes.

I then open three bash sessions to the Ollama container.

In the first I run watch ollama ps so I can observe changes. At first the list is empty.

In the second I run ollama run A and the first terminal lists A as being in memory UNTIL Forever.

In the third I run ollama run B.

The expected behavior is that both model A and model B listed as UNTIL Forever, but instead modelA vanishes from the list to be replaced by B.

This is unexpected behavior.

The expected behavior is that both would be listed and that performance would improve due to neither needing to be reloaded. But, if I repeatedly keep switching models, the visible model gets bumped.

For the record, each model would only take up about 4GB each in RAM, so the 32GB is quite ample.

Personal View

There is a possibility that my problem represents an edge case of the following:

  1. https://github.com/ollama/ollama/issues/8903
  2. https://github.com/ollama/ollama/pull/10003

Relevant log output


OS

Ubuntu Linux 22.04 LTS

GPU

None

CPU

AMD Ryzen 7 8-core/16-thread

RAM

32GB

Ollama version

v0.9.0

Originally created by @FieldMouse-AI on GitHub (Jun 2, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10948 ### What is the issue? ### Similar to 1. https://github.com/ollama/ollama/issues/8903 2. https://github.com/ollama/ollama/pull/10003 ### Problem Descriptionn I created two models using Modelfiles that both are based on `llama3.2:1b`. We will them models `A` and `B`. I run Ollama in a docker container and the environment variable setting `OLLAMA_KEEP_ALIVE=-1` is already set and is visible to all processes. I then open three `bash` sessions to the Ollama container. In the first I run `watch ollama ps` so I can observe changes. At first the list is empty. In the second I run `ollama run A` and the first terminal lists `A` as being in memory `UNTIL` `Forever`. In the third I run `ollama run B`. The expected behavior is that both model `A` and model `B` listed as `UNTIL` `Forever`, but instead model`A` vanishes from the list to be replaced by `B`. This is unexpected behavior. The expected behavior is that both would be listed and that performance would improve due to neither needing to be reloaded. But, if I repeatedly keep switching models, the visible model gets bumped. For the record, each model would only take up about 4GB each in RAM, so the 32GB is quite ample. ### Personal View There is a possibility that my problem represents an edge case of the following: 1. https://github.com/ollama/ollama/issues/8903 2. https://github.com/ollama/ollama/pull/10003 ### Relevant log output ```shell ``` ### OS Ubuntu Linux 22.04 LTS ### GPU None ### CPU AMD Ryzen 7 8-core/16-thread ### RAM 32GB ### Ollama version v0.9.0
GiteaMirror added the bug label 2026-04-29 04:35:50 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 2, 2025):

ollama does not support loading a GGUF file multiple times. #3902

<!-- gh-comment-id:2930345163 --> @rick-github commented on GitHub (Jun 2, 2025): ollama does not support loading a GGUF file multiple times. #3902
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53722