[GH-ISSUE #12001] Only one models loaded into memory at once #7967

Closed
opened 2026-04-12 20:09:02 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @hteeyeoh on GitHub (Aug 21, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12001

What is the issue?

I have a pipline with load 1 embedding model and 1 llm models into memory for my RAG application. It was showing both model when using 'ollama ps' command when using 'ollama version 0.11.4' but now it only show the latest one which being triggered and load it into memory on version 0.11.5 which was totally different now. Is the the expected change? And how to keep both model load in the memory at the same time?

Im using pure cpu host which do not have GPU.

Relevant log output


OS

Linux

GPU

No response

CPU

Intel

Ollama version

0.11.5

Originally created by @hteeyeoh on GitHub (Aug 21, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12001 ### What is the issue? I have a pipline with load 1 embedding model and 1 llm models into memory for my RAG application. It was showing both model when using 'ollama ps' command when using 'ollama version 0.11.4' but now it only show the latest one which being triggered and load it into memory on version `0.11.5` which was totally different now. Is the the expected change? And how to keep both model load in the memory at the same time? Im using pure cpu host which do not have GPU. ### Relevant log output ```shell ``` ### OS Linux ### GPU _No response_ ### CPU Intel ### Ollama version 0.11.5
GiteaMirror added the bug label 2026-04-12 20:09:02 -05:00
Author
Owner

@jmorganca commented on GitHub (Aug 21, 2025):

Hi @hteeyeoh sorry for the issue. This was fixed in 073fa31df5 and will be available in an update soon.

For now using 0.11.4 will work. to install it on Linux you can run:

curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.11.4 sh
<!-- gh-comment-id:3209049448 --> @jmorganca commented on GitHub (Aug 21, 2025): Hi @hteeyeoh sorry for the issue. This was fixed in https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479 and will be available in an update soon. For now using `0.11.4` will work. to install it on Linux you can run: ``` curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.11.4 sh ```
Author
Owner

@hteeyeoh commented on GitHub (Aug 21, 2025):

Hi @hteeyeoh sorry for the issue. This was fixed in 073fa31 and will be available in an update soon.

For now using 0.11.4 will work. to install it on Linux you can run:

curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.11.4 sh

No issue. Would just like to understand whether this is a bug or this would be the expected behaviour going forward.

Thanks for your quick reply.

<!-- gh-comment-id:3209054542 --> @hteeyeoh commented on GitHub (Aug 21, 2025): > Hi [@hteeyeoh](https://github.com/hteeyeoh) sorry for the issue. This was fixed in [073fa31](https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479) and will be available in an update soon. > > For now using `0.11.4` will work. to install it on Linux you can run: > > ``` > curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.11.4 sh > ``` No issue. Would just like to understand whether this is a bug or this would be the expected behaviour going forward. Thanks for your quick reply.
Author
Owner

@FieldMouse-AI commented on GitHub (Aug 21, 2025):

@jmorganca ,

@hteeyeoh 's issue actually looks like a continuation of the issue that I had opened recently: https://github.com/ollama/ollama/issues/11974

I tested v0.11.06 and found that problem appears to still persist, but looking at https://github.com/ollama/ollama/issues/11974, I was under the impression that it was fixed. Am I mistaken? 🤔

Thanks for your help and all. 🤗

<!-- gh-comment-id:3210152073 --> @FieldMouse-AI commented on GitHub (Aug 21, 2025): @jmorganca , @hteeyeoh 's issue actually looks like a continuation of the issue that I had opened recently: https://github.com/ollama/ollama/issues/11974 I tested v0.11.06 and found that problem appears to still persist, but looking at https://github.com/ollama/ollama/issues/11974, I was under the impression that it was fixed. Am I mistaken? 🤔 Thanks for your help and all. 🤗
Author
Owner

@rick-github commented on GitHub (Aug 21, 2025):

0.11.6 doesn't include 073fa31df5.

<!-- gh-comment-id:3211333121 --> @rick-github commented on GitHub (Aug 21, 2025): 0.11.6 doesn't include https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479.
Author
Owner

@FieldMouse-AI commented on GitHub (Aug 21, 2025):

0.11.6 doesn't include 073fa31.

Ah, I see, 073fa31df5 is not in 0.11.6, but will arrive in a future release.

I was qute worried as the constant model reloads cause around a 3x slowdown of some some inferences.

I will remain on 0.11.4 until 073fa31df5 is integrated into a release.

If you would like to request that I retest it (like during a release candidate or something like that), please let me know. 🤗

Thanks! 🤗

<!-- gh-comment-id:3211820342 --> @FieldMouse-AI commented on GitHub (Aug 21, 2025): > 0.11.6 doesn't include [073fa31](https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479). Ah, I see, https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479 is not in 0.11.6, but will arrive in a future release. I was qute worried as the constant model reloads cause around a 3x slowdown of some some inferences. I will remain on 0.11.4 until https://github.com/ollama/ollama/commit/073fa31df501d7cb943d3fa0fb19841da9651479 is integrated into a release. If you would like to request that I retest it (like during a release candidate or something like that), please let me know. 🤗 Thanks! 🤗
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7967