[GH-ISSUE #11092] Loading time of mistral-small3.1 is too long #11087 #53829

Closed
opened 2026-04-29 04:50:16 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @JitaekJo on GitHub (Jun 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11092

What is the issue?

Image

A related problem is the model isn't loaded. As you can see, even though the memory is loaded and ollama is served after providing a prompt, the load is "0%" for a long time.
Is it related to the memory issue? I'm doubt.

######################################################################
I've upgraded ollama recently yes.

Mystery solved.

There have been recent changes to the estimation logic to reduce the chance of an OOM. You can force ollama to load more layers into GPU by setting num_gpu as described https://github.com/ollama/ollama/issues/6950#issuecomment-2373663650. This may increase OOMs or cause a https://github.com/ollama/ollama/issues/7584#issuecomment-2466715900.

And this doesn't seem like an resolution because this guides to load the model on RAM and CPU.
My problem is that the model isn't loaded.

Relevant log output


OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.9.0

Originally created by @JitaekJo on GitHub (Jun 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11092 ### What is the issue? ![Image](https://github.com/user-attachments/assets/6944ebaa-fc4b-45d3-a65c-57e981082021) **A related problem is the model isn't loaded. As you can see, even though the memory is loaded and ollama is served after providing a prompt, the load is "0%" for a long time. Is it related to the memory issue? I'm doubt.** ###################################################################### I've upgraded ollama recently yes. Mystery solved. There have been recent changes to the estimation logic to reduce the chance of an OOM. You can force ollama to load more layers into GPU by setting num_gpu as described https://github.com/ollama/ollama/issues/6950#issuecomment-2373663650. This may increase OOMs or cause a https://github.com/ollama/ollama/issues/7584#issuecomment-2466715900. **And this doesn't seem like an resolution because this guides to load the model on RAM and CPU. My problem is that the model isn't loaded.** ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.9.0
GiteaMirror added the bug label 2026-04-29 04:50:16 -05:00
Author
Owner

@JitaekJo commented on GitHub (Jun 17, 2025):

I found a cause of problem. When I feed an IMAGE, MISTRAL isn't be loaded. Please fix this bug.

<!-- gh-comment-id:2979323137 --> @JitaekJo commented on GitHub (Jun 17, 2025): I found a cause of problem. When I feed an IMAGE, MISTRAL isn't be loaded. Please fix this bug.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53829