[GH-ISSUE #5508] Ollama running 2 instances #65480

Closed
opened 2026-05-03 21:27:07 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @electro199 on GitHub (Jul 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5508

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

When starting ollama from start menu tray it starts the service for ollama which immediately loads model under name ollama.exe using around 4.5 GB ram and when using api it starts another program called ollama_llama_server.exe using around 4 GB of ram and almost 3 GB of vram.

The main issue is when ollama starts it should be the olllama_server loading the model not the olllama.exe

In this screenshot i was using ollama through api and i click on ollama then loads the model along side but ignoring the already running server
Screenshot 2024-07-06 014823

I am not sure if it is intended behavior or not, also how do I stop ollama from loading the model at startup without turning off the service.

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.1.48

Originally created by @electro199 on GitHub (Jul 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5508 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? When starting ollama from start menu tray it starts the service for ollama which immediately loads model under name ollama.exe using around 4.5 GB ram and when using api it starts another program called ollama_llama_server.exe using around 4 GB of ram and almost 3 GB of vram. The main issue is when ollama starts it should be the olllama_server loading the model not the olllama.exe In this screenshot i was using ollama through api and i click on ollama then loads the model along side but ignoring the already running server ![Screenshot 2024-07-06 014823](https://github.com/ollama/ollama/assets/109358640/359c1c28-9cf8-4df9-a964-149658f1532f) I am not sure if it is intended behavior or not, also how do I stop ollama from loading the model at startup without turning off the service. ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.48
GiteaMirror added the nvidianeeds more info labels 2026-05-03 21:27:07 -05:00
Author
Owner

@dhiltgen commented on GitHub (Jul 24, 2024):

How much VRAM do you have, and which model are you loading? What does ollama ps show after you load the model?

<!-- gh-comment-id:2248595766 --> @dhiltgen commented on GitHub (Jul 24, 2024): How much VRAM do you have, and which model are you loading? What does `ollama ps` show after you load the model?
Author
Owner

@electro199 commented on GitHub (Aug 6, 2024):

I am using Nvidia970 4 GB VRAM with 32gb ram.

I will provide more info later today.

<!-- gh-comment-id:2270676909 --> @electro199 commented on GitHub (Aug 6, 2024): I am using Nvidia970 4 GB VRAM with 32gb ram. I will provide more info later today.
Author
Owner

@electro199 commented on GitHub (Aug 6, 2024):

I am not able to create the issue in v0.3.2. I will report if I encounter problem again.

<!-- gh-comment-id:2272250795 --> @electro199 commented on GitHub (Aug 6, 2024): I am not able to create the issue in v0.3.2. I will report if I encounter problem again.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65480