[GH-ISSUE #3867] Ctrl+D to exit is not stopping service #2395

Closed
opened 2026-04-12 12:42:37 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @nishithshowri006 on GitHub (Apr 24, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3867

What is the issue?

I have observed that when we Cntrl+D or exit the chat interface when we run a model it is not stopping the ollama process. This inturn is blocking RAM and VRAM for other tasks. I observed this behavior in wsl and windows versions of ollama.

OS

Windows, WSL2

GPU

Nvidia

CPU

Intel

Ollama version

0.1.32

Originally created by @nishithshowri006 on GitHub (Apr 24, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3867 ### What is the issue? I have observed that when we Cntrl+D or exit the chat interface when we run a model it is not stopping the ollama process. This inturn is blocking RAM and VRAM for other tasks. I observed this behavior in wsl and windows versions of ollama. ### OS Windows, WSL2 ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.32
GiteaMirror added the question label 2026-04-12 12:42:37 -05:00
Author
Owner

@dhiltgen commented on GitHub (May 4, 2024):

Ollama is a client-server architecture, and the server continues to run after the client exits. By default, models are retained in memory for 5 minutes, but this is configurable. On windows, the server is managed by the System Tray application. You can quit from the Tray, and that will unload the server and free up system and gpu memory immediately. On WSL2, it runs as a system service in Linux, which you can stop with systemctl stop ollama

<!-- gh-comment-id:2094504493 --> @dhiltgen commented on GitHub (May 4, 2024): Ollama is a client-server architecture, and the server continues to run after the client exits. By default, models are retained in memory for 5 minutes, but this is [configurable](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately). On windows, the server is managed by the System Tray application. You can quit from the Tray, and that will unload the server and free up system and gpu memory immediately. On WSL2, it runs as a system service in Linux, which you can stop with `systemctl stop ollama`
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2395