[GH-ISSUE #4013] API Endpoint for Listing Loaded Running Models #64525

Closed
opened 2026-05-03 17:58:13 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @strikeoncmputrz on GitHub (Apr 29, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4013

It would be excellent to be able to interrogate the API to determine which models are running at any given time, rather than just seeing which checkpoints were pulled.

I use a variety of clients to interact with Ollama's API. I sometimes run models with a long keep_alive and assume others have similar use cases.

The only way I know of to identify a running model is through processes: ps aux | grep -- '--model' | grep -v grep | grep -Po '(?<=--model\s).*' | cut -d ' ' -f1. This will give you the full path to the model's blob. From there, you can compare that with the output of ollama show --modelfile (or the /api/show endpoint).

I checked the open issues and reddit and didn't see any similar RFIs or requests.

I wrote a bash script (depends on jq) that implements this as POC.

Originally created by @strikeoncmputrz on GitHub (Apr 29, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4013 It would be excellent to be able to interrogate the API to determine which models are running at any given time, rather than just seeing which checkpoints were pulled. I use a variety of clients to interact with Ollama's API. I sometimes run models with a long `keep_alive` and assume others have similar use cases. The only way I know of to identify a running model is through processes: `ps aux | grep -- '--model' | grep -v grep | grep -Po '(?<=--model\s).*' | cut -d ' ' -f1`. This will give you the full path to the model's blob. From there, you can compare that with the output of ollama show --modelfile (or the /api/show endpoint). I checked the open issues and reddit and didn't see any similar RFIs or requests. I wrote a [bash script](https://github.com/strikeoncmputrz/LLM_Scripts/blob/main/show_loaded_models.sh) (depends on jq) that implements this as POC.
GiteaMirror added the feature request label 2026-05-03 17:58:13 -05:00
Author
Owner

@pdevine commented on GitHub (Apr 29, 2024):

I think this would be great, along with an ollama ps which shows which models are currently loaded in memory. It should include when the model's TTL is going to expire as well.

<!-- gh-comment-id:2083847868 --> @pdevine commented on GitHub (Apr 29, 2024): I think this would be great, along with an `ollama ps` which shows which models are currently loaded in memory. It should include when the model's TTL is going to expire as well.
Author
Owner

@strikeoncmputrz commented on GitHub (May 3, 2024):

TTL is a great idea!

<!-- gh-comment-id:2093786978 --> @strikeoncmputrz commented on GitHub (May 3, 2024): TTL is a great idea!
Author
Owner

@unmotivatedgene commented on GitHub (May 10, 2024):

Yes please add this especially with the new concurrency options I want to know what models are sticking around and taking up all my VRAM.

<!-- gh-comment-id:2103756741 --> @unmotivatedgene commented on GitHub (May 10, 2024): Yes please add this especially with the new concurrency options I want to know what models are sticking around and taking up all my VRAM.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64525