[PR #4906] [CLOSED] llm/server.go: Fix ollama ps show 100%GPU even use CPU as runner #42862

Closed
opened 2026-04-24 22:35:05 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/4906
Author: @coolljt0725
Created: 6/7/2024
Status: Closed

Base: mainHead: fix_ollama_ps


📝 Commits (1)

  • c553a9d llm/server.go: Fix ollama ps show 100%GPU even use CPU as runner

📊 Changes

1 file changed (+6 additions, -0 deletions)

View changed files

📝 llm/server.go (+6 -0)

📄 Description

Even the machine has GPU, but it still will fall back to use CPU as runner. But the estimatedVRAM doesn't clear, this make ollama ps show the model use 100% GPU even it fully use CPU.

 ./ollama ps                                                                                                                                
NAME            ID              SIZE    PROCESSOR       UNTIL  
llama3:latest   365c0bd3c000    5.4 GB  100% GPU        4 minutes from now                                                                                                                            

But from the server log, we can see it use CPU as runner

time=2024-06-07T20:36:06.301+08:00 level=INFO source=server.go:344 msg="starting llama server" cmd="/tmp/ollama3547827739/runners/cpu_avx2/ollama_llama_server --model /home/lei/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --verbose --parallel 1 --port 43665" 

This patch fix this by clear estimatedVRAM, because the client caculate the GPU usage according to estimatedVRAM.
This patch also give a warning if use cpu runner instead of GPU.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/4906 **Author:** [@coolljt0725](https://github.com/coolljt0725) **Created:** 6/7/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `fix_ollama_ps` --- ### 📝 Commits (1) - [`c553a9d`](https://github.com/ollama/ollama/commit/c553a9d3787077a67ba15191e4cb99eea65748ee) llm/server.go: Fix ollama ps show 100%GPU even use CPU as runner ### 📊 Changes **1 file changed** (+6 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+6 -0) </details> ### 📄 Description Even the machine has GPU, but it still will fall back to use CPU as runner. But the `estimatedVRAM` doesn't clear, this make `ollama ps` show the model use `100% GPU` even it fully use CPU. ``` ./ollama ps NAME ID SIZE PROCESSOR UNTIL llama3:latest 365c0bd3c000 5.4 GB 100% GPU 4 minutes from now ``` But from the server log, we can see it use CPU as runner ``` time=2024-06-07T20:36:06.301+08:00 level=INFO source=server.go:344 msg="starting llama server" cmd="/tmp/ollama3547827739/runners/cpu_avx2/ollama_llama_server --model /home/lei/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --verbose --parallel 1 --port 43665" ``` This patch fix this by clear `estimatedVRAM`, because the client caculate the GPU usage according to `estimatedVRAM`. This patch also give a warning if use cpu runner instead of GPU. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 22:35:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#42862