[GH-ISSUE #11888] Ollama consistently using CPU instead of Metal GPU on M2 Pro Mac (v0.11.4) #7891

Closed
opened 2026-04-12 20:02:39 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @boraozkum on GitHub (Aug 13, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11888

What is the issue?

I am experiencing an issue where Ollama (version 0.11.4) consistently uses 100% CPU for model inference on my M2 Pro Mac, despite the system having Metal 3 support. I have tried various troubleshooting steps, but the issue persists.

System Information:

  • Mac Model: M2 Pro
  • macOS Version: 14.6.1
  • Ollama Version: 0.11.4 (installed from official .dmg)

Hardware Information:

  • Processor: Apple M2 Pro
  • Memory: 16 GB
  • Graphics: Apple M2 Pro (Integrated)

Steps Taken & Observations:

  1. Initial Observation: ollama ps consistently reports "100% CPU" for all models (e.g., deepseek-r1:14b).
  2. Debug Logs: Debug logs from Ollama server startup consistently show "library=cpu" even after attempting to force Metal via environment variables:
    • OLLAMA_LLM_LIBRARY=metal (logged in ollama_debug_metal.log)
    • OLLAMA_GPU=metal (logged in ollama_debug_gpu_metal.log)
      The output in both logs remained "library=cpu".
  3. Installation Method: Initially, I had a Homebrew installation, which was uninstalled. The current installation is from the official .dmg file.
  4. Model Re-pull: I removed and re-pulled the deepseek-r1:14b model to rule out corruption. The issue persisted.
  5. Explicit num_gpu in Modelfile: Based on a web search, I created a custom Modelfile for deepseek-r1:14b (named deepseek-r1:14b-gpu) and explicitly set PARAMETER num_gpu 61 (as DeepSeek-R1:14B has 61 hidden layers). After creating and running this new model, ollama ps still reported "100% CPU".

Expected Behavior:
Ollama should utilize the Metal GPU for model inference on my M2 Pro Mac, leading to improved performance.

Actual Behavior:
Ollama consistently uses 100% CPU for model inference, resulting in slower performance.

Additional Context:

  • No Docker installation is present.
  • Ollama executable path: /Applications/Ollama.app/Contents/Resources/ollama

I am happy to provide any further logs or information if needed.

Relevant log output


OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.11.4

Originally created by @boraozkum on GitHub (Aug 13, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11888 ### What is the issue? I am experiencing an issue where Ollama (version 0.11.4) consistently uses 100% CPU for model inference on my M2 Pro Mac, despite the system having Metal 3 support. I have tried various troubleshooting steps, but the issue persists. System Information: - Mac Model: M2 Pro - macOS Version: 14.6.1 - Ollama Version: 0.11.4 (installed from official .dmg) Hardware Information: - Processor: Apple M2 Pro - Memory: 16 GB - Graphics: Apple M2 Pro (Integrated) Steps Taken & Observations: 1. Initial Observation: `ollama ps` consistently reports "100% CPU" for all models (e.g., `deepseek-r1:14b`). 2. Debug Logs: Debug logs from Ollama server startup consistently show "library=cpu" even after attempting to force Metal via environment variables: * `OLLAMA_LLM_LIBRARY=metal` (logged in `ollama_debug_metal.log`) * `OLLAMA_GPU=metal` (logged in `ollama_debug_gpu_metal.log`) The output in both logs remained "library=cpu". 3. Installation Method: Initially, I had a Homebrew installation, which was uninstalled. The current installation is from the official `.dmg` file. 4. Model Re-pull: I removed and re-pulled the `deepseek-r1:14b` model to rule out corruption. The issue persisted. 5. Explicit `num_gpu` in Modelfile: Based on a web search, I created a custom Modelfile for `deepseek-r1:14b` (named `deepseek-r1:14b-gpu`) and explicitly set `PARAMETER num_gpu 61` (as DeepSeek-R1:14B has 61 hidden layers). After creating and running this new model, `ollama ps` still reported "100% CPU". Expected Behavior: Ollama should utilize the Metal GPU for model inference on my M2 Pro Mac, leading to improved performance. Actual Behavior: Ollama consistently uses 100% CPU for model inference, resulting in slower performance. Additional Context: - No Docker installation is present. - Ollama executable path: `/Applications/Ollama.app/Contents/Resources/ollama` I am happy to provide any further logs or information if needed. ### Relevant log output ```shell ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.11.4
GiteaMirror added the bug label 2026-04-12 20:02:39 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 13, 2025):

Server logs will help in debugging.

<!-- gh-comment-id:3184355901 --> @rick-github commented on GitHub (Aug 13, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will help in debugging.
Author
Owner

@pdevine commented on GitHub (Aug 13, 2025):

Can you paste in the output of ollama -v? I'm wondering if there are some remnants of the old brew installed version floating around.

<!-- gh-comment-id:3185971124 --> @pdevine commented on GitHub (Aug 13, 2025): Can you paste in the output of `ollama -v`? I'm wondering if there are some remnants of the old brew installed version floating around.
Author
Owner

@anicolao commented on GitHub (Aug 14, 2025):

I've seen this problem on a friend's macbook where it was using rosetta to run the intel binary instead of running a darwin binary. I think we used otool to verify that the ollama binary was the wrong build.

<!-- gh-comment-id:3186301549 --> @anicolao commented on GitHub (Aug 14, 2025): I've seen this problem on a friend's macbook where it was using rosetta to run the intel binary instead of running a darwin binary. I think we used `otool` to verify that the `ollama` binary was the wrong build.
Author
Owner

@pdevine commented on GitHub (Aug 14, 2025):

cc @dhiltgen

<!-- gh-comment-id:3186398994 --> @pdevine commented on GitHub (Aug 14, 2025): cc @dhiltgen
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7891