[GH-ISSUE #3849] Ollama super slow on macOS M1 in Docker #64422

Closed
opened 2026-05-03 17:36:07 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @rb81 on GitHub (Apr 23, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3849

What is the issue?

Ollama running natively on macOS is excellent.
Ollama running on Docker is about 50% slower.
(Unsure if this is a bug or config issue, but I am running default settings.)

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.1.32

Originally created by @rb81 on GitHub (Apr 23, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3849 ### What is the issue? Ollama running natively on macOS is excellent. Ollama running on Docker is about 50% slower. (Unsure if this is a bug or config issue, but I am running default settings.) ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.1.32
GiteaMirror added the question label 2026-05-03 17:36:07 -05:00
Author
Owner

@i-yoyocat commented on GitHub (Apr 24, 2024):

you need a RTX4090,haha

<!-- gh-comment-id:2075228120 --> @i-yoyocat commented on GitHub (Apr 24, 2024): you need a RTX4090,haha
Author
Owner

@dhiltgen commented on GitHub (Apr 24, 2024):

When you run Ollama as a native Mac application on M1 (or newer) hardware, we run the LLM on the GPU.

Docker Desktop on Mac, does NOT expose the Apple GPU to the container runtime, it only exposes an ARM CPU (or virtual x86 CPU via Rosetta emulation) so when you run Ollama inside that container, it is running purely on CPU, not utilizing your GPU hardware.

On PC's NVIDIA and AMD have support for GPU pass-through into containers, so it is possible for ollama in a container to access the GPU, but this is not possible on Apple hardware.

<!-- gh-comment-id:2075359242 --> @dhiltgen commented on GitHub (Apr 24, 2024): When you run Ollama as a native Mac application on M1 (or newer) hardware, we run the LLM on the GPU. Docker Desktop on Mac, does NOT expose the Apple GPU to the container runtime, it only exposes an ARM CPU (or virtual x86 CPU via Rosetta emulation) so when you run Ollama inside that container, it is running purely on CPU, not utilizing your GPU hardware. On PC's NVIDIA and AMD have support for GPU pass-through into containers, so it is possible for ollama in a container to access the GPU, but this is not possible on Apple hardware.
Author
Owner

@rb81 commented on GitHub (Apr 25, 2024):

@dhiltgen Thanks for the clarification!

<!-- gh-comment-id:2076681070 --> @rb81 commented on GitHub (Apr 25, 2024): @dhiltgen Thanks for the clarification!
Author
Owner

@paolocattani commented on GitHub (Jul 22, 2024):

I was about to post the same question for M3.
Thanks @dhiltgen!

<!-- gh-comment-id:2242896159 --> @paolocattani commented on GitHub (Jul 22, 2024): I was about to post the same question for M3. Thanks @dhiltgen!
Author
Owner

@adnankaya commented on GitHub (Nov 12, 2024):

Docker Desktop on Mac, does NOT expose the Apple GPU to the container runtime, it only exposes an ARM CPU (or virtual x86 CPU via Rosetta emulation) so when you run Ollama inside that container, it is running purely on CPU, not utilizing your GPU hardware.

I think on https://ollama.com/blog/ollama-is-now-available-as-an-official-docker-image following sentence is confusing @dhiltgen

We recommend running Ollama alongside Docker Desktop for macOS in order for Ollama to enable GPU acceleration for models.

If there is an example using ollama docker that uses macOS(Apple Sillicone Chips M1, M2, M3 ..) GPU it would be great.

<!-- gh-comment-id:2470463036 --> @adnankaya commented on GitHub (Nov 12, 2024): > Docker Desktop on Mac, does NOT expose the Apple GPU to the container runtime, it only exposes an ARM CPU (or virtual x86 CPU via Rosetta emulation) so when you run Ollama inside that container, it is running purely on CPU, not utilizing your GPU hardware. I think on https://ollama.com/blog/ollama-is-now-available-as-an-official-docker-image following sentence is **confusing** @dhiltgen > We recommend running Ollama alongside Docker Desktop for macOS in order for Ollama to enable GPU acceleration for models. If there is an example using ollama docker that uses macOS(Apple Sillicone Chips M1, M2, M3 ..) GPU it would be great.
Author
Owner

@dhiltgen commented on GitHub (Nov 12, 2024):

@adnankaya thanks for pointing that out. We'll get the blog cleaned up to make it clearer that running in docker on arm macs wont support the GPU.

<!-- gh-comment-id:2471848796 --> @dhiltgen commented on GitHub (Nov 12, 2024): @adnankaya thanks for pointing that out. We'll get the blog cleaned up to make it clearer that running in docker on arm macs wont support the GPU.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64422