[GH-ISSUE #15770] Issue 0.21.x mlx runner error 500 #56561

Closed
opened 2026-04-29 11:01:25 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @Sko85ch on GitHub (Apr 23, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15770

What is the issue?

Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90

Relevant log output


OS

macOS

GPU

No response

CPU

Apple

Ollama version

0.21.1

Originally created by @Sko85ch on GitHub (Apr 23, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15770 ### What is the issue? Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90 ### Relevant log output ```shell ``` ### OS macOS ### GPU _No response_ ### CPU Apple ### Ollama version 0.21.1
GiteaMirror added the bug label 2026-04-29 11:01:25 -05:00
Author
Owner

@albihasani94 commented on GitHub (Apr 23, 2026):

I am running into the same issue trying to run either qwen3.6:35b-a3b-coding-nvfp4 or qwen3.6:27b-coding-nvfp4 on Apple M2 Pro with 16 GPU core count. If there was any other info to share to help maintainers, would be:

panic: mlx: There is no Stream(gpu, 1) in current thread. at .../mlx-c-0.6.0/mlx/c/transforms.cpp:73
        panic: mlx: There is no Stream(gpu, 1) in current thread. at .../mlx-c-0.6.0/mlx/c/transforms.cpp:15

Previously,

source=client.go:420 msg="starting mlx runner subprocess" model=qwen3.6:27b-coding-nvfp4
source=sched.go:561 msg="loaded runners" count=1
source=server.go:31 msg="MLX engine initialized" "MLX version"=0.31.2 device=gpu
...
source=client.go:147 msg="mlx runner is ready"
<!-- gh-comment-id:4304138897 --> @albihasani94 commented on GitHub (Apr 23, 2026): I am running into the same issue trying to run either `qwen3.6:35b-a3b-coding-nvfp4` or `qwen3.6:27b-coding-nvfp4` on Apple M2 Pro with 16 GPU core count. If there was any other info to share to help maintainers, would be: ``` panic: mlx: There is no Stream(gpu, 1) in current thread. at .../mlx-c-0.6.0/mlx/c/transforms.cpp:73 panic: mlx: There is no Stream(gpu, 1) in current thread. at .../mlx-c-0.6.0/mlx/c/transforms.cpp:15 ``` Previously, ``` source=client.go:420 msg="starting mlx runner subprocess" model=qwen3.6:27b-coding-nvfp4 source=sched.go:561 msg="loaded runners" count=1 source=server.go:31 msg="MLX engine initialized" "MLX version"=0.31.2 device=gpu ... source=client.go:147 msg="mlx runner is ready" ```
Author
Owner

@ohjakobsen commented on GitHub (Apr 23, 2026):

I'm seeing the same as well on an M4 Mac with gemma4:26b-nvfp4:

panic: mlx: There is no Stream(gpu, 2) in current thread. at /private/tmp/mlx-c-20260422-18059-3ipqfu/mlx-c-0.6.0/mlx/c/transforms.cpp:15

Same MLX-version

source=server.go:31 msg="MLX engine initialized" "MLX version"=0.31.2 device=gpu
<!-- gh-comment-id:4307375142 --> @ohjakobsen commented on GitHub (Apr 23, 2026): I'm seeing the same as well on an M4 Mac with `gemma4:26b-nvfp4`: ``` panic: mlx: There is no Stream(gpu, 2) in current thread. at /private/tmp/mlx-c-20260422-18059-3ipqfu/mlx-c-0.6.0/mlx/c/transforms.cpp:15 ``` Same MLX-version ``` source=server.go:31 msg="MLX engine initialized" "MLX version"=0.31.2 device=gpu ```
Author
Owner

@Sko85ch commented on GitHub (Apr 24, 2026):

Issue also in 0.21.2

<!-- gh-comment-id:4311975252 --> @Sko85ch commented on GitHub (Apr 24, 2026): Issue also in 0.21.2
Author
Owner

@StefVanG commented on GitHub (Apr 24, 2026):

Experienced the exact same error on Ollama v0.21.1 and 0.21.2 and qwen3.6:27b-mlx-bf16, qwen3.6:27b-coding-bf16, qwen3.6:35b-a3b-coding-bf16. qwen3-coder-next:q4_K_M, devstral-small-2:24b-instruct-2512-fp16 and devstral-2:123b-instruct-2512-q4_K_M have no problem. M5 Max with 40 Core GPU and 128GB Ram

<!-- gh-comment-id:4313481781 --> @StefVanG commented on GitHub (Apr 24, 2026): Experienced the exact same error on Ollama v0.21.1 and 0.21.2 and qwen3.6:27b-mlx-bf16, qwen3.6:27b-coding-bf16, qwen3.6:35b-a3b-coding-bf16. qwen3-coder-next:q4_K_M, devstral-small-2:24b-instruct-2512-fp16 and devstral-2:123b-instruct-2512-q4_K_M have no problem. M5 Max with 40 Core GPU and 128GB Ram
Author
Owner

@antoyavin commented on GitHub (Apr 24, 2026):

Same issue here on ollama 0.21.2:

ollama run qwen3.5:35b-a3b-coding-nvfp4
pulling manifest
pulling model: 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  21 GB
writing manifest
success
>>> hello
Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90

M5 with 10 Core GPU and 32GB RAM

<!-- gh-comment-id:4314505523 --> @antoyavin commented on GitHub (Apr 24, 2026): Same issue here on ollama 0.21.2: ```bash ollama run qwen3.5:35b-a3b-coding-nvfp4 pulling manifest pulling model: 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 21 GB writing manifest success >>> hello Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90 ``` M5 with 10 Core GPU and 32GB RAM
Author
Owner

@oldfarth commented on GitHub (Apr 24, 2026):

Same error like antoyavin
M1 Max 64GB RAM

==> ollama: stable 0.21.2

==> mlx-c: stable 0.6.0

==> mlx: stable 0.31.2
==> Dependencies
Required (1): python@3.14
Recursive Runtime (9)
Dependents: 2
==> Requirements
Build: Xcode >= 15.0 (on macOS)
Required: arm64 architecture, macOS >= 14 (or Linux), macOS

Thank you antoyavin. https://github.com/ollama/ollama/pull/15793 solved the problem

<!-- gh-comment-id:4316378051 --> @oldfarth commented on GitHub (Apr 24, 2026): Same error like [antoyavin](https://github.com/antoyavin) M1 Max 64GB RAM ==> ollama: stable 0.21.2 ==> mlx-c: stable 0.6.0 ==> mlx: stable 0.31.2 ==> Dependencies Required (1): python@3.14 Recursive Runtime (9) Dependents: 2 ==> Requirements Build: Xcode >= 15.0 (on macOS) Required: arm64 architecture, macOS >= 14 (or Linux), macOS Thank you [antoyavin](https://github.com/antoyavin). https://github.com/ollama/ollama/pull/15793 solved the problem
Author
Owner

@IAmBrendanL commented on GitHub (Apr 24, 2026):

I'm also running into the same issue on an M5 pro with 20 GPU cores when trying to use qwen3.6:27b-coding-mxfp8
Error: error running model: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90

<!-- gh-comment-id:4316994878 --> @IAmBrendanL commented on GitHub (Apr 24, 2026): I'm also running into the same issue on an M5 pro with 20 GPU cores when trying to use qwen3.6:27b-coding-mxfp8 `Error: error running model: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90`
Author
Owner

@andreinknv commented on GitHub (Apr 25, 2026):

Here is fix for the issue https://github.com/ollama/ollama/pull/15793

<!-- gh-comment-id:4317560817 --> @andreinknv commented on GitHub (Apr 25, 2026): Here is fix for the issue https://github.com/ollama/ollama/pull/15793
Author
Owner

@2pl commented on GitHub (Apr 25, 2026):

Same issue here.

  • Apple M4 / Tahoe 26.2 / 32GB
  • ollama installed from brew:
    • ollama version is 0.21.2
    • mlx 0.31.2
ollama run  qwen3.6:27b-coding-nvfp4 --verbose <<< "Hi"  

Gives

Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90

In ollama logs I have

panic: mlx: There is no Stream(gpu, 1) in current thread. at /private/tmp/mlx-c-20260422-18059-3ipqfu/mlx-c-0.6.0/mlx/c/transforms.cpp:73

did not try the fix pointed to by @andreinknv yet

<!-- gh-comment-id:4318510466 --> @2pl commented on GitHub (Apr 25, 2026): Same issue here. - Apple M4 / Tahoe 26.2 / 32GB - ollama installed from brew: - ollama version is 0.21.2 - mlx 0.31.2 ```shell ollama run qwen3.6:27b-coding-nvfp4 --verbose <<< "Hi" ``` Gives ```shell Error: 500 Internal Server Error: mlx runner failed: golang.org/x/sync@v0.17.0/errgroup/errgroup.go:78 +0x90 ``` In ollama logs I have > panic: mlx: There is no Stream(gpu, 1) in current thread. at /private/tmp/mlx-c-20260422-18059-3ipqfu/mlx-c-0.6.0/mlx/c/transforms.cpp:73 did not try the fix pointed to by @andreinknv yet
Author
Owner

@Sko85ch commented on GitHub (Apr 25, 2026):

Solved. thx!

<!-- gh-comment-id:4320285886 --> @Sko85ch commented on GitHub (Apr 25, 2026): Solved. thx!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#56561