[GH-ISSUE #13833] Server error when attempting image generation (M5 mac) #55570

Closed
opened 2026-04-29 09:26:03 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @jeastman on GitHub (Jan 22, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13833

What is the issue?

Attempted to generate an image using the x/z-image-turbo model and received a server error

I am running Ollama 0.14.3 on M5 Macbook Pro (latest MacOS).

❯ ollama --version
ollama version is 0.14.3

When I attempt the image generation, the API returns HTTP 500.

❯ ollama run x/z-image-turbo "A cat with a hello sign"
Error: 500 Internal Server Error: Post "http://127.0.0.1:57107/completion": EOF

The logs show the following error:

MLX error: [metal::Device] Unable to load kernel affine_qmm_t_nax_bfloat16_t_gs_32_b_8_bm64_bn64_bk64_wm2_wn2_alN_true_batch_0

A discussion at llmstudio-ai leads me to believe this could be a M5 specific issue.

Relevant log output

time=2026-01-21T20:15:51.918-06:00 level=INFO source=routes.go:1629 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/jeastman/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2026-01-21T20:15:51.957-06:00 level=INFO source=images.go:501 msg="total blobs: 3374"
time=2026-01-21T20:15:51.960-06:00 level=INFO source=images.go:508 msg="total unused blobs removed: 0"
time=2026-01-21T20:15:51.961-06:00 level=INFO source=routes.go:1682 msg="Listening on 127.0.0.1:11434 (version 0.14.3)"
time=2026-01-21T20:15:51.961-06:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-01-21T20:15:51.961-06:00 level=INFO source=server.go:429 msg="starting runner" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --port 56914"
time=2026-01-21T20:15:52.083-06:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=Metal compute=0.0 name=Metal description="Apple M5" libdirs="" driver=0.0 pci_id="" type=discrete total="25.0 GiB" available="25.0 GiB"
[GIN] 2026/01/21 - 20:15:52 | 200 |      27.041µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2026/01/21 - 20:15:52 | 200 |          50µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/01/21 - 20:15:52 | 200 |   66.450334ms |       127.0.0.1 | POST     "/api/show"
time=2026-01-21T20:15:52.426-06:00 level=INFO source=server.go:136 msg="starting image runner subprocess" exe=/Applications/Ollama.app/Contents/Resources/ollama model=x/z-image-turbo:latest port=56921
time=2026-01-21T20:15:52.456-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:78: INFO MLX library initialized"
time=2026-01-21T20:15:52.456-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:79: INFO starting image runner model=x/z-image-turbo:latest port=56921"
time=2026-01-21T20:15:52.460-06:00 level=INFO source=server.go:122 msg=image-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..."
time=2026-01-21T20:15:52.460-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:91: INFO detected model type type=ZImagePipeline"
time=2026-01-21T20:15:52.601-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading tokenizer... ✓"
time=2026-01-21T20:15:53.530-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading text encoder... ✓"
time=2026-01-21T20:15:53.531-06:00 level=INFO source=server.go:122 msg=image-runner msg="  (4.5 GB, peak 4.5 GB)"
time=2026-01-21T20:15:54.796-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading transformer... ✓"
time=2026-01-21T20:15:54.796-06:00 level=INFO source=server.go:122 msg=image-runner msg="  (11.7 GB, peak 11.7 GB)"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading conv_in... ✓"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading mid block... ✓"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading up blocks... ✓ [4 blocks]"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading conv_norm_out... ✓"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loading conv_out... ✓"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  (11.9 GB, peak 11.9 GB)"
time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg="  Loaded in 2.41s (11.9 GB VRAM)"
time=2026-01-21T20:15:54.869-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:54 runner.go:138: INFO image runner listening addr=127.0.0.1:56921"
time=2026-01-21T20:15:54.928-06:00 level=INFO source=server.go:207 msg="image runner is ready" port=56921
time=2026-01-21T20:15:54.946-06:00 level=INFO source=server.go:122 msg=image-runner msg="MLX error: [metal::Device] Unable to load kernel affine_qmm_t_nax_bfloat16_t_gs_32_b_8_bm64_bn64_bk64_wm2_wn2_alN_true_batch_0"
time=2026-01-21T20:15:54.946-06:00 level=INFO source=server.go:122 msg=image-runner msg=" at /Users/runner/work/ollama/ollama/build/_deps/mlx-c-src/mlx/c/transforms.cpp:73"
[GIN] 2026/01/21 - 20:15:55 | 500 |  2.722190167s |       127.0.0.1 | POST     "/api/generate"
time=2026-01-21T20:20:55.100-06:00 level=INFO source=server.go:305 msg="stopping image runner subprocess" pid=51208

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.14.3

Originally created by @jeastman on GitHub (Jan 22, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13833 ### What is the issue? Attempted to generate an image using the x/z-image-turbo model and received a server error I am running Ollama 0.14.3 on **M5** Macbook Pro (latest MacOS). ``` ❯ ollama --version ollama version is 0.14.3 ``` When I attempt the image generation, the API returns HTTP 500. ``` ❯ ollama run x/z-image-turbo "A cat with a hello sign" Error: 500 Internal Server Error: Post "http://127.0.0.1:57107/completion": EOF ``` The logs show the following error: ``` MLX error: [metal::Device] Unable to load kernel affine_qmm_t_nax_bfloat16_t_gs_32_b_8_bm64_bn64_bk64_wm2_wn2_alN_true_batch_0 ``` A discussion at [llmstudio-ai](https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues/1356) leads me to believe this could be a M5 specific issue. ### Relevant log output ```shell time=2026-01-21T20:15:51.918-06:00 level=INFO source=routes.go:1629 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/jeastman/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]" time=2026-01-21T20:15:51.957-06:00 level=INFO source=images.go:501 msg="total blobs: 3374" time=2026-01-21T20:15:51.960-06:00 level=INFO source=images.go:508 msg="total unused blobs removed: 0" time=2026-01-21T20:15:51.961-06:00 level=INFO source=routes.go:1682 msg="Listening on 127.0.0.1:11434 (version 0.14.3)" time=2026-01-21T20:15:51.961-06:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-01-21T20:15:51.961-06:00 level=INFO source=server.go:429 msg="starting runner" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --port 56914" time=2026-01-21T20:15:52.083-06:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=Metal compute=0.0 name=Metal description="Apple M5" libdirs="" driver=0.0 pci_id="" type=discrete total="25.0 GiB" available="25.0 GiB" [GIN] 2026/01/21 - 20:15:52 | 200 | 27.041µs | 127.0.0.1 | GET "/api/version" [GIN] 2026/01/21 - 20:15:52 | 200 | 50µs | 127.0.0.1 | HEAD "/" [GIN] 2026/01/21 - 20:15:52 | 200 | 66.450334ms | 127.0.0.1 | POST "/api/show" time=2026-01-21T20:15:52.426-06:00 level=INFO source=server.go:136 msg="starting image runner subprocess" exe=/Applications/Ollama.app/Contents/Resources/ollama model=x/z-image-turbo:latest port=56921 time=2026-01-21T20:15:52.456-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:78: INFO MLX library initialized" time=2026-01-21T20:15:52.456-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:79: INFO starting image runner model=x/z-image-turbo:latest port=56921" time=2026-01-21T20:15:52.460-06:00 level=INFO source=server.go:122 msg=image-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..." time=2026-01-21T20:15:52.460-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:52 runner.go:91: INFO detected model type type=ZImagePipeline" time=2026-01-21T20:15:52.601-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading tokenizer... ✓" time=2026-01-21T20:15:53.530-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading text encoder... ✓" time=2026-01-21T20:15:53.531-06:00 level=INFO source=server.go:122 msg=image-runner msg=" (4.5 GB, peak 4.5 GB)" time=2026-01-21T20:15:54.796-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading transformer... ✓" time=2026-01-21T20:15:54.796-06:00 level=INFO source=server.go:122 msg=image-runner msg=" (11.7 GB, peak 11.7 GB)" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading conv_in... ✓" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading mid block... ✓" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading up blocks... ✓ [4 blocks]" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading conv_norm_out... ✓" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loading conv_out... ✓" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" (11.9 GB, peak 11.9 GB)" time=2026-01-21T20:15:54.869-06:00 level=INFO source=server.go:122 msg=image-runner msg=" Loaded in 2.41s (11.9 GB VRAM)" time=2026-01-21T20:15:54.869-06:00 level=WARN source=server.go:129 msg=image-runner msg="2026/01/21 20:15:54 runner.go:138: INFO image runner listening addr=127.0.0.1:56921" time=2026-01-21T20:15:54.928-06:00 level=INFO source=server.go:207 msg="image runner is ready" port=56921 time=2026-01-21T20:15:54.946-06:00 level=INFO source=server.go:122 msg=image-runner msg="MLX error: [metal::Device] Unable to load kernel affine_qmm_t_nax_bfloat16_t_gs_32_b_8_bm64_bn64_bk64_wm2_wn2_alN_true_batch_0" time=2026-01-21T20:15:54.946-06:00 level=INFO source=server.go:122 msg=image-runner msg=" at /Users/runner/work/ollama/ollama/build/_deps/mlx-c-src/mlx/c/transforms.cpp:73" [GIN] 2026/01/21 - 20:15:55 | 500 | 2.722190167s | 127.0.0.1 | POST "/api/generate" time=2026-01-21T20:20:55.100-06:00 level=INFO source=server.go:305 msg="stopping image runner subprocess" pid=51208 ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.14.3
GiteaMirror added the macosbug labels 2026-04-29 09:26:03 -05:00
Author
Owner

@kolargol commented on GitHub (Jan 24, 2026):

The same happens on ollama 0.15:

ollama --version
ollama version is 0.15.0

❯ ollama run x/flux2-klein:9b "/Users/zbyszek/man-flying-rocket-with-a-dog-rocket-is-made-of-che-20260115-202651.png replace dog with the shark"
Added image '/Users/zbyszek/man-flying-rocket-with-a-dog-rocket-is-made-of-che-20260115-202651.png'
Error: 500 Internal Server Error: model does not support image editing

i am pulling latest x/flux2-klein:9b

<!-- gh-comment-id:3794317245 --> @kolargol commented on GitHub (Jan 24, 2026): The same happens on ollama 0.15: ollama --version ollama version is 0.15.0 ❯ ollama run x/flux2-klein:9b "/Users/zbyszek/man-flying-rocket-with-a-dog-rocket-is-made-of-che-20260115-202651.png replace dog with the shark" Added image '/Users/zbyszek/man-flying-rocket-with-a-dog-rocket-is-made-of-che-20260115-202651.png' Error: 500 Internal Server Error: model does not support image editing i am pulling latest x/flux2-klein:9b
Author
Owner

@ridvan70 commented on GitHub (Jan 25, 2026):

Different but still error on llama 0.15.1

@Mac-M2 ~ % ollama --version
ollama version is 0.15.1
@Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist
@Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist

#################
NAME                              ID              SIZE      MODIFIED
x/flux2-klein:9b                  5fd79ad76b03    11 GB     3 seconds ago
x/flux2-klein:latest              8c7f37810489    5.7 GB    9 seconds ago
<!-- gh-comment-id:3796877319 --> @ridvan70 commented on GitHub (Jan 25, 2026): Different but still error on llama 0.15.1 ``` @Mac-M2 ~ % ollama --version ollama version is 0.15.1 @Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world" pulling manifest Error: pull model manifest: file does not exist @Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world" pulling manifest Error: pull model manifest: file does not exist ################# NAME ID SIZE MODIFIED x/flux2-klein:9b 5fd79ad76b03 11 GB 3 seconds ago x/flux2-klein:latest 8c7f37810489 5.7 GB 9 seconds ago ```
Author
Owner

@ed-norris commented on GitHub (Jan 27, 2026):

Different but still error on llama 0.15.1

@Mac-M2 ~ % ollama --version
ollama version is 0.15.1
@Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist
@Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist

#################
NAME                              ID              SIZE      MODIFIED
x/flux2-klein:9b                  5fd79ad76b03    11 GB     3 seconds ago
x/flux2-klein:latest              8c7f37810489    5.7 GB    9 seconds ago

Is that a typo - flux-klein instead of flux2-klein ?

<!-- gh-comment-id:3803096186 --> @ed-norris commented on GitHub (Jan 27, 2026): > Different but still error on llama 0.15.1 > > ``` > @Mac-M2 ~ % ollama --version > ollama version is 0.15.1 > @Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world" > pulling manifest > Error: pull model manifest: file does not exist > @Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world" > pulling manifest > Error: pull model manifest: file does not exist > > ################# > NAME ID SIZE MODIFIED > x/flux2-klein:9b 5fd79ad76b03 11 GB 3 seconds ago > x/flux2-klein:latest 8c7f37810489 5.7 GB 9 seconds ago > ``` Is that a typo - `flux-klein` instead of `flux2-klein` ?
Author
Owner

@ridvan70 commented on GitHub (Jan 27, 2026):

Different but still error on llama 0.15.1

@Mac-M2 ~ % ollama --version
ollama version is 0.15.1
@Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist
@Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world"
pulling manifest
Error: pull model manifest: file does not exist

#################
NAME                              ID              SIZE      MODIFIED
x/flux2-klein:9b                  5fd79ad76b03    11 GB     3 seconds ago
x/flux2-klein:latest              8c7f37810489    5.7 GB    9 seconds ago

Is that a typo - flux-klein instead of flux2-klein ?

Fine Observation, thanks. I copy/pasted it from Ollama page and did not verify it. I will correct and retry: Done, it works

<!-- gh-comment-id:3806887730 --> @ridvan70 commented on GitHub (Jan 27, 2026): > > Different but still error on llama 0.15.1 > > ``` > > @Mac-M2 ~ % ollama --version > > ollama version is 0.15.1 > > @Mac-M2 ~ % ollama run x/flux-klein "a cat holding a sign that says hello world" > > pulling manifest > > Error: pull model manifest: file does not exist > > @Mac-M2 ~ % ollama run x/flux-klein:9b "a cat holding a sign that says hello world" > > pulling manifest > > Error: pull model manifest: file does not exist > > > > ################# > > NAME ID SIZE MODIFIED > > x/flux2-klein:9b 5fd79ad76b03 11 GB 3 seconds ago > > x/flux2-klein:latest 8c7f37810489 5.7 GB 9 seconds ago > > ``` > > Is that a typo - `flux-klein` instead of `flux2-klein` ? Fine Observation, thanks. I copy/pasted it from Ollama page and did not verify it. I will correct and retry: Done, it works
Author
Owner

@jeastman commented on GitHub (Mar 3, 2026):

I tested this again today and it appears to now be working.

ollama --version
ollama version is 0.17.5

Thank you!

Closing the ticket.

<!-- gh-comment-id:3988598401 --> @jeastman commented on GitHub (Mar 3, 2026): I tested this again today and it appears to now be working. ```shell ollama --version ollama version is 0.17.5 ``` Thank you! Closing the ticket.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55570