[GH-ISSUE #13746] new: Image generation models (experimental) - not working #9010

Open
opened 2026-04-12 21:50:06 -05:00 by GiteaMirror · 15 comments
Owner

Originally created by @leder11011 on GitHub (Jan 16, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13746

Originally assigned to: @jmorganca on GitHub.

What is the issue?

Hello all,
thank you for ollama v0.14.1!
unfortunately I get an error on linux w/ nvidia GPU or CPU:

OLLAMA_NUM_CTX=2048 ollama run x/z-image-turbo                                                             
>>> cat in the rain

Error: 500 Internal Server Error: image runner failed: 2026/01/16 09:00:39 runner.go:65: INFO starting image runner model=x/z-image-turbo port=34187 (exit: exit status 255)
>>> /bye


  Model
    architecture    ZImagePipeline
    parameters      10.3B
    quantization    FP8
    requires        0.14.0

  Capabilities
    image

Relevant log output

Jan 16 09:02:35 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:35.999+01:00 level=INFO source=server.go:149 msg="starting ollama-mlx image runner subprocess" exe=/usr/local/bin/ollama-mlx model=x/z-image-turbo port=36725
Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.171+01:00 level=WARN source=server.go:141 msg=image-runner msg="2026/01/16 09:02:36 runner.go:65: INFO starting image runner model=x/z-image-turbo port=36725"
Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.174+01:00 level=INFO source=server.go:134 msg=image-runner msg="Loading Z-Image model from manifest: x/z-image-turbo..."
Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.439+01:00 level=INFO source=server.go:134 msg=image-runner msg="  Loading tokenizer... ✓"
Jan 16 09:02:37 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:37.756+01:00 level=INFO source=server.go:134 msg=image-runner msg="  Loading text encoder... ✓"
Jan 16 09:02:38 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:38.722+01:00 level=INFO source=server.go:134 msg=image-runner msg="  (11.3 GB, peak 11.3 GB)"
Jan 16 09:02:40 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:40.936+01:00 level=INFO source=server.go:134 msg=image-runner msg="  Loading transformer... ✓"
Jan 16 09:04:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:04:36.000+01:00 level=INFO source=server.go:320 msg="stopping image runner subprocess" pid=27349
Jan 16 09:04:36 gerrit-systemproductname ollama[17756]: [GIN] 2026/01/16 - 09:04:36 | 500 |          2m0s |       127.0.0.1 | POST     "/api/generate"

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @leder11011 on GitHub (Jan 16, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13746 Originally assigned to: @jmorganca on GitHub. ### What is the issue? Hello all, thank you for ollama v0.14.1! unfortunately I get an error on linux w/ nvidia GPU or CPU: ``` OLLAMA_NUM_CTX=2048 ollama run x/z-image-turbo >>> cat in the rain Error: 500 Internal Server Error: image runner failed: 2026/01/16 09:00:39 runner.go:65: INFO starting image runner model=x/z-image-turbo port=34187 (exit: exit status 255) >>> /bye Model architecture ZImagePipeline parameters 10.3B quantization FP8 requires 0.14.0 Capabilities image ``` ### Relevant log output ```shell Jan 16 09:02:35 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:35.999+01:00 level=INFO source=server.go:149 msg="starting ollama-mlx image runner subprocess" exe=/usr/local/bin/ollama-mlx model=x/z-image-turbo port=36725 Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.171+01:00 level=WARN source=server.go:141 msg=image-runner msg="2026/01/16 09:02:36 runner.go:65: INFO starting image runner model=x/z-image-turbo port=36725" Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.174+01:00 level=INFO source=server.go:134 msg=image-runner msg="Loading Z-Image model from manifest: x/z-image-turbo..." Jan 16 09:02:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:36.439+01:00 level=INFO source=server.go:134 msg=image-runner msg=" Loading tokenizer... ✓" Jan 16 09:02:37 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:37.756+01:00 level=INFO source=server.go:134 msg=image-runner msg=" Loading text encoder... ✓" Jan 16 09:02:38 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:38.722+01:00 level=INFO source=server.go:134 msg=image-runner msg=" (11.3 GB, peak 11.3 GB)" Jan 16 09:02:40 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:02:40.936+01:00 level=INFO source=server.go:134 msg=image-runner msg=" Loading transformer... ✓" Jan 16 09:04:36 gerrit-systemproductname ollama[17756]: time=2026-01-16T09:04:36.000+01:00 level=INFO source=server.go:320 msg="stopping image runner subprocess" pid=27349 Jan 16 09:04:36 gerrit-systemproductname ollama[17756]: [GIN] 2026/01/16 - 09:04:36 | 500 | 2m0s | 127.0.0.1 | POST "/api/generate" ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-12 21:50:06 -05:00
Author
Owner

@1101728133 commented on GitHub (Jan 16, 2026):

Me too.

<!-- gh-comment-id:3758704430 --> @1101728133 commented on GitHub (Jan 16, 2026): Me too.
Author
Owner

@Sekousuke commented on GitHub (Jan 16, 2026):

The computer's memory is insufficient.

<!-- gh-comment-id:3758782559 --> @Sekousuke commented on GitHub (Jan 16, 2026): The computer's memory is insufficient.
Author
Owner

@leder11011 commented on GitHub (Jan 16, 2026):

I got a different error when omitting OLLAMA_NUM_CTX=2048

<!-- gh-comment-id:3761151829 --> @leder11011 commented on GitHub (Jan 16, 2026): I got a different error when omitting `OLLAMA_NUM_CTX=2048`
Author
Owner

@Sekousuke commented on GitHub (Jan 17, 2026):

OLLAMA_NUM_CTX is used for the text model, not for the image model. Moreover, it was terminated by the MacOS system, not because you restricted it.

<!-- gh-comment-id:3762664837 --> @Sekousuke commented on GitHub (Jan 17, 2026): OLLAMA_NUM_CTX is used for the text model, not for the image model. Moreover, it was terminated by the MacOS system, not because you restricted it.
Author
Owner

@ghmer commented on GitHub (Jan 18, 2026):

I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available.

<!-- gh-comment-id:3765482914 --> @ghmer commented on GitHub (Jan 18, 2026): I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available.
Author
Owner

@vansatchen commented on GitHub (Jan 18, 2026):

I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available.

ollama run x/z-image-turbo
a cat

Error: 500 Internal Server Error: image runner exited unexpectedly: exit status 255

ollama -v
ollama version is 0.14.1

free -h
total used free shared buff/cache available
Mem: 1.0Ti 31Gi 8.6Gi 182Mi 974Gi 976Gi
Swap: 8.0Gi 1.8Mi 8.0Gi

Is this memory enough?

<!-- gh-comment-id:3765595969 --> @vansatchen commented on GitHub (Jan 18, 2026): > I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available. > ollama run x/z-image-turbo > a cat > > Error: 500 Internal Server Error: image runner exited unexpectedly: exit status 255 > ollama -v > ollama version is 0.14.1 > free -h > total used free shared buff/cache available > Mem: 1.0Ti 31Gi 8.6Gi 182Mi 974Gi 976Gi > Swap: 8.0Gi 1.8Mi 8.0Gi Is this memory enough?
Author
Owner

@ghmer commented on GitHub (Jan 18, 2026):

I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available.

ollama run x/z-image-turbo
a cat
Error: 500 Internal Server Error: image runner exited unexpectedly: exit status 255

ollama -v
ollama version is 0.14.1

free -h
total used free shared buff/cache available
Mem: 1.0Ti 31Gi 8.6Gi 182Mi 974Gi 976Gi
Swap: 8.0Gi 1.8Mi 8.0Gi

Is this memory enough?

I mean, this is plenty, but is it a MacOS system?

<!-- gh-comment-id:3765603829 --> @ghmer commented on GitHub (Jan 18, 2026): > > I can confirm that image generation works with both models (fp8/bp16) if the appropriate system resources are available. > > > ollama run x/z-image-turbo > > a cat > > Error: 500 Internal Server Error: image runner exited unexpectedly: exit status 255 > > > ollama -v > > ollama version is 0.14.1 > > > free -h > > total used free shared buff/cache available > > Mem: 1.0Ti 31Gi 8.6Gi 182Mi 974Gi 976Gi > > Swap: 8.0Gi 1.8Mi 8.0Gi > > Is this memory enough? I mean, this is plenty, but is it a MacOS system?
Author
Owner

@vansatchen commented on GitHub (Jan 19, 2026):

I mean, this is plenty, but is it a MacOS system?

It's Linux(Ubuntu 24.04), but realese notes of v0.14.1:

Image generation models (experimental)

Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:
Available models

Z-Image-Turbo

<!-- gh-comment-id:3766249571 --> @vansatchen commented on GitHub (Jan 19, 2026): > I mean, this is plenty, but is it a MacOS system? It's Linux(Ubuntu 24.04), but realese notes of [v0.14.1](https://github.com/ollama/ollama/releases/tag/v0.14.1): > Image generation models (experimental) > > Experimental image generation models are available for macOS **and Linux** (CUDA) in Ollama: > Available models > > [Z-Image-Turbo](https://ollama.com/x/z-image-turbo)
Author
Owner

@ghmer commented on GitHub (Jan 19, 2026):

And according to the README of the model you linked to:

Note: image generation models only work on macOS.

I guess one of these docs is not right, but given that I can run both models on my Macbook, I'd further guess - at least this model - only works on MacOS. Aren't they even relying on the MLX framework?

<!-- gh-comment-id:3768729501 --> @ghmer commented on GitHub (Jan 19, 2026): And according to the README of the model you linked to: > Note: image generation models only work on macOS. I guess one of these docs is not right, but given that I can run both models on my Macbook, I'd further guess - at least this model - only works on MacOS. Aren't they even relying on the MLX framework?
Author
Owner

@BOTO145 commented on GitHub (Jan 23, 2026):

the image generation will come later in windows if your os is windows (even lates)

<!-- gh-comment-id:3790573792 --> @BOTO145 commented on GitHub (Jan 23, 2026): the image generation will come later in windows if your os is windows (even lates)
Author
Owner
<!-- gh-comment-id:3868256335 --> @rick-github commented on GitHub (Feb 8, 2026): Linux docker: https://github.com/ollama/ollama/issues/14016#issuecomment-3831904450 Linux native: https://github.com/ollama/ollama/issues/14046#issuecomment-3846867534 Only bf16 quants supported for flux, minimum memory 16G: https://github.com/ollama/ollama/issues/14046#issuecomment-3847105340
Author
Owner

@BOTO145 commented on GitHub (Feb 9, 2026):

Image

either this is the reasons OR if you have downloaded it from hugging face, then u dont have enough vram (that's what I think)

<!-- gh-comment-id:3871531272 --> @BOTO145 commented on GitHub (Feb 9, 2026): <img width="771" height="124" alt="Image" src="https://github.com/user-attachments/assets/943eabe0-b5f6-4152-a18f-586461710261" /> either this is the reasons OR if you have downloaded it from hugging face, then u dont have enough vram (that's what I think)
Author
Owner

@rick-github commented on GitHub (Feb 9, 2026):

Image generation works on Linux using the fixes shown in https://github.com/ollama/ollama/issues/13746#issuecomment-3868256335

<!-- gh-comment-id:3872504518 --> @rick-github commented on GitHub (Feb 9, 2026): Image generation works on Linux using the fixes shown in https://github.com/ollama/ollama/issues/13746#issuecomment-3868256335
Author
Owner

@ckuethe commented on GitHub (Feb 9, 2026):

For the benefit of other linux/docker/rocm users, that looks like it depends on CUDA.

It looks like MLX doesn'twith ROCm yet, based on https://github.com/ml-explore/mlx/issues/2556 and https://github.com/ml-explore/mlx/pull/2300

<!-- gh-comment-id:3873107450 --> @ckuethe commented on GitHub (Feb 9, 2026): For the benefit of other linux/docker/rocm users, that looks like it depends on CUDA. It looks like MLX doesn'twith ROCm yet, based on https://github.com/ml-explore/mlx/issues/2556 and https://github.com/ml-explore/mlx/pull/2300
Author
Owner

@Geramy commented on GitHub (Apr 5, 2026):

For the benefit of other linux/docker/rocm users, that looks like it depends on CUDA.

It looks like MLX doesn'twith ROCm yet, based on ml-explore/mlx#2556 and ml-explore/mlx#2300

There is a ROCm pull request waiting for some advice from the maintainers at MLX

<!-- gh-comment-id:4188005000 --> @Geramy commented on GitHub (Apr 5, 2026): > For the benefit of other linux/docker/rocm users, that looks like it depends on CUDA. > > It looks like MLX doesn'twith ROCm yet, based on [ml-explore/mlx#2556](https://github.com/ml-explore/mlx/issues/2556) and [ml-explore/mlx#2300](https://github.com/ml-explore/mlx/pull/2300) There is a ROCm pull request waiting for some advice from the maintainers at MLX
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9010