[GH-ISSUE #14234] Image generation fails in ollama 0.16.1 #9268

Closed
opened 2026-04-12 22:08:25 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @thomassresearch on GitHub (Feb 13, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14234

What is the issue?

Hi, when I run "ollama run "x/z-image-turbo:latest" (or flux2, the same), I get (MacOS, Apple Silicon):
ollama run "x/z-image-turbo:latest"
Error: failed to load model: 500 Internal Server Error: mlx runner failed: model.norm.weight (exit: exit status 1)

Same for ollama 0.16.0 (it worked a few versions ago, don't remember which one exactly, but it's not long ago)

Relevant log output

time=2026-02-13T12:18:35.204+01:00 level=INFO source=server.go:148 msg="starting mlx runner subprocess" exe=/Applications/Ollama.app/Contents/Resources/ollama model=x/z-image-turbo:latest port=63469 mode=imagegen
time=2026-02-13T12:18:35.240+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.240+01:00 level=INFO msg=\"MLX library initialized\""
time=2026-02-13T12:18:35.243+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.243+01:00 level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=63469 mode=imagegen"
time=2026-02-13T12:18:35.248+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.248+01:00 level=INFO msg=\"detected image model type\" type=ZImagePipeline"
time=2026-02-13T12:18:35.248+01:00 level=INFO source=server.go:134 msg=mlx-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..."
time=2026-02-13T12:18:35.455+01:00 level=INFO source=server.go:134 msg=mlx-runner msg="  Loading tokenizer... ✓"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="Error: failed to create server: failed to load image model: failed to load zimage model: text encoder: load module: LoadModule: missing weights:"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.embed_tokens.weight"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.0.self_attn.q_proj: failed to load quantized weight model.layers.0.self_attn.q_proj: tensor \"model.layers.0.self_attn.q_proj.weight\" not found"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.0.self_attn.k_proj: failed to load quantized weight model.layers.0.self_attn.k_proj: tensor \"model.layers.0.self_attn.k_proj.weight\" not found"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.0.self_attn.v_proj: failed to load quantized weight model.layers.0.self_attn.v_proj: tensor \"model.layers.0.self_attn.v_proj.weight\" not found"
time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.0.self_attn.o_proj: failed to load quantized weight model.layers.0.self_attn.o_proj: tensor \"model.layers.0.self_attn.o_proj.weight\" not found"

...

time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.k_proj: failed to load quantized weight model.layers.35.self_attn.k_proj: tensor \"model.layers.35.self_attn.k_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.v_proj: failed to load quantized weight model.layers.35.self_attn.v_proj: tensor \"model.layers.35.self_attn.v_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.o_proj: failed to load quantized weight model.layers.35.self_attn.o_proj: tensor \"model.layers.35.self_attn.o_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.q_norm.weight"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.k_norm.weight"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.gate_proj: failed to load quantized weight model.layers.35.mlp.gate_proj: tensor \"model.layers.35.mlp.gate_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.up_proj: failed to load quantized weight model.layers.35.mlp.up_proj: tensor \"model.layers.35.mlp.up_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.down_proj: failed to load quantized weight model.layers.35.mlp.down_proj: tensor \"model.layers.35.mlp.down_proj.weight\" not found"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.input_layernorm.weight"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.post_attention_layernorm.weight"
time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.norm.weight"
time=2026-02-13T12:18:35.811+01:00 level=INFO source=server.go:134 msg=mlx-runner msg="  Loading text encoder... "
time=2026-02-13T12:18:35.811+01:00 level=INFO source=server.go:363 msg="stopping mlx runner subprocess" pid=81758

OS

MacOS Sequoia 15.7.3

GPU

Apple M2 Max

CPU

Apple M2 Max

Ollama version

0.16.1

Originally created by @thomassresearch on GitHub (Feb 13, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14234 ### What is the issue? Hi, when I run "ollama run "x/z-image-turbo:latest" (or flux2, the same), I get (MacOS, Apple Silicon): ollama run "x/z-image-turbo:latest" Error: failed to load model: 500 Internal Server Error: mlx runner failed: model.norm.weight (exit: exit status 1) Same for ollama 0.16.0 (it worked a few versions ago, don't remember which one exactly, but it's not long ago) ### Relevant log output ```shell time=2026-02-13T12:18:35.204+01:00 level=INFO source=server.go:148 msg="starting mlx runner subprocess" exe=/Applications/Ollama.app/Contents/Resources/ollama model=x/z-image-turbo:latest port=63469 mode=imagegen time=2026-02-13T12:18:35.240+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.240+01:00 level=INFO msg=\"MLX library initialized\"" time=2026-02-13T12:18:35.243+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.243+01:00 level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=63469 mode=imagegen" time=2026-02-13T12:18:35.248+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="time=2026-02-13T12:18:35.248+01:00 level=INFO msg=\"detected image model type\" type=ZImagePipeline" time=2026-02-13T12:18:35.248+01:00 level=INFO source=server.go:134 msg=mlx-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..." time=2026-02-13T12:18:35.455+01:00 level=INFO source=server.go:134 msg=mlx-runner msg=" Loading tokenizer... ✓" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="Error: failed to create server: failed to load image model: failed to load zimage model: text encoder: load module: LoadModule: missing weights:" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.embed_tokens.weight" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.0.self_attn.q_proj: failed to load quantized weight model.layers.0.self_attn.q_proj: tensor \"model.layers.0.self_attn.q_proj.weight\" not found" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.0.self_attn.k_proj: failed to load quantized weight model.layers.0.self_attn.k_proj: tensor \"model.layers.0.self_attn.k_proj.weight\" not found" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.0.self_attn.v_proj: failed to load quantized weight model.layers.0.self_attn.v_proj: tensor \"model.layers.0.self_attn.v_proj.weight\" not found" time=2026-02-13T12:18:35.728+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.0.self_attn.o_proj: failed to load quantized weight model.layers.0.self_attn.o_proj: tensor \"model.layers.0.self_attn.o_proj.weight\" not found" ... time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.k_proj: failed to load quantized weight model.layers.35.self_attn.k_proj: tensor \"model.layers.35.self_attn.k_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.v_proj: failed to load quantized weight model.layers.35.self_attn.v_proj: tensor \"model.layers.35.self_attn.v_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.o_proj: failed to load quantized weight model.layers.35.self_attn.o_proj: tensor \"model.layers.35.self_attn.o_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.q_norm.weight" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.k_norm.weight" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.gate_proj: failed to load quantized weight model.layers.35.mlp.gate_proj: tensor \"model.layers.35.mlp.gate_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.up_proj: failed to load quantized weight model.layers.35.mlp.up_proj: tensor \"model.layers.35.mlp.up_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.down_proj: failed to load quantized weight model.layers.35.mlp.down_proj: tensor \"model.layers.35.mlp.down_proj.weight\" not found" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.input_layernorm.weight" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.post_attention_layernorm.weight" time=2026-02-13T12:18:35.730+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.norm.weight" time=2026-02-13T12:18:35.811+01:00 level=INFO source=server.go:134 msg=mlx-runner msg=" Loading text encoder... " time=2026-02-13T12:18:35.811+01:00 level=INFO source=server.go:363 msg="stopping mlx runner subprocess" pid=81758 ``` ### OS MacOS Sequoia 15.7.3 ### GPU Apple M2 Max ### CPU Apple M2 Max ### Ollama version 0.16.1
GiteaMirror added the bug label 2026-04-12 22:08:25 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9268