[GH-ISSUE #14249] x/z-image-turbo FP8 model fails to load on v0.16.0/v0.16.1 - MLX runner regression #9276

Closed
opened 2026-04-12 22:08:45 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @NerdSnipe on GitHub (Feb 14, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14249

Description

The x/z-image-turbo (FP8) model fails to load on Ollama v0.16.0 and v0.16.1 with an MLX runner error. The same model works correctly
on v0.15.6 with no changes to the model or system.

Error

Error: failed to load model: 500 Internal Server Error: mlx runner failed: model.norm.weight (exit: exit status 1)

Server logs show the text encoder fails to map weight tensors - every layer (0-35) reports tensor
"model.layers.X.self_attn.q_proj.weight" not found, despite all 2,211 blob files being present and intact on disk.

Full error from server log:
Error: failed to create server: failed to load image model: failed to load zimage model: text encoder: load module: LoadModule:
missing weights:
model.embed_tokens.weight
model.layers.0.self_attn.q_proj: failed to load quantized weight model.layers.0.self_attn.q_proj: tensor
"model.layers.0.self_attn.q_proj.weight" not found
... (all layers 0-35, all projections)
model.norm.weight

Steps to Reproduce

  1. Install Ollama v0.16.1 on macOS (Apple Silicon)
  2. ollama pull x/z-image-turbo
  3. ollama run x/z-image-turbo "a red cat"
  4. Observe 500 error

Workaround

Downgrading to Ollama v0.15.6 resolves the issue. The model loads and generates images without any changes.

Environment

  • Ollama version (broken): v0.16.0 and v0.16.1
  • Ollama version (working): v0.15.6
  • OS: macOS (Darwin 25.2.0)
  • Hardware: Apple M3 Max, 36 GB RAM
  • Model: x/z-image-turbo:latest (FP8, 10.3B params, ~12GB)

Analysis

The v0.16.0 release notes mention "Improvements to Ollama's MLX runner to support GLM-4.7-Flash." This MLX runner change appears to
have broken the weight tensor name mapping for Z-Image pipeline models. The tokenizer loads successfully, but the text encoder
module cannot locate any weight tensors despite the blobs existing on disk.

Originally created by @NerdSnipe on GitHub (Feb 14, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14249 Description The x/z-image-turbo (FP8) model fails to load on Ollama v0.16.0 and v0.16.1 with an MLX runner error. The same model works correctly on v0.15.6 with no changes to the model or system. Error Error: failed to load model: 500 Internal Server Error: mlx runner failed: model.norm.weight (exit: exit status 1) Server logs show the text encoder fails to map weight tensors - every layer (0-35) reports tensor "model.layers.X.self_attn.q_proj.weight" not found, despite all 2,211 blob files being present and intact on disk. Full error from server log: Error: failed to create server: failed to load image model: failed to load zimage model: text encoder: load module: LoadModule: missing weights: model.embed_tokens.weight model.layers.0.self_attn.q_proj: failed to load quantized weight model.layers.0.self_attn.q_proj: tensor "model.layers.0.self_attn.q_proj.weight" not found ... (all layers 0-35, all projections) model.norm.weight Steps to Reproduce 1. Install Ollama v0.16.1 on macOS (Apple Silicon) 2. ollama pull x/z-image-turbo 3. ollama run x/z-image-turbo "a red cat" 4. Observe 500 error Workaround Downgrading to Ollama v0.15.6 resolves the issue. The model loads and generates images without any changes. Environment - Ollama version (broken): v0.16.0 and v0.16.1 - Ollama version (working): v0.15.6 - OS: macOS (Darwin 25.2.0) - Hardware: Apple M3 Max, 36 GB RAM - Model: x/z-image-turbo:latest (FP8, 10.3B params, ~12GB) Analysis The v0.16.0 release notes mention "Improvements to Ollama's MLX runner to support GLM-4.7-Flash." This MLX runner change appears to have broken the weight tensor name mapping for Z-Image pipeline models. The tokenizer loads successfully, but the text encoder module cannot locate any weight tensors despite the blobs existing on disk.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9276