[GH-ISSUE #14231] Image generation issue on Tahoe 26.3 (mlx runner failed) #55778

Closed
opened 2026-04-29 09:43:26 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @slyapustin on GitHub (Feb 13, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14231

Originally assigned to: @pdevine on GitHub.

What is the issue?

Not sure if it's relevant to the recent Ollama update or MacOS update, but I notice it after both of them got updated.
It was working fine before.

sergey@Ultra ~ % ollama run x/z-image-turbo "a simple photo of an orange on a table"

Error: 500 Internal Server Error: mlx runner failed:   model.norm.weight (exit: exit status 1)

Relevant log output

time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.mlp.gate_proj: failed to load quantized weight model.layers.18.mlp.gate_proj: tensor \"model.layers.18.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.mlp.up_proj: failed to load quantized weight model.layers.18.mlp.up_proj: tensor \"model.layers.18.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.mlp.down_proj: failed to load quantized weight model.layers.18.mlp.down_proj: tensor \"model.layers.18.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.18.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.q_proj: failed to load quantized weight model.layers.19.self_attn.q_proj: tensor \"model.layers.19.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.k_proj: failed to load quantized weight model.layers.19.self_attn.k_proj: tensor \"model.layers.19.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.v_proj: failed to load quantized weight model.layers.19.self_attn.v_proj: tensor \"model.layers.19.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.o_proj: failed to load quantized weight model.layers.19.self_attn.o_proj: tensor \"model.layers.19.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.mlp.gate_proj: failed to load quantized weight model.layers.19.mlp.gate_proj: tensor \"model.layers.19.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.mlp.up_proj: failed to load quantized weight model.layers.19.mlp.up_proj: tensor \"model.layers.19.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.mlp.down_proj: failed to load quantized weight model.layers.19.mlp.down_proj: tensor \"model.layers.19.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.19.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.q_proj: failed to load quantized weight model.layers.20.self_attn.q_proj: tensor \"model.layers.20.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.k_proj: failed to load quantized weight model.layers.20.self_attn.k_proj: tensor \"model.layers.20.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.v_proj: failed to load quantized weight model.layers.20.self_attn.v_proj: tensor \"model.layers.20.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.o_proj: failed to load quantized weight model.layers.20.self_attn.o_proj: tensor \"model.layers.20.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.mlp.gate_proj: failed to load quantized weight model.layers.20.mlp.gate_proj: tensor \"model.layers.20.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.mlp.up_proj: failed to load quantized weight model.layers.20.mlp.up_proj: tensor \"model.layers.20.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.mlp.down_proj: failed to load quantized weight model.layers.20.mlp.down_proj: tensor \"model.layers.20.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.20.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.q_proj: failed to load quantized weight model.layers.21.self_attn.q_proj: tensor \"model.layers.21.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.k_proj: failed to load quantized weight model.layers.21.self_attn.k_proj: tensor \"model.layers.21.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.v_proj: failed to load quantized weight model.layers.21.self_attn.v_proj: tensor \"model.layers.21.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.o_proj: failed to load quantized weight model.layers.21.self_attn.o_proj: tensor \"model.layers.21.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.mlp.gate_proj: failed to load quantized weight model.layers.21.mlp.gate_proj: tensor \"model.layers.21.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.mlp.up_proj: failed to load quantized weight model.layers.21.mlp.up_proj: tensor \"model.layers.21.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.mlp.down_proj: failed to load quantized weight model.layers.21.mlp.down_proj: tensor \"model.layers.21.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.21.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.q_proj: failed to load quantized weight model.layers.22.self_attn.q_proj: tensor \"model.layers.22.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.k_proj: failed to load quantized weight model.layers.22.self_attn.k_proj: tensor \"model.layers.22.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.v_proj: failed to load quantized weight model.layers.22.self_attn.v_proj: tensor \"model.layers.22.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.o_proj: failed to load quantized weight model.layers.22.self_attn.o_proj: tensor \"model.layers.22.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.mlp.gate_proj: failed to load quantized weight model.layers.22.mlp.gate_proj: tensor \"model.layers.22.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.mlp.up_proj: failed to load quantized weight model.layers.22.mlp.up_proj: tensor \"model.layers.22.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.mlp.down_proj: failed to load quantized weight model.layers.22.mlp.down_proj: tensor \"model.layers.22.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.22.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.q_proj: failed to load quantized weight model.layers.23.self_attn.q_proj: tensor \"model.layers.23.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.k_proj: failed to load quantized weight model.layers.23.self_attn.k_proj: tensor \"model.layers.23.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.v_proj: failed to load quantized weight model.layers.23.self_attn.v_proj: tensor \"model.layers.23.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.o_proj: failed to load quantized weight model.layers.23.self_attn.o_proj: tensor \"model.layers.23.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.mlp.gate_proj: failed to load quantized weight model.layers.23.mlp.gate_proj: tensor \"model.layers.23.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.mlp.up_proj: failed to load quantized weight model.layers.23.mlp.up_proj: tensor \"model.layers.23.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.mlp.down_proj: failed to load quantized weight model.layers.23.mlp.down_proj: tensor \"model.layers.23.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.23.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.q_proj: failed to load quantized weight model.layers.24.self_attn.q_proj: tensor \"model.layers.24.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.k_proj: failed to load quantized weight model.layers.24.self_attn.k_proj: tensor \"model.layers.24.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.v_proj: failed to load quantized weight model.layers.24.self_attn.v_proj: tensor \"model.layers.24.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.o_proj: failed to load quantized weight model.layers.24.self_attn.o_proj: tensor \"model.layers.24.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.mlp.gate_proj: failed to load quantized weight model.layers.24.mlp.gate_proj: tensor \"model.layers.24.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.mlp.up_proj: failed to load quantized weight model.layers.24.mlp.up_proj: tensor \"model.layers.24.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.mlp.down_proj: failed to load quantized weight model.layers.24.mlp.down_proj: tensor \"model.layers.24.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.24.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.q_proj: failed to load quantized weight model.layers.25.self_attn.q_proj: tensor \"model.layers.25.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.k_proj: failed to load quantized weight model.layers.25.self_attn.k_proj: tensor \"model.layers.25.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.v_proj: failed to load quantized weight model.layers.25.self_attn.v_proj: tensor \"model.layers.25.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.o_proj: failed to load quantized weight model.layers.25.self_attn.o_proj: tensor \"model.layers.25.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.mlp.gate_proj: failed to load quantized weight model.layers.25.mlp.gate_proj: tensor \"model.layers.25.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.mlp.up_proj: failed to load quantized weight model.layers.25.mlp.up_proj: tensor \"model.layers.25.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.mlp.down_proj: failed to load quantized weight model.layers.25.mlp.down_proj: tensor \"model.layers.25.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.25.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.q_proj: failed to load quantized weight model.layers.26.self_attn.q_proj: tensor \"model.layers.26.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.k_proj: failed to load quantized weight model.layers.26.self_attn.k_proj: tensor \"model.layers.26.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.v_proj: failed to load quantized weight model.layers.26.self_attn.v_proj: tensor \"model.layers.26.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.o_proj: failed to load quantized weight model.layers.26.self_attn.o_proj: tensor \"model.layers.26.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.mlp.gate_proj: failed to load quantized weight model.layers.26.mlp.gate_proj: tensor \"model.layers.26.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.mlp.up_proj: failed to load quantized weight model.layers.26.mlp.up_proj: tensor \"model.layers.26.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.mlp.down_proj: failed to load quantized weight model.layers.26.mlp.down_proj: tensor \"model.layers.26.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.26.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.q_proj: failed to load quantized weight model.layers.27.self_attn.q_proj: tensor \"model.layers.27.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.k_proj: failed to load quantized weight model.layers.27.self_attn.k_proj: tensor \"model.layers.27.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.v_proj: failed to load quantized weight model.layers.27.self_attn.v_proj: tensor \"model.layers.27.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.o_proj: failed to load quantized weight model.layers.27.self_attn.o_proj: tensor \"model.layers.27.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.mlp.gate_proj: failed to load quantized weight model.layers.27.mlp.gate_proj: tensor \"model.layers.27.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.mlp.up_proj: failed to load quantized weight model.layers.27.mlp.up_proj: tensor \"model.layers.27.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.mlp.down_proj: failed to load quantized weight model.layers.27.mlp.down_proj: tensor \"model.layers.27.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.27.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.q_proj: failed to load quantized weight model.layers.28.self_attn.q_proj: tensor \"model.layers.28.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.k_proj: failed to load quantized weight model.layers.28.self_attn.k_proj: tensor \"model.layers.28.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.v_proj: failed to load quantized weight model.layers.28.self_attn.v_proj: tensor \"model.layers.28.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.o_proj: failed to load quantized weight model.layers.28.self_attn.o_proj: tensor \"model.layers.28.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.mlp.gate_proj: failed to load quantized weight model.layers.28.mlp.gate_proj: tensor \"model.layers.28.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.mlp.up_proj: failed to load quantized weight model.layers.28.mlp.up_proj: tensor \"model.layers.28.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.mlp.down_proj: failed to load quantized weight model.layers.28.mlp.down_proj: tensor \"model.layers.28.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.28.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.q_proj: failed to load quantized weight model.layers.29.self_attn.q_proj: tensor \"model.layers.29.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.k_proj: failed to load quantized weight model.layers.29.self_attn.k_proj: tensor \"model.layers.29.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.v_proj: failed to load quantized weight model.layers.29.self_attn.v_proj: tensor \"model.layers.29.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.o_proj: failed to load quantized weight model.layers.29.self_attn.o_proj: tensor \"model.layers.29.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.mlp.gate_proj: failed to load quantized weight model.layers.29.mlp.gate_proj: tensor \"model.layers.29.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.mlp.up_proj: failed to load quantized weight model.layers.29.mlp.up_proj: tensor \"model.layers.29.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.mlp.down_proj: failed to load quantized weight model.layers.29.mlp.down_proj: tensor \"model.layers.29.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.29.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.q_proj: failed to load quantized weight model.layers.30.self_attn.q_proj: tensor \"model.layers.30.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.k_proj: failed to load quantized weight model.layers.30.self_attn.k_proj: tensor \"model.layers.30.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.v_proj: failed to load quantized weight model.layers.30.self_attn.v_proj: tensor \"model.layers.30.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.o_proj: failed to load quantized weight model.layers.30.self_attn.o_proj: tensor \"model.layers.30.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.mlp.gate_proj: failed to load quantized weight model.layers.30.mlp.gate_proj: tensor \"model.layers.30.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.mlp.up_proj: failed to load quantized weight model.layers.30.mlp.up_proj: tensor \"model.layers.30.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.mlp.down_proj: failed to load quantized weight model.layers.30.mlp.down_proj: tensor \"model.layers.30.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.input_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.30.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.q_proj: failed to load quantized weight model.layers.31.self_attn.q_proj: tensor \"model.layers.31.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.k_proj: failed to load quantized weight model.layers.31.self_attn.k_proj: tensor \"model.layers.31.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.v_proj: failed to load quantized weight model.layers.31.self_attn.v_proj: tensor \"model.layers.31.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.o_proj: failed to load quantized weight model.layers.31.self_attn.o_proj: tensor \"model.layers.31.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.mlp.gate_proj: failed to load quantized weight model.layers.31.mlp.gate_proj: tensor \"model.layers.31.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.mlp.up_proj: failed to load quantized weight model.layers.31.mlp.up_proj: tensor \"model.layers.31.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.mlp.down_proj: failed to load quantized weight model.layers.31.mlp.down_proj: tensor \"model.layers.31.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.input_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.31.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.q_proj: failed to load quantized weight model.layers.32.self_attn.q_proj: tensor \"model.layers.32.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.k_proj: failed to load quantized weight model.layers.32.self_attn.k_proj: tensor \"model.layers.32.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.v_proj: failed to load quantized weight model.layers.32.self_attn.v_proj: tensor \"model.layers.32.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.o_proj: failed to load quantized weight model.layers.32.self_attn.o_proj: tensor \"model.layers.32.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.mlp.gate_proj: failed to load quantized weight model.layers.32.mlp.gate_proj: tensor \"model.layers.32.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.mlp.up_proj: failed to load quantized weight model.layers.32.mlp.up_proj: tensor \"model.layers.32.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.mlp.down_proj: failed to load quantized weight model.layers.32.mlp.down_proj: tensor \"model.layers.32.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.input_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.32.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.q_proj: failed to load quantized weight model.layers.33.self_attn.q_proj: tensor \"model.layers.33.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.k_proj: failed to load quantized weight model.layers.33.self_attn.k_proj: tensor \"model.layers.33.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.v_proj: failed to load quantized weight model.layers.33.self_attn.v_proj: tensor \"model.layers.33.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.o_proj: failed to load quantized weight model.layers.33.self_attn.o_proj: tensor \"model.layers.33.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.mlp.gate_proj: failed to load quantized weight model.layers.33.mlp.gate_proj: tensor \"model.layers.33.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.mlp.up_proj: failed to load quantized weight model.layers.33.mlp.up_proj: tensor \"model.layers.33.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.mlp.down_proj: failed to load quantized weight model.layers.33.mlp.down_proj: tensor \"model.layers.33.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.input_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.33.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.q_proj: failed to load quantized weight model.layers.34.self_attn.q_proj: tensor \"model.layers.34.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.k_proj: failed to load quantized weight model.layers.34.self_attn.k_proj: tensor \"model.layers.34.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.v_proj: failed to load quantized weight model.layers.34.self_attn.v_proj: tensor \"model.layers.34.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.o_proj: failed to load quantized weight model.layers.34.self_attn.o_proj: tensor \"model.layers.34.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.mlp.gate_proj: failed to load quantized weight model.layers.34.mlp.gate_proj: tensor \"model.layers.34.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.mlp.up_proj: failed to load quantized weight model.layers.34.mlp.up_proj: tensor \"model.layers.34.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.mlp.down_proj: failed to load quantized weight model.layers.34.mlp.down_proj: tensor \"model.layers.34.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.input_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.34.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.q_proj: failed to load quantized weight model.layers.35.self_attn.q_proj: tensor \"model.layers.35.self_attn.q_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.k_proj: failed to load quantized weight model.layers.35.self_attn.k_proj: tensor \"model.layers.35.self_attn.k_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.v_proj: failed to load quantized weight model.layers.35.self_attn.v_proj: tensor \"model.layers.35.self_attn.v_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.o_proj: failed to load quantized weight model.layers.35.self_attn.o_proj: tensor \"model.layers.35.self_attn.o_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.q_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.self_attn.k_norm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.gate_proj: failed to load quantized weight model.layers.35.mlp.gate_proj: tensor \"model.layers.35.mlp.gate_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.up_proj: failed to load quantized weight model.layers.35.mlp.up_proj: tensor \"model.layers.35.mlp.up_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.mlp.down_proj: failed to load quantized weight model.layers.35.mlp.down_proj: tensor \"model.layers.35.mlp.down_proj.weight\" not found"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.input_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.layers.35.post_attention_layernorm.weight"
time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg="  model.norm.weight"
time=2026-02-13T10:27:29.637+01:00 level=INFO source=server.go:134 msg=mlx-runner msg="  Loading text encoder... "
time=2026-02-13T10:27:29.637+01:00 level=INFO source=server.go:363 msg="stopping mlx runner subprocess" pid=72545
[GIN] 2026/02/13 - 10:27:34 | 500 |  5.958062833s |       127.0.0.1 | POST     "/api/generate"

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.16.1

Originally created by @slyapustin on GitHub (Feb 13, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14231 Originally assigned to: @pdevine on GitHub. ### What is the issue? Not sure if it's relevant to the recent Ollama update or MacOS update, but I notice it after both of them got updated. It was working fine before. ``` sergey@Ultra ~ % ollama run x/z-image-turbo "a simple photo of an orange on a table" Error: 500 Internal Server Error: mlx runner failed: model.norm.weight (exit: exit status 1) ``` ### Relevant log output ```shell time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.mlp.gate_proj: failed to load quantized weight model.layers.18.mlp.gate_proj: tensor \"model.layers.18.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.mlp.up_proj: failed to load quantized weight model.layers.18.mlp.up_proj: tensor \"model.layers.18.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.mlp.down_proj: failed to load quantized weight model.layers.18.mlp.down_proj: tensor \"model.layers.18.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.18.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.q_proj: failed to load quantized weight model.layers.19.self_attn.q_proj: tensor \"model.layers.19.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.k_proj: failed to load quantized weight model.layers.19.self_attn.k_proj: tensor \"model.layers.19.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.v_proj: failed to load quantized weight model.layers.19.self_attn.v_proj: tensor \"model.layers.19.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.o_proj: failed to load quantized weight model.layers.19.self_attn.o_proj: tensor \"model.layers.19.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.mlp.gate_proj: failed to load quantized weight model.layers.19.mlp.gate_proj: tensor \"model.layers.19.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.mlp.up_proj: failed to load quantized weight model.layers.19.mlp.up_proj: tensor \"model.layers.19.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.mlp.down_proj: failed to load quantized weight model.layers.19.mlp.down_proj: tensor \"model.layers.19.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.19.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.q_proj: failed to load quantized weight model.layers.20.self_attn.q_proj: tensor \"model.layers.20.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.k_proj: failed to load quantized weight model.layers.20.self_attn.k_proj: tensor \"model.layers.20.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.v_proj: failed to load quantized weight model.layers.20.self_attn.v_proj: tensor \"model.layers.20.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.o_proj: failed to load quantized weight model.layers.20.self_attn.o_proj: tensor \"model.layers.20.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.mlp.gate_proj: failed to load quantized weight model.layers.20.mlp.gate_proj: tensor \"model.layers.20.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.mlp.up_proj: failed to load quantized weight model.layers.20.mlp.up_proj: tensor \"model.layers.20.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.mlp.down_proj: failed to load quantized weight model.layers.20.mlp.down_proj: tensor \"model.layers.20.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.20.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.q_proj: failed to load quantized weight model.layers.21.self_attn.q_proj: tensor \"model.layers.21.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.k_proj: failed to load quantized weight model.layers.21.self_attn.k_proj: tensor \"model.layers.21.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.v_proj: failed to load quantized weight model.layers.21.self_attn.v_proj: tensor \"model.layers.21.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.o_proj: failed to load quantized weight model.layers.21.self_attn.o_proj: tensor \"model.layers.21.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.mlp.gate_proj: failed to load quantized weight model.layers.21.mlp.gate_proj: tensor \"model.layers.21.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.mlp.up_proj: failed to load quantized weight model.layers.21.mlp.up_proj: tensor \"model.layers.21.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.mlp.down_proj: failed to load quantized weight model.layers.21.mlp.down_proj: tensor \"model.layers.21.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.21.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.q_proj: failed to load quantized weight model.layers.22.self_attn.q_proj: tensor \"model.layers.22.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.k_proj: failed to load quantized weight model.layers.22.self_attn.k_proj: tensor \"model.layers.22.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.v_proj: failed to load quantized weight model.layers.22.self_attn.v_proj: tensor \"model.layers.22.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.o_proj: failed to load quantized weight model.layers.22.self_attn.o_proj: tensor \"model.layers.22.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.mlp.gate_proj: failed to load quantized weight model.layers.22.mlp.gate_proj: tensor \"model.layers.22.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.mlp.up_proj: failed to load quantized weight model.layers.22.mlp.up_proj: tensor \"model.layers.22.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.mlp.down_proj: failed to load quantized weight model.layers.22.mlp.down_proj: tensor \"model.layers.22.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.22.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.q_proj: failed to load quantized weight model.layers.23.self_attn.q_proj: tensor \"model.layers.23.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.k_proj: failed to load quantized weight model.layers.23.self_attn.k_proj: tensor \"model.layers.23.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.v_proj: failed to load quantized weight model.layers.23.self_attn.v_proj: tensor \"model.layers.23.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.o_proj: failed to load quantized weight model.layers.23.self_attn.o_proj: tensor \"model.layers.23.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.mlp.gate_proj: failed to load quantized weight model.layers.23.mlp.gate_proj: tensor \"model.layers.23.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.mlp.up_proj: failed to load quantized weight model.layers.23.mlp.up_proj: tensor \"model.layers.23.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.mlp.down_proj: failed to load quantized weight model.layers.23.mlp.down_proj: tensor \"model.layers.23.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.23.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.q_proj: failed to load quantized weight model.layers.24.self_attn.q_proj: tensor \"model.layers.24.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.k_proj: failed to load quantized weight model.layers.24.self_attn.k_proj: tensor \"model.layers.24.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.v_proj: failed to load quantized weight model.layers.24.self_attn.v_proj: tensor \"model.layers.24.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.o_proj: failed to load quantized weight model.layers.24.self_attn.o_proj: tensor \"model.layers.24.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.mlp.gate_proj: failed to load quantized weight model.layers.24.mlp.gate_proj: tensor \"model.layers.24.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.mlp.up_proj: failed to load quantized weight model.layers.24.mlp.up_proj: tensor \"model.layers.24.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.mlp.down_proj: failed to load quantized weight model.layers.24.mlp.down_proj: tensor \"model.layers.24.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.24.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.q_proj: failed to load quantized weight model.layers.25.self_attn.q_proj: tensor \"model.layers.25.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.k_proj: failed to load quantized weight model.layers.25.self_attn.k_proj: tensor \"model.layers.25.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.v_proj: failed to load quantized weight model.layers.25.self_attn.v_proj: tensor \"model.layers.25.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.o_proj: failed to load quantized weight model.layers.25.self_attn.o_proj: tensor \"model.layers.25.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.mlp.gate_proj: failed to load quantized weight model.layers.25.mlp.gate_proj: tensor \"model.layers.25.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.mlp.up_proj: failed to load quantized weight model.layers.25.mlp.up_proj: tensor \"model.layers.25.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.mlp.down_proj: failed to load quantized weight model.layers.25.mlp.down_proj: tensor \"model.layers.25.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.25.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.q_proj: failed to load quantized weight model.layers.26.self_attn.q_proj: tensor \"model.layers.26.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.k_proj: failed to load quantized weight model.layers.26.self_attn.k_proj: tensor \"model.layers.26.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.v_proj: failed to load quantized weight model.layers.26.self_attn.v_proj: tensor \"model.layers.26.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.o_proj: failed to load quantized weight model.layers.26.self_attn.o_proj: tensor \"model.layers.26.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.mlp.gate_proj: failed to load quantized weight model.layers.26.mlp.gate_proj: tensor \"model.layers.26.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.mlp.up_proj: failed to load quantized weight model.layers.26.mlp.up_proj: tensor \"model.layers.26.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.mlp.down_proj: failed to load quantized weight model.layers.26.mlp.down_proj: tensor \"model.layers.26.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.26.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.q_proj: failed to load quantized weight model.layers.27.self_attn.q_proj: tensor \"model.layers.27.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.k_proj: failed to load quantized weight model.layers.27.self_attn.k_proj: tensor \"model.layers.27.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.v_proj: failed to load quantized weight model.layers.27.self_attn.v_proj: tensor \"model.layers.27.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.o_proj: failed to load quantized weight model.layers.27.self_attn.o_proj: tensor \"model.layers.27.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.mlp.gate_proj: failed to load quantized weight model.layers.27.mlp.gate_proj: tensor \"model.layers.27.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.mlp.up_proj: failed to load quantized weight model.layers.27.mlp.up_proj: tensor \"model.layers.27.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.mlp.down_proj: failed to load quantized weight model.layers.27.mlp.down_proj: tensor \"model.layers.27.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.27.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.q_proj: failed to load quantized weight model.layers.28.self_attn.q_proj: tensor \"model.layers.28.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.k_proj: failed to load quantized weight model.layers.28.self_attn.k_proj: tensor \"model.layers.28.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.v_proj: failed to load quantized weight model.layers.28.self_attn.v_proj: tensor \"model.layers.28.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.o_proj: failed to load quantized weight model.layers.28.self_attn.o_proj: tensor \"model.layers.28.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.mlp.gate_proj: failed to load quantized weight model.layers.28.mlp.gate_proj: tensor \"model.layers.28.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.mlp.up_proj: failed to load quantized weight model.layers.28.mlp.up_proj: tensor \"model.layers.28.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.mlp.down_proj: failed to load quantized weight model.layers.28.mlp.down_proj: tensor \"model.layers.28.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.28.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.q_proj: failed to load quantized weight model.layers.29.self_attn.q_proj: tensor \"model.layers.29.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.k_proj: failed to load quantized weight model.layers.29.self_attn.k_proj: tensor \"model.layers.29.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.v_proj: failed to load quantized weight model.layers.29.self_attn.v_proj: tensor \"model.layers.29.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.o_proj: failed to load quantized weight model.layers.29.self_attn.o_proj: tensor \"model.layers.29.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.mlp.gate_proj: failed to load quantized weight model.layers.29.mlp.gate_proj: tensor \"model.layers.29.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.mlp.up_proj: failed to load quantized weight model.layers.29.mlp.up_proj: tensor \"model.layers.29.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.mlp.down_proj: failed to load quantized weight model.layers.29.mlp.down_proj: tensor \"model.layers.29.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.29.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.q_proj: failed to load quantized weight model.layers.30.self_attn.q_proj: tensor \"model.layers.30.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.k_proj: failed to load quantized weight model.layers.30.self_attn.k_proj: tensor \"model.layers.30.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.v_proj: failed to load quantized weight model.layers.30.self_attn.v_proj: tensor \"model.layers.30.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.o_proj: failed to load quantized weight model.layers.30.self_attn.o_proj: tensor \"model.layers.30.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.q_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.self_attn.k_norm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.mlp.gate_proj: failed to load quantized weight model.layers.30.mlp.gate_proj: tensor \"model.layers.30.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.mlp.up_proj: failed to load quantized weight model.layers.30.mlp.up_proj: tensor \"model.layers.30.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.mlp.down_proj: failed to load quantized weight model.layers.30.mlp.down_proj: tensor \"model.layers.30.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.input_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.30.post_attention_layernorm.weight" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.q_proj: failed to load quantized weight model.layers.31.self_attn.q_proj: tensor \"model.layers.31.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.553+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.k_proj: failed to load quantized weight model.layers.31.self_attn.k_proj: tensor \"model.layers.31.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.v_proj: failed to load quantized weight model.layers.31.self_attn.v_proj: tensor \"model.layers.31.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.o_proj: failed to load quantized weight model.layers.31.self_attn.o_proj: tensor \"model.layers.31.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.q_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.self_attn.k_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.mlp.gate_proj: failed to load quantized weight model.layers.31.mlp.gate_proj: tensor \"model.layers.31.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.mlp.up_proj: failed to load quantized weight model.layers.31.mlp.up_proj: tensor \"model.layers.31.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.mlp.down_proj: failed to load quantized weight model.layers.31.mlp.down_proj: tensor \"model.layers.31.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.input_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.31.post_attention_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.q_proj: failed to load quantized weight model.layers.32.self_attn.q_proj: tensor \"model.layers.32.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.k_proj: failed to load quantized weight model.layers.32.self_attn.k_proj: tensor \"model.layers.32.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.v_proj: failed to load quantized weight model.layers.32.self_attn.v_proj: tensor \"model.layers.32.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.o_proj: failed to load quantized weight model.layers.32.self_attn.o_proj: tensor \"model.layers.32.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.q_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.self_attn.k_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.mlp.gate_proj: failed to load quantized weight model.layers.32.mlp.gate_proj: tensor \"model.layers.32.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.mlp.up_proj: failed to load quantized weight model.layers.32.mlp.up_proj: tensor \"model.layers.32.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.mlp.down_proj: failed to load quantized weight model.layers.32.mlp.down_proj: tensor \"model.layers.32.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.input_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.32.post_attention_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.q_proj: failed to load quantized weight model.layers.33.self_attn.q_proj: tensor \"model.layers.33.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.k_proj: failed to load quantized weight model.layers.33.self_attn.k_proj: tensor \"model.layers.33.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.v_proj: failed to load quantized weight model.layers.33.self_attn.v_proj: tensor \"model.layers.33.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.o_proj: failed to load quantized weight model.layers.33.self_attn.o_proj: tensor \"model.layers.33.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.q_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.self_attn.k_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.mlp.gate_proj: failed to load quantized weight model.layers.33.mlp.gate_proj: tensor \"model.layers.33.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.mlp.up_proj: failed to load quantized weight model.layers.33.mlp.up_proj: tensor \"model.layers.33.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.mlp.down_proj: failed to load quantized weight model.layers.33.mlp.down_proj: tensor \"model.layers.33.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.input_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.33.post_attention_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.q_proj: failed to load quantized weight model.layers.34.self_attn.q_proj: tensor \"model.layers.34.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.k_proj: failed to load quantized weight model.layers.34.self_attn.k_proj: tensor \"model.layers.34.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.v_proj: failed to load quantized weight model.layers.34.self_attn.v_proj: tensor \"model.layers.34.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.o_proj: failed to load quantized weight model.layers.34.self_attn.o_proj: tensor \"model.layers.34.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.q_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.self_attn.k_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.mlp.gate_proj: failed to load quantized weight model.layers.34.mlp.gate_proj: tensor \"model.layers.34.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.mlp.up_proj: failed to load quantized weight model.layers.34.mlp.up_proj: tensor \"model.layers.34.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.mlp.down_proj: failed to load quantized weight model.layers.34.mlp.down_proj: tensor \"model.layers.34.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.input_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.34.post_attention_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.q_proj: failed to load quantized weight model.layers.35.self_attn.q_proj: tensor \"model.layers.35.self_attn.q_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.k_proj: failed to load quantized weight model.layers.35.self_attn.k_proj: tensor \"model.layers.35.self_attn.k_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.v_proj: failed to load quantized weight model.layers.35.self_attn.v_proj: tensor \"model.layers.35.self_attn.v_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.o_proj: failed to load quantized weight model.layers.35.self_attn.o_proj: tensor \"model.layers.35.self_attn.o_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.q_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.self_attn.k_norm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.gate_proj: failed to load quantized weight model.layers.35.mlp.gate_proj: tensor \"model.layers.35.mlp.gate_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.up_proj: failed to load quantized weight model.layers.35.mlp.up_proj: tensor \"model.layers.35.mlp.up_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.mlp.down_proj: failed to load quantized weight model.layers.35.mlp.down_proj: tensor \"model.layers.35.mlp.down_proj.weight\" not found" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.input_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.layers.35.post_attention_layernorm.weight" time=2026-02-13T10:27:29.554+01:00 level=WARN source=server.go:141 msg=mlx-runner msg=" model.norm.weight" time=2026-02-13T10:27:29.637+01:00 level=INFO source=server.go:134 msg=mlx-runner msg=" Loading text encoder... " time=2026-02-13T10:27:29.637+01:00 level=INFO source=server.go:363 msg="stopping mlx runner subprocess" pid=72545 [GIN] 2026/02/13 - 10:27:34 | 500 | 5.958062833s | 127.0.0.1 | POST "/api/generate" ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.16.1
GiteaMirror added the bug label 2026-04-29 09:43:26 -05:00
Author
Owner

@slyapustin commented on GitHub (Feb 13, 2026):

Downgrading back to 0.15.1 seems to have fixed the issue.

<!-- gh-comment-id:3896041018 --> @slyapustin commented on GitHub (Feb 13, 2026): Downgrading back to 0.15.1 seems to have fixed the issue.
Author
Owner

@dryoo commented on GitHub (Feb 13, 2026):

It was working fine before. can confirm

<!-- gh-comment-id:3897560522 --> @dryoo commented on GitHub (Feb 13, 2026): It was working fine before. can confirm
Author
Owner

@pdevine commented on GitHub (Feb 14, 2026):

Sorry about this guys. The fix is in for 0.16.2 if you want to try the pre-release.

<!-- gh-comment-id:3902183455 --> @pdevine commented on GitHub (Feb 14, 2026): Sorry about this guys. The fix is in for 0.16.2 if you want to try the pre-release.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55778