ollama

mirror of https://github.com/ollama/ollama.git synced 2026-03-12 01:45:29 -05:00

Files

Patrick Devine e9f6ea232f Add qwen3.5-next-moe support to MLX runner and models (#14417 )

This change adds support for qwen3.5-next-moe models (qwen3-next/qwen3.5-next/qwen3-coder) to the MLX runner. It also:

* introduces recurrent cache support and related MLX ops
* updates pipeline/runner integration and adds tests
* properly quantizes stacked expert tensors
* a Gated Delta Metal kernel for fast SSM inference
* adds new MLX calls for Conv1d, DepthwideConv1d, Contiguous, Exp, Log, SoftmaxAxis

2026-03-03 16:39:22 -08:00

client

model: add qwen3 support to mlxrunner (#14293 )

2026-02-17 13:58:49 -08:00

create_test.go

Add qwen3.5-next-moe support to MLX runner and models (#14417 )

2026-03-03 16:39:22 -08:00

create.go

Add qwen3.5-next-moe support to MLX runner and models (#14417 )

2026-03-03 16:39:22 -08:00

imagegen.go

safetensors quantization for mlx (#14184 )

2026-02-10 11:29:17 -08:00