[GH-ISSUE #15700] MacbookPro M5 can't run qwen3.6:35b-a3b-q4_K_M #56526

Open
opened 2026-04-29 10:57:32 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @yesl0107 on GitHub (Apr 19, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15700

Originally assigned to: @jmorganca on GitHub.

What is the issue?

when exec: ollama run qwen3.6:35b-a3b-q4_K_M
result:
Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details

Relevant log output

ggml_metal_device_init: recommendedMaxWorkingSetSize  = 26800.60 MB
load_backend: loaded CPU backend from /Applications/Ollama.app/Contents/Resources/libggml-cpu.so
time=2026-04-19T18:28:09.809+08:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.FP16_VA=1 CPU.1.DOTPROD=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang)
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M5
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "In file included from program_source:2837:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368:
/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types"
    static_assert(__tensor_ops_detail::__is_same_v<_leftType, leftValueType>, "Input types must match cooperative tensor types");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:394:24: note: in instantiation of function template specialization 'mpp::tensor_ops::__mutmul2d_detail::__run<{32, 64, 32, false, true, false, 1}, metal::execution_simdgroups<4>, metal::tensor<threadgroup half, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::tensor<threadgroup bfloat, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::cooperative_tensor<float, metal::extents<int, 18446744073709551615, 18446744073709551615>, mpp::tensor_ops::__mutmul2d_detail::__operand_layout<{32, 64, 32, false, true, false, 1}, mpp::tensor_ops::__mutmul2d_detail::__matmul2d_cooperative_operand_index::destination, metal::execution_simdgroups<4>, bfloat, half, float, int>>>' requested here
    __mutmul2d_detail::__run<Descriptor, Scope, LeftOperandType,
                       ^
program_source:12147:12: note: in instantiation of function template specialization 'mpp::tensor_ops::matmul2d<{32, 64, 32, false, true, false, 1}, metal::execution_simdgroups<4>>::run<metal::tensor<threadgroup half, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::tensor<threadgroup bfloat, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::cooperative_tensor<float, metal::extents<int, 18446744073709551615, 18446744073709551615>, mpp::tensor_ops::__mutmul2d_detail::__operand_layout<{32, 64, 32, false, true, false, 1}, mpp::tensor_ops::__mutmul2d_detail::__matmul2d_cooperative_operand_index::destination, metal::execution_simdgroups<4>, bfloat, half, float, int>>, void>' requested here
        mm.run(sB, sA, cT);
           ^
In file included from program_source:2837:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368:
/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3267:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<half, bfloat>' "Input types must match cooperative tensor types"
    static_assert(__tensor_ops_detail::__is_same_v<_rightType, rightValueType>, "Input types must match cooperative tensor types");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
" UserInfo={NSLocalizedDescription=In file included from program_source:2837:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10:
In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368:
/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types"
    static_assert(__tensor_ops_detail::__is_same_v<_leftType, leftValueType>, "Input types must match cooperative tensor types");

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.21.0

Originally created by @yesl0107 on GitHub (Apr 19, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15700 Originally assigned to: @jmorganca on GitHub. ### What is the issue? when exec: ollama run qwen3.6:35b-a3b-q4_K_M result: Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details ### Relevant log output ```shell ggml_metal_device_init: recommendedMaxWorkingSetSize = 26800.60 MB load_backend: loaded CPU backend from /Applications/Ollama.app/Contents/Resources/libggml-cpu.so time=2026-04-19T18:28:09.809+08:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.FP16_VA=1 CPU.1.DOTPROD=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang) ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M5 ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected ggml_metal_init: will try to compile it on the fly ggml_metal_library_init: using embedded metal library ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "In file included from program_source:2837: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368: /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types" static_assert(__tensor_ops_detail::__is_same_v<_leftType, leftValueType>, "Input types must match cooperative tensor types"); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:394:24: note: in instantiation of function template specialization 'mpp::tensor_ops::__mutmul2d_detail::__run<{32, 64, 32, false, true, false, 1}, metal::execution_simdgroups<4>, metal::tensor<threadgroup half, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::tensor<threadgroup bfloat, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::cooperative_tensor<float, metal::extents<int, 18446744073709551615, 18446744073709551615>, mpp::tensor_ops::__mutmul2d_detail::__operand_layout<{32, 64, 32, false, true, false, 1}, mpp::tensor_ops::__mutmul2d_detail::__matmul2d_cooperative_operand_index::destination, metal::execution_simdgroups<4>, bfloat, half, float, int>>>' requested here __mutmul2d_detail::__run<Descriptor, Scope, LeftOperandType, ^ program_source:12147:12: note: in instantiation of function template specialization 'mpp::tensor_ops::matmul2d<{32, 64, 32, false, true, false, 1}, metal::execution_simdgroups<4>>::run<metal::tensor<threadgroup half, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::tensor<threadgroup bfloat, metal::extents<int, 18446744073709551615, 18446744073709551615>, metal::tensor_inline>, metal::cooperative_tensor<float, metal::extents<int, 18446744073709551615, 18446744073709551615>, mpp::tensor_ops::__mutmul2d_detail::__operand_layout<{32, 64, 32, false, true, false, 1}, mpp::tensor_ops::__mutmul2d_detail::__matmul2d_cooperative_operand_index::destination, metal::execution_simdgroups<4>, bfloat, half, float, int>>, void>' requested here mm.run(sB, sA, cT); ^ In file included from program_source:2837: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368: /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3267:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<half, bfloat>' "Input types must match cooperative tensor types" static_assert(__tensor_ops_detail::__is_same_v<_rightType, rightValueType>, "Input types must match cooperative tensor types"); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ " UserInfo={NSLocalizedDescription=In file included from program_source:2837: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MetalPerformancePrimitives.h:10: In file included from /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/MPPTensorOpsMatMul2d.h:368: /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266:5: error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types" static_assert(__tensor_ops_detail::__is_same_v<_leftType, leftValueType>, "Input types must match cooperative tensor types"); ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.21.0
GiteaMirror added the bug label 2026-04-29 10:57:32 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#56526