[GH-ISSUE #10093] Error creating Ollama model from hugging face microsoft/phi4 #53129

Closed
opened 2026-04-29 02:02:05 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @nishtahir on GitHub (Apr 2, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10093

What is the issue?

Ollama model fails to run when created using microsoft/phi-4 huggingface model with

Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'rope_factors_long.weight' has wrong shape; expected    64, got     0,     1,     1,     1
llama_model_load_from_file_impl: failed to load model

Steps to reproduce

  1. Save the hugging face model output
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-4")
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-4")

base_model.save_pretrained("model_output")
tokenizer.save_pretrained("model_output")
  1. Create a Modelfile in the model_output directory
FROM .
  1. Create an ollama model using the Modelfile
ollama create -f model_output/Modelfile microsoft/phi-4

The model is created successfully

...
converting model 
using existing layer sha256:540d5dbc018433acaadee51346944cb9b79f6dcb50c653a784a5c621061226c6 
using autodetected template chatml 
using existing layer sha256:f02dd72bb2423204352eabc5637b44d79d17f109fdb510a7c51455892aa2d216 
writing manifest 
success 
  1. Attempt to run the model
ollama run microsoft/phi-4

This fails with Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'rope_factors_long.weight' has wrong shape; expected 64, got 0, 1, 1, 1 llama_model_load_from_file_impl: failed to load model

Relevant log output

ollama show microsoft/phi-4

  Model
    architecture        phi3     
    parameters          14.7B    
    context length      16384    
    embedding length    5120     
    quantization        F16      

  Parameters
    stop    "<|im_start|>"    
    stop    "<|im_end|>"

OS

Linux

GPU

Nvidia

CPU

AMD, Intel

Ollama version

0.6.2

Originally created by @nishtahir on GitHub (Apr 2, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10093 ### What is the issue? Ollama model fails to run when created using `microsoft/phi-4` huggingface model with ``` Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'rope_factors_long.weight' has wrong shape; expected 64, got 0, 1, 1, 1 llama_model_load_from_file_impl: failed to load model ``` # Steps to reproduce 1. Save the hugging face model output ``` from transformers import AutoModelForCausalLM, AutoTokenizer base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-4") tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-4") base_model.save_pretrained("model_output") tokenizer.save_pretrained("model_output") ``` 2. Create a `Modelfile` in the `model_output` directory ``` FROM . ``` 3. Create an ollama model using the `Modelfile` ``` ollama create -f model_output/Modelfile microsoft/phi-4 ``` The model is created successfully ``` ... converting model using existing layer sha256:540d5dbc018433acaadee51346944cb9b79f6dcb50c653a784a5c621061226c6 using autodetected template chatml using existing layer sha256:f02dd72bb2423204352eabc5637b44d79d17f109fdb510a7c51455892aa2d216 writing manifest success ``` 4. Attempt to run the model ``` ollama run microsoft/phi-4 ``` This fails with `Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'rope_factors_long.weight' has wrong shape; expected 64, got 0, 1, 1, 1 llama_model_load_from_file_impl: failed to load model` ### Relevant log output ```shell ollama show microsoft/phi-4 Model architecture phi3 parameters 14.7B context length 16384 embedding length 5120 quantization F16 Parameters stop "<|im_start|>" stop "<|im_end|>" ``` ### OS Linux ### GPU Nvidia ### CPU AMD, Intel ### Ollama version 0.6.2
GiteaMirror added the bug label 2026-04-29 02:02:05 -05:00
Author
Owner

@nishtahir commented on GitHub (Apr 2, 2025):

A workaround appears to be converting the model to GGUF using llamacpp and creating the ollama model that way.

llama-convert-hf-to-gguf ./model_output
ollama create -f model_output/Modelfile microsoft/phi-4 
<!-- gh-comment-id:2773748379 --> @nishtahir commented on GitHub (Apr 2, 2025): A workaround appears to be converting the model to GGUF using `llamacpp` and creating the ollama model that way. ``` llama-convert-hf-to-gguf ./model_output ollama create -f model_output/Modelfile microsoft/phi-4 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53129