[GH-ISSUE #6440] Model architecture Gemma2ForCausalLm #50561

Closed
opened 2026-04-28 16:23:41 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @luisgg98 on GitHub (Aug 20, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6440

What is the issue?

Good afternoon, I would like to start marking I am not 100% sure whether this is an issue or maybe I am misunderstanding the concept of architecture.
I tried to create a model on ollama by using a Modelfile at version 0.3.0.

imagen
I go the previous error, so I decided to upgrade ollama to version 0.3.6
Which pops off error "unsupported architecture":
imagen

Configuration file of the model I have finetuned:

{
  "_name_or_path": "google/gemma-2-9b",
  "architectures": [
    "Gemma2ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "attn_logit_softcapping": 50.0,
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "eos_token_id": 1,
  "final_logit_softcapping": 30.0,
  "head_dim": 256,
  "hidden_act": "gelu_pytorch_tanh",
  "hidden_activation": "gelu_pytorch_tanh",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 8192,
  "model_type": "gemma2",
  "num_attention_heads": 16,
  "num_hidden_layers": 42,
  "num_key_value_heads": 8,
  "pad_token_id": 0,
  "query_pre_attn_scalar": 256,
  "rms_norm_eps": 1e-06,
  "rope_theta": 10000.0,
  "sliding_window": 4096,
  "sliding_window_size": 4096,
  "torch_dtype": "float16",
  "transformers_version": "4.43.3",
  "use_cache": false,
  "vocab_size": 256000
}

It's a model based on Gemma2 9b. Gemma2 works smoothly.
imagen

Modelfile:

FROM /path/model/
TEMPLATE """<start_of_turn>user/
{{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn>
<start_of_turn>model
{{ .Response }}<end_of_turn>"""
PARAMETER stop "<start_of_turn>"
PARAMETER stop "<end_of_turn>"
SYSTEM """Below is an instruction that describes a task. Write a response that appropriately completes the request.
Generate a concise summary in Spanish the following input. Here is the input"""

I don't understand by Gemma2 is supported but models based on it are not.
Thank you for reaching this point. Please if you have any clue about what's the issue please write it down bellow.

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.3.6

Originally created by @luisgg98 on GitHub (Aug 20, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6440 ### What is the issue? Good afternoon, I would like to start marking I am not 100% sure whether this is an issue or maybe I am misunderstanding the concept of architecture. I tried to create a model on ollama by using a Modelfile at version 0.3.0. ![imagen](https://github.com/user-attachments/assets/dfdd4b08-5eb8-4939-a536-fd739dd2e784) I go the previous error, so I decided to upgrade ollama to version 0.3.6 Which pops off error "unsupported architecture": ![imagen](https://github.com/user-attachments/assets/b51c374e-dec6-4e81-acf5-b292ce3af43d) Configuration file of the model I have finetuned: ```json { "_name_or_path": "google/gemma-2-9b", "architectures": [ "Gemma2ForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "attn_logit_softcapping": 50.0, "bos_token_id": 2, "cache_implementation": "hybrid", "eos_token_id": 1, "final_logit_softcapping": 30.0, "head_dim": 256, "hidden_act": "gelu_pytorch_tanh", "hidden_activation": "gelu_pytorch_tanh", "hidden_size": 3584, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 8192, "model_type": "gemma2", "num_attention_heads": 16, "num_hidden_layers": 42, "num_key_value_heads": 8, "pad_token_id": 0, "query_pre_attn_scalar": 256, "rms_norm_eps": 1e-06, "rope_theta": 10000.0, "sliding_window": 4096, "sliding_window_size": 4096, "torch_dtype": "float16", "transformers_version": "4.43.3", "use_cache": false, "vocab_size": 256000 } ``` It's a model based on Gemma2 9b. Gemma2 works smoothly. ![imagen](https://github.com/user-attachments/assets/965205ba-03df-4273-bb3e-2ad9ce34b807) Modelfile: ```Dockerfile FROM /path/model/ TEMPLATE """<start_of_turn>user/ {{ if .System }}{{ .System }} {{ end }}{{ .Prompt }}<end_of_turn> <start_of_turn>model {{ .Response }}<end_of_turn>""" PARAMETER stop "<start_of_turn>" PARAMETER stop "<end_of_turn>" SYSTEM """Below is an instruction that describes a task. Write a response that appropriately completes the request. Generate a concise summary in Spanish the following input. Here is the input""" ``` I don't understand by Gemma2 is supported but models based on it are not. Thank you for reaching this point. Please if you have any clue about what's the issue please write it down bellow. ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.3.6
GiteaMirror added the bug label 2026-04-28 16:23:41 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 20, 2024):

The create function of ollama primarily works with GGUF files. The ability to import safetensors (I'm assuming that's what's in /path/model) is a convenience function but it's limited in the architectures it can process. If you want to import a model that is not in the supported architecture, you need to use the convert_hf_to_gguf.py script in llama.cpp

<!-- gh-comment-id:2298609585 --> @rick-github commented on GitHub (Aug 20, 2024): The `create` function of ollama primarily works with GGUF files. The ability to import safetensors (I'm assuming that's what's in /path/model) is a convenience function but it's limited in the architectures it can process. If you want to import a model that is not in the supported architecture, you need to use the `convert_hf_to_gguf.py` script in [llama.cpp](https://github.com/ggerganov/llama.cpp)
Author
Owner

@mxyng commented on GitHub (Aug 21, 2024):

Gemma2 conversion is implemented by https://github.com/ollama/ollama/pull/5365 which has been merged and will be in the next release

<!-- gh-comment-id:2302936214 --> @mxyng commented on GitHub (Aug 21, 2024): Gemma2 conversion is implemented by https://github.com/ollama/ollama/pull/5365 which has been merged and will be in the next release
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50561