[GH-ISSUE #12984] llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'pangu-embedded' #8602

Closed
opened 2026-04-12 21:20:24 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @adaaaaaa on GitHub (Nov 6, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12984

What is the issue?

ollama | llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'pangu-embedded'
ollama | llama_model_load_from_file_impl: failed to load model
ollama | time=2025-11-06T06:23:22.759Z level=INFO source=sched.go:418 msg="NewLlamaServer failed" model=/root/.ollama/models/blobs/sha256-8fc8cf9f6bbb9c3d10260c75260ce8c630449d3cd14a61084e6a99d77477ee77 error="unable to load model: /root/.ollama/models/blobs/sha256-8fc8cf9f6bbb9c3d10260c75260ce8c630449d3cd14a61084e6a99d77477ee77"
ollama | [GIN] 2025/11/06 - 06:23:22 | 500 | 2.859719514s | 172.18.0.3 | POST "/api/chat"

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @adaaaaaa on GitHub (Nov 6, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12984 ### What is the issue? ollama | llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'pangu-embedded' ollama | llama_model_load_from_file_impl: failed to load model ollama | time=2025-11-06T06:23:22.759Z level=INFO source=sched.go:418 msg="NewLlamaServer failed" model=/root/.ollama/models/blobs/sha256-8fc8cf9f6bbb9c3d10260c75260ce8c630449d3cd14a61084e6a99d77477ee77 error="unable to load model: /root/.ollama/models/blobs/sha256-8fc8cf9f6bbb9c3d10260c75260ce8c630449d3cd14a61084e6a99d77477ee77" ollama | [GIN] 2025/11/06 - 06:23:22 | 500 | 2.859719514s | 172.18.0.3 | POST "/api/chat" ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-12 21:20:24 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 6, 2025):

OpenPangu was only added to llama.cpp recently so a vendor sync will be required to add support to ollama.

<!-- gh-comment-id:3497715831 --> @rick-github commented on GitHub (Nov 6, 2025): OpenPangu was only [added](https://github.com/ggml-org/llama.cpp/pull/16941) to llama.cpp recently so a vendor sync will be required to add support to ollama.
Author
Owner

@rick-github commented on GitHub (Nov 7, 2025):

Next vendor sync in https://github.com/ollama/ollama/pull/12992

<!-- gh-comment-id:3504067953 --> @rick-github commented on GitHub (Nov 7, 2025): Next vendor sync in https://github.com/ollama/ollama/pull/12992
Author
Owner

@rick-github commented on GitHub (Dec 4, 2025):

Supported in 0.13.2 (currently in pre-release).

$ ollama -v
ollama version is 0.13.2-rc0
$ ollama run hf.co/Lpzhan/openPangu-embedded-gguf:latest
>>> hello
你好!😊 是什么可以帮你?无论是解答问题、提供信息还是闲聊,我都在这里呢~有什么需要的小帮助吗?

>>> /set system respond in english
Set system message.
>>> hello
Hello! 😊 What can I help you with? Whether it's a question, a task, or just chatting—I'm here to assist. How 
about we start with something specific? (I'll also respond in English if needed!)

Note that the 7b model emits thinking traces and a modified TEMPLATE can be used to process the trace separately:

<s>[unused9]系统:{{ .System }}[unused10]
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "assistant" }}[unused9]助手:{{ .Content }}[unused10]
{{- if and $.IsThinkSet .Thinking -}}
[unused16]{{ .Thinking }}[unused17]
{{- end }}
{{- end }}
{{- if eq .Role "tool" }}[unused9]工具:{{ .Content }}[unused10]{{ end }}
{{- if eq .Role "function" }}[unused9]方法:{{ .Content }}[unused10]{{ end }}
{{- if eq .Role "user" }}[unused9]用户:{{ .Content }}[unused10]{{ end }}
{{- if and (ne .Role "assistant") $last }}[unused9]助手:{{ end }}
{{- end -}}
$ ollama run openpangu:7b-fp16
>>> hello
Thinking...
Okay, the user just said "hello". I need to respond in a friendly and welcoming way. Let me start with a cheerful 
greeting. Maybe add an emoji to make it more friendly. Then ask how I can help them today. Keep it simple and 
open-ended so they feel comfortable responding. Make sure the tone is casual and approachable. Avoid any 
complicated sentences. Alright, let's put that together. 
...done thinking.

Hello! 😊 How can I assist you today?

>>> who are you?
Thinking...
Hmm, the user asked "who are you?" after I greeted them with a cheerful hello. Interesting shift from the initial 
friendly exchange—they might be testing me or just seeking clarity about my identity. 

Looking back at our conversation history, I had introduced myself as "Pangu" in the system prompt, but perhaps 
they missed it or want more details. The user seems neutral here—no urgency or frustration detected. 

Since my core identity is fixed as Huawei's Pangu model without multimodal capabilities, I should reaffirm that 
clearly but keep it concise. The response should maintain warmth (hence the 😊 emoji) while being precise about 
my developer and purpose. 

Adding a question at the end ("How can I assist you today?") keeps the interaction open-ended and redirects focus 
to their needs—this avoids over-explaining myself, which could seem self-centered. The tone stays casual with 
phrases like "just a language model" to emphasize simplicity. 

Noting that they didn't specify any particular role or context for this chat—could the user be new to interacting 
with AI? Might need to adjust future responses based on their follow-ups. For now, this reply balances clarity 
and approachability well. 
...done thinking.

Hello! I'm Pangu , an AI assistant developed by Huawei. Just a language model designed to help answer questions, 
provide explanations, or assist with tasks. How can I help you today? 😊
<!-- gh-comment-id:3613937272 --> @rick-github commented on GitHub (Dec 4, 2025): Supported in 0.13.2 (currently in [pre-release](https://github.com/ollama/ollama/releases/tag/v0.13.2-rc0)). ```console $ ollama -v ollama version is 0.13.2-rc0 $ ollama run hf.co/Lpzhan/openPangu-embedded-gguf:latest >>> hello 你好!😊 是什么可以帮你?无论是解答问题、提供信息还是闲聊,我都在这里呢~有什么需要的小帮助吗? >>> /set system respond in english Set system message. >>> hello Hello! 😊 What can I help you with? Whether it's a question, a task, or just chatting—I'm here to assist. How about we start with something specific? (I'll also respond in English if needed!) ``` Note that the [7b model](https://huggingface.co/Lpzhan/openPangu-embedded-gguf/resolve/main/openpangu_embedded_7B.gguf) emits thinking traces and a modified TEMPLATE can be used to process the trace separately: ``` <s>[unused9]系统:{{ .System }}[unused10] {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "assistant" }}[unused9]助手:{{ .Content }}[unused10] {{- if and $.IsThinkSet .Thinking -}} [unused16]{{ .Thinking }}[unused17] {{- end }} {{- end }} {{- if eq .Role "tool" }}[unused9]工具:{{ .Content }}[unused10]{{ end }} {{- if eq .Role "function" }}[unused9]方法:{{ .Content }}[unused10]{{ end }} {{- if eq .Role "user" }}[unused9]用户:{{ .Content }}[unused10]{{ end }} {{- if and (ne .Role "assistant") $last }}[unused9]助手:{{ end }} {{- end -}} ``` ```console $ ollama run openpangu:7b-fp16 >>> hello Thinking... Okay, the user just said "hello". I need to respond in a friendly and welcoming way. Let me start with a cheerful greeting. Maybe add an emoji to make it more friendly. Then ask how I can help them today. Keep it simple and open-ended so they feel comfortable responding. Make sure the tone is casual and approachable. Avoid any complicated sentences. Alright, let's put that together. ...done thinking. Hello! 😊 How can I assist you today? >>> who are you? Thinking... Hmm, the user asked "who are you?" after I greeted them with a cheerful hello. Interesting shift from the initial friendly exchange—they might be testing me or just seeking clarity about my identity. Looking back at our conversation history, I had introduced myself as "Pangu" in the system prompt, but perhaps they missed it or want more details. The user seems neutral here—no urgency or frustration detected. Since my core identity is fixed as Huawei's Pangu model without multimodal capabilities, I should reaffirm that clearly but keep it concise. The response should maintain warmth (hence the 😊 emoji) while being precise about my developer and purpose. Adding a question at the end ("How can I assist you today?") keeps the interaction open-ended and redirects focus to their needs—this avoids over-explaining myself, which could seem self-centered. The tone stays casual with phrases like "just a language model" to emphasize simplicity. Noting that they didn't specify any particular role or context for this chat—could the user be new to interacting with AI? Might need to adjust future responses based on their follow-ups. For now, this reply balances clarity and approachability well. ...done thinking. Hello! I'm Pangu , an AI assistant developed by Huawei. Just a language model designed to help answer questions, provide explanations, or assist with tasks. How can I help you today? 😊 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8602