[GH-ISSUE #6272] Ollama Creat 手动部署 报错 Error: invalid file magic #3929

Closed
opened 2026-04-12 14:48:21 -05:00 by GiteaMirror · 22 comments
Owner

Originally created by @JaminYan on GitHub (Aug 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6272

What is the issue?

按飞书文档找的modelfile 文件,ollama creat 后输出报错
Error: invalid file magic ,因此无法部署在ollama

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.34

Originally created by @JaminYan on GitHub (Aug 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6272 ### What is the issue? 按飞书文档找的modelfile 文件,ollama creat 后输出报错 Error: invalid file magic ,因此无法部署在ollama ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.34
GiteaMirror added the bug label 2026-04-12 14:48:21 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 9, 2024):

Contents of modelfile? Which model are you trying to import?

<!-- gh-comment-id:2277206062 --> @rick-github commented on GitHub (Aug 9, 2024): Contents of modelfile? Which model are you trying to import?
Author
Owner

@JaminYan commented on GitHub (Aug 9, 2024):

FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf
FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf

TEMPLATE """{{ if .System }}<|im_start|>system

{{ .System }}<|im_end|>{{ end }}

{{ if .Prompt }}<|im_start|>user

{{ .Prompt }}<|im_end|>{{ end }}

<|im_start|>assistant<|im_end|>

{{ .Response }}<|im_end|>"""

PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|im_end|>"
PARAMETER num_ctx 2048

<!-- gh-comment-id:2277213158 --> @JaminYan commented on GitHub (Aug 9, 2024): FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf TEMPLATE """{{ if .System }}<|im_start|>system {{ .System }}<|im_end|>{{ end }} {{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|>{{ end }} <|im_start|>assistant<|im_end|> {{ .Response }}<|im_end|>""" PARAMETER stop "<|endoftext|>" PARAMETER stop "<|im_end|>" PARAMETER num_ctx 2048
Author
Owner

@JaminYan commented on GitHub (Aug 9, 2024):

ggml-model-Q4_K_M.gguf and ggml-model-f16.gguf are all from huggingface

<!-- gh-comment-id:2277215452 --> @JaminYan commented on GitHub (Aug 9, 2024): ggml-model-Q4_K_M.gguf and ggml-model-f16.gguf are all from huggingface
Author
Owner

@rick-github commented on GitHub (Aug 9, 2024):

MiniCPM-V-2_6 is not supported yet. When https://github.com/ggerganov/llama.cpp/pull/7599 is merged then it will work.

<!-- gh-comment-id:2277216027 --> @rick-github commented on GitHub (Aug 9, 2024): MiniCPM-V-2_6 is not supported yet. When https://github.com/ggerganov/llama.cpp/pull/7599 is merged then it will work.
Author
Owner

@JaminYan commented on GitHub (Aug 9, 2024):

Thanks

<!-- gh-comment-id:2277220786 --> @JaminYan commented on GitHub (Aug 9, 2024): Thanks
Author
Owner

@jmorganca commented on GitHub (Aug 25, 2024):

Closing as we are tracking if MiniCPM is supported! Thanks for the issue, and thanks @rick-github

<!-- gh-comment-id:2308979302 --> @jmorganca commented on GitHub (Aug 25, 2024): Closing as we are tracking if MiniCPM is supported! Thanks for the issue, and thanks @rick-github
Author
Owner

@arkohut commented on GitHub (Sep 29, 2024):

The official model minicpm-v is already there. So I think right now it is supported? But I still get error message "Error: invalid file magic" when I try to create a Q4_K_M model using ollama.

BTW ollama can create a fp16 model successfully.

<!-- gh-comment-id:2381363535 --> @arkohut commented on GitHub (Sep 29, 2024): The official model minicpm-v is already there. So I think right now it is supported? But I still get error message "Error: invalid file magic" when I try to create a Q4_K_M model using ollama. BTW ollama can create a fp16 model successfully.
Author
Owner

@yjflike1 commented on GitHub (Oct 3, 2024):

I meet the same issue...

<!-- gh-comment-id:2390820941 --> @yjflike1 commented on GitHub (Oct 3, 2024): I meet the same issue...
Author
Owner

@yzyhyt commented on GitHub (Nov 4, 2024):

I have met the same problem when creating minicpm-v 2.6 Q4_K_M model using ollama, while llama.cpp can smoothly run the model.

<!-- gh-comment-id:2454525992 --> @yzyhyt commented on GitHub (Nov 4, 2024): I have met the same problem when creating minicpm-v 2.6 Q4_K_M model using ollama, while llama.cpp can smoothly run the model.
Author
Owner

@rick-github commented on GitHub (Nov 4, 2024):

What commands are you running to create the model?

<!-- gh-comment-id:2454537676 --> @rick-github commented on GitHub (Nov 4, 2024): What commands are you running to create the model?
Author
Owner

@yzyhyt commented on GitHub (Nov 4, 2024):

What commands are you running to create the model?

I've used ollama create minicpm2.6 -f minicpmv2_6.Modelfile to create the model.

minicpmv2_6.Modelfile is like this:

FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf
FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf

TEMPLATE """{{ if .System }}<|im_start|>system

{{ .System }}<|im_end|>{{ end }}

{{ if .Prompt }}<|im_start|>user

{{ .Prompt }}<|im_end|>{{ end }}

<|im_start|>assistant<|im_end|>

{{ .Response }}<|im_end|>"""

PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|im_end|>"
PARAMETER num_ctx 2048
<!-- gh-comment-id:2454550838 --> @yzyhyt commented on GitHub (Nov 4, 2024): > What commands are you running to create the model? I've used `ollama create minicpm2.6 -f minicpmv2_6.Modelfile` to create the model. minicpmv2_6.Modelfile is like this: ``` FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf TEMPLATE """{{ if .System }}<|im_start|>system {{ .System }}<|im_end|>{{ end }} {{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|>{{ end }} <|im_start|>assistant<|im_end|> {{ .Response }}<|im_end|>""" PARAMETER stop "<|endoftext|>" PARAMETER stop "<|im_end|>" PARAMETER num_ctx 2048 ```
Author
Owner

@rick-github commented on GitHub (Nov 4, 2024):

Where are you getting the GGUF files from, or how are you creating them?

<!-- gh-comment-id:2454663035 --> @rick-github commented on GitHub (Nov 4, 2024): Where are you getting the GGUF files from, or how are you creating them?
Author
Owner

@yzyhyt commented on GitHub (Nov 4, 2024):

Where are you getting the GGUF files from, or how are you creating them?

I directly download the GGUF files from https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main instead of manually converting from pytorch weights.

<!-- gh-comment-id:2454671616 --> @yzyhyt commented on GitHub (Nov 4, 2024): > Where are you getting the GGUF files from, or how are you creating them? I directly download the GGUF files from https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main instead of manually converting from pytorch weights.
Author
Owner

@rick-github commented on GitHub (Nov 4, 2024):

The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with:

truncate --size=4681089336 ggml-model-Q4_K_M.gguf

On windows:

FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336
<!-- gh-comment-id:2454881282 --> @rick-github commented on GitHub (Nov 4, 2024): The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with: ``` truncate --size=4681089336 ggml-model-Q4_K_M.gguf ``` On windows: ``` FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336 ```
Author
Owner

@yzyhyt commented on GitHub (Nov 4, 2024):

The issue has been fixed. Thank you so much.
By the way, could you tell me how to swiftly judge if a file has extra null bytes on its end? I used xxd command but it's too slow.

<!-- gh-comment-id:2454912175 --> @yzyhyt commented on GitHub (Nov 4, 2024): The issue has been fixed. Thank you so much. By the way, could you tell me how to swiftly judge if a file has extra null bytes on its end? I used xxd command but it's too slow.
Author
Owner

@rick-github commented on GitHub (Nov 4, 2024):

$ xxd -R never -s -32 ggml-model-Q4_K_M.gguf
11703c120: aac7 5652 a521 b9a5 65b4 65b8 aca0 5554  ..VR.!..e.e...UT
11703c130: 80ad a564 9499 8900 0000 0000 0000 0000  ...d............
<!-- gh-comment-id:2455084841 --> @rick-github commented on GitHub (Nov 4, 2024): ```console $ xxd -R never -s -32 ggml-model-Q4_K_M.gguf 11703c120: aac7 5652 a521 b9a5 65b4 65b8 aca0 5554 ..VR.!..e.e...UT 11703c130: 80ad a564 9499 8900 0000 0000 0000 0000 ...d............ ```
Author
Owner

@rick-github commented on GitHub (Nov 4, 2024):

I have a modified ollama that I used to detect this problem.

diff --git a/llm/ggml.go b/llm/ggml.go
index e857d4b8..0f92965a 100644
--- a/llm/ggml.go
+++ b/llm/ggml.go
@@ -324,6 +324,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) {
 
        rs = bufioutil.NewBufferedSeeker(rs, 32<<10)
 
+        xx, _ := rs.Seek(0, io.SeekCurrent)
        var magic uint32
        if err := binary.Read(rs, binary.LittleEndian, &magic); err != nil {
                return nil, 0, err
@@ -340,7 +341,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) {
        case FILE_MAGIC_GGUF_BE:
                c = &containerGGUF{ByteOrder: binary.BigEndian, maxArraySize: maxArraySize}
        default:
-               return nil, 0, errors.New("invalid file magic")
+               return nil, 0, errors.New(fmt.Sprintf("invalid file magic at %d", xx))
        }
 
        model, err := c.Decode(rs)

<!-- gh-comment-id:2455096617 --> @rick-github commented on GitHub (Nov 4, 2024): I have a modified ollama that I used to detect this problem. ```diff diff --git a/llm/ggml.go b/llm/ggml.go index e857d4b8..0f92965a 100644 --- a/llm/ggml.go +++ b/llm/ggml.go @@ -324,6 +324,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) { rs = bufioutil.NewBufferedSeeker(rs, 32<<10) + xx, _ := rs.Seek(0, io.SeekCurrent) var magic uint32 if err := binary.Read(rs, binary.LittleEndian, &magic); err != nil { return nil, 0, err @@ -340,7 +341,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) { case FILE_MAGIC_GGUF_BE: c = &containerGGUF{ByteOrder: binary.BigEndian, maxArraySize: maxArraySize} default: - return nil, 0, errors.New("invalid file magic") + return nil, 0, errors.New(fmt.Sprintf("invalid file magic at %d", xx)) } model, err := c.Decode(rs) ```
Author
Owner

@yzyhyt commented on GitHub (Nov 5, 2024):

I appreciate your help and instruction. if your modification is merged, it will be easier to figure out the problem.

<!-- gh-comment-id:2456285947 --> @yzyhyt commented on GitHub (Nov 5, 2024): I appreciate your help and instruction. if your modification is merged, it will be easier to figure out the problem.
Author
Owner

@fengqiliang93 commented on GitHub (Nov 5, 2024):

The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with:

truncate --size=4681089336 ggml-model-Q4_K_M.gguf

On windows:

FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336

My ggml-model-Q4_K_M.gguf file is quantified by myself, and the size is 4681089440. After I follow this command, ollama create... worked, but ollama run... reports error was :

Error: llama runner process has terminated: error loading model: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete

<!-- gh-comment-id:2456708482 --> @fengqiliang93 commented on GitHub (Nov 5, 2024): > The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with: > > ``` > truncate --size=4681089336 ggml-model-Q4_K_M.gguf > ``` > > On windows: > > ``` > FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336 > ``` My ggml-model-Q4_K_M.gguf file is quantified by myself, and the size is 4681089440. After I follow this command, `ollama create...` worked, but `ollama run...` reports error was : **Error: llama runner process has terminated: error loading model: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete**
Author
Owner

@rick-github commented on GitHub (Nov 5, 2024):

My advice was based on that specific file. A different file will require different corrective measures.

<!-- gh-comment-id:2456733891 --> @rick-github commented on GitHub (Nov 5, 2024): My advice was based on that specific file. A different file will require different corrective measures.
Author
Owner

@arkohut commented on GitHub (Nov 6, 2024):

BTW using command ollama create xxxx -f Modelfile -q Q4_K_M to perform quantization with Ollama will not encounter this issue.

<!-- gh-comment-id:2460122496 --> @arkohut commented on GitHub (Nov 6, 2024): BTW using command `ollama create xxxx -f Modelfile -q Q4_K_M` to perform quantization with Ollama will not encounter this issue.
Author
Owner

@hao7Chen commented on GitHub (Nov 11, 2024):

The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with:

truncate --size=4681089336 ggml-model-Q4_K_M.gguf

On windows:

FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336

这对我有用,感谢。我的是 minicpm Q4_0,我觉得这种方式起码对于 minicpm gguff 文件是有用的

<!-- gh-comment-id:2467675570 --> @hao7Chen commented on GitHub (Nov 11, 2024): > The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with: > > ``` > truncate --size=4681089336 ggml-model-Q4_K_M.gguf > ``` > > On windows: > > ``` > FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336 > ``` 这对我有用,感谢。我的是 minicpm Q4_0,我觉得这种方式起码对于 minicpm gguff 文件是有用的
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3929