[GH-ISSUE #6272] Ollama Creat 手动部署报错 Error: invalid file magic #3929

New Issue

@rick-github commented on GitHub (Aug 9, 2024):

MiniCPM-V-2_6 is not supported yet. When https://github.com/ggerganov/llama.cpp/pull/7599 is merged then it will work.

@rick-github commented on GitHub (Aug 9, 2024): MiniCPM-V-2_6 is not supported yet. When https://github.com/ggerganov/llama.cpp/pull/7599 is merged then it will work.

GiteaMirror commented

@JaminYan commented on GitHub (Aug 9, 2024):

Thanks

@JaminYan commented on GitHub (Aug 9, 2024): Thanks

GiteaMirror commented

@jmorganca commented on GitHub (Aug 25, 2024):

Closing as we are tracking if MiniCPM is supported! Thanks for the issue, and thanks @rick-github

@jmorganca commented on GitHub (Aug 25, 2024): Closing as we are tracking if MiniCPM is supported! Thanks for the issue, and thanks @rick-github

GiteaMirror commented

2026-04-12 14:48:24 -05:00

@arkohut commented on GitHub (Sep 29, 2024):

The official model minicpm-v is already there. So I think right now it is supported? But I still get error message "Error: invalid file magic" when I try to create a Q4_K_M model using ollama.

BTW ollama can create a fp16 model successfully.

@arkohut commented on GitHub (Sep 29, 2024): The official model minicpm-v is already there. So I think right now it is supported? But I still get error message "Error: invalid file magic" when I try to create a Q4_K_M model using ollama. BTW ollama can create a fp16 model successfully.

GiteaMirror commented

@yjflike1 commented on GitHub (Oct 3, 2024):

I meet the same issue...

@yjflike1 commented on GitHub (Oct 3, 2024): I meet the same issue...

GiteaMirror commented

2026-04-12 14:48:24 -05:00

@yzyhyt commented on GitHub (Nov 4, 2024):

I have met the same problem when creating minicpm-v 2.6 Q4_K_M model using ollama, while llama.cpp can smoothly run the model.

@yzyhyt commented on GitHub (Nov 4, 2024): I have met the same problem when creating minicpm-v 2.6 Q4_K_M model using ollama, while llama.cpp can smoothly run the model.

GiteaMirror commented

2026-04-12 14:48:24 -05:00

@rick-github commented on GitHub (Nov 4, 2024):

What commands are you running to create the model?

@rick-github commented on GitHub (Nov 4, 2024): What commands are you running to create the model?

GiteaMirror commented

2026-04-12 14:48:25 -05:00

@yzyhyt commented on GitHub (Nov 4, 2024):

What commands are you running to create the model?

I've used ollama create minicpm2.6 -f minicpmv2_6.Modelfile to create the model.

minicpmv2_6.Modelfile is like this:

FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf
FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf

TEMPLATE """{{ if .System }}<|im_start|>system

{{ .System }}<|im_end|>{{ end }}

{{ if .Prompt }}<|im_start|>user

{{ .Prompt }}<|im_end|>{{ end }}

<|im_start|>assistant<|im_end|>

{{ .Response }}<|im_end|>"""

PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|im_end|>"
PARAMETER num_ctx 2048

@yzyhyt commented on GitHub (Nov 4, 2024): > What commands are you running to create the model? I've used `ollama create minicpm2.6 -f minicpmv2_6.Modelfile` to create the model. minicpmv2_6.Modelfile is like this: ``` FROM ./MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf FROM ./MiniCPM-V-2_6/mmproj-model-f16.gguf TEMPLATE """{{ if .System }}<|im_start|>system {{ .System }}<|im_end|>{{ end }} {{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|>{{ end }} <|im_start|>assistant<|im_end|> {{ .Response }}<|im_end|>""" PARAMETER stop "<|endoftext|>" PARAMETER stop "<|im_end|>" PARAMETER num_ctx 2048 ```

GiteaMirror commented

2026-04-12 14:48:25 -05:00

@rick-github commented on GitHub (Nov 4, 2024):

Where are you getting the GGUF files from, or how are you creating them?

@rick-github commented on GitHub (Nov 4, 2024): Where are you getting the GGUF files from, or how are you creating them?

GiteaMirror commented

@yzyhyt commented on GitHub (Nov 4, 2024):

Where are you getting the GGUF files from, or how are you creating them?

I directly download the GGUF files from https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main instead of manually converting from pytorch weights.

@yzyhyt commented on GitHub (Nov 4, 2024): > Where are you getting the GGUF files from, or how are you creating them? I directly download the GGUF files from https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main instead of manually converting from pytorch weights.

GiteaMirror commented

@rick-github commented on GitHub (Nov 4, 2024):

The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with:

truncate --size=4681089336 ggml-model-Q4_K_M.gguf

On windows:

FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336

@rick-github commented on GitHub (Nov 4, 2024): The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with: ``` truncate --size=4681089336 ggml-model-Q4_K_M.gguf ``` On windows: ``` FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336 ```

GiteaMirror commented

@yzyhyt commented on GitHub (Nov 4, 2024):

The issue has been fixed. Thank you so much.
By the way, could you tell me how to swiftly judge if a file has extra null bytes on its end? I used xxd command but it's too slow.

@yzyhyt commented on GitHub (Nov 4, 2024): The issue has been fixed. Thank you so much. By the way, could you tell me how to swiftly judge if a file has extra null bytes on its end? I used xxd command but it's too slow.

GiteaMirror commented

@rick-github commented on GitHub (Nov 4, 2024):

$ xxd -R never -s -32 ggml-model-Q4_K_M.gguf
11703c120: aac7 5652 a521 b9a5 65b4 65b8 aca0 5554  ..VR.!..e.e...UT
11703c130: 80ad a564 9499 8900 0000 0000 0000 0000  ...d............

@rick-github commented on GitHub (Nov 4, 2024): ```console $ xxd -R never -s -32 ggml-model-Q4_K_M.gguf 11703c120: aac7 5652 a521 b9a5 65b4 65b8 aca0 5554 ..VR.!..e.e...UT 11703c130: 80ad a564 9499 8900 0000 0000 0000 0000 ...d............ ```

GiteaMirror commented

@rick-github commented on GitHub (Nov 4, 2024):

I have a modified ollama that I used to detect this problem.

diff --git a/llm/ggml.go b/llm/ggml.go
index e857d4b8..0f92965a 100644
--- a/llm/ggml.go
+++ b/llm/ggml.go
@@ -324,6 +324,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) {
 
        rs = bufioutil.NewBufferedSeeker(rs, 32<<10)
 
+        xx, _ := rs.Seek(0, io.SeekCurrent)
        var magic uint32
        if err := binary.Read(rs, binary.LittleEndian, &magic); err != nil {
                return nil, 0, err
@@ -340,7 +341,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) {
        case FILE_MAGIC_GGUF_BE:
                c = &containerGGUF{ByteOrder: binary.BigEndian, maxArraySize: maxArraySize}
        default:
-               return nil, 0, errors.New("invalid file magic")
+               return nil, 0, errors.New(fmt.Sprintf("invalid file magic at %d", xx))
        }
 
        model, err := c.Decode(rs)

@rick-github commented on GitHub (Nov 4, 2024): I have a modified ollama that I used to detect this problem. ```diff diff --git a/llm/ggml.go b/llm/ggml.go index e857d4b8..0f92965a 100644 --- a/llm/ggml.go +++ b/llm/ggml.go @@ -324,6 +324,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) { rs = bufioutil.NewBufferedSeeker(rs, 32<<10) + xx, _ := rs.Seek(0, io.SeekCurrent) var magic uint32 if err := binary.Read(rs, binary.LittleEndian, &magic); err != nil { return nil, 0, err @@ -340,7 +341,7 @@ func DecodeGGML(rs io.ReadSeeker, maxArraySize int) (*GGML, int64, error) { case FILE_MAGIC_GGUF_BE: c = &containerGGUF{ByteOrder: binary.BigEndian, maxArraySize: maxArraySize} default: - return nil, 0, errors.New("invalid file magic") + return nil, 0, errors.New(fmt.Sprintf("invalid file magic at %d", xx)) } model, err := c.Decode(rs) ```

GiteaMirror commented

@yzyhyt commented on GitHub (Nov 5, 2024):

I appreciate your help and instruction. if your modification is merged, it will be easier to figure out the problem.

@yzyhyt commented on GitHub (Nov 5, 2024): I appreciate your help and instruction. if your modification is merged, it will be easier to figure out the problem.

GiteaMirror commented

@fengqiliang93 commented on GitHub (Nov 5, 2024):

The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with:
truncate --size=4681089336 ggml-model-Q4_K_M.gguf
On windows:
FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336

My ggml-model-Q4_K_M.gguf file is quantified by myself, and the size is 4681089440. After I follow this command, ollama create... worked, but ollama run... reports error was :

Error: llama runner process has terminated: error loading model: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete

@fengqiliang93 commented on GitHub (Nov 5, 2024): > The ggml-model-Q4_K_M.gguf that I downloaded has 8 extra null bytes on the end, these confuse ollama. If your copy of the file is 4681089344 bytes long, this is the problem. On linux, you can remove these bytes with: > > ``` > truncate --size=4681089336 ggml-model-Q4_K_M.gguf > ``` > > On windows: > > ``` > FSUTIL file seteof ggml-model-Q4_K_M.gguf 4681089336 > ``` My ggml-model-Q4_K_M.gguf file is quantified by myself, and the size is 4681089440. After I follow this command, `ollama create...` worked, but `ollama run...` reports error was : **Error: llama runner process has terminated: error loading model: tensor 'output.weight' data is not within the file bounds, model is corrupted or incomplete**

GiteaMirror commented