[GH-ISSUE #4900] MiniCPM-Llama3-V-2_5 #3094

Open
opened 2026-04-12 13:32:00 -05:00 by GiteaMirror · 19 comments
Owner

Originally created by @kotaxyz on GitHub (Jun 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4900

This is the best open source vision model i have ever tried , We need support for it in ollama

Originally created by @kotaxyz on GitHub (Jun 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4900 This is the best open source vision model i have ever tried , We need support for it in ollama
GiteaMirror added the model label 2026-04-12 13:32:00 -05:00
Author
Owner

@GingerNg commented on GitHub (Jun 7, 2024):

https://ollama.com/hhao/openbmb-minicpm-llama3-v-2_5 I tried to pull model, but failed many times,
image

<!-- gh-comment-id:2154644512 --> @GingerNg commented on GitHub (Jun 7, 2024): https://ollama.com/hhao/openbmb-minicpm-llama3-v-2_5 I tried to pull model, but failed many times, ![image](https://github.com/ollama/ollama/assets/13898738/aa2d21c3-2251-413b-a9e4-a0e5ed53ed03)
Author
Owner

@Greatz08 commented on GitHub (Jun 8, 2024):

@kotaxyz did you tried to download gguf format files and create Modelfile with that and tried to run? I tried different small variant of it but got core dump error even tho i have enough vram and llava 1.6 was easily runnable without issues even tho it was comparatively bigger in size, i created one issue recently regarding this #4925 .I thought multimodels are supported so download gguf files and created modelfile and tried to run but got some issues of which i posted logs on my issue which u can see. My question is does multimodels need explict support from ollama like specific variants can only run or what ? Pretty much confused right now :-))

<!-- gh-comment-id:2155802882 --> @Greatz08 commented on GitHub (Jun 8, 2024): @kotaxyz did you tried to download gguf format files and create Modelfile with that and tried to run? I tried different small variant of it but got core dump error even tho i have enough vram and llava 1.6 was easily runnable without issues even tho it was comparatively bigger in size, i created one issue recently regarding this #4925 .I thought multimodels are supported so download gguf files and created modelfile and tried to run but got some issues of which i posted logs on my issue which u can see. My question is does multimodels need explict support from ollama like specific variants can only run or what ? Pretty much confused right now :-))
Author
Owner

@kotaxyz commented on GitHub (Jun 8, 2024):

@kotaxyz did you tried to download gguf format files and create Modelfile with that and tried to run? I tried different small variant of it but got core dump error even tho i have enough vram and llava 1.6 was easily runnable without issues even tho it was comparatively bigger in size, i created one issue recently regarding this #4925 .I thought multimodels are supported so download gguf files and created modelfile and tried to run but got some issues of which i posted logs on my issue which u can see. My question is does multimodels need explict support from ollama like specific variants can only run or what ? Pretty much confused right now :-))

hi @HakaishinShwet yes , i downloaded the files ,and created the modelfiles 2 times one time the model was hallucinating on every image i give the second time it was working as a language model but no vision capabilities

<!-- gh-comment-id:2155826100 --> @kotaxyz commented on GitHub (Jun 8, 2024): > @kotaxyz did you tried to download gguf format files and create Modelfile with that and tried to run? I tried different small variant of it but got core dump error even tho i have enough vram and llava 1.6 was easily runnable without issues even tho it was comparatively bigger in size, i created one issue recently regarding this #4925 .I thought multimodels are supported so download gguf files and created modelfile and tried to run but got some issues of which i posted logs on my issue which u can see. My question is does multimodels need explict support from ollama like specific variants can only run or what ? Pretty much confused right now :-)) hi @HakaishinShwet yes , i downloaded the files ,and created the modelfiles 2 times one time the model was hallucinating on every image i give the second time it was working as a language model but no vision capabilities
Author
Owner

@Greatz08 commented on GitHub (Jun 8, 2024):

@kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model.
When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925

<!-- gh-comment-id:2155833029 --> @Greatz08 commented on GitHub (Jun 8, 2024): @kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model. When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925
Author
Owner

@kotaxyz commented on GitHub (Jun 8, 2024):

@kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model. When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925

ah yes i was using only one model but i noticed that some people said its required to use both models in the modelfile but they also have been getting another type of error , thats why i was asking if this is official release from the people created ollama this at least will reduce the hassle and guarantee that it works for everyone , i don't want to get in the rabbit hole of me rebuilding the binaries

<!-- gh-comment-id:2156093170 --> @kotaxyz commented on GitHub (Jun 8, 2024): > @kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model. When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925 ah yes i was using only one model but i noticed that some people said its required to use both models in the modelfile but they also have been getting another type of error , thats why i was asking if this is official release from the people created ollama this at least will reduce the hassle and guarantee that it works for everyone , i don't want to get in the rabbit hole of me rebuilding the binaries
Author
Owner

@Greatz08 commented on GitHub (Jun 9, 2024):

@kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model. When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925

ah yes i was using only one model but i noticed that some people said its required to use both models in the modelfile but they also have been getting another type of error , thats why i was asking if this is official release from the people created ollama this at least will reduce the hassle and guarantee that it works for everyone , i don't want to get in the rabbit hole of me rebuilding the binaries

@kotaxyz yea that's what i created issue for #4925 do check reply on it then u will understand why we getting error

<!-- gh-comment-id:2156265068 --> @Greatz08 commented on GitHub (Jun 9, 2024): > > @kotaxyz did you use mmproj.F16.gguf file too while building model from Modelfile or forgot it? because without it even i can run standalone MiniCPM.gguf model but it wont act as multimodel but as chat only model. When i tried building without mmproj.gguf it worked as chat model but with mmproj.gguf usage in building multimodel i faced issues as it was not running and showing errors which i mentioned in my issue number 4925 > > ah yes i was using only one model but i noticed that some people said its required to use both models in the modelfile but they also have been getting another type of error , thats why i was asking if this is official release from the people created ollama this at least will reduce the hassle and guarantee that it works for everyone , i don't want to get in the rabbit hole of me rebuilding the binaries @kotaxyz yea that's what i created issue for #4925 do check reply on it then u will understand why we getting error
Author
Owner

@kotaxyz commented on GitHub (Jun 9, 2024):

https://ollama.com/hhao/openbmb-minicpm-llama3-v-2_5 I tried to pull model, but failed many times, image

@GingerNg yup same here , tried 4 times always stuck

<!-- gh-comment-id:2156611008 --> @kotaxyz commented on GitHub (Jun 9, 2024): > https://ollama.com/hhao/openbmb-minicpm-llama3-v-2_5 I tried to pull model, but failed many times, ![image](https://private-user-images.githubusercontent.com/13898738/337636364-aa2d21c3-2251-413b-a9e4-a0e5ed53ed03.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc5NDA2MDAsIm5iZiI6MTcxNzk0MDMwMCwicGF0aCI6Ii8xMzg5ODczOC8zMzc2MzYzNjQtYWEyZDIxYzMtMjI1MS00MTNiLWE5ZTQtYTBlNWVkNTNlZDAzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjA5VDEzMzgyMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPThlZjEzYzFhOTc5ZTBmYTNhNjU4NGJiY2E2MmRiOTYzN2JlNGNiZmYyYTUwMDIyZmQ5ZTM5YTFlZjNmMDAzNDQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.FVBywkEfyvYZxvyaSw0054hFGqaw5zL4PQ0ihvUQgpI) @GingerNg yup same here , tried 4 times always stuck
Author
Owner

@Greatz08 commented on GitHub (Jun 12, 2024):

@jmorganca any progress on this ? :-))

<!-- gh-comment-id:2162375591 --> @Greatz08 commented on GitHub (Jun 12, 2024): @jmorganca any progress on this ? :-))
Author
Owner

@kotaxyz commented on GitHub (Jun 19, 2024):

hi @jmorganca ,is there any plans to add this model

<!-- gh-comment-id:2178067325 --> @kotaxyz commented on GitHub (Jun 19, 2024): hi @jmorganca ,is there any plans to add this model
Author
Owner

@Milor123 commented on GitHub (Jul 4, 2024):

I have this error Error: llama runner process has terminated: signal: aborted (core dumped) in RTX 4070 12GB VRAM

ollama run hhao/openbmb-minicpm-llama3-v-2_5
pulling manifest 
pulling 391d11736c3c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.0 GB                         
pulling 010ec3ba94cb... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 4.9 GB                         
pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏  254 B                         
pulling 2c527a8fcba5... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏  124 B                         
pulling ada64ec88682... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏  493 B                         
verifying sha256 digest 
writing manifest 
removing any unused layers 
success 
Error: llama runner process has terminated: signal: aborted (core dumped) 

what should i do?

<!-- gh-comment-id:2209125868 --> @Milor123 commented on GitHub (Jul 4, 2024): I have this error Error: llama runner process has terminated: signal: aborted (core dumped) in RTX 4070 12GB VRAM ``` ollama run hhao/openbmb-minicpm-llama3-v-2_5 pulling manifest pulling 391d11736c3c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.0 GB pulling 010ec3ba94cb... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 4.9 GB pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 254 B pulling 2c527a8fcba5... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 124 B pulling ada64ec88682... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 493 B verifying sha256 digest writing manifest removing any unused layers success Error: llama runner process has terminated: signal: aborted (core dumped) ``` what should i do?
Author
Owner

@lstep commented on GitHub (Jul 8, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

<!-- gh-comment-id:2215385242 --> @lstep commented on GitHub (Jul 8, 2024): ``` $ ollama -v ollama version is 0.1.48 ``` Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector`
Author
Owner

@Forevery1 commented on GitHub (Jul 9, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model

https://github.com/ggerganov/llama.cpp/pull/7599

<!-- gh-comment-id:2215650752 --> @Forevery1 commented on GitHub (Jul 9, 2024): > ``` > $ ollama -v > ollama version is 0.1.48 > ``` > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > At present, this model is not supported, but llama.cpp has a PR about this model https://github.com/ggerganov/llama.cpp/pull/7599
Author
Owner

@Milor123 commented on GitHub (Jul 9, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model

ggerganov/llama.cpp#7599

What is it? Can be used it with ollama like a vision??

<!-- gh-comment-id:2217428697 --> @Milor123 commented on GitHub (Jul 9, 2024): > > ``` > > $ ollama -v > > ollama version is 0.1.48 > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > At present, this model is not supported, but llama.cpp has a PR about this model > > [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) What is it? Can be used it with ollama like a vision??
Author
Owner

@lstep commented on GitHub (Jul 9, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model
ggerganov/llama.cpp#7599

What is it? Can be used it with ollama like a vision??

Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example.

<!-- gh-comment-id:2217679606 --> @lstep commented on GitHub (Jul 9, 2024): > > > ``` > > > $ ollama -v > > > ollama version is 0.1.48 > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > > > > > At present, this model is not supported, but llama.cpp has a PR about this model > > [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) > > What is it? Can be used it with ollama like a vision?? Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example.
Author
Owner

@cmp-nct commented on GitHub (Jul 11, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model
ggerganov/llama.cpp#7599

What is it? Can be used it with ollama like a vision??

Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example.

moondream is currently not supported, the involved siglip-CLIP seems to not work correctly.

<!-- gh-comment-id:2223528626 --> @cmp-nct commented on GitHub (Jul 11, 2024): > > > > ``` > > > > $ ollama -v > > > > ollama version is 0.1.48 > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > > > > > > > > At present, this model is not supported, but llama.cpp has a PR about this model > > > [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) > > > > > > What is it? Can be used it with ollama like a vision?? > > Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example. moondream is currently not supported, the involved siglip-CLIP seems to not work correctly.
Author
Owner

@lstep commented on GitHub (Jul 11, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model
ggerganov/llama.cpp#7599

What is it? Can be used it with ollama like a vision??

Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example.

moondream is currently not supported, the involved siglip-CLIP seems to not work correctly.

What are you talking about??? moondream is officially supported, it is even in the official library (https://ollama.com/library/moondream). I've been using for months:

import (
        "context"
        "fmt"
        "log"
        "net/http"
        "net/url"
        "os"

        "github.com/ollama/ollama/api"
)

func main() {
        if len(os.Args) <= 1 {
                log.Fatal("usage: <image name>")
        }

        imgData, err := os.ReadFile(os.Args[1])
        if err != nil {
                log.Fatalf("error reading image file: %v", err)
        }

        clientURL, err := url.Parse("http://127.0.0.1:11434")
        if err != nil {
                log.Fatalf("error parsing client URL: %v", err)
        }

        client := api.NewClient(clientURL, http.DefaultClient)
        req := &api.GenerateRequest{
                Model:  "moondream:1.8b-v2-fp16",
                Prompt: "describe this image",
                Images: []api.ImageData{imgData},
        }

        ctx := context.Background()
        respFunc := func(resp api.GenerateResponse) error {
                fmt.Print(resp.Response)
                return nil
        }

        err = client.Generate(ctx, req, respFunc)
        if err != nil {
                log.Fatalf("error generating response: %v", err)
        }

        fmt.Println("A comment about life and its mysteries.")
}

dog

$ ./gollama dog.png 

In the image, a large brown dog with shaggy fur is the main focus. The dog's tongue is out and its mouth appears slightly open, giving off an impression of relaxation or playfulness. The dog stands on all fours, facing to the right side of the image, creating a sense of curiosity about what it might be seeing or doing in that direction. The background features a grassy area dotted with trees, adding a touch of nature to the scene and providing a tranquil setting for this furry companion.A comment about life and its mysteries.
<!-- gh-comment-id:2223702263 --> @lstep commented on GitHub (Jul 11, 2024): > > > > > ``` > > > > > $ ollama -v > > > > > ollama version is 0.1.48 > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > > > > > > > > > > > At present, this model is not supported, but llama.cpp has a PR about this model > > > > [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) > > > > > > > > > What is it? Can be used it with ollama like a vision?? > > > > > > Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example. > > moondream is currently not supported, the involved siglip-CLIP seems to not work correctly. What are you talking about??? moondream is officially supported, it is even in the official library (https://ollama.com/library/moondream). I've been using for months: ```go import ( "context" "fmt" "log" "net/http" "net/url" "os" "github.com/ollama/ollama/api" ) func main() { if len(os.Args) <= 1 { log.Fatal("usage: <image name>") } imgData, err := os.ReadFile(os.Args[1]) if err != nil { log.Fatalf("error reading image file: %v", err) } clientURL, err := url.Parse("http://127.0.0.1:11434") if err != nil { log.Fatalf("error parsing client URL: %v", err) } client := api.NewClient(clientURL, http.DefaultClient) req := &api.GenerateRequest{ Model: "moondream:1.8b-v2-fp16", Prompt: "describe this image", Images: []api.ImageData{imgData}, } ctx := context.Background() respFunc := func(resp api.GenerateResponse) error { fmt.Print(resp.Response) return nil } err = client.Generate(ctx, req, respFunc) if err != nil { log.Fatalf("error generating response: %v", err) } fmt.Println("A comment about life and its mysteries.") } ``` ![dog](https://github.com/user-attachments/assets/a8a281d5-5fd9-4814-9828-e9aa707518dc) ``` $ ./gollama dog.png In the image, a large brown dog with shaggy fur is the main focus. The dog's tongue is out and its mouth appears slightly open, giving off an impression of relaxation or playfulness. The dog stands on all fours, facing to the right side of the image, creating a sense of curiosity about what it might be seeing or doing in that direction. The background features a grassy area dotted with trees, adding a touch of nature to the scene and providing a tranquil setting for this furry companion.A comment about life and its mysteries. ```
Author
Owner

@cmp-nct commented on GitHub (Jul 12, 2024):

$ ollama -v
ollama version is 0.1.48

Trying to run hhao/openbmb-minicpm-llama3-v-2_5:q8_0, getting the following error: GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector

At present, this model is not supported, but llama.cpp has a PR about this model
ggerganov/llama.cpp#7599

What is it? Can be used it with ollama like a vision??

Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example.

moondream is currently not supported, the involved siglip-CLIP seems to not work correctly.

What are you talking about??? moondream is officially supported, it is even in the official library (https://ollama.com/library/moondream). I've been using for months:

import (
        "context"
        "fmt"
        "log"
        "net/http"
        "net/url"
        "os"

        "github.com/ollama/ollama/api"
)

func main() {
        if len(os.Args) <= 1 {
                log.Fatal("usage: <image name>")
        }

        imgData, err := os.ReadFile(os.Args[1])
        if err != nil {
                log.Fatalf("error reading image file: %v", err)
        }

        clientURL, err := url.Parse("http://127.0.0.1:11434")
        if err != nil {
                log.Fatalf("error parsing client URL: %v", err)
        }

        client := api.NewClient(clientURL, http.DefaultClient)
        req := &api.GenerateRequest{
                Model:  "moondream:1.8b-v2-fp16",
                Prompt: "describe this image",
                Images: []api.ImageData{imgData},
        }

        ctx := context.Background()
        respFunc := func(resp api.GenerateResponse) error {
                fmt.Print(resp.Response)
                return nil
        }

        err = client.Generate(ctx, req, respFunc)
        if err != nil {
                log.Fatalf("error generating response: %v", err)
        }

        fmt.Println("A comment about life and its mysteries.")
}

dog

$ ./gollama dog.png 

In the image, a large brown dog with shaggy fur is the main focus. The dog's tongue is out and its mouth appears slightly open, giving off an impression of relaxation or playfulness. The dog stands on all fours, facing to the right side of the image, creating a sense of curiosity about what it might be seeing or doing in that direction. The background features a grassy area dotted with trees, adding a touch of nature to the scene and providing a tranquil setting for this furry companion.A comment about life and its mysteries.

I'm sorry, I thought you mean moondream2. Which runs, but doesn't work properly (very poor quality).
I am not sure about moondream1, I recall I tested it and it did not work well on llama.cpp but that was many months ago.

<!-- gh-comment-id:2224326978 --> @cmp-nct commented on GitHub (Jul 12, 2024): > > > > > > ``` > > > > > > $ ollama -v > > > > > > ollama version is 0.1.48 > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying to run `hhao/openbmb-minicpm-llama3-v-2_5:q8_0`, getting the following error: `GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/examples/llava/clip.cpp:1024: new_clip->has_llava_projector` > > > > > > > > > > > > > > > At present, this model is not supported, but llama.cpp has a PR about this model > > > > > [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) > > > > > > > > > > > > What is it? Can be used it with ollama like a vision?? > > > > > > > > > Well, when llama.cpp will support it, yes. Vision is already supported on ollama, for supported LLMs, like moondream for example. > > > > > > moondream is currently not supported, the involved siglip-CLIP seems to not work correctly. > > What are you talking about??? moondream is officially supported, it is even in the official library (https://ollama.com/library/moondream). I've been using for months: > > ```go > import ( > "context" > "fmt" > "log" > "net/http" > "net/url" > "os" > > "github.com/ollama/ollama/api" > ) > > func main() { > if len(os.Args) <= 1 { > log.Fatal("usage: <image name>") > } > > imgData, err := os.ReadFile(os.Args[1]) > if err != nil { > log.Fatalf("error reading image file: %v", err) > } > > clientURL, err := url.Parse("http://127.0.0.1:11434") > if err != nil { > log.Fatalf("error parsing client URL: %v", err) > } > > client := api.NewClient(clientURL, http.DefaultClient) > req := &api.GenerateRequest{ > Model: "moondream:1.8b-v2-fp16", > Prompt: "describe this image", > Images: []api.ImageData{imgData}, > } > > ctx := context.Background() > respFunc := func(resp api.GenerateResponse) error { > fmt.Print(resp.Response) > return nil > } > > err = client.Generate(ctx, req, respFunc) > if err != nil { > log.Fatalf("error generating response: %v", err) > } > > fmt.Println("A comment about life and its mysteries.") > } > ``` > > ![dog](https://private-user-images.githubusercontent.com/2028/347962508-a8a281d5-5fd9-4814-9828-e9aa707518dc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA3NTI4MjQsIm5iZiI6MTcyMDc1MjUyNCwicGF0aCI6Ii8yMDI4LzM0Nzk2MjUwOC1hOGEyODFkNS01ZmQ5LTQ4MTQtOTgyOC1lOWFhNzA3NTE4ZGMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MTJUMDI0ODQ0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZWQ3NjBkODhjOGRlNjJjMmY2Y2U0OGVkODMxNzUxNTU2M2YwOTNkM2IxMGIwNDY1NzI4ZWI5YjNiN2RhYWJhNCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.X61yRKRJTcRG8IPo8tddnu3zaeR61DgAw_wwLnMw8RM) > > ``` > $ ./gollama dog.png > > In the image, a large brown dog with shaggy fur is the main focus. The dog's tongue is out and its mouth appears slightly open, giving off an impression of relaxation or playfulness. The dog stands on all fours, facing to the right side of the image, creating a sense of curiosity about what it might be seeing or doing in that direction. The background features a grassy area dotted with trees, adding a touch of nature to the scene and providing a tranquil setting for this furry companion.A comment about life and its mysteries. > ``` I'm sorry, I thought you mean moondream2. Which runs, but doesn't work properly (very poor quality). I am not sure about moondream1, I recall I tested it and it did not work well on llama.cpp but that was many months ago.
Author
Owner

@lingyezhixing commented on GitHub (Aug 1, 2024):

Is the fit of this model progressing? Or do you have plans to adapt?

<!-- gh-comment-id:2261756281 --> @lingyezhixing commented on GitHub (Aug 1, 2024): Is the fit of this model progressing? Or do you have plans to adapt?
Author
Owner

@wlrnet commented on GitHub (Aug 13, 2024):

Distibution version macos app still not support this model, still need rebuild ./ollama binary file

<!-- gh-comment-id:2285267438 --> @wlrnet commented on GitHub (Aug 13, 2024): Distibution version macos app still not support this model, still need rebuild ./ollama binary file
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3094