[GH-ISSUE #6267] add openbmb MiniCPM-V-2_6 #50435

Closed
opened 2026-04-28 15:51:27 -05:00 by GiteaMirror · 20 comments
Owner

Originally created by @insinfo on GitHub (Aug 8, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6267

add openbmb MiniCPM-V-2_6
https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main

Originally created by @insinfo on GitHub (Aug 8, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6267 add openbmb MiniCPM-V-2_6 https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main
GiteaMirror added the feature request label 2026-04-28 15:51:27 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 8, 2024):

Pending support in llama.cpp: https://github.com/ggerganov/llama.cpp/pull/7599

<!-- gh-comment-id:2276870890 --> @rick-github commented on GitHub (Aug 8, 2024): Pending support in llama.cpp: https://github.com/ggerganov/llama.cpp/pull/7599
Author
Owner

@calonye commented on GitHub (Aug 9, 2024):

希望可以支持 ollama,无需再自行编译,如其他流行模型一样,run 后直接可用。
I hope there will be support for Ollama without needing to compile it myself, like other popular models, where it can be run and used directly after.
https://github.com/OpenBMB/MiniCPM-V/issues/422

<!-- gh-comment-id:2276994104 --> @calonye commented on GitHub (Aug 9, 2024): 希望可以支持 ollama,无需再自行编译,如其他流行模型一样,run 后直接可用。 I hope there will be support for Ollama without needing to compile it myself, like other popular models, where it can be run and used directly after. :https://github.com/OpenBMB/MiniCPM-V/issues/422
Author
Owner

@itsPreto commented on GitHub (Aug 9, 2024):

https://github.com/ggerganov/llama.cpp/pull/7599 has been merged can we get the ball rolling here:)

<!-- gh-comment-id:2278074994 --> @itsPreto commented on GitHub (Aug 9, 2024): https://github.com/ggerganov/llama.cpp/pull/7599 has been merged can we get the ball rolling here:)
Author
Owner

@rick-github commented on GitHub (Aug 9, 2024):

Almost there, https://github.com/ggerganov/llama.cpp/pull/7599 was for MiniCPM-V-2_5, MiniCPM-V-2_6 requires another PR but that will build on the existing work and shouldn't take long.

<!-- gh-comment-id:2278084473 --> @rick-github commented on GitHub (Aug 9, 2024): Almost there, https://github.com/ggerganov/llama.cpp/pull/7599 was for MiniCPM-V-2_5, MiniCPM-V-2_6 requires another PR but that will build on the existing work and shouldn't take long.
Author
Owner

@xuhongming251 commented on GitHub (Aug 11, 2024):

https://ollama.com/xuxx/minicpm2.6

There is an installation package for Ollama on Windows, and the MiniCPM 2.6 model is available.

<!-- gh-comment-id:2282757951 --> @xuhongming251 commented on GitHub (Aug 11, 2024): https://ollama.com/xuxx/minicpm2.6 There is an installation package for Ollama on Windows, and the MiniCPM 2.6 model is available.
Author
Owner

@rick-github commented on GitHub (Aug 11, 2024):

https://github.com/ggerganov/llama.cpp/pull/8967

<!-- gh-comment-id:2282780190 --> @rick-github commented on GitHub (Aug 11, 2024): https://github.com/ggerganov/llama.cpp/pull/8967
Author
Owner

@ChristianWeyer commented on GitHub (Aug 16, 2024):

Pending support in llama.cpp: ggerganov/llama.cpp#7599

This and this https://github.com/ggerganov/llama.cpp/pull/8967 got merged 🥳

<!-- gh-comment-id:2293957551 --> @ChristianWeyer commented on GitHub (Aug 16, 2024): > Pending support in llama.cpp: [ggerganov/llama.cpp#7599](https://github.com/ggerganov/llama.cpp/pull/7599) This and this https://github.com/ggerganov/llama.cpp/pull/8967 got merged 🥳
Author
Owner

@ttamoud commented on GitHub (Aug 17, 2024):

https://github.com/ggerganov/llama.cpp/pull/8967 has been merged, can we get the ball rolling here pls 💯

<!-- gh-comment-id:2294898893 --> @ttamoud commented on GitHub (Aug 17, 2024): https://github.com/ggerganov/llama.cpp/pull/8967 has been merged, can we get the ball rolling here pls 💯
Author
Owner

@tong-zeng commented on GitHub (Aug 21, 2024):

I've seen a lot of requests to add MiniCPM-V-2.6, but haven't seen any response / progress lately, any problems with MiniCPM-V-2.6 PR or any concerns? Thanks.

<!-- gh-comment-id:2299963517 --> @tong-zeng commented on GitHub (Aug 21, 2024): I've seen a lot of requests to add MiniCPM-V-2.6, but haven't seen any response / progress lately, any problems with MiniCPM-V-2.6 PR or any concerns? Thanks.
Author
Owner

@pdevine commented on GitHub (Sep 1, 2024):

Dupe of #6417

<!-- gh-comment-id:2323545342 --> @pdevine commented on GitHub (Sep 1, 2024): Dupe of #6417
Author
Owner

@itsPreto commented on GitHub (Sep 10, 2024):

I see this ticket is marked as closed but I don't see any information on whether the video capabilities of this model are also implemented?

<!-- gh-comment-id:2341893732 --> @itsPreto commented on GitHub (Sep 10, 2024): I see this ticket is marked as closed but I don't see any information on whether the video capabilities of this model are also implemented?
Author
Owner

@rick-github commented on GitHub (Sep 10, 2024):

llama.cpp doesn't support video at the moment.

<!-- gh-comment-id:2341926186 --> @rick-github commented on GitHub (Sep 10, 2024): llama.cpp doesn't support video at the moment.
Author
Owner

@colin4k commented on GitHub (Sep 11, 2024):

I see this ticket is marked as closed but I don't see any information on whether the video capabilities of this model are also implemented?

https://ollama.com/library/minicpm-v

<!-- gh-comment-id:2342811650 --> @colin4k commented on GitHub (Sep 11, 2024): > I see this ticket is marked as closed but I don't see any information on whether the video capabilities of this model are also implemented? https://ollama.com/library/minicpm-v
Author
Owner

@ChristianWeyer commented on GitHub (Sep 11, 2024):

Is there an example of how to query this model with images?
Via CLI and via API @rick-github?

<!-- gh-comment-id:2343206635 --> @ChristianWeyer commented on GitHub (Sep 11, 2024): Is there an example of how to query this model with images? Via CLI and via API @rick-github?
Author
Owner

@rick-github commented on GitHub (Sep 13, 2024):

CLI (note the image to be processed must be a full (/xx/yy/xx.jpg) or relative (./xx.jpg) path, just the filename (xx.jpg) won't be recognized as a file to be added to the prompt):

$ ollama run minicpm-v:8b-2.6-q4_0
>>> describe this image: ./puppy.jpg
Added image './puppy.jpg'
The image depicts a small white puppy with soft, fluffy fur sitting on what 
appears to be an outdoor step or platform. The background is blurred but seems 
to consist of neutral colors that do not distract from the subject. Notably, the 
puppy has a red collar around its neck adorned with a single gold bell and two 
red decorative pieces resembling flowers. Its expression is curious yet alert 
as it gazes off into the distance.

>>> 

Raw API:

$ curl -s http://localhost:11434/api/chat -d '{
  "model":"minicpm-v:8b-2.6-q4_0",
  "messages":[{
    "role":"user",
    "content":"describe the image",
    "images":["'$(base64 -w0 puppy.jpg)'"]}],
  "stream":false}' | jq -r .message.content

The image depicts a small, white puppy sitting on what appears to be a stone or 
concrete surface. The puppy has fluffy fur and bright eyes that convey curiosity 
or alertness. Around its neck is a red collar with a bell attached, which 
suggests the dog might have an outdoor environment where it interacts with 
people who like to hear their whereabouts through sound. There's no discernible 
context beyond this; there are no other animals, humans, or objects in the 
immediate vicinity that give further clues about the setting or story behind the 
image.

Python API:

#!/usr/bin/env python3

import ollama
import base64

with open("puppy.jpg", "rb") as f:
  b64_image = base64.b64encode(f.read()).decode("utf-8")
  client = ollama.Client()
  response = client.chat(
      model="minicpm-v:8b-2.6-q4_0",
      messages=[{
        "role":"user",
        "content":"describe this image",
        "images":[b64_image]}
      ]
  )
  print(response["message"]["content"])

"""
This image shows a small white puppy sitting on what appears to be a concrete 
surface, possibly steps. The puppy has soft, fluffy fur and bright eyes that 
are looking off into the distance with an innocent expression. It is wearing a 
red collar adorned with a single gold bell, which adds a touch of color 
contrast against its pure white coat. In the background, there's a blurred 
urban environment suggesting this photo might have been taken in a city or town 
setting. The puppy’s posture and gaze give it a curious yet gentle demeanor, 
making for an endearing image that captures the viewer's attention with its 
simplicity and charm.
"""

puppy

<!-- gh-comment-id:2348231498 --> @rick-github commented on GitHub (Sep 13, 2024): CLI (note the image to be processed must be a full (/xx/yy/xx.jpg) or relative (./xx.jpg) path, just the filename (xx.jpg) won't be recognized as a file to be added to the prompt): ```console $ ollama run minicpm-v:8b-2.6-q4_0 >>> describe this image: ./puppy.jpg Added image './puppy.jpg' The image depicts a small white puppy with soft, fluffy fur sitting on what appears to be an outdoor step or platform. The background is blurred but seems to consist of neutral colors that do not distract from the subject. Notably, the puppy has a red collar around its neck adorned with a single gold bell and two red decorative pieces resembling flowers. Its expression is curious yet alert as it gazes off into the distance. >>> ``` Raw API: ``` $ curl -s http://localhost:11434/api/chat -d '{ "model":"minicpm-v:8b-2.6-q4_0", "messages":[{ "role":"user", "content":"describe the image", "images":["'$(base64 -w0 puppy.jpg)'"]}], "stream":false}' | jq -r .message.content The image depicts a small, white puppy sitting on what appears to be a stone or concrete surface. The puppy has fluffy fur and bright eyes that convey curiosity or alertness. Around its neck is a red collar with a bell attached, which suggests the dog might have an outdoor environment where it interacts with people who like to hear their whereabouts through sound. There's no discernible context beyond this; there are no other animals, humans, or objects in the immediate vicinity that give further clues about the setting or story behind the image. ``` Python API: ```python #!/usr/bin/env python3 import ollama import base64 with open("puppy.jpg", "rb") as f: b64_image = base64.b64encode(f.read()).decode("utf-8") client = ollama.Client() response = client.chat( model="minicpm-v:8b-2.6-q4_0", messages=[{ "role":"user", "content":"describe this image", "images":[b64_image]} ] ) print(response["message"]["content"]) """ This image shows a small white puppy sitting on what appears to be a concrete surface, possibly steps. The puppy has soft, fluffy fur and bright eyes that are looking off into the distance with an innocent expression. It is wearing a red collar adorned with a single gold bell, which adds a touch of color contrast against its pure white coat. In the background, there's a blurred urban environment suggesting this photo might have been taken in a city or town setting. The puppy’s posture and gaze give it a curious yet gentle demeanor, making for an endearing image that captures the viewer's attention with its simplicity and charm. """ ``` ![puppy](https://github.com/user-attachments/assets/98bef0b5-cf0b-4558-a863-4a571b7f42cd)
Author
Owner

@ChristianWeyer commented on GitHub (Sep 13, 2024):

Perfect, thank you @rick-github

<!-- gh-comment-id:2348235497 --> @ChristianWeyer commented on GitHub (Sep 13, 2024): Perfect, thank you @rick-github
Author
Owner

@colin4k commented on GitHub (Sep 13, 2024):

Is there an example of how to query this model with images? Via CLI and via API @rick-github?

ollama run minicpm-v "extract all java code from the image: ./1.png"

But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better.

<!-- gh-comment-id:2348380920 --> @colin4k commented on GitHub (Sep 13, 2024): > Is there an example of how to query this model with images? Via CLI and via API @rick-github? ollama run minicpm-v "extract all java code from the image: ./1.png" But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better.
Author
Owner

@ChristianWeyer commented on GitHub (Sep 13, 2024):

Is there an example of how to query this model with images? Via CLI and via API @rick-github?

ollama run minicpm-v "extract all java code from the image: ./1.png"

But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better.

Interesting - is qwen2-vl in Ollama registry? Could not find it.

<!-- gh-comment-id:2348384955 --> @ChristianWeyer commented on GitHub (Sep 13, 2024): > > Is there an example of how to query this model with images? Via CLI and via API @rick-github? > > ollama run minicpm-v "extract all java code from the image: ./1.png" > > But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better. Interesting - is `qwen2-vl `in Ollama registry? Could not find it.
Author
Owner

@rick-github commented on GitHub (Sep 13, 2024):

Not supported in llama.cpp yet: https://github.com/ggerganov/llama.cpp/issues/9246

<!-- gh-comment-id:2348407798 --> @rick-github commented on GitHub (Sep 13, 2024): Not supported in llama.cpp yet: https://github.com/ggerganov/llama.cpp/issues/9246
Author
Owner

@colin4k commented on GitHub (Sep 14, 2024):

Is there an example of how to query this model with images? Via CLI and via API @rick-github?

ollama run minicpm-v "extract all java code from the image: ./1.png"
But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better.

Interesting - is qwen2-vl in Ollama registry? Could not find it.

Ollama didn't suppport qwen2-vl now, you have to run qwen2-vl with it's source code.

<!-- gh-comment-id:2350746285 --> @colin4k commented on GitHub (Sep 14, 2024): > > > Is there an example of how to query this model with images? Via CLI and via API @rick-github? > > > > > > ollama run minicpm-v "extract all java code from the image: ./1.png" > > But I compared minicpm-v and qwen2-vl, qwen2-vl performed much better. > > Interesting - is `qwen2-vl `in Ollama registry? Could not find it. Ollama didn't suppport qwen2-vl now, you have to run qwen2-vl with it's source code.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50435