[GH-ISSUE #2290] LLaVA 1.6 now available #47832

Closed
opened 2026-04-28 05:26:40 -05:00 by GiteaMirror · 13 comments
Owner

Originally created by @coder543 on GitHub (Jan 31, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2290

https://llava-vl.github.io/blog/2024-01-30-llava-1-6/

Supposedly a big improvement

Originally created by @coder543 on GitHub (Jan 31, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2290 https://llava-vl.github.io/blog/2024-01-30-llava-1-6/ Supposedly a big improvement
GiteaMirror added the model label 2026-04-28 05:26:40 -05:00
Author
Owner

@Donno191 commented on GitHub (Feb 1, 2024):

I tested it a few hours ago

llava web site :
WhatsApp Image 2024-02-01 at 05 59 33

ollama :
WhatsApp Image 2024-02-01 at 05 59 36
WhatsApp Image 2024-02-01 at 05 59 39

34b_Q4KM
and 7b_fp16

Not that great of a result to be honest ! Is there anyone that can test llava-34B_fp16 ??? i just don't have enough RAM :/

<!-- gh-comment-id:1920569237 --> @Donno191 commented on GitHub (Feb 1, 2024): I tested it a few hours ago llava web site : ![WhatsApp Image 2024-02-01 at 05 59 33](https://github.com/ollama/ollama/assets/10705947/48c9ffdd-afdb-41a5-8302-f9c34ee4ed90) ollama : ![WhatsApp Image 2024-02-01 at 05 59 36](https://github.com/ollama/ollama/assets/10705947/ef32e8ce-58ea-4e57-8706-abd577b15dc4) ![WhatsApp Image 2024-02-01 at 05 59 39](https://github.com/ollama/ollama/assets/10705947/cbf0f636-50ad-44ed-90b9-3b0ba4454a18) 34b_Q4KM and 7b_fp16 Not that great of a result to be honest ! Is there anyone that can test llava-34B_fp16 ??? i just don't have enough RAM :/
Author
Owner

@Donno191 commented on GitHub (Feb 1, 2024):

I would like to see if llava-1.6v-34B_fp16 from ollama models will give the same results as the llava website : image attached below
Table

<!-- gh-comment-id:1920571881 --> @Donno191 commented on GitHub (Feb 1, 2024): I would like to see if llava-1.6v-34B_fp16 from ollama models will give the same results as the llava website : image attached below ![Table](https://github.com/ollama/ollama/assets/10705947/910a29f2-3be6-470a-a92c-85fb6636e589)
Author
Owner

@hskalin commented on GitHub (Feb 1, 2024):

How can I get embeddings for an image using llava? I know about the api endpoint but what prompt should I give to it exactly?

<!-- gh-comment-id:1920705341 --> @hskalin commented on GitHub (Feb 1, 2024): How can I get embeddings for an image using llava? I know about the api endpoint but what prompt should I give to it exactly?
Author
Owner

@henryclw commented on GitHub (Feb 1, 2024):

@Donno191 Hi, great to see such testing result, thanks a lot. May I ask where did you get the quantized model weights?

<!-- gh-comment-id:1920802239 --> @henryclw commented on GitHub (Feb 1, 2024): @Donno191 Hi, great to see such testing result, thanks a lot. May I ask where did you get the quantized model weights?
Author
Owner

@Donno191 commented on GitHub (Feb 1, 2024):

@henryclw I used ollama's models :
ollama run llava:7b-v1.6-mistral-fp16
ollama run llava:34b-v1.6-q4_K_M
https://llava.hliu.cc/

<!-- gh-comment-id:1921498870 --> @Donno191 commented on GitHub (Feb 1, 2024): @henryclw I used ollama's models : ollama run llava:7b-v1.6-mistral-fp16 ollama run llava:34b-v1.6-q4_K_M https://llava.hliu.cc/
Author
Owner

@henryclw commented on GitHub (Feb 1, 2024):

Donno191

Thank you very much. Have a great day!

<!-- gh-comment-id:1922201946 --> @henryclw commented on GitHub (Feb 1, 2024): > [Donno191](/Donno191) Thank you very much. Have a great day!
Author
Owner

@adriens commented on GitHub (Feb 1, 2024):

The results are just amazing:

image
image

<!-- gh-comment-id:1922298539 --> @adriens commented on GitHub (Feb 1, 2024): The results are just amazing: ![image](https://github.com/ollama/ollama/assets/5235127/e8f4dde6-3341-4e41-8730-d387f39ba4d4) ![image](https://github.com/ollama/ollama/assets/5235127/fa030b06-2e40-463a-85ea-e3279c0f1530)
Author
Owner

@Donno191 commented on GitHub (Feb 2, 2024):

@adriens thanks you !

<!-- gh-comment-id:1922791311 --> @Donno191 commented on GitHub (Feb 2, 2024): @adriens thanks you !
Author
Owner

@adriens commented on GitHub (Feb 2, 2024):

If you want I can make the notebook public... I'll do the storyteling later, just let me know @Donno191 💭

<!-- gh-comment-id:1922794365 --> @adriens commented on GitHub (Feb 2, 2024): If you want I can make the notebook public... I'll do the storyteling later, just let me know @Donno191 :thought_balloon:
Author
Owner

@Donno191 commented on GitHub (Feb 2, 2024):

@adriens No, it is fine. No worries, have a great day :)

<!-- gh-comment-id:1922796038 --> @Donno191 commented on GitHub (Feb 2, 2024): @adriens No, it is fine. No worries, have a great day :)
Author
Owner

@chigkim commented on GitHub (Feb 3, 2024):

I believe Llama.cpp does not support Llava v1.6 completely yet. There's a PR for partial support.

@cmp-nct author for the PR above said:

With these tools you can convert llava-1.6 into a llama.cpp GGUF file and it will work for inferencing.
But as long as the image preprocessing is not integrated, it will not provide the same quality in results.
Right now llama.cpp will create the usual 14 patches of a rectangular padded 336 pixel image.
But the big change in llava-1.6 was the preprocessing in how patches are split up into image regions of much higher resolutions, it does not need the padding/cropping anymore.

Did Ollama folks forked llama.cpp and completed llava v1.6 architecture including image preprocessing?

<!-- gh-comment-id:1925419257 --> @chigkim commented on GitHub (Feb 3, 2024): I believe Llama.cpp does not support Llava v1.6 completely yet. There's a [PR](https://github.com/ggerganov/llama.cpp/pull/5267) for partial support. @cmp-nct author for the PR above said: > With these tools you can convert llava-1.6 into a llama.cpp GGUF file and it will work for inferencing. > But as long as the image preprocessing is not integrated, it will not provide the same quality in results. > Right now llama.cpp will create the usual 14 patches of a rectangular padded 336 pixel image. > But the big change in llava-1.6 was the preprocessing in how patches are split up into image regions of much higher resolutions, it does not need the padding/cropping anymore. Did Ollama folks forked llama.cpp and completed llava v1.6 architecture including image preprocessing?
Author
Owner

@pdevine commented on GitHub (Feb 5, 2024):

Going to close this, as it's now supported.

<!-- gh-comment-id:1927880084 --> @pdevine commented on GitHub (Feb 5, 2024): Going to close this, as it's now supported.
Author
Owner

@chigkim commented on GitHub (Feb 10, 2024):

It's not completed yet. Can you guys mark Llava 1.6 as partial support? It's not fully supported in Llama.cpp.

People assume it's the same as Llava 1.6, and it's not there yet.

https://github.com/ggerganov/llama.cpp/pull/5267

The dev from Llava is also chiming in there to complete the PR.

<!-- gh-comment-id:1937076083 --> @chigkim commented on GitHub (Feb 10, 2024): It's not completed yet. Can you guys mark Llava 1.6 as partial support? It's not fully supported in Llama.cpp. People assume it's the same as Llava 1.6, and it's not there yet. https://github.com/ggerganov/llama.cpp/pull/5267 The dev from Llava is also chiming in there to complete the PR.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#47832