[GH-ISSUE #2290] LLaVA 1.6 now available #47832

New Issue

@Donno191 commented on GitHub (Feb 1, 2024):

I tested it a few hours ago

llava web site :

ollama :

34b_Q4KM
and 7b_fp16

Not that great of a result to be honest ! Is there anyone that can test llava-34B_fp16 ??? i just don't have enough RAM :/

@Donno191 commented on GitHub (Feb 1, 2024): I tested it a few hours ago llava web site : ![WhatsApp Image 2024-02-01 at 05 59 33](https://github.com/ollama/ollama/assets/10705947/48c9ffdd-afdb-41a5-8302-f9c34ee4ed90) ollama : ![WhatsApp Image 2024-02-01 at 05 59 36](https://github.com/ollama/ollama/assets/10705947/ef32e8ce-58ea-4e57-8706-abd577b15dc4) ![WhatsApp Image 2024-02-01 at 05 59 39](https://github.com/ollama/ollama/assets/10705947/cbf0f636-50ad-44ed-90b9-3b0ba4454a18) 34b_Q4KM and 7b_fp16 Not that great of a result to be honest ! Is there anyone that can test llava-34B_fp16 ??? i just don't have enough RAM :/

GiteaMirror commented

@Donno191 commented on GitHub (Feb 1, 2024):

I would like to see if llava-1.6v-34B_fp16 from ollama models will give the same results as the llava website : image attached below

@Donno191 commented on GitHub (Feb 1, 2024): I would like to see if llava-1.6v-34B_fp16 from ollama models will give the same results as the llava website : image attached below ![Table](https://github.com/ollama/ollama/assets/10705947/910a29f2-3be6-470a-a92c-85fb6636e589)

GiteaMirror commented

@hskalin commented on GitHub (Feb 1, 2024):

How can I get embeddings for an image using llava? I know about the api endpoint but what prompt should I give to it exactly?

@hskalin commented on GitHub (Feb 1, 2024): How can I get embeddings for an image using llava? I know about the api endpoint but what prompt should I give to it exactly?

GiteaMirror commented

@henryclw commented on GitHub (Feb 1, 2024):

@Donno191 Hi, great to see such testing result, thanks a lot. May I ask where did you get the quantized model weights?

@henryclw commented on GitHub (Feb 1, 2024): @Donno191 Hi, great to see such testing result, thanks a lot. May I ask where did you get the quantized model weights?

GiteaMirror commented

@Donno191 commented on GitHub (Feb 1, 2024):

@henryclw I used ollama's models :
ollama run llava:7b-v1.6-mistral-fp16
ollama run llava:34b-v1.6-q4_K_M
https://llava.hliu.cc/

@Donno191 commented on GitHub (Feb 1, 2024): @henryclw I used ollama's models : ollama run llava:7b-v1.6-mistral-fp16 ollama run llava:34b-v1.6-q4_K_M https://llava.hliu.cc/

GiteaMirror commented

@henryclw commented on GitHub (Feb 1, 2024):

Donno191

Thank you very much. Have a great day!

@henryclw commented on GitHub (Feb 1, 2024): > [Donno191](/Donno191) Thank you very much. Have a great day!

GiteaMirror commented

@adriens commented on GitHub (Feb 1, 2024):

The results are just amazing:

@adriens commented on GitHub (Feb 1, 2024): The results are just amazing: ![image](https://github.com/ollama/ollama/assets/5235127/e8f4dde6-3341-4e41-8730-d387f39ba4d4) ![image](https://github.com/ollama/ollama/assets/5235127/fa030b06-2e40-463a-85ea-e3279c0f1530)

GiteaMirror commented

@Donno191 commented on GitHub (Feb 2, 2024):

@adriens thanks you !

@Donno191 commented on GitHub (Feb 2, 2024): @adriens thanks you !

GiteaMirror commented

@adriens commented on GitHub (Feb 2, 2024):

If you want I can make the notebook public... I'll do the storyteling later, just let me know @Donno191 💭

@adriens commented on GitHub (Feb 2, 2024): If you want I can make the notebook public... I'll do the storyteling later, just let me know @Donno191 :thought_balloon:

GiteaMirror commented

@Donno191 commented on GitHub (Feb 2, 2024):

@adriens No, it is fine. No worries, have a great day :)

@Donno191 commented on GitHub (Feb 2, 2024): @adriens No, it is fine. No worries, have a great day :)

GiteaMirror commented

@chigkim commented on GitHub (Feb 3, 2024):

I believe Llama.cpp does not support Llava v1.6 completely yet. There's a PR for partial support.

@cmp-nct author for the PR above said:

With these tools you can convert llava-1.6 into a llama.cpp GGUF file and it will work for inferencing.
But as long as the image preprocessing is not integrated, it will not provide the same quality in results.
Right now llama.cpp will create the usual 14 patches of a rectangular padded 336 pixel image.
But the big change in llava-1.6 was the preprocessing in how patches are split up into image regions of much higher resolutions, it does not need the padding/cropping anymore.

Did Ollama folks forked llama.cpp and completed llava v1.6 architecture including image preprocessing?

@chigkim commented on GitHub (Feb 3, 2024): I believe Llama.cpp does not support Llava v1.6 completely yet. There's a [PR](https://github.com/ggerganov/llama.cpp/pull/5267) for partial support. @cmp-nct author for the PR above said: > With these tools you can convert llava-1.6 into a llama.cpp GGUF file and it will work for inferencing. > But as long as the image preprocessing is not integrated, it will not provide the same quality in results. > Right now llama.cpp will create the usual 14 patches of a rectangular padded 336 pixel image. > But the big change in llava-1.6 was the preprocessing in how patches are split up into image regions of much higher resolutions, it does not need the padding/cropping anymore. Did Ollama folks forked llama.cpp and completed llava v1.6 architecture including image preprocessing?

GiteaMirror commented

@pdevine commented on GitHub (Feb 5, 2024):

Going to close this, as it's now supported.

@pdevine commented on GitHub (Feb 5, 2024): Going to close this, as it's now supported.

GiteaMirror commented