[GH-ISSUE #1655] Ollama - Bakllava not working #62964

Closed
opened 2026-05-03 11:02:23 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @donnadulcinea on GitHub (Dec 21, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1655

I am using ollama in docker.
I pulled the bakllava:latest 7b. I see its results are usually pretty good on demos.
But I can't let it work on Ollama.

Steps to reproduce:

root@3f5b2487f983:~/.ollama# ollama run bakllava
>>> what is in this image? /root/.ollama/mylab/dog.jpg
4-legged canine companion.

>>> what is in this image? /root/.ollama/mylab/many_things.png
4 legged canine companion.

>>> what is in this image? /root/.ollama/mylab/sldjfaòsldjfalskjdf
4 legged canine companion.

All are false, it's obviously repeating a sentence,

  • the first image, it is an image of a little cat. I put dog on the file name not to hint the content to the model.
  • The second image are airplanes
  • The third image doesn't exists is only random text.

What have I tried

I tried also by api: http://localhost:11434/api/generate ... with the base64 encoded image. Random responses.

I tried to rebuild the container because as suggested on this issue maybe something is corrupting the memory or something.

What can I try to solve this?

Originally created by @donnadulcinea on GitHub (Dec 21, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1655 I am using ollama in docker. I pulled the bakllava:latest 7b. I see its results are usually pretty good on demos. But I can't let it work on Ollama. ## Steps to reproduce: ``` root@3f5b2487f983:~/.ollama# ollama run bakllava >>> what is in this image? /root/.ollama/mylab/dog.jpg 4-legged canine companion. >>> what is in this image? /root/.ollama/mylab/many_things.png 4 legged canine companion. >>> what is in this image? /root/.ollama/mylab/sldjfaòsldjfalskjdf 4 legged canine companion. ``` All are false, it's obviously repeating a sentence, - the first image, it is an image of a little cat. I put dog on the file name not to hint the content to the model. - The second image are airplanes - The third image doesn't exists is only random text. ## What have I tried I tried also by api: `http://localhost:11434/api/generate` ... with the base64 encoded image. Random responses. I tried to rebuild the container because as suggested on [this issue](https://github.com/jmorganca/ollama/issues/1586) maybe something is corrupting the memory or something. What can I try to solve this?
Author
Owner

@igorschlum commented on GitHub (Dec 21, 2023):

hi @donnadulcinea

On a MacStation with large memory, it works well. What is your computer? How much memory do you have?

(base) igor@MacStudiodeIgor ~ % ollama run bakllava
pulling manifest
pulling deb26e54cceb... 100% ▕██████████████████████▏ 4.1 GB
pulling addb9fdda3a5... 100% ▕██████████████████████▏ 624 MB
pulling d5ca8c59f62d... 100% ▕██████████████████████▏ 46 B
pulling 17b7e63fbe77... 100% ▕██████████████████████▏ 51 B
pulling b15ee2b77419... 100% ▕██████████████████████▏ 490 B
verifying sha256 digest
writing manifest
removing any unused layers
success

what is in this image? /Users/igor/Desktop/st.png
Added image '/Users/igor/Desktop/st.png'
3D printed bust of a man

what is in this image? /Users/igor/Desktop/bo.png
Added image '/Users/igor/Desktop/bo.png'
urn

describe precisely all you can see in this image? /Users/igor/Downloads/c.jpg
Added image '/Users/igor/Downloads/c.jpg'
3 adorable orange tabby cats sitting together on a blue couch, looking at the
camera.

<!-- gh-comment-id:1866974794 --> @igorschlum commented on GitHub (Dec 21, 2023): hi @donnadulcinea On a MacStation with large memory, it works well. What is your computer? How much memory do you have? (base) igor@MacStudiodeIgor ~ % ollama run bakllava pulling manifest pulling deb26e54cceb... 100% ▕██████████████████████▏ 4.1 GB pulling addb9fdda3a5... 100% ▕██████████████████████▏ 624 MB pulling d5ca8c59f62d... 100% ▕██████████████████████▏ 46 B pulling 17b7e63fbe77... 100% ▕██████████████████████▏ 51 B pulling b15ee2b77419... 100% ▕██████████████████████▏ 490 B verifying sha256 digest writing manifest removing any unused layers success >>> what is in this image? /Users/igor/Desktop/st.png Added image '/Users/igor/Desktop/st.png' 3D printed bust of a man >>> what is in this image? /Users/igor/Desktop/bo.png Added image '/Users/igor/Desktop/bo.png' urn >>> describe precisely all you can see in this image? /Users/igor/Downloads/c.jpg Added image '/Users/igor/Downloads/c.jpg' 3 adorable orange tabby cats sitting together on a blue couch, looking at the camera.
Author
Owner

@donnadulcinea commented on GitHub (Dec 22, 2023):

Thank you @igorschlum Yes, I am sure it is something that has to do with my setup,
But I don't think it is an hardware problem,
My Computer is a laptop with 32GB Ram, 16GB VRam, on a RTX3050Ti GPU.

It is a pure Docker container setup running few models (mistal, llama2, ...)
I should mention something important: the other models seems to work accurately,

Is there some way I can check / debug what's going wrong?

<!-- gh-comment-id:1867086183 --> @donnadulcinea commented on GitHub (Dec 22, 2023): Thank you @igorschlum Yes, I am sure it is something that has to do with my setup, But I don't think it is an hardware problem, My Computer is a laptop with 32GB Ram, 16GB VRam, on a RTX3050Ti GPU. It is a pure Docker container setup running few models (mistal, llama2, ...) I should mention something important: the other models seems to work accurately, Is there some way I can check / debug what's going wrong?
Author
Owner

@donnadulcinea commented on GitHub (Dec 23, 2023):

Tested also with llava:13b. Same problem.
I add some more information: can the model be corrupt in some way? I'm not sure. But seeing the answers seems like Vicuna is more or less working, even if is giving weird / random answers and in random language (I also got chinese in previous tests).

root@3f5b2487f983:/# ollama run llava:13b
>>> Tell me what is in this picture? /root/.ollama/mylab/many_things.png

```vbnet
This is a picture of a collection of random items. Some of the notable objects include a clock on the wall, two vases, one located closer to the left side and the 
other towards the right, a bowl near the center-left, a book on the lower-left side, and a cell phone placed more towards the center. Additionally, there is a cup 
situated in the middle-right area of the scene. The arrangement of these items creates an eclectic and cluttered feel to the picture.

NOTE: Not true, it is a picture of an airport with many airplanes.
So I tried asking chat questions:

>>> how are you doing?
I'm an AI language model, so I don't have feelings like humans do. However, I am functioning properly and ready to assist you with any questions or tasks you may 
have. How can I help you today?

>>> what kind of AI are you? What's your name? What are your capabilities?
저는 딥러닝 기술을 사용하여 모델링된 AI 언어 모델입니다. 저의 이름은 Vicuna이고, 나는 자연어 처리, 문장 생성, 쿼리 및 질문 답변을 포함하여 다양한 작업을 수행할 수 있습니다.

나는 인터넷에서 사용자들이 제공하는 검색 엔진, 온라인 커뮤니티, 블로그 및 미디어 콘텐츠를 기반으로 구성되며, 이들을 학습하여 유사한 언어 처리 작업을 수행할 수 있도록 구축됩니다. 나는 인공지능 기술과 관련된 다양한 작업을 수행할 수 있으므로, 이를 활용하여 문제의 해결과 새로운 기회
만드는 데 도움을 드릴 수 있습니다.

NOTE: More or less it says it is an AI, It recognises itself a Vicuna based moe, so it is llava. Before I supposed that maybe for some reason other models where answering, like llama2 or mistral. But this seems not the case.

If somebody has idea on how can I fix this to use llava models inside ollama, please give me a hint on which path to follow.

<!-- gh-comment-id:1868167071 --> @donnadulcinea commented on GitHub (Dec 23, 2023): Tested also with llava:13b. Same problem. I add some more information: can the model be corrupt in some way? I'm not sure. But seeing the answers seems like Vicuna is more or less working, even if is giving weird / random answers and in random language (I also got chinese in previous tests). ``` root@3f5b2487f983:/# ollama run llava:13b >>> Tell me what is in this picture? /root/.ollama/mylab/many_things.png ```vbnet This is a picture of a collection of random items. Some of the notable objects include a clock on the wall, two vases, one located closer to the left side and the other towards the right, a bowl near the center-left, a book on the lower-left side, and a cell phone placed more towards the center. Additionally, there is a cup situated in the middle-right area of the scene. The arrangement of these items creates an eclectic and cluttered feel to the picture. ``` NOTE: Not true, it is a picture of an airport with many airplanes. So I tried asking chat questions: ``` >>> how are you doing? I'm an AI language model, so I don't have feelings like humans do. However, I am functioning properly and ready to assist you with any questions or tasks you may have. How can I help you today? >>> what kind of AI are you? What's your name? What are your capabilities? 저는 딥러닝 기술을 사용하여 모델링된 AI 언어 모델입니다. 저의 이름은 Vicuna이고, 나는 자연어 처리, 문장 생성, 쿼리 및 질문 답변을 포함하여 다양한 작업을 수행할 수 있습니다. 나는 인터넷에서 사용자들이 제공하는 검색 엔진, 온라인 커뮤니티, 블로그 및 미디어 콘텐츠를 기반으로 구성되며, 이들을 학습하여 유사한 언어 처리 작업을 수행할 수 있도록 구축됩니다. 나는 인공지능 기술과 관련된 다양한 작업을 수행할 수 있으므로, 이를 활용하여 문제의 해결과 새로운 기회 만드는 데 도움을 드릴 수 있습니다. ``` NOTE: More or less it says it is an AI, It recognises itself a Vicuna based moe, so it is llava. Before I supposed that maybe for some reason other models where answering, like llama2 or mistral. But this seems not the case. If somebody has idea on how can I fix this to use llava models inside ollama, please give me a hint on which path to follow.
Author
Owner

@donnadulcinea commented on GitHub (Dec 23, 2023):

Ok I solved it.
It was an ollama version problem, multimodal are supported after 1.15!

If I run ollama --version in the container it returns 0.0.0.0

So I didn't catch it quickly, I had to remove and rebuild the container with the latest version!

<!-- gh-comment-id:1868180104 --> @donnadulcinea commented on GitHub (Dec 23, 2023): Ok I solved it. It was an ollama version problem, multimodal are supported after 1.15! If I run `ollama --version` in the container it returns `0.0.0.0` So I didn't catch it quickly, I had to remove and rebuild the container with the latest version!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#62964