[GH-ISSUE #10733] Ollama 0.7.0 decreased the quality of Llama-vision dramatically. #53562

Closed
opened 2026-04-29 03:47:57 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @chigkim on GitHub (May 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10733

What is the issue?

Ollama 0.7.0 asked me to redownload llama3.2-vision:11b.
It kind of gets the overview of image, but it hallucinates the detail dramatically.
For example, even though it is very obvious that there are two people, but it says only one person.
Also it often describes they're holding a cell phone or book, although there are no cellphones or books.
It used to be fairly accurate. What happened?
Gemma3, Mistral, Qwen2.5-vl all seem to work with the latest multimodal engine.

Relevant log output


OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.7.0

Originally created by @chigkim on GitHub (May 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10733 ### What is the issue? Ollama 0.7.0 asked me to redownload llama3.2-vision:11b. It kind of gets the overview of image, but it hallucinates the detail dramatically. For example, even though it is very obvious that there are two people, but it says only one person. Also it often describes they're holding a cell phone or book, although there are no cellphones or books. It used to be fairly accurate. What happened? Gemma3, Mistral, Qwen2.5-vl all seem to work with the latest multimodal engine. ### Relevant log output ```shell ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.7.0
GiteaMirror added the bug label 2026-04-29 03:47:58 -05:00
Author
Owner

@igorschlum commented on GitHub (May 17, 2025):

Hi @chigkim could you provide a sample picture so I can try on my side on my Mac and see if I get the same results as you?

<!-- gh-comment-id:2888424317 --> @igorschlum commented on GitHub (May 17, 2025): Hi @chigkim could you provide a sample picture so I can try on my side on my Mac and see if I get the same results as you?
Author
Owner

@chigkim commented on GitHub (May 18, 2025):

Basically you can use any image, but here's one.

Image

<!-- gh-comment-id:2889244904 --> @chigkim commented on GitHub (May 18, 2025): Basically you can use any image, but here's one. ![Image](https://github.com/user-attachments/assets/3862d150-ef10-4f66-8e84-632130c7b189)
Author
Owner

@rick-github commented on GitHub (May 24, 2025):

The model has been updated.

<!-- gh-comment-id:2906090362 --> @rick-github commented on GitHub (May 24, 2025): The model has been updated.
Author
Owner

@igorschlum commented on GitHub (May 24, 2025):

Hi @rick-github, @chigkim

Thank you for informing us about the new build of the vision LLM.

Ollama 0.7.1 significantly improves the results!

With the updated vision LLM and Ollama 0.7.0, I was getting a poor description that repeated itself endlessly.

After upgrading to Ollama 0.7.1, I now get an accurate and complete description. Here's the one for the picture proposed last week.

The image depicts a small, fluffy dog standing in the snow. The dog is white with brown ears and has large, round eyes that are black. Its mouth is slightly open, and its tongue is sticking out. The dog's fur is long and fluffy, and it has a small nose and floppy ears. The dog is standing in the snow, which is white and powdery. There are no other objects or people in the image, just the dog and the snow. The background is dark and blurry, but it appears to be a
forest or wooded area. The overall atmosphere of the image is one of cuteness and innocence, as the dog looks up at the camera with its big, round eyes.

@chigkim, I believe you can test this LLM again. If you agree that the issue is resolved, feel free to close it.

<!-- gh-comment-id:2906810141 --> @igorschlum commented on GitHub (May 24, 2025): Hi @rick-github, @chigkim Thank you for informing us about the new build of the vision LLM. Ollama 0.7.1 significantly improves the results! With the updated vision LLM and Ollama 0.7.0, I was getting a poor description that repeated itself endlessly. After upgrading to Ollama 0.7.1, I now get an accurate and complete description. Here's the one for the picture proposed last week. The image depicts a small, fluffy dog standing in the snow. The dog is white with brown ears and has large, round eyes that are black. Its mouth is slightly open, and its tongue is sticking out. The dog's fur is long and fluffy, and it has a small nose and floppy ears. The dog is standing in the snow, which is white and powdery. There are no other objects or people in the image, just the dog and the snow. The background is dark and blurry, but it appears to be a forest or wooded area. The overall atmosphere of the image is one of cuteness and innocence, as the dog looks up at the camera with its big, round eyes. @chigkim, I believe you can test this LLM again. If you agree that the issue is resolved, feel free to close it.
Author
Owner

@chigkim commented on GitHub (May 24, 2025):

Awesome! It's much better now! Closing it now. Thanks!

<!-- gh-comment-id:2906909565 --> @chigkim commented on GitHub (May 24, 2025): Awesome! It's much better now! Closing it now. Thanks!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53562