[GH-ISSUE #7563] LLama3.2 Vision refuses to analyse image #4816

Closed
opened 2026-04-12 15:47:53 -05:00 by GiteaMirror · 16 comments
Owner

Originally created by @I-I-IT on GitHub (Nov 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7563

What is the issue?

screenshot-ollama
Disclaimer: I have no GPU (Integrated Graphics)

OS

Linux

GPU

Other

CPU

Intel

Ollama version

0.4

Originally created by @I-I-IT on GitHub (Nov 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7563 ### What is the issue? ![screenshot-ollama](https://github.com/user-attachments/assets/bc208fb6-34b7-4ac3-a19f-b7adfacdf269) Disclaimer: I have no GPU (Integrated Graphics) ### OS Linux ### GPU Other ### CPU Intel ### Ollama version 0.4
GiteaMirror added the question label 2026-04-12 15:47:53 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 7, 2024):

It told you how to look at it. If you want the model to describe it, then say that. You will need to include the path to the image.

Describe this image:  /path/to/image.jpg
<!-- gh-comment-id:2463145527 --> @rick-github commented on GitHub (Nov 7, 2024): It told you how to look at it. If you want the model to describe it, then say that. You will need to include the path to the image. ``` Describe this image: /path/to/image.jpg ```
Author
Owner

@0lsn commented on GitHub (Nov 7, 2024):

same for me. the first picture is always correct. the other picture that get added are simple ignored and i always get the first picture that was added as answer. Only restarting the server fixes it:

OS: Windows 11 (latest)

GPU: 4080 (latest driver)

CPU: 11700k Intel

Ollama version: 0.4

<!-- gh-comment-id:2463204174 --> @0lsn commented on GitHub (Nov 7, 2024): same for me. the first picture is always correct. the other picture that get added are simple ignored and i always get the first picture that was added as answer. Only restarting the server fixes it: OS: Windows 11 (latest) GPU: 4080 (latest driver) CPU: 11700k Intel Ollama version: 0.4
Author
Owner

@rick-github commented on GitHub (Nov 7, 2024):

Type /clear between prompts.

<!-- gh-comment-id:2463232119 --> @rick-github commented on GitHub (Nov 7, 2024): Type `/clear` between prompts.
Author
Owner

@I-I-IT commented on GitHub (Nov 7, 2024):

It told you how to look at it. If you want the model to describe it, then say that. You will need to include the path to the image.

Describe this image:  /path/to/image.jpg

Even then, it beasically refuse to tell me anything about the context. See screenshot
ollama-screenshot-2

<!-- gh-comment-id:2463239493 --> @I-I-IT commented on GitHub (Nov 7, 2024): > It told you how to look at it. If you want the model to describe it, then say that. You will need to include the path to the image. > > ``` > Describe this image: /path/to/image.jpg > ``` Even then, it beasically refuse to tell me anything about the context. See screenshot ![ollama-screenshot-2](https://github.com/user-attachments/assets/7de432f9-24ed-4768-b32a-79139edbb558)
Author
Owner

@rick-github commented on GitHub (Nov 7, 2024):

What's the quality of the image? Can you add it here?

$ ollama run llama3.2-vision:latest
>>> What do you see here ./puppy.png ?
Added image './puppy.png'
You are looking at a small, white puppy with a red collar. The puppy appears to be a
Maltese or similar breed and is likely around two months old based on its size and fur
texture. It seems healthy and alert, but it may have some minor issues like thin ears
or slightly long nails that could indicate improper care or handling.
<!-- gh-comment-id:2463295250 --> @rick-github commented on GitHub (Nov 7, 2024): What's the quality of the image? Can you add it here? ```console $ ollama run llama3.2-vision:latest >>> What do you see here ./puppy.png ? Added image './puppy.png' You are looking at a small, white puppy with a red collar. The puppy appears to be a Maltese or similar breed and is likely around two months old based on its size and fur texture. It seems healthy and alert, but it may have some minor issues like thin ears or slightly long nails that could indicate improper care or handling. ```
Author
Owner

@I-I-IT commented on GitHub (Nov 7, 2024):

What's the quality of the image? Can you add it here?

$ ollama run llama3.2-vision:latest
>>> What do you see here ./puppy.png ?
Added image './puppy.png'
You are looking at a small, white puppy with a red collar. The puppy appears to be a
Maltese or similar breed and is likely around two months old based on its size and fur
texture. It seems healthy and alert, but it may have some minor issues like thin ears
or slightly long nails that could indicate improper care or handling.

The quality is good, but it's just voting number for the US election. I would except it to at least get that.
us-elections-stats

<!-- gh-comment-id:2463301687 --> @I-I-IT commented on GitHub (Nov 7, 2024): > What's the quality of the image? Can you add it here? > > ``` > $ ollama run llama3.2-vision:latest > >>> What do you see here ./puppy.png ? > Added image './puppy.png' > You are looking at a small, white puppy with a red collar. The puppy appears to be a > Maltese or similar breed and is likely around two months old based on its size and fur > texture. It seems healthy and alert, but it may have some minor issues like thin ears > or slightly long nails that could indicate improper care or handling. > ``` The quality is good, but it's just voting number for the US election. I would except it to at least get that. ![us-elections-stats](https://github.com/user-attachments/assets/033af40c-a242-4648-ad6e-bd89f604a2a3)
Author
Owner

@edmosRovi commented on GitHub (Nov 8, 2024):

Hello! The way to ask for the image, as you've been saying before, doesn't seem to work, at least not for me. Does anyone know what might be happening?

<!-- gh-comment-id:2464039074 --> @edmosRovi commented on GitHub (Nov 8, 2024): Hello! The way to ask for the image, as you've been saying before, doesn't seem to work, at least not for me. Does anyone know what might be happening?
Author
Owner

@I-I-IT commented on GitHub (Nov 8, 2024):

Does anyone know what might be happening?

It's a common problem in LLMs that even if they have vision, web search capibities, they will often say they can't. This is probably due to their RL. However, Ollama could improve this by having a general prompt that say it can and should analyse image it is given.

<!-- gh-comment-id:2464150743 --> @I-I-IT commented on GitHub (Nov 8, 2024): > Does anyone know what might be happening? It's a common problem in LLMs that even if they have vision, web search capibities, they will often say they can't. This is probably due to their RL. However, Ollama could improve this by having a general prompt that say it can and should analyse image it is given.
Author
Owner

@gongzhang commented on GitHub (Nov 8, 2024):

image

Same issue here.

<!-- gh-comment-id:2464290876 --> @gongzhang commented on GitHub (Nov 8, 2024): ![image](https://github.com/user-attachments/assets/4d5afad6-39a9-44db-b58b-e75ae35a477e) Same issue here.
Author
Owner

@I-I-IT commented on GitHub (Nov 8, 2024):

image

Same issue here.

It's not though. It correctly describe the image. (Llama3.2 Vision 不可以说中文, 看看https://ollama.com/library/llama3.2-vision)

<!-- gh-comment-id:2464418473 --> @I-I-IT commented on GitHub (Nov 8, 2024): > ![image](https://github.com/user-attachments/assets/4d5afad6-39a9-44db-b58b-e75ae35a477e) > > Same issue here. It's not though. It correctly describe the image. (Llama3.2 Vision 不可以说中文, 看看https://ollama.com/library/llama3.2-vision)
Author
Owner

@rick-github commented on GitHub (Nov 8, 2024):

$ ollama run llama3.2-vision:latest
>>> What do you see here ./us-election-stats.png ?
Added image './us-election-stats.png'
It appears that this image is a screenshot of an Excel spreadsheet containing data on US election turnout and voting statistics, with the years ranging from 1968 to 2024. The table includes 
several columns providing various statistical information. Specifically:

*   **Turnout**: This column displays the total number of votes cast in each year.
*   **Population**: This column shows the estimated population for each year.
*   **Voting Age Population (VAP)**: This column indicates the number of eligible voters for each year.
*   **Registered Voters**: This column lists the number of registered voters for each year.
*   **Turnout Rate**: This column calculates the percentage of registered voters who actually voted in each year.

The data is presented in a clear and organized manner, making it easy to analyze and compare voting trends across different years.

>>> convert this image into CSV format
To convert the image into CSV format, we need to extract the data from the table and separate it with commas. Here's how you can do it:

1. Identify the columns: The columns in the table are "Year", "Turnout", "Population", "Voting Age Population (VAP)", "Registered Voters", and "Turnout Rate".
2. Extract the data: Copy the data from each column into a text editor or spreadsheet software.
3. Separate the data with commas: Place a comma after each value in each row to separate them.

Here is the converted CSV file:

Year,Turnout,Population,Voting Age Population (VAP),Registered Voters,Turnout Rate
1968,73026831,195743427,,120285000,60.7%
1972,77625152,205238390,,140777000,55.1%
1976,81603346,213270022,,152308000,53.6%
1980,86496851,223140018,,163945000,52.8%
1984,92654861,232766280,,173995000,53.3%
1988,91586725,242287814,,181956000,50.3%
1992,104600366,255175339,,189493000,55.2%
1996,96389818,268984347,,196789000,49.0%
2000,105594024,282398554,,209130000,50.5%
2004,122349480,293947885,,219508000,55.7%
2008,131406895,305694910,,229989000,57.1%
2012,129139997,316651321,,240177000,53.8%
2016,136787187,327210198,,24942000,54.8%
2020,158481688,335942003,,252274000,62.8%

Note: The data is truncated to fit the response format.

It missed some of the data, maybe confused by the incomplete VEP column, but it was able to process it. Try running /clear in the client and try again. The previous conversation history may be confusing the model.

<!-- gh-comment-id:2464585407 --> @rick-github commented on GitHub (Nov 8, 2024): ```console $ ollama run llama3.2-vision:latest >>> What do you see here ./us-election-stats.png ? Added image './us-election-stats.png' It appears that this image is a screenshot of an Excel spreadsheet containing data on US election turnout and voting statistics, with the years ranging from 1968 to 2024. The table includes several columns providing various statistical information. Specifically: * **Turnout**: This column displays the total number of votes cast in each year. * **Population**: This column shows the estimated population for each year. * **Voting Age Population (VAP)**: This column indicates the number of eligible voters for each year. * **Registered Voters**: This column lists the number of registered voters for each year. * **Turnout Rate**: This column calculates the percentage of registered voters who actually voted in each year. The data is presented in a clear and organized manner, making it easy to analyze and compare voting trends across different years. >>> convert this image into CSV format To convert the image into CSV format, we need to extract the data from the table and separate it with commas. Here's how you can do it: 1. Identify the columns: The columns in the table are "Year", "Turnout", "Population", "Voting Age Population (VAP)", "Registered Voters", and "Turnout Rate". 2. Extract the data: Copy the data from each column into a text editor or spreadsheet software. 3. Separate the data with commas: Place a comma after each value in each row to separate them. Here is the converted CSV file: Year,Turnout,Population,Voting Age Population (VAP),Registered Voters,Turnout Rate 1968,73026831,195743427,,120285000,60.7% 1972,77625152,205238390,,140777000,55.1% 1976,81603346,213270022,,152308000,53.6% 1980,86496851,223140018,,163945000,52.8% 1984,92654861,232766280,,173995000,53.3% 1988,91586725,242287814,,181956000,50.3% 1992,104600366,255175339,,189493000,55.2% 1996,96389818,268984347,,196789000,49.0% 2000,105594024,282398554,,209130000,50.5% 2004,122349480,293947885,,219508000,55.7% 2008,131406895,305694910,,229989000,57.1% 2012,129139997,316651321,,240177000,53.8% 2016,136787187,327210198,,24942000,54.8% 2020,158481688,335942003,,252274000,62.8% Note: The data is truncated to fit the response format. ``` It missed some of the data, maybe confused by the incomplete VEP column, but it was able to process it. Try running `/clear` in the client and try again. The previous conversation history may be confusing the model.
Author
Owner

@I-I-IT commented on GitHub (Nov 8, 2024):

It missed some of the data, maybe confused by the incomplete VEP column, but it was able to process it. Try running /clear in the client and try again. The previous conversation history may be confusing the model.

I will try, the issue I have is that it can take 5 minutes (litterally) to answer. I am running Microsoft Surface Laptop 3. Yes, I don't have a GPU, but still, this is very bad performance.

Edit: It worked!

ollama-screenshot3

<!-- gh-comment-id:2465897498 --> @I-I-IT commented on GitHub (Nov 8, 2024): >It missed some of the data, maybe confused by the incomplete VEP column, but it was able to process it. Try running /clear in the client and try again. The previous conversation history may be confusing the model. I will try, the issue I have is that it can take 5 minutes (litterally) to answer. I am running Microsoft Surface Laptop 3. Yes, I don't have a GPU, but still, this is very bad performance. Edit: It worked! ![ollama-screenshot3](https://github.com/user-attachments/assets/99876adf-12a7-4627-9cb9-c13a60fed3fe)
Author
Owner

@loosac commented on GitHub (Nov 10, 2024):

ollama1
Also after that i created new discussion where i put this screenshot to llama and asked what happened:
obraz
Does llama3.2-vision have hardcoded periods? ;)

obraz

<!-- gh-comment-id:2466829426 --> @loosac commented on GitHub (Nov 10, 2024): ![ollama1](https://github.com/user-attachments/assets/d95b2a53-8b67-4213-bb13-9f3c2ffb0bba) Also after that i created new discussion where i put this screenshot to llama and asked what happened: ![obraz](https://github.com/user-attachments/assets/03b43198-4c7c-4da0-acbc-03b95de10d66) Does llama3.2-vision have hardcoded periods? ;) ![obraz](https://github.com/user-attachments/assets/56be50e3-53ac-4e6b-821c-3415983fb901)
Author
Owner

@I-I-IT commented on GitHub (Nov 10, 2024):

ollama1 Also after that i created new discussion where i put this screenshot to llama and asked what happened: obraz Does llama3.2-vision have hardcoded periods? ;)

obraz

That's a veryv strange issue. Are you sure nothing/ no one messed up with the context window, model settings, etc.

<!-- gh-comment-id:2466877001 --> @I-I-IT commented on GitHub (Nov 10, 2024): > ![ollama1](https://private-user-images.githubusercontent.com/33119507/384715191-d95b2a53-8b67-4213-bb13-9f3c2ffb0bba.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTUxOTEtZDk1YjJhNTMtOGI2Ny00MjEzLWJiMTMtOWYzYzJmZmIwYmJhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRhYTJmNTQ4NzRjNGMwMWMyMzExZGVmNjA3ZjlkZGNjOGY5OTkzMTBlN2FlYTFmMTI2ZGI0Yjc0YjI4MjFkNjkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.tHoEiU0fRGHLonqwOP4ADm1X6bXPNb7QgytEJ1ssM5s) Also after that i created new discussion where i put this screenshot to llama and asked what happened: ![obraz](https://private-user-images.githubusercontent.com/33119507/384715304-03b43198-4c7c-4da0-acbc-03b95de10d66.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTUzMDQtMDNiNDMxOTgtNGM3Yy00ZGEwLWFjYmMtMDNiOTVkZTEwZDY2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTE4YjRmNDAyZDI3M2ZiMmE4NDU4MGQ1YjhhNjJmYzJiNGYzNmRkOTU1MmY0ZDlmZjQ5MTdlZTJjZmJhM2I2OTAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.gIG1fARFzusZRMYuqSBcbyBXJXuXNLJDHxZkt4t_eMw) Does llama3.2-vision have hardcoded periods? ;) > > ![obraz](https://private-user-images.githubusercontent.com/33119507/384716261-56be50e3-53ac-4e6b-821c-3415983fb901.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTYyNjEtNTZiZTUwZTMtNTNhYy00ZTZiLTgyMWMtMzQxNTk4M2ZiOTAxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdkMWFjMzNmODI3YmQxNDk2NTliMDU1NjI5YmQzMTRmMTYzZmZkMDQwOWE0MGRmNDgwY2U2NDhhMDMwYjE5YzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.ud4swGGWPFjr3Fwxa0I5O8dSiLY9_f3xc2nSdPvaXUA) That's a veryv strange issue. Are you sure nothing/ no one messed up with the context window, model settings, etc.
Author
Owner

@loosac commented on GitHub (Nov 10, 2024):

ollama1 Also after that i created new discussion where i put this screenshot to llama and asked what happened: obraz Does llama3.2-vision have hardcoded periods? ;)
obraz

That's a veryv strange issue. Are you sure nothing/ no one messed up with the context window, model settings, etc.

No, docker, ollama pull, nothing extraordynary.

Single 3090, 90 will not fit in VRAM, so maybe it's unstable on CPU?

<!-- gh-comment-id:2466890968 --> @loosac commented on GitHub (Nov 10, 2024): > > ![ollama1](https://private-user-images.githubusercontent.com/33119507/384715191-d95b2a53-8b67-4213-bb13-9f3c2ffb0bba.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTUxOTEtZDk1YjJhNTMtOGI2Ny00MjEzLWJiMTMtOWYzYzJmZmIwYmJhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRhYTJmNTQ4NzRjNGMwMWMyMzExZGVmNjA3ZjlkZGNjOGY5OTkzMTBlN2FlYTFmMTI2ZGI0Yjc0YjI4MjFkNjkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.tHoEiU0fRGHLonqwOP4ADm1X6bXPNb7QgytEJ1ssM5s) Also after that i created new discussion where i put this screenshot to llama and asked what happened: ![obraz](https://private-user-images.githubusercontent.com/33119507/384715304-03b43198-4c7c-4da0-acbc-03b95de10d66.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTUzMDQtMDNiNDMxOTgtNGM3Yy00ZGEwLWFjYmMtMDNiOTVkZTEwZDY2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTE4YjRmNDAyZDI3M2ZiMmE4NDU4MGQ1YjhhNjJmYzJiNGYzNmRkOTU1MmY0ZDlmZjQ5MTdlZTJjZmJhM2I2OTAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.gIG1fARFzusZRMYuqSBcbyBXJXuXNLJDHxZkt4t_eMw) Does llama3.2-vision have hardcoded periods? ;) > > ![obraz](https://private-user-images.githubusercontent.com/33119507/384716261-56be50e3-53ac-4e6b-821c-3415983fb901.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzEyNjg3NzQsIm5iZiI6MTczMTI2ODQ3NCwicGF0aCI6Ii8zMzExOTUwNy8zODQ3MTYyNjEtNTZiZTUwZTMtNTNhYy00ZTZiLTgyMWMtMzQxNTk4M2ZiOTAxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDExMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMTEwVDE5NTQzNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdkMWFjMzNmODI3YmQxNDk2NTliMDU1NjI5YmQzMTRmMTYzZmZkMDQwOWE0MGRmNDgwY2U2NDhhMDMwYjE5YzkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.ud4swGGWPFjr3Fwxa0I5O8dSiLY9_f3xc2nSdPvaXUA) > > That's a veryv strange issue. Are you sure nothing/ no one messed up with the context window, model settings, etc. No, docker, ollama pull, nothing extraordynary. Single 3090, 90 will not fit in VRAM, so maybe it's unstable on CPU?
Author
Owner

@rick-github commented on GitHub (Nov 10, 2024):

I couldn't find the original image so I screen capped it, so the difference in resolution may make a difference.

$ ollama run llama3.2-vision:90b-instruct-q4_K_M
>>> What's on this photo?  /tmp/airshow.png
Added image '/tmp/airshow.png'
The image shows a group of people watching a formation of jets flying overhead. The scene is set against a blue sky with white clouds, and the people are standing in an open area, possibly at an air show 
or military base.

**Key Elements:**

* A group of 5-6 people, including children and adults
* They are all looking up at the sky
* Some of them have their arms raised or hands on their heads as if they are shielding their eyes from the sun
* In the background, there is a formation of 7 jets flying in a V-shape

**Relevant Details:**

* The people appear to be enjoying themselves and taking photos/videos with their phones
* Some of them are wearing hats or sunglasses to protect themselves from the sun
* The jets are flying low enough that they can be seen clearly, but not so low that they pose a threat to the people on the ground

**Overall Impression:**

The image conveys a sense of excitement and wonder as the people watch the jets fly by. It's likely that this is a special event or occasion, such as an air show or military demonstration. The presence of 
children suggests that it may be a family-friendly event.

Note that when you use an interface like open-webui, you are sending the conversation history with the request. If the model has previously denied a request, it's more likely to refuse subsequent ones if you keep reminding it of the earlier responses. Other interfaces may also add extra context behind the scenes which affect the response.

<!-- gh-comment-id:2466973481 --> @rick-github commented on GitHub (Nov 10, 2024): I couldn't find the original image so I screen capped it, so the difference in resolution may make a difference. ```console $ ollama run llama3.2-vision:90b-instruct-q4_K_M >>> What's on this photo? /tmp/airshow.png Added image '/tmp/airshow.png' The image shows a group of people watching a formation of jets flying overhead. The scene is set against a blue sky with white clouds, and the people are standing in an open area, possibly at an air show or military base. **Key Elements:** * A group of 5-6 people, including children and adults * They are all looking up at the sky * Some of them have their arms raised or hands on their heads as if they are shielding their eyes from the sun * In the background, there is a formation of 7 jets flying in a V-shape **Relevant Details:** * The people appear to be enjoying themselves and taking photos/videos with their phones * Some of them are wearing hats or sunglasses to protect themselves from the sun * The jets are flying low enough that they can be seen clearly, but not so low that they pose a threat to the people on the ground **Overall Impression:** The image conveys a sense of excitement and wonder as the people watch the jets fly by. It's likely that this is a special event or occasion, such as an air show or military demonstration. The presence of children suggests that it may be a family-friendly event. ``` Note that when you use an interface like open-webui, you are sending the conversation history with the request. If the model has previously denied a request, it's more likely to refuse subsequent ones if you keep reminding it of the earlier responses. Other interfaces may also add extra context behind the scenes which affect the response.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4816