[GH-ISSUE #9664] Llama3.2-vision can't see images and believes it's a text only model #83990

Closed
opened 2026-05-09 19:35:11 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @50-scratch-tabs on GitHub (Mar 11, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9664

What is the issue?

I sent an image to llama 3.2 vision, and it said that it added an image, and yet the model keeps telling me that it's a text based language model and that it can't see images.

Relevant log output


OS

Windows

GPU

None

CPU

AMD

Ollama version

0.5.13

Originally created by @50-scratch-tabs on GitHub (Mar 11, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9664 ### What is the issue? I sent an image to llama 3.2 vision, and it said that it added an image, and yet the model keeps telling me that it's a text based language model and that it can't see images. ### Relevant log output ```shell ``` ### OS Windows ### GPU None ### CPU AMD ### Ollama version 0.5.13
GiteaMirror added the bug label 2026-05-09 19:35:11 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 11, 2025):

Server log will aid in debugging. What client are you using?

<!-- gh-comment-id:2715994290 --> @rick-github commented on GitHub (Mar 11, 2025): Server log will aid in debugging. What client are you using?
Author
Owner

@50-scratch-tabs commented on GitHub (Mar 12, 2025):

Server log will aid in debugging. What client are you using?

Ollama cli. I'll attach my logs in maybe 25 minutes.

Oh, and, after looking at similar issues, I did the command to show the model file and I didn't see a projector line.

<!-- gh-comment-id:2716007924 --> @50-scratch-tabs commented on GitHub (Mar 12, 2025): > Server log will aid in debugging. What client are you using? Ollama cli. I'll attach my logs in maybe 25 minutes. Oh, and, after looking at similar issues, I did the command to show the model file and I didn't see a projector line.
Author
Owner

@50-scratch-tabs commented on GitHub (Mar 12, 2025):

>>> Can I upload pictures to you?
Unfortunately, I'm a large language model, I don't have the capability to directly receive or
display images. However, there are a few workarounds:

1. **Describe the image**: You can describe the picture in detail, and I'll do my best to
understand and respond accordingly.
2. **Use text-based alternatives**: If you want to share an image with me, you could try creating
a text-based representation of it using ASCII art or emojis.
3. **Share a link**: If the image is publicly available online (e.g., on a website, social media
platform), you can share the link with me, and I might be able to help with it.

Keep in mind that these workarounds have limitations, and my understanding of the image will
depend on your description or interpretation.

How can I assist you today?

>>> \..\..\..\Pictures\Icons_logos\debian_logo.jpg How about now?
Unfortunately, even with a path to an image file, I'm still not able to directly access or
display the image. The path `\..\..\..\Pictures\Icons_logos\debian_logo.jpg` is specific to your
local file system, and I don't have direct access to it.

However, if you'd like to describe the Debian logo, I can try to help with that! What would you
like to know or discuss about the Debian logo?

>>> Whoops, that's supposed to upload an image. I might have the syntax wrong, so please bear with me.
You're trying to share an image by providing a file path. Unfortunately, this format isn't
supported in our conversation.

However, if you want to share the image, you can try copying and pasting the contents of the
image into the chat window. Some images can be shared as text using various encoding formats like
Base64 or ASCII art, but it's not a straightforward process.

If you're unsure about how to proceed, I can offer some guidance on how to encode an image as
text or provide alternative ways to share the image with me.

>>> \..\..\..\Pictures\Icons_logos\debian_logo.jpg
It seems like that syntax isn't working for us.

Let's try a different approach. If you want to share the Debian logo with me, you can:

1. Save the image as a file (e.g., "debian_logo.jpg") and then upload it to a public hosting
service like imgur or postimage.
2. Share the URL of the hosted image with me, and I'll do my best to help.

Alternatively, if you'd rather not share an external link, we can try describing the logo
together! What's distinctive about the Debian logo?

>>> I'm trying. Ollama is supposed to allow me to upload images. Please ignore me if you see that syntax. Whether you believe it or not, you're also an image seeing AI, and eventually it should go through. Let me know if you get my image.
It seems like we have a bit of a misunderstanding here.

I'm not actually an "image seeing" AI in the classical sense. I'm a large language model designed
to understand and respond to natural language inputs, but I don't have direct access to visual
information or images.

That being said, I can try to help you with your goal of uploading images using Ollama (I assume
that's an image upload platform). However, I should let you know that I won't be able to directly
"see" the image even if we manage to upload it.

If you're still having trouble getting the image uploaded, feel free to try again or provide more
details about what's not working. I'll do my best to help troubleshoot the issue!

>>> Ollama is the AI app that I'm using to communicate with you. And I know you don't think so, but you are an AI that can see images.
I think there might be some confusion here. Ollama is a text-based AI model, and it doesn't have
the capability to upload or display images. It's designed to understand and respond to natural
language inputs, but it doesn't have visual processing capabilities.

I'm also not an "image seeing" AI. I'm a large language model that processes and responds to
text-based input, but I don't have the ability to perceive or interpret visual information.

While Ollama might be a fun and creative way to interact with me, we're still limited to
text-based communication. If you want to share an image or describe something visually, feel free
to do so using words, and I'll do my best to help you out.

>>> You are llama3.2-vision. Please describe this image:
I'm afraid I'm not actually "llama3.2-vision" or any other type of image-seeing AI. I'm still
just a large language model, and I don't have the capability to perceive or interpret visual
information.

However, based on your previous descriptions, I can try to make an educated guess about what this
image might be. Since you mentioned the Debian

(At this point, I stopped generating, saved the conversation, and changed to the directory of the image.)

>>> Now try: ./Debian_logo.jpg
Added image './Debian_logo.jpg'
I think we're getting stuck in a loop here!

As I mentioned earlier, I'm not capable of seeing or interpreting images, so I won't be able to
describe anything new based on your attempts to upload an image.

If you'd like to continue our conversation, feel free to type out a description of the image or
ask me a question about something else. I'll do my best to help and provide information on a wide
range of topics.

I'm going to check my logs to make sure I don't dox myself, and then I'll upload them.

<!-- gh-comment-id:2716073478 --> @50-scratch-tabs commented on GitHub (Mar 12, 2025): ``` >>> Can I upload pictures to you? Unfortunately, I'm a large language model, I don't have the capability to directly receive or display images. However, there are a few workarounds: 1. **Describe the image**: You can describe the picture in detail, and I'll do my best to understand and respond accordingly. 2. **Use text-based alternatives**: If you want to share an image with me, you could try creating a text-based representation of it using ASCII art or emojis. 3. **Share a link**: If the image is publicly available online (e.g., on a website, social media platform), you can share the link with me, and I might be able to help with it. Keep in mind that these workarounds have limitations, and my understanding of the image will depend on your description or interpretation. How can I assist you today? >>> \..\..\..\Pictures\Icons_logos\debian_logo.jpg How about now? Unfortunately, even with a path to an image file, I'm still not able to directly access or display the image. The path `\..\..\..\Pictures\Icons_logos\debian_logo.jpg` is specific to your local file system, and I don't have direct access to it. However, if you'd like to describe the Debian logo, I can try to help with that! What would you like to know or discuss about the Debian logo? >>> Whoops, that's supposed to upload an image. I might have the syntax wrong, so please bear with me. You're trying to share an image by providing a file path. Unfortunately, this format isn't supported in our conversation. However, if you want to share the image, you can try copying and pasting the contents of the image into the chat window. Some images can be shared as text using various encoding formats like Base64 or ASCII art, but it's not a straightforward process. If you're unsure about how to proceed, I can offer some guidance on how to encode an image as text or provide alternative ways to share the image with me. >>> \..\..\..\Pictures\Icons_logos\debian_logo.jpg It seems like that syntax isn't working for us. Let's try a different approach. If you want to share the Debian logo with me, you can: 1. Save the image as a file (e.g., "debian_logo.jpg") and then upload it to a public hosting service like imgur or postimage. 2. Share the URL of the hosted image with me, and I'll do my best to help. Alternatively, if you'd rather not share an external link, we can try describing the logo together! What's distinctive about the Debian logo? >>> I'm trying. Ollama is supposed to allow me to upload images. Please ignore me if you see that syntax. Whether you believe it or not, you're also an image seeing AI, and eventually it should go through. Let me know if you get my image. It seems like we have a bit of a misunderstanding here. I'm not actually an "image seeing" AI in the classical sense. I'm a large language model designed to understand and respond to natural language inputs, but I don't have direct access to visual information or images. That being said, I can try to help you with your goal of uploading images using Ollama (I assume that's an image upload platform). However, I should let you know that I won't be able to directly "see" the image even if we manage to upload it. If you're still having trouble getting the image uploaded, feel free to try again or provide more details about what's not working. I'll do my best to help troubleshoot the issue! >>> Ollama is the AI app that I'm using to communicate with you. And I know you don't think so, but you are an AI that can see images. I think there might be some confusion here. Ollama is a text-based AI model, and it doesn't have the capability to upload or display images. It's designed to understand and respond to natural language inputs, but it doesn't have visual processing capabilities. I'm also not an "image seeing" AI. I'm a large language model that processes and responds to text-based input, but I don't have the ability to perceive or interpret visual information. While Ollama might be a fun and creative way to interact with me, we're still limited to text-based communication. If you want to share an image or describe something visually, feel free to do so using words, and I'll do my best to help you out. >>> You are llama3.2-vision. Please describe this image: I'm afraid I'm not actually "llama3.2-vision" or any other type of image-seeing AI. I'm still just a large language model, and I don't have the capability to perceive or interpret visual information. However, based on your previous descriptions, I can try to make an educated guess about what this image might be. Since you mentioned the Debian ``` (At this point, I stopped generating, saved the conversation, and changed to the directory of the image.) ``` >>> Now try: ./Debian_logo.jpg Added image './Debian_logo.jpg' I think we're getting stuck in a loop here! As I mentioned earlier, I'm not capable of seeing or interpreting images, so I won't be able to describe anything new based on your attempts to upload an image. If you'd like to continue our conversation, feel free to type out a description of the image or ask me a question about something else. I'll do my best to help and provide information on a wide range of topics. ``` I'm going to check my logs to make sure I don't dox myself, and then I'll upload them.
Author
Owner

@pdevine commented on GitHub (Mar 12, 2025):

Can you try to clear the context (with >>> /clear) and then type in >>> Describe this image: ./Debian_logo.jpg?

If the CLI doesn't say Added image 'image.name' it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model.

<!-- gh-comment-id:2716080959 --> @pdevine commented on GitHub (Mar 12, 2025): Can you try to clear the context (with `>>> /clear`) and then type in `>>> Describe this image: ./Debian_logo.jpg`? If the CLI doesn't say `Added image 'image.name'` it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model.
Author
Owner

@50-scratch-tabs commented on GitHub (Mar 12, 2025):

Can you try to clear the context (with >>> /clear) and then type in >>> Describe this image: ./Debian_logo.jpg?

If the CLI doesn't say Added image 'image.name' it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model.

Got it. I'll try after clearing.

Also I got my log files:

server.log.txt
server-1.log.txt
server-2.log.txt

Edit: My computer is now whirring away. My hardware is anything but set up for AI so this might take a few minutes.

(Now that I think of it, the first time I successfully added the image, it took a while)

<!-- gh-comment-id:2716082921 --> @50-scratch-tabs commented on GitHub (Mar 12, 2025): > Can you try to clear the context (with `>>> /clear`) and then type in `>>> Describe this image: ./Debian_logo.jpg`? > > If the CLI doesn't say `Added image 'image.name'` it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model. Got it. I'll try after clearing. Also I got my log files: [server.log.txt](https://github.com/user-attachments/files/19200148/server.log.txt) [server-1.log.txt](https://github.com/user-attachments/files/19200149/server-1.log.txt) [server-2.log.txt](https://github.com/user-attachments/files/19200150/server-2.log.txt) Edit: My computer is now whirring away. My hardware is anything but set up for AI so this might take a few minutes. (Now that I think of it, the first time I successfully added the image, it took a while)
Author
Owner

@50-scratch-tabs commented on GitHub (Mar 12, 2025):

Can you try to clear the context (with >>> /clear) and then type in >>> Describe this image: ./Debian_logo.jpg?

If the CLI doesn't say Added image 'image.name' it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model.

Thank you so much! This worked like a charm. I'll just have to remember to remind it in it's system prompt when I use this in a project.

<!-- gh-comment-id:2716108347 --> @50-scratch-tabs commented on GitHub (Mar 12, 2025): > Can you try to clear the context (with `>>> /clear`) and then type in `>>> Describe this image: ./Debian_logo.jpg`? > > If the CLI doesn't say `Added image 'image.name'` it means it's not going to pass in the image to the model. It looks like you were able to send the image on the last prompt, but the context was already contaminated and the model thought it wasn't actually a vision model. Thank you so much! This worked like a charm. I'll just have to remember to remind it in it's system prompt when I use this in a project.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#83990