[GH-ISSUE #8322] llama3.2-vision // llava - not receiving images via chat completions API #67386

Closed
opened 2026-05-04 10:11:25 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @TheFoot on GitHub (Jan 6, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8322

What is the issue?

When using either the llama3.2-vision:11b or llava:13b models, the LLM reports it does not receive an image. Llama is clearer than Llava (which just makes up something).

The CLI works (with absolute paths not relative):

% ollama run llama3.2-vision
>>> Describe this image: /Users/thefoot/Pictures/thefeet.png
Added image '/Users/thefoot/Pictures/thefeet.png'
This image features a digital illustration of nine feet, with the top three pairs [...]

Example API chat completions call:

curl --location 'http://localhost:11434/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "llama3.2-vision:11b”,
    "messages": [
        {
            "role": "user",
            "content": "Describe this image",
            "images": [
                "iVBORw0KGgoAAAANSUhEUgAABAAAAAQACAYAAA [...]”
            ]
        }
    ],
    "temperature": 0.7,
    "max_tokens": 500,
    "stream": false
}'

Response message:

"message": {
                "role": "assistant",
                "content": "There is no image provided. Can you please describe the image or provide more context, so I can give a description?"
            }

Any idea if this is a bug, or have I done something wrong with the API call?

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.5.4

Originally created by @TheFoot on GitHub (Jan 6, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8322 ### What is the issue? When using either the llama3.2-vision:11b or llava:13b models, the LLM reports it does not receive an image. Llama is clearer than Llava (which just makes up something). The CLI works (with absolute paths not relative): ``` % ollama run llama3.2-vision >>> Describe this image: /Users/thefoot/Pictures/thefeet.png Added image '/Users/thefoot/Pictures/thefeet.png' This image features a digital illustration of nine feet, with the top three pairs [...] ``` Example API chat completions call: ``` curl --location 'http://localhost:11434/v1/chat/completions' \ --header 'Content-Type: application/json' \ --data '{ "model": "llama3.2-vision:11b”, "messages": [ { "role": "user", "content": "Describe this image", "images": [ "iVBORw0KGgoAAAANSUhEUgAABAAAAAQACAYAAA [...]” ] } ], "temperature": 0.7, "max_tokens": 500, "stream": false }' ``` Response message: ``` "message": { "role": "assistant", "content": "There is no image provided. Can you please describe the image or provide more context, so I can give a description?" } ``` Any idea if this is a bug, or have I done something wrong with the API call? ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.5.4
GiteaMirror added the bug label 2026-05-04 10:11:25 -05:00
Author
Owner

@rick-github commented on GitHub (Jan 7, 2025):

OpenAI API requires that the images are a dict and the image blob has a DataURI prefix:

            "images": {
                "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABAAAAAQACAYAAA [...]”
            }
<!-- gh-comment-id:2574199881 --> @rick-github commented on GitHub (Jan 7, 2025): OpenAI API [requires](https://platform.openai.com/docs/guides/vision#uploading-base64-encoded-images) that the images are a dict and the image blob has a DataURI prefix: ``` "images": { "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABAAAAAQACAYAAA [...]” } ```
Author
Owner

@TheFoot commented on GitHub (Jan 7, 2025):

Ahh thank you! The below worked as per OpenAI spec:

"content": [
                {
                    "type": "text",
                    "text": "What is in this image?",
                },
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"},
                },
            ],
<!-- gh-comment-id:2574894439 --> @TheFoot commented on GitHub (Jan 7, 2025): Ahh thank you! The below worked as per OpenAI spec: ``` "content": [ { "type": "text", "text": "What is in this image?", }, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}, }, ], ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#67386