[GH-ISSUE #6753] image_url support for vision models #30017

Open
opened 2026-04-22 09:25:44 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @madroidmaq on GitHub (Sep 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6753

What is the issue?

curl:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer OPENAI_API_KEY" \
  -d '{
    "model": "minicpm-v:8b-2.6-fp16",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What’s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "http://images.cocodataset.org/val2017/000000039769.jpg"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

response:

{
	"error": {
		"message": "invalid image input",
		"type": "invalid_request_error",
		"param": null,
		"code": null
	}
}

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.10

Originally created by @madroidmaq on GitHub (Sep 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6753 ### What is the issue? curl: ```py curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer OPENAI_API_KEY" \ -d '{ "model": "minicpm-v:8b-2.6-fp16", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What’s in this image?" }, { "type": "image_url", "image_url": { "url": "http://images.cocodataset.org/val2017/000000039769.jpg" } } ] } ], "max_tokens": 300 }' ``` response: ```json { "error": { "message": "invalid image input", "type": "invalid_request_error", "param": null, "code": null } } ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.3.10
GiteaMirror added the feature requestapi labels 2026-04-22 09:25:44 -05:00
Author
Owner

@ZaMeR12 commented on GitHub (Sep 23, 2024):

You had just to read the docs about the openai compatibility click here to see it

The docs tell that for now that the openai compatibility doesn't handle the url image format. So until the ollama team had it, you will need to convert your image in base64 by yourself.Also, i recommend to use the regular api of ollama (the openai compatibility is experimenatl) if you can avoid the openai compatibility. This compatibility is make more for application that already exist with openai api and don't want to deal with ollama api. In the case of example an app that make able to use openai before Ollama exist, well now customer can use ollama even if the "team" who build the app change nothing.

Classic Ollama api here

There is a screenshot to you see what i mean quickly:
image

What is the issue?

curl:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer OPENAI_API_KEY" \
  -d '{
    "model": "minicpm-v:8b-2.6-fp16",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What’s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "http://images.cocodataset.org/val2017/000000039769.jpg"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

response:

{
	"error": {
		"message": "invalid image input",
		"type": "invalid_request_error",
		"param": null,
		"code": null
	}
}

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.10

<!-- gh-comment-id:2367182498 --> @ZaMeR12 commented on GitHub (Sep 23, 2024): You had just to read the docs about the openai compatibility [click here to see it](https://github.com/ollama/ollama/blob/main/docs/openai.md) The docs tell that for now that the openai compatibility doesn't handle the url image format. So until the ollama team had it, you will need to convert your image in base64 by yourself.Also, i recommend to use the regular api of ollama (the openai compatibility is experimenatl) if you can avoid the openai compatibility. This compatibility is make more for application that already exist with openai api and don't want to deal with ollama api. In the case of example an app that make able to use openai before Ollama exist, well now customer can use ollama even if the "team" who build the app change nothing. Classic Ollama api [here](https://github.com/ollama/ollama/blob/main/docs/api.md) There is a screenshot to you see what i mean quickly: ![image](https://github.com/user-attachments/assets/fbe39945-532c-4af3-acc4-369f33e4b499) > ### What is the issue? > curl: > > ```python > curl http://localhost:11434/v1/chat/completions \ > -H "Content-Type: application/json" \ > -H "Authorization: Bearer OPENAI_API_KEY" \ > -d '{ > "model": "minicpm-v:8b-2.6-fp16", > "messages": [ > { > "role": "user", > "content": [ > { > "type": "text", > "text": "What’s in this image?" > }, > { > "type": "image_url", > "image_url": { > "url": "http://images.cocodataset.org/val2017/000000039769.jpg" > } > } > ] > } > ], > "max_tokens": 300 > }' > ``` > > response: > > ```json > { > "error": { > "message": "invalid image input", > "type": "invalid_request_error", > "param": null, > "code": null > } > } > ``` > > ### OS > macOS > > ### GPU > Apple > > ### CPU > Apple > > ### Ollama version > 0.3.10
Author
Owner

@maxi1134 commented on GitHub (Nov 11, 2024):

I admit that it would be great to be able to simply pipe URLs to the LLMs with any API.

<!-- gh-comment-id:2468593118 --> @maxi1134 commented on GitHub (Nov 11, 2024): I admit that it would be great to be able to simply pipe URLs to the LLMs with any API.
Author
Owner

@ZaMeR12 commented on GitHub (Nov 25, 2024):

I admit that it would be great to be able to simply pipe URLs to the LLMs with any API.

Technically, you can fetch the image from any API, transform it to the base64 and send it to Ollama. All you need to add to your app is a column in you DB that indicate if the source is an url or a machine image. I mean transforming an image to base64 doesn't change between the two type, juste indicating if it's an url to search the image from the web instead of local machine. If i succeed to transform uploaded user image on Electron with Ollama's API and that was not easy because of Electron, you can technically do that from scratch in your side.

<!-- gh-comment-id:2499057499 --> @ZaMeR12 commented on GitHub (Nov 25, 2024): > I admit that it would be great to be able to simply pipe URLs to the LLMs with any API. Technically, you can fetch the image from any API, transform it to the base64 and send it to Ollama. All you need to add to your app is a column in you DB that indicate if the source is an url or a machine image. I mean transforming an image to base64 doesn't change between the two type, juste indicating if it's an url to search the image from the web instead of local machine. If i succeed to transform uploaded user image on Electron with Ollama's API and that was not easy because of Electron, you can technically do that from scratch in your side.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30017