[PR #5208] [MERGED] Support image input for OpenAI chat compatibility #37588

Closed
opened 2026-04-22 22:16:11 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5208
Author: @royjhan
Created: 6/22/2024
Status: Merged
Merged: 7/14/2024
Merged by: @royjhan

Base: mainHead: royh-vision


📝 Commits (10+)

📊 Changes

2 files changed (+119 additions, -6 deletions)

View changed files

📝 openai/openai.go (+70 -6)
📝 openai/openai_test.go (+49 -0)

📄 Description

Supports passing in base64 encoded image into image_url.

E.g.

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llava",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
               "url": "'$image'"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }' | jq
{
  "id": "chatcmpl-659",
  "object": "chat.completion",
  "created": 1719016156,
  "model": "llava",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " The image shows a cute cartoon of an animal. It appears to be a dog or similar creature, styled with exaggerated features typical in internet memes. The character has big eyes, a round face, and its arms are raised in the air, as if waving or giving a thumbs-up gesture. There's also some motion blur that gives the impression of movement, suggesting the animal might be jumping or dancing. This kind of image is often used in digital communication to convey emotions or add a playful element to text messages. "
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 112,
    "total_tokens": 113
  }
}

Resolves #3690, #5304


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5208 **Author:** [@royjhan](https://github.com/royjhan) **Created:** 6/22/2024 **Status:** ✅ Merged **Merged:** 7/14/2024 **Merged by:** [@royjhan](https://github.com/royjhan) **Base:** `main` ← **Head:** `royh-vision` --- ### 📝 Commits (10+) - [`cb6d5b0`](https://github.com/ollama/ollama/commit/cb6d5b0310ae2c245c7e4f0cf3f85676789b9c7e) OpenAI v1 models - [`1114d96`](https://github.com/ollama/ollama/commit/1114d9661bc0cee5aaa1e9f7f6e24aee12e6c6fe) Refactor Writers - [`e57f0d1`](https://github.com/ollama/ollama/commit/e57f0d19f389d46e88441466ed6e1fc3870f9020) Add Test - [`310940d`](https://github.com/ollama/ollama/commit/310940d11e56232d7c6220dc809fef894b0c0618) Credit Co-Author - [`691e448`](https://github.com/ollama/ollama/commit/691e44869e3f463a79be6137b11e8780fde3d24d) Merge branch 'main' into royh-openai - [`65c0850`](https://github.com/ollama/ollama/commit/65c08507cb55839709da2db0f7dad65bb2133485) Empty List Testing - [`c826421`](https://github.com/ollama/ollama/commit/c826421f17ea2a3b351a9da18e36a993b0856a33) Use Namespace for Ownedby - [`fe1a625`](https://github.com/ollama/ollama/commit/fe1a625fee3c2b663c7814042392135fc9bdd973) Update Test - [`3cf19d9`](https://github.com/ollama/ollama/commit/3cf19d95880cbd16073733622c9fe505050baf02) Add back envconfig - [`c037616`](https://github.com/ollama/ollama/commit/c037616d6bbecf91ec87a84336bac45f07846fe5) v1/models docs ### 📊 Changes **2 files changed** (+119 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `openai/openai.go` (+70 -6) 📝 `openai/openai_test.go` (+49 -0) </details> ### 📄 Description Supports passing in base64 encoded image into image_url. E.g. ``` curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "llava", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What'\''s in this image?" }, { "type": "image_url", "image_url": { "url": "'$image'" } } ] } ], "max_tokens": 300 }' | jq ``` ``` { "id": "chatcmpl-659", "object": "chat.completion", "created": 1719016156, "model": "llava", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": " The image shows a cute cartoon of an animal. It appears to be a dog or similar creature, styled with exaggerated features typical in internet memes. The character has big eyes, a round face, and its arms are raised in the air, as if waving or giving a thumbs-up gesture. There's also some motion blur that gives the impression of movement, suggesting the animal might be jumping or dancing. This kind of image is often used in digital communication to convey emotions or add a playful element to text messages. " }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 1, "completion_tokens": 112, "total_tokens": 113 } } ``` Resolves #3690, #5304 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:16:11 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#37588