[PR #10509] [MERGED] image: add vision capability tag for projector-based models #11997

Closed
opened 2025-11-12 16:26:27 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10509
Author: @rick-github
Created: 5/1/2025
Status: Merged
Merged: 5/1/2025
Merged by: @mxyng

Base: mainHead: capabilities


📝 Commits (1)

  • 8efe78f image: add vision capability for projector-based models

📊 Changes

1 file changed (+5 additions, -0 deletions)

View changed files

📝 server/images.go (+5 -0)

📄 Description

Most vision models in the ollama library use projectors and don't have the "vision" tag in the capabilities list.

$ for i in gemma3 mistral-small3.1 llava llama3.2-vision minicpm-v llava-llama3 moondream bakllava llava-phi3 granite3.2-vision ; do printf "%-20s " $i ; curl -s localhost:11434/api/show -d '{"model":"'$i'"}' | jq -c .capabilities ; done
gemma3               ["completion","vision"]
mistral-small3.1     ["completion","vision","tools"]
llava                ["completion"]
llama3.2-vision      ["completion"]
minicpm-v            ["completion"]
llava-llama3         ["completion"]
moondream            ["completion"]
bakllava             ["completion"]
llava-phi3           ["completion"]
granite3.2-vision    ["completion","tools"]

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10509 **Author:** [@rick-github](https://github.com/rick-github) **Created:** 5/1/2025 **Status:** ✅ Merged **Merged:** 5/1/2025 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `capabilities` --- ### 📝 Commits (1) - [`8efe78f`](https://github.com/ollama/ollama/commit/8efe78f6ab8e6fa512fe0fcea4552d48aaae00b5) image: add vision capability for projector-based models ### 📊 Changes **1 file changed** (+5 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `server/images.go` (+5 -0) </details> ### 📄 Description Most vision models in the ollama library use projectors and don't have the "vision" tag in the capabilities list. ```console $ for i in gemma3 mistral-small3.1 llava llama3.2-vision minicpm-v llava-llama3 moondream bakllava llava-phi3 granite3.2-vision ; do printf "%-20s " $i ; curl -s localhost:11434/api/show -d '{"model":"'$i'"}' | jq -c .capabilities ; done gemma3 ["completion","vision"] mistral-small3.1 ["completion","vision","tools"] llava ["completion"] llama3.2-vision ["completion"] minicpm-v ["completion"] llava-llama3 ["completion"] moondream ["completion"] bakllava ["completion"] llava-phi3 ["completion"] granite3.2-vision ["completion","tools"] ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-12 16:26:27 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#11997