[GH-ISSUE #2338] Very nice to have: capabilities info for multimodal models #27113

Closed
opened 2026-04-22 04:05:18 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @da-z on GitHub (Feb 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2338

Not sure if this is done already, I checked the llava info and it does not mention capabilities anywhere. Would be nice to detect via ollama show or API model info that this model supports vision.

API Example

GET /api/tags

{
  //...
  "details": {
	  "parent_model": "",
	  "format": "gguf",
	  "family": "llama",
	  "families": [
		  "llama",
		  "clip"
	  ],
          "capabilities": ["vision"]
	  //...
  }
}
Originally created by @da-z on GitHub (Feb 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2338 Not sure if this is done already, I checked the llava info and it does not mention capabilities anywhere. Would be nice to detect via ollama show or API model info that this model supports `vision`. API Example `GET /api/tags` ```js { //... "details": { "parent_model": "", "format": "gguf", "family": "llama", "families": [ "llama", "clip" ], "capabilities": ["vision"] //... } } ```
Author
Owner

@chigkim commented on GitHub (Feb 3, 2024):

As far as I know, all multimodal models Ollama supports have clip in families. Other non-multimodal regular language models don't have clip in families.

<!-- gh-comment-id:1925421025 --> @chigkim commented on GitHub (Feb 3, 2024): As far as I know, all multimodal models Ollama supports have clip in families. Other non-multimodal regular language models don't have clip in families.
Author
Owner

@da-z commented on GitHub (Feb 3, 2024):

You are right. Just read about it now https://openai.com/research/clip

<!-- gh-comment-id:1925422287 --> @da-z commented on GitHub (Feb 3, 2024): You are right. Just read about it now https://openai.com/research/clip
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#27113