[GH-ISSUE #1589] Support multimodal models like vision #51224

Closed
opened 2026-05-05 12:07:05 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @fire on GitHub (Apr 17, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1589

Is your feature request related to a problem? Please describe.

I wish to integrate multimodal models.

Describe the solution you'd like

Support models like NousResearch/Obsidian-3B-V0.5 which are vision and large language models.

Describe alternatives you've considered

Use chatgpt with image upload.

Additional context

I had a hard time finding any duplicated issues, apologies for not finding an existing issue.

Originally created by @fire on GitHub (Apr 17, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/1589 **Is your feature request related to a problem? Please describe.** I wish to integrate multimodal models. **Describe the solution you'd like** Support models like NousResearch/Obsidian-3B-V0.5 which are vision and large language models. **Describe alternatives you've considered** Use chatgpt with image upload. **Additional context** I had a hard time finding any duplicated issues, apologies for not finding an existing issue.
Author
Owner

@fire commented on GitHub (Apr 17, 2024):

🔄 Multi-Modal Support: Seamlessly engage with models that support multimodal interactions, including images (e.g., LLava).

It was a bit hard finding this.

<!-- gh-comment-id:2061762477 --> @fire commented on GitHub (Apr 17, 2024): > 🔄 Multi-Modal Support: Seamlessly engage with models that support multimodal interactions, including images (e.g., LLava). It was a bit hard finding this.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#51224