[PR #7414] [MERGED] runner.go: Better abstract vision model integration #59110

Closed
opened 2026-04-29 13:59:25 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7414
Author: @jessegross
Created: 10/30/2024
Status: Merged
Merged: 10/30/2024
Merged by: @jessegross

Base: mainHead: jessegross/mllama_batch


📝 Commits (1)

  • 7e2ed84 runner.go: Better abstract vision model integration

📊 Changes

13 files changed (+533 additions, -453 deletions)

View changed files

📝 llama/llama.cpp (+44 -61)
📝 llama/llama.go (+71 -78)
📝 llama/llama.h (+2 -1)
📝 llama/llava.cpp (+1 -1)
📝 llama/patches/0010-add-mllama-support.patch (+143 -101)
📝 llama/runner/cache.go (+0 -58)
📝 llama/runner/cache_test.go (+0 -75)
llama/runner/image.go (+145 -0)
llama/runner/image_test.go (+80 -0)
📝 llama/runner/runner.go (+13 -42)
📝 server/prompt.go (+25 -27)
📝 server/prompt_test.go (+5 -5)
📝 server/routes.go (+4 -4)

📄 Description

-Update mllama to take the cross attention state as embeddings in a batch, more similar to how Llava handles it. This improves integration with the input cache.
-Pass locations in a prompt for embeddings using tags similar to Llava.
-Abstract interface to vision models so the main runner accesses Clip and Mllama similarly


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7414 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 10/30/2024 **Status:** ✅ Merged **Merged:** 10/30/2024 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/mllama_batch` --- ### 📝 Commits (1) - [`7e2ed84`](https://github.com/ollama/ollama/commit/7e2ed8410b28b237833de585fcc1df99d1bcc38c) runner.go: Better abstract vision model integration ### 📊 Changes **13 files changed** (+533 additions, -453 deletions) <details> <summary>View changed files</summary> 📝 `llama/llama.cpp` (+44 -61) 📝 `llama/llama.go` (+71 -78) 📝 `llama/llama.h` (+2 -1) 📝 `llama/llava.cpp` (+1 -1) 📝 `llama/patches/0010-add-mllama-support.patch` (+143 -101) 📝 `llama/runner/cache.go` (+0 -58) 📝 `llama/runner/cache_test.go` (+0 -75) ➕ `llama/runner/image.go` (+145 -0) ➕ `llama/runner/image_test.go` (+80 -0) 📝 `llama/runner/runner.go` (+13 -42) 📝 `server/prompt.go` (+25 -27) 📝 `server/prompt_test.go` (+5 -5) 📝 `server/routes.go` (+4 -4) </details> ### 📄 Description -Update mllama to take the cross attention state as embeddings in a batch, more similar to how Llava handles it. This improves integration with the input cache. -Pass locations in a prompt for embeddings using tags similar to Llava. -Abstract interface to vision models so the main runner accesses Clip and Mllama similarly --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 13:59:25 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#59110