[PR #6971] [CLOSED] draft: mllama vision encoder #74573

Closed
opened 2026-05-05 06:43:59 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6971
Author: @mxyng
Created: 9/26/2024
Status: Closed

Base: pdevine/imageprocHead: mxyng/mllama


📝 Commits (10+)

  • f8ed545 image processing for llama3.2
  • 5da1043 feed the linter
  • c48e2cf more fixes for mllama
  • 96a8b2f fix prompt for non-mllama multimodal
  • a2d33ee linter feeding
  • 5486c57 fix template / imageproc issues
  • 7d5e0ff add server.cpp and patches
  • 71e76f8 server.cpp: cleanup cross attention state
  • 03cf762 change resize algorithm
  • 3a1c8da only allow a single image to be passed

📊 Changes

19 files changed (+2650 additions, -153 deletions)

View changed files

📝 cmd/cmd.go (+1 -2)
📝 cmd/interactive.go (+20 -27)
📝 go.mod (+1 -0)
📝 go.sum (+2 -0)
📝 llm/ext_server/CMakeLists.txt (+1 -1)
llm/ext_server/mllama.cpp (+906 -0)
llm/ext_server/mllama.h (+61 -0)
📝 llm/ext_server/server.cpp (+85 -21)
llm/patches/0009-mllama.patch (+693 -0)
📝 llm/server.go (+3 -2)
server/imageproc/images.go (+255 -0)
server/imageproc/images_test.go (+363 -0)
📝 server/model.go (+3 -1)
📝 server/prompt.go (+81 -14)
📝 server/prompt_test.go (+144 -14)
📝 server/routes.go (+25 -13)
📝 server/routes_generate_test.go (+6 -6)
📝 template/template.go (+0 -13)
📝 template/template_test.go (+0 -39)

📄 Description

this change implements the mllama vision encoder in the existing clip.cpp example

it's still very much a work in progress


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6971 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 9/26/2024 **Status:** ❌ Closed **Base:** `pdevine/imageproc` ← **Head:** `mxyng/mllama` --- ### 📝 Commits (10+) - [`f8ed545`](https://github.com/ollama/ollama/commit/f8ed545cbb4fc8febfb223ebe1687ee2c409fe08) image processing for llama3.2 - [`5da1043`](https://github.com/ollama/ollama/commit/5da104368067430efef07d84ef417d0103746c30) feed the linter - [`c48e2cf`](https://github.com/ollama/ollama/commit/c48e2cfc0d2fbea3813a46283daa90e7ab725eae) more fixes for mllama - [`96a8b2f`](https://github.com/ollama/ollama/commit/96a8b2f7d8befee49c8a0c916c594e21f3de9a0c) fix prompt for non-mllama multimodal - [`a2d33ee`](https://github.com/ollama/ollama/commit/a2d33ee390c67a20ae9f24e805ea5199d8e609e7) linter feeding - [`5486c57`](https://github.com/ollama/ollama/commit/5486c5736412d325240ed48d6e2d96d2be3027ba) fix template / imageproc issues - [`7d5e0ff`](https://github.com/ollama/ollama/commit/7d5e0ff80eca93f61a241d8a46ee88eeed6451d5) add server.cpp and patches - [`71e76f8`](https://github.com/ollama/ollama/commit/71e76f8c90c1aa02aeebdaaec532c7d55c7b63b0) server.cpp: cleanup cross attention state - [`03cf762`](https://github.com/ollama/ollama/commit/03cf7627ec5945d9689385eb538f71a8e0f978ff) change resize algorithm - [`3a1c8da`](https://github.com/ollama/ollama/commit/3a1c8da5e4bfa0568ff697598d86d4d712e3b477) only allow a single image to be passed ### 📊 Changes **19 files changed** (+2650 additions, -153 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+1 -2) 📝 `cmd/interactive.go` (+20 -27) 📝 `go.mod` (+1 -0) 📝 `go.sum` (+2 -0) 📝 `llm/ext_server/CMakeLists.txt` (+1 -1) ➕ `llm/ext_server/mllama.cpp` (+906 -0) ➕ `llm/ext_server/mllama.h` (+61 -0) 📝 `llm/ext_server/server.cpp` (+85 -21) ➕ `llm/patches/0009-mllama.patch` (+693 -0) 📝 `llm/server.go` (+3 -2) ➕ `server/imageproc/images.go` (+255 -0) ➕ `server/imageproc/images_test.go` (+363 -0) 📝 `server/model.go` (+3 -1) 📝 `server/prompt.go` (+81 -14) 📝 `server/prompt_test.go` (+144 -14) 📝 `server/routes.go` (+25 -13) 📝 `server/routes_generate_test.go` (+6 -6) 📝 `template/template.go` (+0 -13) 📝 `template/template_test.go` (+0 -39) </details> ### 📄 Description this change implements the mllama vision encoder in the existing clip.cpp example it's still very much a work in progress --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 06:43:59 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#74573