[PR #7537] [MERGED] imageproc mllama refactor #17728

Closed
opened 2026-04-16 06:12:12 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7537
Author: @pdevine
Created: 11/7/2024
Status: Merged
Merged: 12/15/2024
Merged by: @pdevine

Base: mainHead: pdevine/imageproc-redux


📝 Commits (8)

📊 Changes

10 files changed (+820 additions, -117 deletions)

View changed files

model/imageproc/images.go (+111 -0)
model/imageproc/images_test.go (+177 -0)
📝 model/mllama/imageproc.go (+63 -102)
📝 model/mllama/imageproc_test.go (+13 -9)
model/pixtral/imageproc.go (+68 -0)
model/pixtral/imageproc_test.go (+219 -0)
model/qwen2vl/imageproc.go (+74 -0)
model/qwen2vl/imageproc_test.go (+78 -0)
📝 server/prompt.go (+8 -3)
📝 server/routes.go (+9 -3)

📄 Description

This change breaks out the image processing routines into a generic module called models/imageproc and also creates a new models/mllama model which is specific the the mllama vision processing. There are a few other minor changes such as:

  • Preprocess() now takes an io.Reader instead of sending the byte slice
  • Preprocess() now returns a map[string]any which contains any options to pass back which are specific to the model
  • The mean/standard dev. constants are broken out into package variables

I haven't added an interface for the model, but that should go along with the forward pass and can come in a different PR. We also need to determine what the actual directory structure should look like.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7537 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 11/7/2024 **Status:** ✅ Merged **Merged:** 12/15/2024 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/imageproc-redux` --- ### 📝 Commits (8) - [`ccf7891`](https://github.com/ollama/ollama/commit/ccf7891fc61302a06a06680bfaad715fd34ed35f) imageproc mllama refactor - [`cf2f09b`](https://github.com/ollama/ollama/commit/cf2f09bd5d962ec8c33125082be633b373f82499) add pixtral - [`08a49cc`](https://github.com/ollama/ollama/commit/08a49cc503dc88c81ce497d6b2589b85a4463e94) gci stuff - [`cf75a7f`](https://github.com/ollama/ollama/commit/cf75a7ffae543ba2a564b50f52e5408bfa326283) more changes - [`f022d1d`](https://github.com/ollama/ollama/commit/f022d1d5f6dfbce2a831f1cdcf88f6be7cca2754) add qwen2.5 image processing - [`ba4dca0`](https://github.com/ollama/ollama/commit/ba4dca0dafcf2b655c2e0f7b6e04b2d7c0016c1a) fix path - [`4581ef3`](https://github.com/ollama/ollama/commit/4581ef3be48890ba963b7980a40ea2e67dfb3b3c) feed the linter - [`343e53c`](https://github.com/ollama/ollama/commit/343e53c96f3320553041af158949b8fe1b649c19) more linter feeding ### 📊 Changes **10 files changed** (+820 additions, -117 deletions) <details> <summary>View changed files</summary> ➕ `model/imageproc/images.go` (+111 -0) ➕ `model/imageproc/images_test.go` (+177 -0) 📝 `model/mllama/imageproc.go` (+63 -102) 📝 `model/mllama/imageproc_test.go` (+13 -9) ➕ `model/pixtral/imageproc.go` (+68 -0) ➕ `model/pixtral/imageproc_test.go` (+219 -0) ➕ `model/qwen2vl/imageproc.go` (+74 -0) ➕ `model/qwen2vl/imageproc_test.go` (+78 -0) 📝 `server/prompt.go` (+8 -3) 📝 `server/routes.go` (+9 -3) </details> ### 📄 Description This change breaks out the image processing routines into a generic module called `models/imageproc` and also creates a new `models/mllama` model which is specific the the mllama vision processing. There are a few other minor changes such as: * Preprocess() now takes an io.Reader instead of sending the byte slice * Preprocess() now returns a map[string]any which contains any options to pass back which are specific to the model * The mean/standard dev. constants are broken out into package variables I haven't added an interface for the model, but that should go along with the forward pass and can come in a different PR. We also need to determine what the actual directory structure should look like. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:12:12 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#17728