[PR #15974] [CLOSED] feat: Gemma 4 visual token budgets (image_min_tokens / image_max_tokens) #77674

Closed
opened 2026-05-05 10:20:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15974
Author: @glennneuber
Created: 5/5/2026
Status: Closed

Base: mainHead: feat/gemma4-visual-token-budgets


📝 Commits (9)

  • 2d48cd8 feat(gemma4vision): add Gemma 4 visual token ladder helpers
  • 6c5e815 feat(api): add image_min_tokens and image_max_tokens to Options
  • d6b1fb5 feat(model): add MultimodalBudgetEncoder interface
  • 63f0d7a feat(gemma4): apply per-request vision token budgets
  • 82d5d8f feat(ollamarunner): plumb vision token budgets through completion
  • 41e382d fix(scheduler): reload GGUF runner when image token options change
  • 0c4a611 chore(mlxrunner): debug log when image token budgets are set
  • e563a3b docs(design): add Gemma 4 vision token budget design note
  • e90d7d9 docs(design): add PR body template for GitHub

📊 Changes

14 files changed (+713 additions, -34 deletions)

View changed files

📝 api/types.go (+4 -0)
docs/design/PR_BODY.md (+17 -0)
docs/design/README.md (+3 -0)
docs/design/gemma4-vision-token-budgets.md (+422 -0)
internal/gemma4vision/budget.go (+67 -0)
internal/gemma4vision/budget_test.go (+41 -0)
📝 model/model.go (+6 -0)
📝 model/models/gemma4/model.go (+28 -6)
📝 model/models/gemma4/process_image.go (+35 -22)
model/models/gemma4/process_image_test.go (+45 -0)
📝 runner/ollamarunner/runner.go (+27 -5)
📝 server/sched.go (+5 -1)
📝 server/sched_test.go (+7 -0)
📝 x/mlxrunner/server.go (+6 -0)

📄 Description

Summary

Adds Gemma 4 visual token budget options image_min_tokens and image_max_tokens on api.Options, with ladder snap {70,140,280,560,1120}, defaults 70 / 560, per-completion wiring in ollamarunner, and non-MLX scheduler reload when those options change (alongside existing Runner comparison).

Design

See docs/design/gemma4-vision-token-budgets.md (full plan preserved in-repo).

Testing

  • go test ./internal/gemma4vision/... ./model/models/gemma4/...
  • go test ./runner/ollamarunner/... ./server/... (or full go test ./... in CI)

Notes

  • MLX engine: options decode; slog.Debug only when non-zero (vision not on MLX).
  • OLLAMA_DEBUG=1: Gemma4 vision path emits structured slog.Debug for budgets and token count.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15974 **Author:** [@glennneuber](https://github.com/glennneuber) **Created:** 5/5/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `feat/gemma4-visual-token-budgets` --- ### 📝 Commits (9) - [`2d48cd8`](https://github.com/ollama/ollama/commit/2d48cd835088391414e7b980d1cdaec443d740e3) feat(gemma4vision): add Gemma 4 visual token ladder helpers - [`6c5e815`](https://github.com/ollama/ollama/commit/6c5e81523cd3f9eb92e617e2a1442af4d7b65e97) feat(api): add image_min_tokens and image_max_tokens to Options - [`d6b1fb5`](https://github.com/ollama/ollama/commit/d6b1fb5252cc57785947c6109ed19fbfcc089c3a) feat(model): add MultimodalBudgetEncoder interface - [`63f0d7a`](https://github.com/ollama/ollama/commit/63f0d7a3ea3518223b1217a607a5a9a8951dd08f) feat(gemma4): apply per-request vision token budgets - [`82d5d8f`](https://github.com/ollama/ollama/commit/82d5d8f044b092a0805f15e0ab1cdab174010941) feat(ollamarunner): plumb vision token budgets through completion - [`41e382d`](https://github.com/ollama/ollama/commit/41e382da735ef2620d19135e5437901a874a3589) fix(scheduler): reload GGUF runner when image token options change - [`0c4a611`](https://github.com/ollama/ollama/commit/0c4a6118aaf659b0d6ecb8434ab447018895bf8b) chore(mlxrunner): debug log when image token budgets are set - [`e563a3b`](https://github.com/ollama/ollama/commit/e563a3bda47c62bac942b7def925139c987295c3) docs(design): add Gemma 4 vision token budget design note - [`e90d7d9`](https://github.com/ollama/ollama/commit/e90d7d95526ed5a5cb36b5be8553556d7561df7c) docs(design): add PR body template for GitHub ### 📊 Changes **14 files changed** (+713 additions, -34 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+4 -0) ➕ `docs/design/PR_BODY.md` (+17 -0) ➕ `docs/design/README.md` (+3 -0) ➕ `docs/design/gemma4-vision-token-budgets.md` (+422 -0) ➕ `internal/gemma4vision/budget.go` (+67 -0) ➕ `internal/gemma4vision/budget_test.go` (+41 -0) 📝 `model/model.go` (+6 -0) 📝 `model/models/gemma4/model.go` (+28 -6) 📝 `model/models/gemma4/process_image.go` (+35 -22) ➕ `model/models/gemma4/process_image_test.go` (+45 -0) 📝 `runner/ollamarunner/runner.go` (+27 -5) 📝 `server/sched.go` (+5 -1) 📝 `server/sched_test.go` (+7 -0) 📝 `x/mlxrunner/server.go` (+6 -0) </details> ### 📄 Description ## Summary Adds Gemma 4 **visual token budget** options `image_min_tokens` and `image_max_tokens` on `api.Options`, with ladder snap `{70,140,280,560,1120}`, defaults **70** / **560**, per-completion wiring in **ollamarunner**, and **non-MLX** scheduler reload when those options change (alongside existing `Runner` comparison). ## Design See [docs/design/gemma4-vision-token-budgets.md](gemma4-vision-token-budgets.md) (full plan preserved in-repo). ## Testing - `go test ./internal/gemma4vision/... ./model/models/gemma4/...` - `go test ./runner/ollamarunner/... ./server/...` (or full `go test ./...` in CI) ## Notes - MLX engine: options decode; `slog.Debug` only when non-zero (vision not on MLX). - `OLLAMA_DEBUG=1`: Gemma4 vision path emits structured `slog.Debug` for budgets and token count. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 10:20:59 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77674