[PR #9025] [CLOSED] Include unified vision layers in memory prediction #18108

Closed
opened 2026-04-16 06:24:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/9025
Author: @dhiltgen
Created: 2/12/2025
Status: Closed

Base: jessegross/new_runnerHead: new_runner_vision_mem_predict


📝 Commits (10+)

  • 44b3974 next
  • 5838289 fix linter
  • 3749883 refactor prcess text tests
  • f46a4b0 model: benchmark bpe split
  • c4f127e remove unused file
  • 95eb87a ml: update Dump to handle precision
  • 9a504e2 backend: Don't return an error on Close
  • e0d8f38 backend: Consistently use int (vs. int64) for tensor shapes
  • fdca546 backend: Support graph computation that does not return an output
  • a701c94 ggml-backend: Let GGML allocate context memory

📊 Changes

83 files changed (+478091 additions, -527 deletions)

View changed files

📝 cmd/cmd.go (+9 -2)
📝 cmd/runner/main.go (+1 -1)
📝 convert/convert.go (+16 -16)
📝 convert/convert_bert.go (+5 -5)
📝 convert/convert_commandr.go (+5 -5)
📝 convert/convert_gemma.go (+5 -5)
📝 convert/convert_gemma2.go (+2 -4)
📝 convert/convert_gemma2_adapter.go (+5 -5)
📝 convert/convert_llama.go (+6 -6)
📝 convert/convert_llama_adapter.go (+5 -5)
📝 convert/convert_mixtral.go (+5 -5)
📝 convert/convert_phi3.go (+7 -7)
📝 convert/convert_qwen2.go (+5 -5)
📝 convert/convert_test.go (+6 -6)
📝 envconfig/config.go (+3 -0)
📝 fs/ggml/ggml.go (+148 -95)
📝 fs/ggml/gguf.go (+6 -7)
📝 fs/ggml/type.go (+2 -7)
📝 fs/util/bufioutil/buffer_seeker.go (+0 -0)
📝 fs/util/bufioutil/buffer_seeker_test.go (+0 -0)

...and 63 more files

📄 Description

Replaced by #9113

For newer vision models with a single gguf, include the projection estimates.

Note: I debated DRYing this out with memory.go projectorMemoryRequirements (which it is derived from) but they're different now, and may continue to evolve independently as we nail down the metadata formats, so having a distinct function felt like less friction.

This also adjusts the CLI to be able to detect both styles of vision model.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/9025 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 2/12/2025 **Status:** ❌ Closed **Base:** `jessegross/new_runner` ← **Head:** `new_runner_vision_mem_predict` --- ### 📝 Commits (10+) - [`44b3974`](https://github.com/ollama/ollama/commit/44b39749d5555b36f2044c08a747886c6bd24fea) next - [`5838289`](https://github.com/ollama/ollama/commit/58382892ad59941dfafc4a4d789343b817b26e74) fix linter - [`3749883`](https://github.com/ollama/ollama/commit/37498836bd5f75a70625670488daf334522b3ad4) refactor prcess text tests - [`f46a4b0`](https://github.com/ollama/ollama/commit/f46a4b07a32c6b4d16f5c51abda714ccbff8fddf) model: benchmark bpe split - [`c4f127e`](https://github.com/ollama/ollama/commit/c4f127ee6d47d1bcea624911cf3ec8f6dfb83540) remove unused file - [`95eb87a`](https://github.com/ollama/ollama/commit/95eb87a0523ced39bf2eb17507cff4901d92bf53) ml: update Dump to handle precision - [`9a504e2`](https://github.com/ollama/ollama/commit/9a504e2f759942cbea9bcd8d270c3273e7e8cc5c) backend: Don't return an error on Close - [`e0d8f38`](https://github.com/ollama/ollama/commit/e0d8f3872aca048b0921f3124ed2351ece3aae81) backend: Consistently use int (vs. int64) for tensor shapes - [`fdca546`](https://github.com/ollama/ollama/commit/fdca5461f1786b262d01d3f922fcd282c32e3b5b) backend: Support graph computation that does not return an output - [`a701c94`](https://github.com/ollama/ollama/commit/a701c9419418716c93642113e8057f1a4efff3ce) ggml-backend: Let GGML allocate context memory ### 📊 Changes **83 files changed** (+478091 additions, -527 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+9 -2) 📝 `cmd/runner/main.go` (+1 -1) 📝 `convert/convert.go` (+16 -16) 📝 `convert/convert_bert.go` (+5 -5) 📝 `convert/convert_commandr.go` (+5 -5) 📝 `convert/convert_gemma.go` (+5 -5) 📝 `convert/convert_gemma2.go` (+2 -4) 📝 `convert/convert_gemma2_adapter.go` (+5 -5) 📝 `convert/convert_llama.go` (+6 -6) 📝 `convert/convert_llama_adapter.go` (+5 -5) 📝 `convert/convert_mixtral.go` (+5 -5) 📝 `convert/convert_phi3.go` (+7 -7) 📝 `convert/convert_qwen2.go` (+5 -5) 📝 `convert/convert_test.go` (+6 -6) 📝 `envconfig/config.go` (+3 -0) 📝 `fs/ggml/ggml.go` (+148 -95) 📝 `fs/ggml/gguf.go` (+6 -7) 📝 `fs/ggml/type.go` (+2 -7) 📝 `fs/util/bufioutil/buffer_seeker.go` (+0 -0) 📝 `fs/util/bufioutil/buffer_seeker_test.go` (+0 -0) _...and 63 more files_ </details> ### 📄 Description Replaced by #9113 For newer vision models with a single gguf, include the projection estimates. Note: I debated DRYing this out with `memory.go` `projectorMemoryRequirements` (which it is derived from) but they're different now, and may continue to evolve independently as we nail down the metadata formats, so having a distinct function felt like less friction. This also adjusts the CLI to be able to detect both styles of vision model. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:24:43 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#18108