[PR #13511] [CLOSED] server: support split mmproj GGUF layers for existing models #14246

Closed
opened 2026-04-13 00:49:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13511
Author: @iosub
Created: 12/17/2025
Status: Closed

Base: mainHead: qwen3vl-split-clean-pr


📝 Commits (10+)

  • 0b6f0ef feat: add support for split GGUF models (separate vision encoder)
  • f458abc fix: M-RoPE position encoding and address PR review feedback
  • f4bcc34 fix: add [img] tokens for split GGUF models without renderer
  • c29a2f0 model/qwen3vl: fix split vision deepstack outputs
  • 314f7fd fix: keep embedding API and GGUF writer compatible with main
  • 6c6fee4 fix: address Copilot review
  • 7183fae server: support split mmproj gguf layers
  • 7ebb0ca chore: trigger PR refresh
  • 9021dc0 chore: trigger PR refresh
  • 607249e new file: z_iosu_2/1/plan_mmproj_split_qwen3vl.md

📊 Changes

24 files changed (+2729 additions, -227 deletions)

View changed files

📝 .gitignore (+2 -0)
📝 fs/ggml/gguf.go (+2 -2)
📝 llama/llama.go (+232 -20)
📝 llm/server.go (+37 -17)
📝 llm/server_test.go (+3 -0)
📝 ml/backend.go (+14 -0)
📝 ml/backend/ggml/ggml.go (+620 -10)
ml/nn/fast/rope.go (+21 -0)
📝 model/model.go (+18 -0)
📝 model/models/qwen3vl/imageprocessor.go (+66 -13)
📝 model/models/qwen3vl/model.go (+434 -9)
📝 model/models/qwen3vl/model_text.go (+5 -4)
📝 model/models/qwen3vl/model_vision.go (+533 -59)
model/vision_bridge.go (+150 -0)
📝 runner/llamarunner/cache.go (+5 -0)
📝 runner/llamarunner/image.go (+54 -3)
📝 runner/llamarunner/runner.go (+283 -25)
📝 runner/ollamarunner/cache.go (+19 -0)
📝 runner/ollamarunner/runner.go (+26 -8)
📝 server/create.go (+3 -2)

...and 4 more files

📄 Description

This PR updates Go-side handling so users can run already-published HuggingFace split multimodal GGUF bundles (main text model + separate vision encoder + separate mmproj.gguf) without repacking.

Go changes:

  • server/create.go: classify GGUF files with general.type=mmproj as projector layers (same as general.type=projector) so they get media type application/vnd.ollama.image.projector during create/import.
  • server/images.go: add load-time GGUF metadata inspection to reclassify mislabeled layers in existing manifests:
    • if a layer is typed as application/vnd.ollama.image.model but general.type is mmproj/projector, route it into ProjectorPaths and avoid overwriting ModelPath.
    • if block_count == 0 and vision.block_count > 0, set VisionPath.

Validation:

  • Verified with hf.co/unsloth/Qwen3-VL-8B-Instruct-GGUF:Q4_K_M and hf.co/ggml-org/Qwen3-VL-2B-Instruct-GGUF:Q8_0; logs show ensureVisionReady hasProjector=true, mm.0.*/mm.2.* projector tensors found, and "Copied mm.0/mm.2 projectors to VisionModel".

Note: This supersedes the now-closed PR #13456 (same branch; GitHub wont update diffs for closed PRs).


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13511 **Author:** [@iosub](https://github.com/iosub) **Created:** 12/17/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `qwen3vl-split-clean-pr` --- ### 📝 Commits (10+) - [`0b6f0ef`](https://github.com/ollama/ollama/commit/0b6f0ef678d9eb4e251d5524af98f50f7a2a0d22) feat: add support for split GGUF models (separate vision encoder) - [`f458abc`](https://github.com/ollama/ollama/commit/f458abc42e5e076989c72a0d6066691b4fca00d6) fix: M-RoPE position encoding and address PR review feedback - [`f4bcc34`](https://github.com/ollama/ollama/commit/f4bcc3488215f0ccffd36ab47b2d1ab919d01fe6) fix: add [img] tokens for split GGUF models without renderer - [`c29a2f0`](https://github.com/ollama/ollama/commit/c29a2f089e61016e4623e6d073ad0b9e125c90d8) model/qwen3vl: fix split vision deepstack outputs - [`314f7fd`](https://github.com/ollama/ollama/commit/314f7fdbc463b82e747a7dc28994de6e15afe7df) fix: keep embedding API and GGUF writer compatible with main - [`6c6fee4`](https://github.com/ollama/ollama/commit/6c6fee4372283a9301fbaed25a615df11134e304) fix: address Copilot review - [`7183fae`](https://github.com/ollama/ollama/commit/7183fae6765546a55d6bdf111caf8b2c0afdfffe) server: support split mmproj gguf layers - [`7ebb0ca`](https://github.com/ollama/ollama/commit/7ebb0ca942f76dd7549f724eb5de76b679a1d5ed) chore: trigger PR refresh - [`9021dc0`](https://github.com/ollama/ollama/commit/9021dc0a98a604a39b0c1fa401b63b27c6759a99) chore: trigger PR refresh - [`607249e`](https://github.com/ollama/ollama/commit/607249e1c9f395b6517eb585ad62c79e1e9c8708) new file: z_iosu_2/1/plan_mmproj_split_qwen3vl.md ### 📊 Changes **24 files changed** (+2729 additions, -227 deletions) <details> <summary>View changed files</summary> 📝 `.gitignore` (+2 -0) 📝 `fs/ggml/gguf.go` (+2 -2) 📝 `llama/llama.go` (+232 -20) 📝 `llm/server.go` (+37 -17) 📝 `llm/server_test.go` (+3 -0) 📝 `ml/backend.go` (+14 -0) 📝 `ml/backend/ggml/ggml.go` (+620 -10) ➕ `ml/nn/fast/rope.go` (+21 -0) 📝 `model/model.go` (+18 -0) 📝 `model/models/qwen3vl/imageprocessor.go` (+66 -13) 📝 `model/models/qwen3vl/model.go` (+434 -9) 📝 `model/models/qwen3vl/model_text.go` (+5 -4) 📝 `model/models/qwen3vl/model_vision.go` (+533 -59) ➕ `model/vision_bridge.go` (+150 -0) 📝 `runner/llamarunner/cache.go` (+5 -0) 📝 `runner/llamarunner/image.go` (+54 -3) 📝 `runner/llamarunner/runner.go` (+283 -25) 📝 `runner/ollamarunner/cache.go` (+19 -0) 📝 `runner/ollamarunner/runner.go` (+26 -8) 📝 `server/create.go` (+3 -2) _...and 4 more files_ </details> ### 📄 Description This PR updates Go-side handling so users can run already-published HuggingFace split multimodal GGUF bundles (main text model + separate vision encoder + separate `mmproj.gguf`) without repacking. Go changes: - `server/create.go`: classify GGUF files with `general.type=mmproj` as projector layers (same as `general.type=projector`) so they get media type `application/vnd.ollama.image.projector` during create/import. - `server/images.go`: add load-time GGUF metadata inspection to reclassify mislabeled layers in existing manifests: - if a layer is typed as `application/vnd.ollama.image.model` but `general.type` is `mmproj`/`projector`, route it into `ProjectorPaths` and avoid overwriting `ModelPath`. - if `block_count == 0` and `vision.block_count > 0`, set `VisionPath`. Validation: - Verified with `hf.co/unsloth/Qwen3-VL-8B-Instruct-GGUF:Q4_K_M` and `hf.co/ggml-org/Qwen3-VL-2B-Instruct-GGUF:Q8_0`; logs show `ensureVisionReady hasProjector=true`, `mm.0.*`/`mm.2.*` projector tensors found, and "Copied mm.0/mm.2 projectors to VisionModel". Note: This supersedes the now-closed PR #13456 (same branch; GitHub wont update diffs for closed PRs). --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:49:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14246