[GH-ISSUE #15203] Imported Chandra OCR 2 GGUF + mmproj fails for image inference on macOS MLX build (split vision / qwen35 compatibility) #71791

Closed
opened 2026-05-05 02:30:14 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @kwoh80 on GitHub (Apr 2, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15203

Summary

I’m trying to import and run Chandra OCR 2 locally with Ollama on Apple Silicon macOS.

The model imports successfully, but image inference fails at runtime.
This reproduces on:

  • Ollama 0.19.0
  • OLLAMA_NEW_ENGINE=true
  • current main built from source

Environment

  • macOS Apple Silicon
  • Metal / MLX path
  • Ollama 0.19.0
  • also reproduced on current main source build
  • model files:
    • chandra-ocr-2.Q8_0.gguf
    • chandra-ocr-2.mmproj-q8_0.gguf

Reproduction

  1. Import base model only:

FROM /tmp/chandra_gguf/chandra-ocr-2.Q8_0.gguf

  1. Import base + mmproj:

FROM /tmp/chandra_gguf/chandra-ocr-2.Q8_0.gguf
FROM /tmp/chandra_gguf/chandra-ocr-2.mmproj-q8_0.gguf

  1. Create the model:

ollama create chandraq8mmproj -f Modelfile.mmproj

  1. Send an image generation request through /api/generate with an image + prompt.

Expected behavior

The imported Chandra multimodal model should accept image input and run OCR/reconstruction.

Actual behavior

Base-only import:

  • image inference fails with:
    • this model is missing data required for image input

Base + mmproj import:

  • api/tags shows the model and reports families like:
    • ["qwen35", "clip"]
  • but runtime inference fails with:
    • unable to load model

Relevant logs

model not yet supported by Ollama engine, switching to compatibility mode
split vision models aren't supported
llama_model_load: error loading model architecture: unknown model architecture: 'qwen35'

API response:
{"error":"unable to load model: /Users/.../.ollama/models/blobs/sha256-..."}

Notes

This looks like the problem is not just missing mmproj data.
The model appears to import and register correctly, but runtime loading fails for an imported split vision model using qwen35 + clip.

Questions

  • Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture?
  • Is qwen35 multimodal import expected to work yet?
  • Is there a recommended import format or model packaging workaround for this kind of model?

Minimal log block:
```text
model not yet supported by Ollama engine, switching to compatibility mode
split vision models aren't supported
llama_model_load: error loading model architecture: unknown model architecture: 'qwen35'
{"error":"unable to load model: /Users/.../.ollama/models/blobs/sha256-..."}

Related references:

Originally created by @kwoh80 on GitHub (Apr 2, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15203 Summary I’m trying to import and run Chandra OCR 2 locally with Ollama on Apple Silicon macOS. The model imports successfully, but image inference fails at runtime. This reproduces on: - Ollama 0.19.0 - OLLAMA_NEW_ENGINE=true - current main built from source Environment - macOS Apple Silicon - Metal / MLX path - Ollama 0.19.0 - also reproduced on current main source build - model files: - chandra-ocr-2.Q8_0.gguf - chandra-ocr-2.mmproj-q8_0.gguf Reproduction 1. Import base model only: FROM /tmp/chandra_gguf/chandra-ocr-2.Q8_0.gguf 2. Import base + mmproj: FROM /tmp/chandra_gguf/chandra-ocr-2.Q8_0.gguf FROM /tmp/chandra_gguf/chandra-ocr-2.mmproj-q8_0.gguf 3. Create the model: ollama create chandraq8mmproj -f Modelfile.mmproj 4. Send an image generation request through /api/generate with an image + prompt. Expected behavior The imported Chandra multimodal model should accept image input and run OCR/reconstruction. Actual behavior Base-only import: - image inference fails with: - this model is missing data required for image input Base + mmproj import: - api/tags shows the model and reports families like: - ["qwen35", "clip"] - but runtime inference fails with: - unable to load model Relevant logs model not yet supported by Ollama engine, switching to compatibility mode split vision models aren't supported llama_model_load: error loading model architecture: unknown model architecture: 'qwen35' API response: {"error":"unable to load model: /Users/.../.ollama/models/blobs/sha256-..."} Notes This looks like the problem is not just missing mmproj data. The model appears to import and register correctly, but runtime loading fails for an imported split vision model using qwen35 + clip. Questions - Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture? - Is qwen35 multimodal import expected to work yet? - Is there a recommended import format or model packaging workaround for this kind of model? ``` Minimal log block: ```text model not yet supported by Ollama engine, switching to compatibility mode split vision models aren't supported llama_model_load: error loading model architecture: unknown model architecture: 'qwen35' {"error":"unable to load model: /Users/.../.ollama/models/blobs/sha256-..."} ``` Related references: - https://github.com/ollama/ollama/issues/9967 - https://github.com/ollama/ollama/issues/11254 - https://github.com/ollama/ollama/issues/9727 - https://github.com/ollama/ollama/issues/5245 - https://github.com/ollama/ollama/releases
Author
Owner

@rick-github commented on GitHub (Apr 2, 2026):

Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture?

Correct, see #14575.

Try importing from safetensors.

<!-- gh-comment-id:4174972341 --> @rick-github commented on GitHub (Apr 2, 2026): > Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture? Correct, see #14575. Try [importing](https://github.com/ollama/ollama/blob/main/docs/import.mdx#Importing-a-model-from-Safetensors-weights) from safetensors.
Author
Owner

@kwoh80 commented on GitHub (Apr 3, 2026):

Thanks, that helps.

We’ll try importing from safetensors rather than GGUF and report back.

Our case is still a split-vision multimodal one (qwen35 + clip / mmproj-style layout), so the main thing we want to verify is whether the safetensors import path changes the runtime outcome for image inference, not just model registration.

If we can get a clean repro on the safetensors path, we’ll follow up with:

  • exact model source
  • exact Modelfile / import command
  • whether image inference succeeds or still fails
  • the corresponding runtime logs
      1. 오후 3:25, frob @.***> 작성:

rick-github
left a comment
(ollama/ollama#15203)
https://github.com/ollama/ollama/issues/15203#issuecomment-4174972341
Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture?

Correct, see #14575 https://github.com/ollama/ollama/issues/14575.

Try importing https://github.com/ollama/ollama/blob/main/docs/import.mdx#Importing-a-model-from-Safetensors-weights from safetensors.


Reply to this email directly, view it on GitHub https://github.com/ollama/ollama/issues/15203?email_source=notifications&email_token=BAXKVS77S4UOD66JXZRBWQL4TYBW5A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJXGQ4TOMRTGQY2M4TFMFZW63VGMF2XI2DPOKSWK5TFNZ2KYZTPN52GK4S7MNWGSY3L#issuecomment-4174972341, or unsubscribe https://github.com/notifications/unsubscribe-auth/BAXKVS3D7HSHOWXVNIUCKIT4TYBW5AVCNFSM6AAAAACXKDO7V2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCNZUHE3TEMZUGE.
You are receiving this because you authored the thread.

<!-- gh-comment-id:4181491141 --> @kwoh80 commented on GitHub (Apr 3, 2026): Thanks, that helps. We’ll try importing from safetensors rather than GGUF and report back. Our case is still a split-vision multimodal one (`qwen35` + `clip` / mmproj-style layout), so the main thing we want to verify is whether the safetensors import path changes the runtime outcome for image inference, not just model registration. If we can get a clean repro on the safetensors path, we’ll follow up with: - exact model source - exact Modelfile / import command - whether image inference succeeds or still fails - the corresponding runtime logs > 2026. 4. 2. 오후 3:25, frob ***@***.***> 작성: > > > rick-github > left a comment > (ollama/ollama#15203) > <https://github.com/ollama/ollama/issues/15203#issuecomment-4174972341> > Is imported split-vision multimodal GGUF currently unsupported on Ollama for this architecture? > > Correct, see #14575 <https://github.com/ollama/ollama/issues/14575>. > > Try importing <https://github.com/ollama/ollama/blob/main/docs/import.mdx#Importing-a-model-from-Safetensors-weights> from safetensors. > > — > Reply to this email directly, view it on GitHub <https://github.com/ollama/ollama/issues/15203?email_source=notifications&email_token=BAXKVS77S4UOD66JXZRBWQL4TYBW5A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJXGQ4TOMRTGQY2M4TFMFZW63VGMF2XI2DPOKSWK5TFNZ2KYZTPN52GK4S7MNWGSY3L#issuecomment-4174972341>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BAXKVS3D7HSHOWXVNIUCKIT4TYBW5AVCNFSM6AAAAACXKDO7V2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCNZUHE3TEMZUGE>. > You are receiving this because you authored the thread. >
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71791