[GH-ISSUE #15541] Metal backend crash on Apple Silicon M5: MTLLibrary bfloat/half mismatch causes llama runner termination (500) #35690

Closed
opened 2026-04-22 20:22:32 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @rahul238xaviers on GitHub (Apr 13, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15541

Summary

Ollama consistently fails to run any model due to a Metal backend initialization failure. The runner process terminates and the API returns 500.

Environment

  • Ollama version: 0.20.6
  • Platform: macOS 26.3.1 (build 25D771280a)
  • Hardware: Apple Silicon M5
  • Install type: Ollama desktop app + CLI
  • Reproducibility: 100% (all tested models)

Steps to Reproduce

  1. Start Ollama.
  2. Run any model prompt, for example:
    • ollama run gemma2:2b "Hello"
  3. Observe server-side runner crash and client-side 500 error.

Expected Behavior

Model loads successfully and returns a response.

Actual Behavior

Model load fails during Metal initialization; llama runner exits; client receives:

  • 500 Internal Server Error: llama runner process has terminated: %!w(<nil>)

Error Signature (sanitized)

  • static_assert failed: Input types must match cooperative tensor types
  • bfloat/half mismatch in MetalPerformancePrimitives matmul path
  • ggml_metal_init: error: failed to initialize the Metal library
  • llama_init_from_model: failed to initialize the context: failed to initialize Metal backend
  • panic: unable to create llama context
  • llama runner terminated: exit status 2

Representative framework references:

  • /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266
  • /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3267

Additional Observations

  • Failure occurs across multiple models, not model-specific.
  • Model blobs download and verification succeed.
  • Runtime fails at backend initialization stage.

Workarounds Tried

  • Reinstall Ollama
  • Install/update Xcode and Metal toolchain components
  • Restart app/processes
  • Alternate host/port setup
  • Disable flash attention
  • KV cache type adjustments

Result: same failure path in Metal backend.

Impact

Ollama is unusable for inference in this environment because all model runs terminate before serving.

Privacy Note

All user-identifying information and local absolute paths have been redacted.

Originally created by @rahul238xaviers on GitHub (Apr 13, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15541 ## Summary Ollama consistently fails to run any model due to a Metal backend initialization failure. The runner process terminates and the API returns 500. ## Environment - Ollama version: 0.20.6 - Platform: macOS 26.3.1 (build 25D771280a) - Hardware: Apple Silicon M5 - Install type: Ollama desktop app + CLI - Reproducibility: 100% (all tested models) ## Steps to Reproduce 1. Start Ollama. 2. Run any model prompt, for example: - `ollama run gemma2:2b "Hello"` 3. Observe server-side runner crash and client-side 500 error. ## Expected Behavior Model loads successfully and returns a response. ## Actual Behavior Model load fails during Metal initialization; llama runner exits; client receives: - `500 Internal Server Error: llama runner process has terminated: %!w(<nil>)` ## Error Signature (sanitized) - static_assert failed: `Input types must match cooperative tensor types` - bfloat/half mismatch in MetalPerformancePrimitives matmul path - `ggml_metal_init: error: failed to initialize the Metal library` - `llama_init_from_model: failed to initialize the context: failed to initialize Metal backend` - `panic: unable to create llama context` - `llama runner terminated: exit status 2` Representative framework references: - `/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266` - `/System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3267` ## Additional Observations - Failure occurs across multiple models, not model-specific. - Model blobs download and verification succeed. - Runtime fails at backend initialization stage. ## Workarounds Tried - Reinstall Ollama - Install/update Xcode and Metal toolchain components - Restart app/processes - Alternate host/port setup - Disable flash attention - KV cache type adjustments Result: same failure path in Metal backend. ## Impact Ollama is unusable for inference in this environment because all model runs terminate before serving. ## Privacy Note All user-identifying information and local absolute paths have been redacted.
Author
Owner

@tjtoed commented on GitHub (Apr 13, 2026):

Same issue here on a new Macbook Pro M5 running 26.2 25C56 and Ollama 0.20.6

<!-- gh-comment-id:4236937547 --> @tjtoed commented on GitHub (Apr 13, 2026): Same issue here on a new Macbook Pro M5 running 26.2 25C56 and Ollama 0.20.6
Author
Owner

@rahul238xaviers commented on GitHub (Apr 13, 2026):

Same issue here on a new Macbook Pro M5 running 26.2 25C56 and Ollama 0.20.6

I have just finished the update of mac os and now it has started working. Seems like the compatibility issue was with the metal library. Will keep this issue open until I verify with the larger model and then will close it.

Image
<!-- gh-comment-id:4236971741 --> @rahul238xaviers commented on GitHub (Apr 13, 2026): > Same issue here on a new Macbook Pro M5 running 26.2 25C56 and Ollama 0.20.6 I have just finished the update of mac os and now it has started working. Seems like the compatibility issue was with the metal library. Will keep this issue open until I verify with the larger model and then will close it. <img width="984" height="212" alt="Image" src="https://github.com/user-attachments/assets/15e73586-e204-4deb-ba1c-cdf6d9da05e3" />
Author
Owner

@rahul238xaviers commented on GitHub (Apr 13, 2026):

Confirmed, the Mac OS update resolved the issue.

Image
<!-- gh-comment-id:4237049670 --> @rahul238xaviers commented on GitHub (Apr 13, 2026): Confirmed, the Mac OS update resolved the issue. <img width="1172" height="264" alt="Image" src="https://github.com/user-attachments/assets/04cc25f6-2185-40bc-885d-8ebf4510ce8f" />
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35690