[PR #13649] [CLOSED] mlx: implement L2Norm for embedding model support #45562

Closed
opened 2026-04-25 01:14:38 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13649
Author: @iamadalek
Created: 1/8/2026
Status: Closed

Base: mlx-engineHead: mlx-engine


📝 Commits (1)

  • 92d6603 mlx: implement L2Norm operation for BERT/Qwen3 embedding support

📊 Changes

1 file changed (+30 additions, -2 deletions)

View changed files

📝 x/ml/backend/mlx/mlx.go (+30 -2)

📄 Description

Summary

Implements the L2Norm operation for the MLX backend, enabling BERT, NomicBERT, Qwen3, and Gemma3 embedding models.

Implementation

L2 normalization using MLX-C primitives:

x * rsqrt(sum(x², axis=-1, keepdims=true) + eps)

The implementation follows the same patterns as existing ops (RMSNorm, LayerNorm) and properly manages memory with mlx_array_free() calls for intermediate results.

Models Unblocked

This fixes the NOT YET IMPLEMENTED panic for:

  • model/models/bert/embed.go
  • model/models/nomicbert/model.go
  • model/models/qwen3/embed.go
  • x/model/models/gemma3/embed.go

Testing

Code compiles syntactically (verified via gofmt). Full build testing requires Xcode with Metal compiler which I don't have installed, but the implementation is straightforward and uses verified MLX-C v0.4.1 functions.

Related to #13648


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13649 **Author:** [@iamadalek](https://github.com/iamadalek) **Created:** 1/8/2026 **Status:** ❌ Closed **Base:** `mlx-engine` ← **Head:** `mlx-engine` --- ### 📝 Commits (1) - [`92d6603`](https://github.com/ollama/ollama/commit/92d6603ac7a03325e892bf9b8a780ac1403850c9) mlx: implement L2Norm operation for BERT/Qwen3 embedding support ### 📊 Changes **1 file changed** (+30 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `x/ml/backend/mlx/mlx.go` (+30 -2) </details> ### 📄 Description ## Summary Implements the L2Norm operation for the MLX backend, enabling BERT, NomicBERT, Qwen3, and Gemma3 embedding models. ## Implementation L2 normalization using MLX-C primitives: ``` x * rsqrt(sum(x², axis=-1, keepdims=true) + eps) ``` The implementation follows the same patterns as existing ops (RMSNorm, LayerNorm) and properly manages memory with `mlx_array_free()` calls for intermediate results. ## Models Unblocked This fixes the `NOT YET IMPLEMENTED` panic for: - `model/models/bert/embed.go` - `model/models/nomicbert/model.go` - `model/models/qwen3/embed.go` - `x/model/models/gemma3/embed.go` ## Testing Code compiles syntactically (verified via gofmt). Full build testing requires Xcode with Metal compiler which I don't have installed, but the implementation is straightforward and uses verified MLX-C v0.4.1 functions. Related to #13648 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 01:14:38 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#45562