[GH-ISSUE #4128] Normalization of output from embedding model #2565

Open
opened 2026-04-12 12:53:30 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @hagemon on GitHub (May 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4128

When I use Ollama Embedding together with Langchain Retriever's get_relevant_documents, I always get a score that around 200. However, when I use HuggingFaceEmbedding, this value is between 0 and 1.

So I continued to explore the reason and, according to the official documentation, used OllamaEmbedding to vectorize both query and documents. I found that their dot product still exceeds 100:

from langchain_community.embeddings import OllamaEmbeddings
import numpy as np

ollama_emb = OllamaEmbeddings(
    model="mxbai-embed-large:latest",
)
r1 = ollama_emb.embed_documents(
    [
        "Alpha is the first letter of Greek alphabet",
        "Beta is the second letter of Greek alphabet",
    ]
)
r2 = ollama_emb.embed_query(
    "What is the second letter of Greek alphabet"
)
print(np.dot(r1, r2))  
# Output: array([196.91232687, 198.68434774])

Therefore, I assume that they are not normalized.

Some vector databases, such as Milvus, suggest normalizing the vectors before inserting them into the database. So, I wonder if OllamaEmbedding has plans to (or had already) support an option like HuggingFaceEmbedding's encode_kwargs = {"normalize_embeddings": True}, which allows the output vectors to be normalized without the need for me to manually implement this process.

Originally created by @hagemon on GitHub (May 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4128 When I use Ollama Embedding together with Langchain Retriever's `get_relevant_documents`, I always get a score that around 200. However, when I use HuggingFaceEmbedding, this value is between 0 and 1. So I continued to explore the reason and, according to the official documentation, used OllamaEmbedding to vectorize both query and documents. I found that their dot product still exceeds 100: ```python from langchain_community.embeddings import OllamaEmbeddings import numpy as np ollama_emb = OllamaEmbeddings( model="mxbai-embed-large:latest", ) r1 = ollama_emb.embed_documents( [ "Alpha is the first letter of Greek alphabet", "Beta is the second letter of Greek alphabet", ] ) r2 = ollama_emb.embed_query( "What is the second letter of Greek alphabet" ) print(np.dot(r1, r2)) # Output: array([196.91232687, 198.68434774]) ``` Therefore, I assume that they are not normalized. Some vector databases, such as Milvus, suggest normalizing the vectors before inserting them into the database. So, I wonder if OllamaEmbedding has plans to (or had already) support an option like HuggingFaceEmbedding's `encode_kwargs = {"normalize_embeddings": True}`, which allows the output vectors to be normalized without the need for me to manually implement this process.
GiteaMirror added the feature request label 2026-04-12 12:53:30 -05:00
Author
Owner

@cristoslc commented on GitHub (Jun 30, 2024):

That's an issue for me as well. Trying to use Ollama as a low-effort local embedding server to pair with FileMaker 2024, but the built-in functions in FileMaker require normalized embeddings (as of 2024-06-30).

<!-- gh-comment-id:2198695839 --> @cristoslc commented on GitHub (Jun 30, 2024): That's an issue for me as well. Trying to use Ollama as a low-effort local embedding server to pair with FileMaker 2024, but the built-in functions in FileMaker require normalized embeddings (as of 2024-06-30).
Author
Owner

@Susensio commented on GitHub (Jul 2, 2024):

I run into the same problem with ollama not normalizing embeddings. I ended up extending OllamaEmbeddings class:

import numpy as np
from langchain_community.embeddings import OllamaEmbeddings

class OllamaEmbeddingsNormalized(OllamaEmbeddings):
    def _process_emb_response(self, input: str) -> list[float]:
        emb = super()._process_emb_response(input)
        return (np.array(emb) / np.linalg.norm(emb)).tolist()

# The rest of your code
ollama_emb = OllamaEmbeddingsNormalized(
    model="mxbai-embed-large:latest",
)
r1 = ollama_emb.embed_documents(
    [
        "Alpha is the first letter of Greek alphabet",
        "Beta is the second letter of Greek alphabet",
    ]
)
r2 = ollama_emb.embed_query(
    "What is the second letter of Greek alphabet"
)
print(np.dot(r1, r2))  
# Output: array([0.67319929 0.75340838])
<!-- gh-comment-id:2203403086 --> @Susensio commented on GitHub (Jul 2, 2024): I run into the same problem with ollama not normalizing embeddings. I ended up extending `OllamaEmbeddings` class: ```python import numpy as np from langchain_community.embeddings import OllamaEmbeddings class OllamaEmbeddingsNormalized(OllamaEmbeddings): def _process_emb_response(self, input: str) -> list[float]: emb = super()._process_emb_response(input) return (np.array(emb) / np.linalg.norm(emb)).tolist() # The rest of your code ollama_emb = OllamaEmbeddingsNormalized( model="mxbai-embed-large:latest", ) r1 = ollama_emb.embed_documents( [ "Alpha is the first letter of Greek alphabet", "Beta is the second letter of Greek alphabet", ] ) r2 = ollama_emb.embed_query( "What is the second letter of Greek alphabet" ) print(np.dot(r1, r2)) # Output: array([0.67319929 0.75340838]) ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2565