[GH-ISSUE #6187] Embeddings produce different results when sent as a list as opposed to individually #50374

Closed
opened 2026-04-28 15:30:12 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @jorgetrejo36 on GitHub (Aug 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6187

Originally assigned to: @royjhan on GitHub.

What is the issue?

I am using the ollama.embed function from the Python library and getting interesting results whenever I send a list of inputs to the function. The embedding response that is sent back varies very much when a list of inputs is sent rather than just a single string of input (either by itself or from within a list). To showcase the disparity reference the code here which creates embeddings for a list of random strings. Whenever the embeddings are created one-by-one through a loop (as either a string or a list with a single string) the embeddings are equal. However, if the same list is sent as a list of strings to the ollama.embed function the embeddings all vary.

I have no idea what the cause of this is and it is pretty annoying as I am trying to use these embeddings for a RAG app and it's crucial that each piece of input that is processed create good and accurate embeddings to reference.

I have a suspicion that it may not have anything to do with ollama but actually with llama.cpp but before doing more digging was curious if anyone else came across this issue.

import ollama
import numpy as np
import os
from typing import List

EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL")
EPS=1e-4

test_sentences = [
    "The Act of Union (1707) united England and Scotland under a single government.",
    "Queen Anne died in 1714 without an heir, leading to the succession crisis that resulted in the Hanoverian dynasty taking the throne.",
    "The War of the Spanish Succession (1701-1714) saw England allied with Austria against Spain and France.",
    "The Treaty of Utrecht (1713) ended the war and granted England significant territorial gains.",
    "The South Sea Company was founded in 1711, leading to a speculative bubble that burst in 1720, causing widespread financial ruin.",
    "The Jacobite Risings (1689-1746) were a series of rebellions aimed at restoring the Stuart dynasty to the British throne.",
    "The Glorious Revolution (1688) saw William III and Mary II take the throne from James II, establishing constitutional monarchy in England.",
    "The Great Fire of London (1702) destroyed much of the city, leading to significant rebuilding efforts.",
    "The Gin Act (1729) was passed to curb excessive gin consumption, which had become a major social problem.",
    "The Industrial Revolution began to take hold in England during this period, with innovations like the spinning jenny and power looms transforming manufacturing"
]


def embed_string(s: str) -> np.ndarray:
    return np.array(ollama.embed(
        input=s,
        model=EMBEDDING_MODEL,
        options={
            
        },
        truncate=False
    )["embeddings"])[0]

def embed_list(s: List[str]) -> np.ndarray:
    return np.array(ollama.embed(
        input=s,
        model=EMBEDDING_MODEL,
        options={
            
        },
        truncate=False
    )["embeddings"])

def embed_list_single(s: List[str]) -> np.ndarray:
    return np.array(ollama.embed(
        input=[s],
        model=EMBEDDING_MODEL,
        options={
            
        },
        truncate=False
    )["embeddings"][0])

def test(list_of_string: List[str]) -> bool:
    singles = np.array([embed_string(s) for s in list_of_string])
    as_list = embed_list(list_of_string)
    as_list_singles = np.array([embed_list_single(s) for s in list_of_string])

    print(f"singles.shape: {singles.shape}")
    print(f"as_list.shape: {as_list.shape}")
    print(f"as_list_singles.shape: {as_list_singles.shape}")
    
    print("distance between singles and batch list:")
    for i, s in enumerate(list_of_string):
        dist = np.sqrt(((singles[i] - as_list[i]) ** 2).sum())
        print(f"{i}: {dist:.9f}")

    print("distance between single-element-list and batch list:")
    fail = False
    for i, s in enumerate(list_of_string):
        dist = np.sqrt(((singles[i] - as_list_singles[i]) ** 2).sum())
        print(f"{i}: {dist:.9f}")

    print("distance between single-element-list and batch list:")
    for i, s in enumerate(list_of_string):
        dist = np.sqrt(((as_list[i] - as_list_singles[i]) ** 2).sum())
        print(f"{i}: {dist:.9f}")

test(test_sentences)

Output:

singles.shape: (10, 384)
as_list.shape: (10, 384)
as_list_singles.shape: (10, 384)
distance between singles and batch list:
0: 0.001783381
1: 0.218350668
2: 0.243072520
3: 0.219616556
4: 0.382090694
5: 0.278717576
6: 0.291609303
7: 0.270641616
8: 0.243079911
9: 0.204011001
distance between singles and single-element-list:
0: 0.000000230
1: 0.000000124
2: 0.000000000
3: 0.000000000
4: 0.000000000
5: 0.000000000
6: 0.000000097
7: 0.000000128
8: 0.000000146
9: 0.000000000
distance between single-element-list and batch list:
0: 0.001783382
1: 0.218350656
2: 0.243072520
3: 0.219616556
4: 0.382090694
5: 0.278717576
6: 0.291609299
7: 0.270641636
8: 0.243079900
9: 0.204011001

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.3.2

Originally created by @jorgetrejo36 on GitHub (Aug 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6187 Originally assigned to: @royjhan on GitHub. ### What is the issue? I am using the ollama.embed function from the Python library and getting interesting results whenever I send a list of inputs to the function. The embedding response that is sent back varies very much when a list of inputs is sent rather than just a single string of input (either by itself or from within a list). To showcase the disparity reference the code here which creates embeddings for a list of random strings. Whenever the embeddings are created one-by-one through a loop (as either a string or a list with a single string) the embeddings are equal. However, if the same list is sent as a list of strings to the ollama.embed function the embeddings all vary. I have no idea what the cause of this is and it is pretty annoying as I am trying to use these embeddings for a RAG app and it's crucial that each piece of input that is processed create good and accurate embeddings to reference. I have a suspicion that it may not have anything to do with ollama but actually with llama.cpp but before doing more digging was curious if anyone else came across this issue. ``` import ollama import numpy as np import os from typing import List EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL") EPS=1e-4 test_sentences = [ "The Act of Union (1707) united England and Scotland under a single government.", "Queen Anne died in 1714 without an heir, leading to the succession crisis that resulted in the Hanoverian dynasty taking the throne.", "The War of the Spanish Succession (1701-1714) saw England allied with Austria against Spain and France.", "The Treaty of Utrecht (1713) ended the war and granted England significant territorial gains.", "The South Sea Company was founded in 1711, leading to a speculative bubble that burst in 1720, causing widespread financial ruin.", "The Jacobite Risings (1689-1746) were a series of rebellions aimed at restoring the Stuart dynasty to the British throne.", "The Glorious Revolution (1688) saw William III and Mary II take the throne from James II, establishing constitutional monarchy in England.", "The Great Fire of London (1702) destroyed much of the city, leading to significant rebuilding efforts.", "The Gin Act (1729) was passed to curb excessive gin consumption, which had become a major social problem.", "The Industrial Revolution began to take hold in England during this period, with innovations like the spinning jenny and power looms transforming manufacturing" ] def embed_string(s: str) -> np.ndarray: return np.array(ollama.embed( input=s, model=EMBEDDING_MODEL, options={ }, truncate=False )["embeddings"])[0] def embed_list(s: List[str]) -> np.ndarray: return np.array(ollama.embed( input=s, model=EMBEDDING_MODEL, options={ }, truncate=False )["embeddings"]) def embed_list_single(s: List[str]) -> np.ndarray: return np.array(ollama.embed( input=[s], model=EMBEDDING_MODEL, options={ }, truncate=False )["embeddings"][0]) def test(list_of_string: List[str]) -> bool: singles = np.array([embed_string(s) for s in list_of_string]) as_list = embed_list(list_of_string) as_list_singles = np.array([embed_list_single(s) for s in list_of_string]) print(f"singles.shape: {singles.shape}") print(f"as_list.shape: {as_list.shape}") print(f"as_list_singles.shape: {as_list_singles.shape}") print("distance between singles and batch list:") for i, s in enumerate(list_of_string): dist = np.sqrt(((singles[i] - as_list[i]) ** 2).sum()) print(f"{i}: {dist:.9f}") print("distance between single-element-list and batch list:") fail = False for i, s in enumerate(list_of_string): dist = np.sqrt(((singles[i] - as_list_singles[i]) ** 2).sum()) print(f"{i}: {dist:.9f}") print("distance between single-element-list and batch list:") for i, s in enumerate(list_of_string): dist = np.sqrt(((as_list[i] - as_list_singles[i]) ** 2).sum()) print(f"{i}: {dist:.9f}") test(test_sentences) ``` Output: ``` singles.shape: (10, 384) as_list.shape: (10, 384) as_list_singles.shape: (10, 384) distance between singles and batch list: 0: 0.001783381 1: 0.218350668 2: 0.243072520 3: 0.219616556 4: 0.382090694 5: 0.278717576 6: 0.291609303 7: 0.270641616 8: 0.243079911 9: 0.204011001 distance between singles and single-element-list: 0: 0.000000230 1: 0.000000124 2: 0.000000000 3: 0.000000000 4: 0.000000000 5: 0.000000000 6: 0.000000097 7: 0.000000128 8: 0.000000146 9: 0.000000000 distance between single-element-list and batch list: 0: 0.001783382 1: 0.218350656 2: 0.243072520 3: 0.219616556 4: 0.382090694 5: 0.278717576 6: 0.291609299 7: 0.270641636 8: 0.243079900 9: 0.204011001 ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.3.2
GiteaMirror added the bug label 2026-04-28 15:30:12 -05:00
Author
Owner

@royjhan commented on GitHub (Aug 5, 2024):

Thanks for reporting this. Off of initial diagnosis, it looks like all of the embeddings are "correct" but being shuffled around in the batch case, which isn't good. Here's the code I ran to verify this:

model = SentenceTransformer("all-MiniLM-L6-v2")

batch_response = ollama.embed(
    input=test_sentences,
    model=EMBEDDING_MODEL,
    truncate=False
)["embeddings"]

constructed_singles = []
for s in test_sentences:
    constructed_singles.append(ollama.embed(
        input=s,
        model=EMBEDDING_MODEL,
        truncate=False
    )["embeddings"][0])

constructed_single_list = []
for s in test_sentences:
    constructed_single_list.append(ollama.embed(
        input=[s],
        model=EMBEDDING_MODEL,
        truncate=False
    )["embeddings"][0])

similarities = model.similarity(batch_response, constructed_singles)
print(similarities)

print("\n")

similarities = model.similarity(batch_response, constructed_single_list)
print(similarities)

print("\n")

similarities = model.similarity(constructed_singles, constructed_single_list)
print(similarities)

print("\n")

we should get a 1.0 diagonal in the cosine similarity matrix but the 1.0 is in the wrong index for some parts of the matrix. Will work towards a fix asap

<!-- gh-comment-id:2270003704 --> @royjhan commented on GitHub (Aug 5, 2024): Thanks for reporting this. Off of initial diagnosis, it looks like all of the embeddings are "correct" but being shuffled around in the batch case, which isn't good. Here's the code I ran to verify this: ``` model = SentenceTransformer("all-MiniLM-L6-v2") batch_response = ollama.embed( input=test_sentences, model=EMBEDDING_MODEL, truncate=False )["embeddings"] constructed_singles = [] for s in test_sentences: constructed_singles.append(ollama.embed( input=s, model=EMBEDDING_MODEL, truncate=False )["embeddings"][0]) constructed_single_list = [] for s in test_sentences: constructed_single_list.append(ollama.embed( input=[s], model=EMBEDDING_MODEL, truncate=False )["embeddings"][0]) similarities = model.similarity(batch_response, constructed_singles) print(similarities) print("\n") similarities = model.similarity(batch_response, constructed_single_list) print(similarities) print("\n") similarities = model.similarity(constructed_singles, constructed_single_list) print(similarities) print("\n") ``` we should get a 1.0 diagonal in the cosine similarity matrix but the 1.0 is in the wrong index for some parts of the matrix. Will work towards a fix asap
Author
Owner

@Pobalinsky commented on GitHub (Apr 17, 2025):

I had a similar issue, I don't know if this is still relevant, but I was only getting this kind of weird results when I was using contextual LLMS. Basically surrounding context changed the embeddings. When I was using non-contextual models to embed, this issue was gone.

<!-- gh-comment-id:2813656135 --> @Pobalinsky commented on GitHub (Apr 17, 2025): I had a similar issue, I don't know if this is still relevant, but I was only getting this kind of weird results when I was using contextual LLMS. Basically surrounding context changed the embeddings. When I was using non-contextual models to embed, this issue was gone.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50374