[GH-ISSUE #3777] Embedding results have changed in v0.1.32 #2333

Closed
opened 2026-04-12 12:39:27 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @SunmeetOberoi on GitHub (Apr 20, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3777

Originally assigned to: @jmorganca on GitHub.

What is the issue?

The values for embedding have changed in 0.1.32 release.

I used an older version of ollama to complete a POC for categorization of some data it all went fine. Now when I was trying to implement the solution I could see the search result were way off. Almost all categories had a 50-60% similarity with every input value. After trying to fix my script for hours I thought to downgrade ollama and that worked.

This is only happening in 0.1.32, I tested the same code on 0.1.29, 0.1.30 and 0.1.31 it is working consistently and accurately.

Attaching a sample python script to test out the observation:

import subprocess

import numpy as np
import ollama
import pandas as pd

models = [
    'nomic-embed-text',
    'mxbai-embed-large',
    'snowflake-arctic-embed'
]


def calculate_similarity(product_embedding):
    df['similarity'] = df['embeddings'].apply(
        lambda x: np.dot(x, product_embedding) / (np.linalg.norm(x) * np.linalg.norm(product_embedding)))
    df_sorted = df.sort_values(by='similarity', ascending=False)
    return df_sorted[['Name', 'similarity']].head()


print(subprocess.check_output(["ollama", "--version"]).decode())

for model in models:
    df = pd.DataFrame({
        'Name': ['Fruits', 'Vegetables', 'Yellow', 'Brown']
    })
    df["embeddings"] = df.apply(
        lambda row: np.array(ollama.embeddings(model=model, prompt=row['Name']).get('embedding')), axis=1)

    arg = "Veggies"
    embedding = np.array(ollama.embeddings(model=model, prompt=arg).get('embedding'))
    result = calculate_similarity(embedding)
    print(f"========={model}==============")
    print(result)

Results:

Output for v0.1.29
ollama version is 0.1.29

=========nomic-embed-text==============
         Name  similarity
1  Vegetables    0.663238
0      Fruits    0.449560
2      Yellow    0.307696
3       Brown    0.281496
=========mxbai-embed-large==============
         Name  similarity
1  Vegetables    0.558774
2      Yellow    0.551288
3       Brown    0.549997
0      Fruits    0.548751
=========snowflake-arctic-embed==============
         Name  similarity
0      Fruits    0.825177
2      Yellow    0.825156
3       Brown    0.824968
1  Vegetables    0.824860
Output for v0.1.30
ollama version is 0.1.30

=========nomic-embed-text==============
         Name  similarity
1  Vegetables    0.663238
0      Fruits    0.449560
2      Yellow    0.307696
3       Brown    0.281496
=========mxbai-embed-large==============
         Name  similarity
1  Vegetables    0.558774
2      Yellow    0.551288
3       Brown    0.549997
0      Fruits    0.548751
=========snowflake-arctic-embed==============
         Name  similarity
0      Fruits    0.825177
2      Yellow    0.825156
3       Brown    0.824968
1  Vegetables    0.824860
Output for v0.1.31
ollama version is 0.1.31

=========nomic-embed-text==============
         Name  similarity
1  Vegetables    0.663238
0      Fruits    0.449560
2      Yellow    0.307696
3       Brown    0.281496
=========mxbai-embed-large==============
         Name  similarity
1  Vegetables    0.558774
2      Yellow    0.551288
3       Brown    0.549997
0      Fruits    0.548751
=========snowflake-arctic-embed==============
         Name  similarity
0      Fruits    0.825177
2      Yellow    0.825156
3       Brown    0.824968
1  Vegetables    0.824860
Output for v0.1.32
ollama version is 0.1.32

=========nomic-embed-text==============
         Name  similarity
2      Yellow    0.670598
3       Brown    0.636525
1  Vegetables    0.629145
0      Fruits    0.607114
=========mxbai-embed-large==============
         Name  similarity
1  Vegetables    0.619664
3       Brown    0.573989
2      Yellow    0.549525
0      Fruits    0.480530
=========snowflake-arctic-embed==============
         Name  similarity
0      Fruits    0.787907
3       Brown    0.787868
2      Yellow    0.787841
1  Vegetables    0.787761

  • The difference is clearly visible in the output. While the similarity is not changing by much for the correct answer in this sample, but for my dataset this has a far bigger impact.
  • Although I was using nomic and mxbai only but I kept snowflake in the sample as well because for my full dataset I could see snowflake was performing a little better than the other two in 0.1.32.
  • I understand I should use a vector database for such a problem but this is just a lightweight sample.
  • I conducted this test on WSL2 but its the same in Windows as well.
  • While gathering this data one thing I noted was 0.1.32 takes significantly longer to run the script as compared to older version.

Python libraries version

  • Ollama Version: 0.1.8
  • Pandas Version: 2.2.2
  • Numpy Version: 1.26.4

OS

Windows, WSL2

GPU

Nvidia

CPU

Intel

Ollama version

0.1.32

Originally created by @SunmeetOberoi on GitHub (Apr 20, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3777 Originally assigned to: @jmorganca on GitHub. ### What is the issue? The values for embedding have changed in 0.1.32 release. I used an older version of ollama to complete a POC for categorization of some data it all went fine. Now when I was trying to implement the solution I could see the search result were way off. Almost all categories had a 50-60% similarity with every input value. After trying to fix my script for hours I thought to downgrade ollama and that worked. This is only happening in 0.1.32, I tested the same code on 0.1.29, 0.1.30 and 0.1.31 it is working consistently and accurately. Attaching a sample python script to test out the observation: ```python import subprocess import numpy as np import ollama import pandas as pd models = [ 'nomic-embed-text', 'mxbai-embed-large', 'snowflake-arctic-embed' ] def calculate_similarity(product_embedding): df['similarity'] = df['embeddings'].apply( lambda x: np.dot(x, product_embedding) / (np.linalg.norm(x) * np.linalg.norm(product_embedding))) df_sorted = df.sort_values(by='similarity', ascending=False) return df_sorted[['Name', 'similarity']].head() print(subprocess.check_output(["ollama", "--version"]).decode()) for model in models: df = pd.DataFrame({ 'Name': ['Fruits', 'Vegetables', 'Yellow', 'Brown'] }) df["embeddings"] = df.apply( lambda row: np.array(ollama.embeddings(model=model, prompt=row['Name']).get('embedding')), axis=1) arg = "Veggies" embedding = np.array(ollama.embeddings(model=model, prompt=arg).get('embedding')) result = calculate_similarity(embedding) print(f"========={model}==============") print(result) ``` ## Results: <details> <summary>Output for v0.1.29</summary> ``` ollama version is 0.1.29 =========nomic-embed-text============== Name similarity 1 Vegetables 0.663238 0 Fruits 0.449560 2 Yellow 0.307696 3 Brown 0.281496 =========mxbai-embed-large============== Name similarity 1 Vegetables 0.558774 2 Yellow 0.551288 3 Brown 0.549997 0 Fruits 0.548751 =========snowflake-arctic-embed============== Name similarity 0 Fruits 0.825177 2 Yellow 0.825156 3 Brown 0.824968 1 Vegetables 0.824860 ``` </details> <details> <summary>Output for v0.1.30</summary> ``` ollama version is 0.1.30 =========nomic-embed-text============== Name similarity 1 Vegetables 0.663238 0 Fruits 0.449560 2 Yellow 0.307696 3 Brown 0.281496 =========mxbai-embed-large============== Name similarity 1 Vegetables 0.558774 2 Yellow 0.551288 3 Brown 0.549997 0 Fruits 0.548751 =========snowflake-arctic-embed============== Name similarity 0 Fruits 0.825177 2 Yellow 0.825156 3 Brown 0.824968 1 Vegetables 0.824860 ``` </details> <details> <summary>Output for v0.1.31</summary> ``` ollama version is 0.1.31 =========nomic-embed-text============== Name similarity 1 Vegetables 0.663238 0 Fruits 0.449560 2 Yellow 0.307696 3 Brown 0.281496 =========mxbai-embed-large============== Name similarity 1 Vegetables 0.558774 2 Yellow 0.551288 3 Brown 0.549997 0 Fruits 0.548751 =========snowflake-arctic-embed============== Name similarity 0 Fruits 0.825177 2 Yellow 0.825156 3 Brown 0.824968 1 Vegetables 0.824860 ``` </details> <details> <summary>Output for v0.1.32</summary> ``` ollama version is 0.1.32 =========nomic-embed-text============== Name similarity 2 Yellow 0.670598 3 Brown 0.636525 1 Vegetables 0.629145 0 Fruits 0.607114 =========mxbai-embed-large============== Name similarity 1 Vegetables 0.619664 3 Brown 0.573989 2 Yellow 0.549525 0 Fruits 0.480530 =========snowflake-arctic-embed============== Name similarity 0 Fruits 0.787907 3 Brown 0.787868 2 Yellow 0.787841 1 Vegetables 0.787761 ``` </details> --- - The difference is clearly visible in the output. While the similarity is not changing by much for the correct answer in this sample, but for my dataset this has a far bigger impact. - Although I was using nomic and mxbai only but I kept snowflake in the sample as well because for my full dataset I could see snowflake was performing a little better than the other two in 0.1.32. - I understand I should use a vector database for such a problem but this is just a lightweight sample. - I conducted this test on WSL2 but its the same in Windows as well. - While gathering this data one thing I noted was 0.1.32 takes significantly longer to run the script as compared to older version. ### Python libraries version - Ollama Version: 0.1.8 - Pandas Version: 2.2.2 - Numpy Version: 1.26.4 ### OS Windows, WSL2 ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.32
GiteaMirror added the bug label 2026-04-12 12:39:27 -05:00
Author
Owner

@Kanishk-Kumar commented on GitHub (Apr 22, 2024):

I also have this issue. Also (even in 0.1.30), it never loads the full n_ctx = 8192 and shows this error:

Apr 18 11:21:22 xyz-MS-7D91 ollama[38865]: time=2024-04-18T11:21:22.071+05:30 level=WARN source=server.go:51 msg="requested context length is greater than model max context length" requested=8192 model=2048

@SunmeetOberoi can you please also cross check Ollama logs using sudo journalctl -xeu ollama.service -f and check if n_ctx is accurate?

https://github.com/ollama/ollama/issues/3727#issuecomment-2065758174

<!-- gh-comment-id:2068581640 --> @Kanishk-Kumar commented on GitHub (Apr 22, 2024): I also have this issue. Also (even in 0.1.30), it never loads the full `n_ctx = 8192` and shows this error: `Apr 18 11:21:22 xyz-MS-7D91 ollama[38865]: time=2024-04-18T11:21:22.071+05:30 level=WARN source=server.go:51 msg="requested context length is greater than model max context length" requested=8192 model=2048` @SunmeetOberoi can you please also cross check Ollama logs using `sudo journalctl -xeu ollama.service -f` and check if `n_ctx` is accurate? https://github.com/ollama/ollama/issues/3727#issuecomment-2065758174
Author
Owner

@jimscard commented on GitHub (Apr 22, 2024):

@jmorganca Looks like 0.1.32 is using the wrong model config parameter to determine max context length. Details [here] (https://github.com/ollama/ollama/issues/3727#issuecomment-2070006251)

Also see Readme.md in the nomic GGUF file repository for this model.

<!-- gh-comment-id:2070027104 --> @jimscard commented on GitHub (Apr 22, 2024): @jmorganca Looks like 0.1.32 is using the wrong model config parameter to determine max context length. Details [here] (https://github.com/ollama/ollama/issues/3727#issuecomment-2070006251) Also see [Readme.md](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/README.md) in the nomic GGUF file repository for this model.
Author
Owner

@SunmeetOberoi commented on GitHub (Apr 22, 2024):

Hi @Kanishk-Kumar, I tried it out and yes I am also seeing that log message and the n_ctx as 2048

  • v0.1.31 - WSL2
    time=2024-04-22T23:43:22.474+05:30 level=WARN source=llm.go:44 msg="requested context length is greater than model's max context length (8192 > 2048), using 2048 instead"

  • v0.1.32 - Windows
    time=2024-04-22T23:51:48.208+05:30 level=WARN source=server.go:51 msg="requested context length is greater than model max context length" requested=8192 model=2048

Also, the embeddings are not the same amongst these versions.

As @jimscard correctly pointed out the nomic GGUF readme file does mentions something related to this as well which might help.

llama.cpp will default to 2048 tokens of context with these files. To use the full 8192 tokens that Nomic Embed is benchmarked on, you will have to choose a context extension method. The original model uses Dynamic NTK-Aware RoPE scaling, but that is not currently available in llama.cpp. A combination of YaRN and linear scaling is an acceptable substitute.

Since this context length issue is existing in v0.1.31 as well and is a model specific issue. I think its not related to the different embedding values problem mentioned here and can be tracked separately in #3727

<!-- gh-comment-id:2070639172 --> @SunmeetOberoi commented on GitHub (Apr 22, 2024): Hi @Kanishk-Kumar, I tried it out and yes I am also seeing that log message and the n_ctx as 2048 - v0.1.31 - WSL2 time=2024-04-22T23:43:22.474+05:30 level=WARN source=llm.go:44 msg="requested context length is greater than model's max context length (8192 > 2048), using 2048 instead" - v0.1.32 - Windows time=2024-04-22T23:51:48.208+05:30 level=WARN source=server.go:51 msg="requested context length is greater than model max context length" requested=8192 model=2048 Also, the embeddings are not the same amongst these versions. As @jimscard correctly pointed out the nomic GGUF readme file does mentions something related to this as well which might help. > llama.cpp will default to 2048 tokens of context with these files. To use the full 8192 tokens that Nomic Embed is benchmarked on, you will have to choose a context extension method. The original model uses Dynamic NTK-Aware RoPE scaling, but that is not currently available in llama.cpp. A combination of YaRN and linear scaling is an acceptable substitute. Since this context length issue is existing in v0.1.31 as well and is a model specific issue. I think its not related to the different embedding values problem mentioned here and can be tracked separately in #3727
Author
Owner

@youkefan18 commented on GitHub (May 6, 2024):

Guys, any luck here? I just bumped to v0.1.33 and this issue still exists.
Vectors embedded with 'mxbai-embed-large' in v0.1.26 are so different from v0.1.33.

<!-- gh-comment-id:2095438975 --> @youkefan18 commented on GitHub (May 6, 2024): Guys, any luck here? I just bumped to v0.1.33 and this issue still exists. Vectors embedded with 'mxbai-embed-large' in v0.1.26 are so different from v0.1.33.
Author
Owner

@deadbeef84 commented on GitHub (May 13, 2024):

I've pinpointed the issue to this commit 5ec12cec6c

<!-- gh-comment-id:2107208545 --> @deadbeef84 commented on GitHub (May 13, 2024): I've pinpointed the issue to this commit 5ec12cec6c097a4d3706edb0fa0e51f02dfc1b4c
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2333