[GH-ISSUE #7242] api/embed endpoint is not working exactly like api/embeddings endpoint #30361

Closed
opened 2026-04-22 09:56:09 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @hdnh2006 on GitHub (Oct 17, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7242

What is the issue?

Hi guys!

Thanks for this fantastic tool. This is the first time I use embeddings with ollama, I have just tried inference with LLM, and I realized that there is a big difference between the endpoints /embed and /embeddings.

I understand the /embeddings was deprecated but it is a common endpoint used by openai compatible applications and I am facing these results:


import requests

url = 'http://192.168.100.60:11434/api/embeddings'

data = {
    "model": "bge-large",
    "input": ["Why is the sky blue?", "Why is the grass green?"]

}

response = requests.post(url, json=data)

print(response.json())
{'embedding': []}

I should get the same results as the endpoint /embed:


import requests

url = 'http://192.168.100.60:11434/api/embed'
data = {
    "model": "bge-large",
    "input": ["Why is the sky blue?", "Why is the grass green?"]

}

response = requests.post(url, json=data)

print(response.json())
{'model': 'bge-large', 'embeddings': [[0.0013820849, -0.0021153516, 0.026703326, -0.019793263, -0.028357966, -0.0038778095, -0.0074022487, 0.026467843, 0.0390734, 0.064318694, -0.029173901, -0.016984569, 0.02871944, ...], [-0.009015707, -0.007080707, 0.012702042, 0.026064986, 0.0032204844, 0.03592011, 0.025629202, -3.0089186e-05, 0.030164663, 0.06953627, 0.02197145, -0.019186007, 0.031450707, 0.004626421, -0.01037141, 0.0134581905, ...]], 'total_duration': 351441147, 'load_duration': 1014270, 'prompt_eval_count': 12}

Am I doing something wrong? I need to call the endpoint embeddings as it is required to simulate my openai compatible application.

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

ollama version is 0.3.12

Originally created by @hdnh2006 on GitHub (Oct 17, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7242 ### What is the issue? Hi guys! Thanks for this fantastic tool. This is the first time I use embeddings with ollama, I have just tried inference with LLM, and I realized that there is a big difference between the endpoints `/embed` and `/embeddings`. I understand the `/embeddings` was deprecated but it is a common endpoint used by `openai` compatible applications and I am facing these results: ```python import requests url = 'http://192.168.100.60:11434/api/embeddings' data = { "model": "bge-large", "input": ["Why is the sky blue?", "Why is the grass green?"] } response = requests.post(url, json=data) print(response.json()) {'embedding': []} ``` I should get the same results as the endpoint `/embed`: ```python import requests url = 'http://192.168.100.60:11434/api/embed' data = { "model": "bge-large", "input": ["Why is the sky blue?", "Why is the grass green?"] } response = requests.post(url, json=data) print(response.json()) {'model': 'bge-large', 'embeddings': [[0.0013820849, -0.0021153516, 0.026703326, -0.019793263, -0.028357966, -0.0038778095, -0.0074022487, 0.026467843, 0.0390734, 0.064318694, -0.029173901, -0.016984569, 0.02871944, ...], [-0.009015707, -0.007080707, 0.012702042, 0.026064986, 0.0032204844, 0.03592011, 0.025629202, -3.0089186e-05, 0.030164663, 0.06953627, 0.02197145, -0.019186007, 0.031450707, 0.004626421, -0.01037141, 0.0134581905, ...]], 'total_duration': 351441147, 'load_duration': 1014270, 'prompt_eval_count': 12} ``` Am I doing something wrong? I need to call the endpoint `embeddings` as it is required to simulate my openai compatible application. ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version ollama version is 0.3.12
GiteaMirror added the bug label 2026-04-22 09:56:09 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 18, 2024):

The /api/embeddings endpoint uses prompt, not input. But it's deprecated and returns different values. If you want OpenAI compatabiity, use the OpenAI compatibility endpoint, /v1/embeddings.

<!-- gh-comment-id:2421029705 --> @rick-github commented on GitHub (Oct 18, 2024): The `/api/embeddings` endpoint uses `prompt`, not `input`. But it's deprecated and returns different values. If you want OpenAI compatabiity, use the [OpenAI compatibility endpoint](https://github.com/ollama/ollama/blob/main/docs/openai.md#v1embeddings), `/v1/embeddings`.
Author
Owner

@hdnh2006 commented on GitHub (Oct 18, 2024):

Thank you!!! @rick-github

<!-- gh-comment-id:2421912624 --> @hdnh2006 commented on GitHub (Oct 18, 2024): Thank you!!! @rick-github
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30361