[GH-ISSUE #3553] Embedding endpoint not available on windows. #48705

Closed
opened 2026-04-28 09:07:01 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @elblogbruno on GitHub (Apr 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3553

What is the issue?

I installed latest version of windows of ollama v0.1.31 and I can't seem to be able to use the new embedding functionalities.
For example , this url http://localhost:11434/api/embeddings gives me 404 not found.

The above exception was the direct cause of the following exception:


Traceback (most recent call last):
  File "d:\Desktop\Proyectos\OllamaPi\langchain-python-rag-websummary\play_store_scrapping\embeddings.py", line 19, in <module>
    response = ollama.embeddings(model="mxbai-embed-large", prompt=d)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 183, in embeddings
    return self._request(
           ^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 53, in _request
    response = self._client.request(method, url, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 814, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 901, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 929, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 966, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 1002, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_transports\default.py", line 227, in handle_request
    with map_httpcore_exceptions():
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_transports\default.py", line 83, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [WinError 10049] La dirección solicitada no es válida en este contexto

What did you expect to see?

No response

Steps to reproduce

Access http://localhost:11434/api/embeddings

or run sample code:

import ollama
import chromadb # ChromaDB is a vector embedding database

documents = [
  "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels",
  "Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands",
  "Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall",
  "Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight",
  "Llamas are vegetarians and have very efficient digestive systems",
  "Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old",
]

client = chromadb.Client()
collection = client.create_collection(name="docs")

 
# store each document in a vector embedding database
for i, d in enumerate(documents):
  response = ollama.embeddings(model="mxbai-embed-large", prompt=d)
  embedding = response["embedding"]
  collection.add(
    ids=[str(i)],
    embeddings=[embedding],
    documents=[d]
  )

# an example prompt
prompt = "What animals are llamas related to?"

# generate an embedding for the prompt and retrieve the most relevant doc
response = ollama.embeddings(
  prompt=prompt,
  model="mxbai-embed-large"
)
results = collection.query(
  query_embeddings=[response["embedding"]],
  n_results=1
)
data = results['documents'][0][0]

# generate a response combining the prompt and data we retrieved in step 2
output = ollama.generate(
  model="llama2",
  prompt=f"Using this data: {data}. Respond to this prompt: {prompt}"
)

print(output['response'])

Are there any recent changes that introduced the issue?

No response

OS

Windows

Architecture

x86

Platform

No response

Ollama version

0.1.31

GPU

Nvidia

GPU info

NVIDIA GeForce GTX 1060 6GB

CPU

Intel

Other software

No response

Originally created by @elblogbruno on GitHub (Apr 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3553 ### What is the issue? I installed latest version of windows of ollama v0.1.31 and I can't seem to be able to use the new embedding functionalities. For example , this url http://localhost:11434/api/embeddings gives me 404 not found. The above exception was the direct cause of the following exception: ``` Traceback (most recent call last): File "d:\Desktop\Proyectos\OllamaPi\langchain-python-rag-websummary\play_store_scrapping\embeddings.py", line 19, in <module> response = ollama.embeddings(model="mxbai-embed-large", prompt=d) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 183, in embeddings return self._request( ^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\ollama\_client.py", line 53, in _request response = self._client.request(method, url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 814, in request return self.send(request, auth=auth, follow_redirects=follow_redirects) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 901, in send response = self._send_handling_auth( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 929, in _send_handling_auth response = self._send_handling_redirects( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 966, in _send_handling_redirects response = self._send_single_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_client.py", line 1002, in _send_single_request response = transport.handle_request(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_transports\default.py", line 227, in handle_request with map_httpcore_exceptions(): File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\contextlib.py", line 158, in __exit__ self.gen.throw(value) File "C:\Users\elblo\AppData\Local\Programs\Python\Python312\Lib\site-packages\httpx\_transports\default.py", line 83, in map_httpcore_exceptions raise mapped_exc(message) from exc httpx.ConnectError: [WinError 10049] La dirección solicitada no es válida en este contexto ``` ### What did you expect to see? _No response_ ### Steps to reproduce Access http://localhost:11434/api/embeddings or run sample code: ``` import ollama import chromadb # ChromaDB is a vector embedding database documents = [ "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels", "Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands", "Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall", "Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight", "Llamas are vegetarians and have very efficient digestive systems", "Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old", ] client = chromadb.Client() collection = client.create_collection(name="docs") # store each document in a vector embedding database for i, d in enumerate(documents): response = ollama.embeddings(model="mxbai-embed-large", prompt=d) embedding = response["embedding"] collection.add( ids=[str(i)], embeddings=[embedding], documents=[d] ) # an example prompt prompt = "What animals are llamas related to?" # generate an embedding for the prompt and retrieve the most relevant doc response = ollama.embeddings( prompt=prompt, model="mxbai-embed-large" ) results = collection.query( query_embeddings=[response["embedding"]], n_results=1 ) data = results['documents'][0][0] # generate a response combining the prompt and data we retrieved in step 2 output = ollama.generate( model="llama2", prompt=f"Using this data: {data}. Respond to this prompt: {prompt}" ) print(output['response']) ``` ### Are there any recent changes that introduced the issue? _No response_ ### OS Windows ### Architecture x86 ### Platform _No response_ ### Ollama version 0.1.31 ### GPU Nvidia ### GPU info NVIDIA GeForce GTX 1060 6GB ### CPU Intel ### Other software _No response_
GiteaMirror added the bug label 2026-04-28 09:07:01 -05:00
Author
Owner

@elblogbruno commented on GitHub (Apr 9, 2024):

I set OLLAMA_HOST to 0.0.0.0 before so I needed to use a client with the correct ip of my pc to make it work.

client_ollama = Client(host='http://192.168.1.xx:11434')

<!-- gh-comment-id:2044636387 --> @elblogbruno commented on GitHub (Apr 9, 2024): I set OLLAMA_HOST to 0.0.0.0 before so I needed to use a client with the correct ip of my pc to make it work. client_ollama = Client(host='http://192.168.1.xx:11434')
Author
Owner

@Dr-Corgi commented on GitHub (Aug 15, 2024):

Same problem

<!-- gh-comment-id:2291685821 --> @Dr-Corgi commented on GitHub (Aug 15, 2024): Same problem
Author
Owner

@jclauzel commented on GitHub (Feb 11, 2025):

That's because this line:

query_embeddings=[response["embedding"]],

should be instead:

query_embeddings=response["embeddings"],

<!-- gh-comment-id:2651124020 --> @jclauzel commented on GitHub (Feb 11, 2025): That's because this line: query_embeddings=[response["embedding"]], should be instead: query_embeddings=response["embeddings"],
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#48705