[GH-ISSUE #3613] langchain embedding from remote server #64265

Closed
opened 2026-05-03 16:50:23 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @Ana0112 on GitHub (Apr 12, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3613

What is the issue?

I am using this code langchain to get embeddings.

Code -

loader = PyPDFDirectoryLoader("data")
data = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter=RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma
oembed = OllamaEmbeddings(base_url="https://11aa-11-111-111-111.ngrok-free.app/:11434", model="nomic-embed-text")

Upto this, code is working fine.

The following line is throwing error :-

vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed)

ValueError: Error raised by inference API HTTP code: 404, 404 page not found

I want to use these embeddings for RAG, with groq

rag_template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
rag_prompt = ChatPromptTemplate.from_template(rag_template)
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

I am able to generate collection using chromaDB standalone (i.e. not using langchain), using this code embedding-model
but then I don't know how to use the generated embeddings with groq llm for RAG.

What did you expect to see?

chroma db should have generated the vectorstore

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

No response

OS

Linux

Architecture

No response

Platform

No response

Ollama version

No response

GPU

No response

GPU info

No response

CPU

No response

Other software

No response

Originally created by @Ana0112 on GitHub (Apr 12, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3613 ### What is the issue? I am using this code [langchain](https://github.com/ollama/ollama/blob/main/docs/tutorials/langchainpy.md) to get embeddings. Code - ``` loader = PyPDFDirectoryLoader("data") data = loader.load() from langchain.text_splitter import RecursiveCharacterTextSplitter text_splitter=RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0) all_splits = text_splitter.split_documents(data) from langchain.embeddings import OllamaEmbeddings from langchain.vectorstores import Chroma oembed = OllamaEmbeddings(base_url="https://11aa-11-111-111-111.ngrok-free.app/:11434", model="nomic-embed-text") ``` Upto this, code is working fine. The following line is throwing error :- ``` vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed) ``` `ValueError: Error raised by inference API HTTP code: 404, 404 page not found` I want to use these embeddings for RAG, with groq ``` rag_template = """Answer the question based only on the following context: {context} Question: {question} """ rag_prompt = ChatPromptTemplate.from_template(rag_template) rag_chain = ( {"context": retriever, "question": RunnablePassthrough()} | rag_prompt | llm | StrOutputParser() ) ``` I am able to generate collection using chromaDB standalone (i.e. not using langchain), using this code [embedding-model](https://ollama.com/blog/embedding-models) but then I don't know how to use the generated embeddings with groq llm for RAG. ### What did you expect to see? chroma db should have generated the vectorstore ### Steps to reproduce _No response_ ### Are there any recent changes that introduced the issue? _No response_ ### OS Linux ### Architecture _No response_ ### Platform _No response_ ### Ollama version _No response_ ### GPU _No response_ ### GPU info _No response_ ### CPU _No response_ ### Other software _No response_
GiteaMirror added the bug label 2026-05-03 16:50:23 -05:00
Author
Owner

@andrewnguonly commented on GitHub (Apr 14, 2024):

There might be a bug in the base_url. The port number :11434 is after a slash /, so the port isn't actually specified; it's interpreted as a path (hence 404 page not found error).

Try removing the slash /: ngrok-free.app:11434

<!-- gh-comment-id:2054083605 --> @andrewnguonly commented on GitHub (Apr 14, 2024): There might be a bug in the `base_url`. The port number `:11434` is after a slash `/`, so the port isn't actually specified; it's interpreted as a path (hence `404` page not found error). Try removing the slash `/`: `ngrok-free.app:11434`
Author
Owner

@Ana0112 commented on GitHub (Apr 14, 2024):

ok, thanks for your response.

I tried without the slash, but still getting the error

oembed = OllamaEmbeddings(base_url="https://xxxx-xx-xxx.ngrok-free.app:11434", model="nomic-embed-text")
vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed)

ValueError: Error raised by inference endpoint: HTTPSConnectionPool(host='xxx-xx-xxx.ngrok-free.app', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcc71d09c30>: Failed to establish a new connection: [Errno 111] Connection refused'))

<!-- gh-comment-id:2054087798 --> @Ana0112 commented on GitHub (Apr 14, 2024): ok, thanks for your response. I tried without the slash, but still getting the error ``` oembed = OllamaEmbeddings(base_url="https://xxxx-xx-xxx.ngrok-free.app:11434", model="nomic-embed-text") vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed) ``` ValueError: Error raised by inference endpoint: HTTPSConnectionPool(host='xxx-xx-xxx.ngrok-free.app', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcc71d09c30>: Failed to establish a new connection: [Errno 111] Connection refused'))
Author
Owner

@andrewnguonly commented on GitHub (Apr 14, 2024):

@Ana0112, take a look at this comment and the discussion in the issue. You may need to set the OLLAMA_HOST environment variable appropriately.

<!-- gh-comment-id:2054134068 --> @andrewnguonly commented on GitHub (Apr 14, 2024): @Ana0112, take a look at [this comment](https://github.com/ollama/ollama/issues/2132#issuecomment-1904515289) and the discussion in the issue. You may need to set the `OLLAMA_HOST` environment variable appropriately.
Author
Owner

@Ana0112 commented on GitHub (Apr 14, 2024):

I am doing

os.environ["OLLAMA_HOST"]= "https://xx.ngrok-free.app"

which is giving the above error.

I couldn't understand what should be the public ip in my case to set - "OLLAMA_ORIGINS=http://public_ip:11434"

I can see ollama is running in browser on both these urls -

https://xx.ngrok-free.app/
http://0.0.0.0:11434/
<!-- gh-comment-id:2054174594 --> @Ana0112 commented on GitHub (Apr 14, 2024): I am doing ``` os.environ["OLLAMA_HOST"]= "https://xx.ngrok-free.app" ``` which is giving the above error. I couldn't understand what should be the public ip in my case to set - `"OLLAMA_ORIGINS=http://public_ip:11434"` I can see `ollama is running` in browser on both these urls - ``` https://xx.ngrok-free.app/ http://0.0.0.0:11434/ ```
Author
Owner

@jmorganca commented on GitHub (Apr 15, 2024):

Hi @Ana0112 for using Ollama with ngrok see https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-use-ollama-with-ngrok

Hope this helps!

<!-- gh-comment-id:2057629430 --> @jmorganca commented on GitHub (Apr 15, 2024): Hi @Ana0112 for using Ollama with ngrok see https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-use-ollama-with-ngrok Hope this helps!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64265