[GH-ISSUE #2053] Request -> Remote server deployment tutorial w/ API access for AI apps #1188

Closed
opened 2026-04-12 10:58:10 -05:00 by GiteaMirror · 15 comments
Owner

Originally created by @squatchydev9000 on GitHub (Jan 18, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2053

Hey Ollama team, thx for all that you guys are doing.

Question/Request: can you please demonstrate how we can deploy Ollama to a remote server -> I have using ssh but I cannot, for the life of me, figure out how to build it into an api I can use with autogen/crewai/superagi/etc...

I bet many are also stuck here. Sure we can get things going locally, but almost no one actually owns an m3 mac to run things locally... so local dev is tough... and for production AI apps we need an API solution for a remote Ollama install...

I believe the world needs Ollama and open sourced options more than ever as the big corporations are pushing us towards the abyss... an API/Deployment tutorial or package would be the keystone in protecting humanity from the big corps...

Originally created by @squatchydev9000 on GitHub (Jan 18, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2053 Hey Ollama team, thx for all that you guys are doing. Question/Request: can you please demonstrate how we can deploy Ollama to a remote server -> I have using ssh but I cannot, for the life of me, figure out how to build it into an api I can use with autogen/crewai/superagi/etc... **I bet many are also stuck here**. Sure we can get things going locally, but almost no one actually owns an m3 mac to run things locally... so local dev is tough... and for production AI apps we need an API solution for a remote Ollama install... I believe the world needs Ollama and open sourced options more than ever as the big corporations are pushing us towards the abyss... an API/Deployment tutorial or package would be the keystone in protecting humanity from the big corps...
Author
Owner

@easp commented on GitHub (Jan 18, 2024):

but almost no one actually owns an m3 mac to run things locally

You don't need a M3, or a Mac to run things locally. Lots of people run Ollama locally on PCs.

If you want to expose the ollama service beyond localhost you can refer to the FAQ. You should be conscious of the fact that the ollama API doesn't have any authentication or encryption, so you'll either want to run it behind a reverse proxy that implements those things or use a VPN (tail scale is easy to set up).

<!-- gh-comment-id:1898891378 --> @easp commented on GitHub (Jan 18, 2024): > but almost no one actually owns an m3 mac to run things locally You don't need a M3, or a Mac to run things locally. Lots of people run Ollama locally on PCs. If you want to expose the ollama service beyond localhost you can [refer to the FAQ](https://github.com/jmorganca/ollama/blob/main/docs/faq.md#how-can-i-expose-ollama-on-my-network). You should be conscious of the fact that the ollama API doesn't have any authentication or encryption, so you'll either want to run it behind a reverse proxy that implements those things or use a VPN (tail scale is easy to set up).
Author
Owner

@mxyng commented on GitHub (Jan 18, 2024):

Here's a concrete example of running Ollama in Kubernetes. There are example configurations for both CPU and GPU inference

<!-- gh-comment-id:1898892872 --> @mxyng commented on GitHub (Jan 18, 2024): Here's a concrete [example](https://github.com/jmorganca/ollama/tree/main/examples/kubernetes) of running Ollama in Kubernetes. There are example configurations for both CPU and GPU inference
Author
Owner

@squatchydev9000 commented on GitHub (Jan 18, 2024):

thx guys.. i appreciate it. Also posted to 'Matt's" youtube channel... he replied that this is a good idea so here's to hoping for a tutorial direct from the Ollama team. :)
cheers.

<!-- gh-comment-id:1899041997 --> @squatchydev9000 commented on GitHub (Jan 18, 2024): thx guys.. i appreciate it. Also posted to 'Matt's" youtube channel... he replied that this is a good idea so here's to hoping for a tutorial direct from the Ollama team. :) cheers.
Author
Owner

@squatchydev9000 commented on GitHub (Jan 19, 2024):

Need to add for clarity -> I am struggling to access my remote serve linux ubuntu ollama install from anything other than ssh.

Need guidance on connecting to my remote linux/ubuntu server... all I have is a public IP... requests time out no matter what different url string I try...

<!-- gh-comment-id:1899560335 --> @squatchydev9000 commented on GitHub (Jan 19, 2024): Need to add for clarity -> I am struggling to access my remote serve linux ubuntu ollama install from anything other than ssh. Need guidance on connecting to my remote linux/ubuntu server... all I have is a public IP... requests time out no matter what different url string I try...
Author
Owner

@squatchydev9000 commented on GitHub (Jan 19, 2024):

I am still unable to find a clear set of instructions or a tutorial to connect to the static public IP of my hosted ubuntu/linux Ollama install with anything other than SSH in my terminal...

Anyone have a way to get past the 'Request Timedout' error... or connection advice?

<!-- gh-comment-id:1900796098 --> @squatchydev9000 commented on GitHub (Jan 19, 2024): I am still unable to find a clear set of instructions or a tutorial to connect to the static public IP of my hosted ubuntu/linux Ollama install with anything other than SSH in my terminal... Anyone have a way to get past the 'Request Timedout' error... or connection advice?
Author
Owner

@easp commented on GitHub (Jan 19, 2024):

I'd suggest familiarizing yourself more with firewalls and ssh tunneling / port forwarding. Your hosting provider may have some introductory resources.

You really should NOT be exposing services to the internet at large without some understanding of network and system security. SSH tunneling is probably the simplest way to make a remote Ollama install accessible to something running on your local machine.

<!-- gh-comment-id:1900835716 --> @easp commented on GitHub (Jan 19, 2024): I'd suggest familiarizing yourself more with firewalls and ssh tunneling / port forwarding. Your hosting provider may have some introductory resources. You really should NOT be exposing services to the internet at large without some understanding of network and system security. SSH tunneling is probably the simplest way to make a remote Ollama install accessible to something running on your local machine.
Author
Owner

@squatchydev9000 commented on GitHub (Jan 19, 2024):

I followed the binding info from the faq.md file for linux to a 't'....

After a few hours and lots of chat-gpt, lots of editing the environment variables... I still have it binding to 127.0.0.1... when it restarts... (shows :: or 0.0.0.0 when checking status after changes and deomon reload etc...

I simply cannot get it to 0.0.0.0...

Is Ollama not suitable as a production ready LLM runner for my apps? is it strictly a tool for running models locally and/or remotely direct to your machine via SSH tunneling?

<!-- gh-comment-id:1900937928 --> @squatchydev9000 commented on GitHub (Jan 19, 2024): I followed the binding info from the faq.md file for linux to a 't'.... After a few hours and lots of chat-gpt, lots of editing the environment variables... I still have it binding to 127.0.0.1... when it restarts... (shows :: or 0.0.0.0 when checking status after changes and deomon reload etc... I simply cannot get it to 0.0.0.0... ***Is Ollama not suitable as a production ready LLM runner for my apps?*** is it strictly a tool for running models locally and/or remotely direct to your machine via SSH tunneling?
Author
Owner

@squatchydev9000 commented on GitHub (Jan 19, 2024):

The people need an alternative to open-ai... the only way I see is Ollama so we can utilize open sourced models for various tasks within our AI apps... but ALL attempts to deploy have been a giant waste of time so far...

Has anyone out there successfully deployed Ollama to a remote server for production scalable apps?

<!-- gh-comment-id:1900944201 --> @squatchydev9000 commented on GitHub (Jan 19, 2024): The people need an alternative to open-ai... the only way I see is Ollama so we can utilize open sourced models for various tasks within our AI apps... but ALL attempts to deploy have been a giant waste of time so far... Has anyone out there successfully deployed Ollama to a remote server for production scalable apps?
Author
Owner

@easp commented on GitHub (Jan 19, 2024):

Please share the output of sudo netstat -ltnp This will run netstat as superuser and tell netstat to show listening sockets (l) on tcp (t) using numeric representations of IP and port addresses (n) and list the processes behind those listening sockets (p). You can obfuscate any IP addresses you don't want to be public.

<!-- gh-comment-id:1900995158 --> @easp commented on GitHub (Jan 19, 2024): Please share the output of `sudo netstat -ltnp` This will run netstat as superuser and tell netstat to show listening sockets (l) on tcp (t) using numeric representations of IP and port addresses (n) and list the processes behind those listening sockets (p). You can obfuscate any IP addresses you don't want to be public.
Author
Owner

@remy415 commented on GitHub (Jan 31, 2024):

@squatchydev9000 The ollama-python repo has a tutorial for interacting with the API using Python, and there's one for JS on their JS repo:

https://github.com/ollama/ollama-python
https://github.com/ollama/ollama-js

They also have REST API documentation:
https://github.com/ollama/ollama/blob/main/docs/api.md

FAQ section covers exposing the interface to remote machines:
https://github.com/ollama/ollama/blob/main/docs/faq.md

Set env variable to tell Ollama which interface to bind on:

OLLAMA_HOST="0.0.0.0"

Can also update the origins:
OLLAMA_ORIGINS="172.16.4.20"

This should allow you to remotely access ollama serve via API. There are a lot of tutorials out there for deploying apps via Docker, Kubernetes, or through API packages such as Flask, FastAPI, Django, etc. Without knowing your current experience level, it would be difficult to point you to an appropriate tutorial/guide. Feel free to reach out if you need hep with anything.

<!-- gh-comment-id:1918212677 --> @remy415 commented on GitHub (Jan 31, 2024): @squatchydev9000 The ollama-python repo has a tutorial for interacting with the API using Python, and there's one for JS on their JS repo: https://github.com/ollama/ollama-python https://github.com/ollama/ollama-js They also have REST API documentation: https://github.com/ollama/ollama/blob/main/docs/api.md FAQ section covers exposing the interface to remote machines: https://github.com/ollama/ollama/blob/main/docs/faq.md Set env variable to tell Ollama which interface to bind on: `OLLAMA_HOST="0.0.0.0"` Can also update the origins: `OLLAMA_ORIGINS="172.16.4.20"` This should allow you to remotely access ollama serve via API. There are a lot of tutorials out there for deploying apps via Docker, Kubernetes, or through API packages such as Flask, FastAPI, Django, etc. Without knowing your current experience level, it would be difficult to point you to an appropriate tutorial/guide. Feel free to reach out if you need hep with anything.
Author
Owner

@pdevine commented on GitHub (Mar 11, 2024):

As @remy415 mentioned, you should be able to start the ollama server with the OLLAMA_HOST=0.0.0.0 environment variable. I think the question has been answered, so I'm going to go ahead and close the issue. Please feel free to reopen or just keep commenting.

<!-- gh-comment-id:1989113729 --> @pdevine commented on GitHub (Mar 11, 2024): As @remy415 mentioned, you should be able to start the ollama server with the `OLLAMA_HOST=0.0.0.0` environment variable. I _think_ the question has been answered, so I'm going to go ahead and close the issue. Please feel free to reopen or just keep commenting.
Author
Owner

@danielbisca commented on GitHub (Apr 7, 2024):

Helllo, i also had the same problem. Installed Ollama on a dedicated VM (cpu only) then was struggling with api requests (crewai etc) from the other machines on the internal network.

The solution is pretty simple tho: need to log to the hosted server and once the models are running type:

export OPENAI_API_BASE=http://localhost:11434/v1
i user also SerperDev since no luck with DuckDuckGo and did the same:
export SERPERAPI_KEY=your_serperkey

on the code part i have :


from langchain_openai import ChatOpenAI

from crewai_tools import SerperDevTool
os.environ["SERPER_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxc"

search_tool = SerperDevTool()

gemma2b=ChatOpenAI(
openai_api_base='http://192.168.0.15:11434/v1', # ip of the ollama server
openai_api_key='NA',
model_name='gemma:2b'
)


<!-- gh-comment-id:2041416453 --> @danielbisca commented on GitHub (Apr 7, 2024): Helllo, i also had the same problem. Installed Ollama on a dedicated VM (cpu only) then was struggling with api requests (crewai etc) from the other machines on the internal network. The solution is pretty simple tho: need to log to the hosted server and once the models are running type: export OPENAI_API_BASE=http://localhost:11434/v1 i user also SerperDev since no luck with DuckDuckGo and did the same: export SERPERAPI_KEY=your_serperkey on the code part i have : ****************************************************************************************** from langchain_openai import ChatOpenAI from crewai_tools import SerperDevTool os.environ["SERPER_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxc" search_tool = SerperDevTool() gemma2b=ChatOpenAI( openai_api_base='http://192.168.0.15:11434/v1', # ip of the ollama server openai_api_key='NA', model_name='gemma:2b' ) ******************************************************************************
Author
Owner

@scattter commented on GitHub (Apr 9, 2024):

@squatchydev9000 The ollama-python repo has a tutorial for interacting with the API using Python, and there's one for JS on their JS repo:

https://github.com/ollama/ollama-python https://github.com/ollama/ollama-js

They also have REST API documentation: https://github.com/ollama/ollama/blob/main/docs/api.md

FAQ section covers exposing the interface to remote machines: https://github.com/ollama/ollama/blob/main/docs/faq.md

Set env variable to tell Ollama which interface to bind on:

OLLAMA_HOST="0.0.0.0"

Can also update the origins: OLLAMA_ORIGINS="172.16.4.20"

This should allow you to remotely access ollama serve via API. There are a lot of tutorials out there for deploying apps via Docker, Kubernetes, or through API packages such as Flask, FastAPI, Django, etc. Without knowing your current experience level, it would be difficult to point you to an appropriate tutorial/guide. Feel free to reach out if you need hep with anything.

very useful!

<!-- gh-comment-id:2045454561 --> @scattter commented on GitHub (Apr 9, 2024): > @squatchydev9000 The ollama-python repo has a tutorial for interacting with the API using Python, and there's one for JS on their JS repo: > > https://github.com/ollama/ollama-python https://github.com/ollama/ollama-js > > They also have REST API documentation: https://github.com/ollama/ollama/blob/main/docs/api.md > > FAQ section covers exposing the interface to remote machines: https://github.com/ollama/ollama/blob/main/docs/faq.md > > Set env variable to tell Ollama which interface to bind on: > > `OLLAMA_HOST="0.0.0.0"` > > Can also update the origins: `OLLAMA_ORIGINS="172.16.4.20"` > > This should allow you to remotely access ollama serve via API. There are a lot of tutorials out there for deploying apps via Docker, Kubernetes, or through API packages such as Flask, FastAPI, Django, etc. Without knowing your current experience level, it would be difficult to point you to an appropriate tutorial/guide. Feel free to reach out if you need hep with anything. very useful!
Author
Owner

@DOMINION-JOHN1 commented on GitHub (May 29, 2024):

Hey Ollama team, thx for all that you guys are doing.

Question/Request: can you please demonstrate how we can deploy Ollama to a remote server -> I have using ssh but I cannot, for the life of me, figure out how to build it into an api I can use with autogen/crewai/superagi/etc...

I bet many are also stuck here. Sure we can get things going locally, but almost no one actually owns an m3 mac to run things locally... so local dev is tough... and for production AI apps we need an API solution for a remote Ollama install...

I believe the world needs Ollama and open sourced options more than ever as the big corporations are pushing us towards the abyss... an API/Deployment tutorial or package would be the keystone in protecting humanity from the big corps...

very true i have same issue here , for any pc to run ollama locally, the system should be atleast to a very extent a bit powerful

<!-- gh-comment-id:2138230801 --> @DOMINION-JOHN1 commented on GitHub (May 29, 2024): > Hey Ollama team, thx for all that you guys are doing. > > Question/Request: can you please demonstrate how we can deploy Ollama to a remote server -> I have using ssh but I cannot, for the life of me, figure out how to build it into an api I can use with autogen/crewai/superagi/etc... > > **I bet many are also stuck here**. Sure we can get things going locally, but almost no one actually owns an m3 mac to run things locally... so local dev is tough... and for production AI apps we need an API solution for a remote Ollama install... > > I believe the world needs Ollama and open sourced options more than ever as the big corporations are pushing us towards the abyss... an API/Deployment tutorial or package would be the keystone in protecting humanity from the big corps... very true i have same issue here , for any pc to run ollama locally, the system should be atleast to a very extent a bit powerful
Author
Owner

@DOMINION-JOHN1 commented on GitHub (May 29, 2024):

I love how easy it is to use ollama locally i do wish same can be available for production deployment jus as OpenAi HAS IT BY JUST CALLING THEIR API

<!-- gh-comment-id:2138233427 --> @DOMINION-JOHN1 commented on GitHub (May 29, 2024): I love how easy it is to use ollama locally i do wish same can be available for production deployment jus as OpenAi HAS IT BY JUST CALLING THEIR API
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1188