[GH-ISSUE #465] deployment on ubuntu server with nginx causes 504 timeout #216

Closed
opened 2026-04-12 09:44:17 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @helxsz on GitHub (Sep 3, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/465

I am running a ollama server which runs a llama2 on the cloud server with ubuntu.

curl -X POST http://localhost:11434/api/generate -d '{
>   "model": "llama2",
>   "prompt":"Why is the sky blue?"
> }'

the result is starting to stream in 3 seconds

{"model":"llama2","created_at":"2023-09-01T22:35:43.271181437Z","response":" The","done":false}
....
....
However, for the public visit there is a nginx server connecting to this ollama server, but everytime visiting the service in cloud IP address always causes 504 timeout.

the configuration of nginx server timeout doesn't work.

server {
        listen 9003;
        server_name xx.xx.xx.xx;

        location / {
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header   Host      $http_host;
                proxy_set_header X-Forwarded-Proto $scheme;

                proxy_read_timeout 150000;
                proxy_connect_timeout 150000;
                proxy_send_timeout 150000;
     }
}

usually timeout 150000 should be enough to solve the timeout issue since ollama inference is only within 3 seconds.

any ideas on the deployment

Originally created by @helxsz on GitHub (Sep 3, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/465 I am running a ollama server which runs a llama2 on the cloud server with ubuntu. ``` curl -X POST http://localhost:11434/api/generate -d '{ > "model": "llama2", > "prompt":"Why is the sky blue?" > }' ``` the result is starting to stream in 3 seconds {"model":"llama2","created_at":"2023-09-01T22:35:43.271181437Z","response":" The","done":false} .... .... However, for the public visit there is a nginx server connecting to this ollama server, but everytime visiting the service in cloud IP address always causes 504 timeout. the configuration of nginx server timeout doesn't work. ``` server { listen 9003; server_name xx.xx.xx.xx; location / { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header X-Forwarded-Proto $scheme; proxy_read_timeout 150000; proxy_connect_timeout 150000; proxy_send_timeout 150000; } } ``` usually timeout 150000 should be enough to solve the timeout issue since ollama inference is only within 3 seconds. any ideas on the deployment
Author
Owner

@helxsz commented on GitHub (Sep 4, 2023):

investigate more on 'event-stream',

 header:{
  'Content-Type': 'text/event-stream'
}

therefore

proxy_http_version 1.1; version 1.1 supports long connection

       proxy_set_header Connection 'keep-alive';
       proxy_set_header Cache-Control 'no-cache';
       proxy_set_header Content-Type 'text/event-stream';

proxy_buffering off; disable the buffering to make sure the real time connection

<!-- gh-comment-id:1705662099 --> @helxsz commented on GitHub (Sep 4, 2023): investigate more on 'event-stream', ``` header:{ 'Content-Type': 'text/event-stream' } ``` therefore `proxy_http_version 1.1;` version 1.1 supports long connection ``` proxy_set_header Connection 'keep-alive'; proxy_set_header Cache-Control 'no-cache'; proxy_set_header Content-Type 'text/event-stream'; ``` ` proxy_buffering off; ` disable the buffering to make sure the real time connection
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#216