[GH-ISSUE #12270] inside tmux session: ollama serve broken proxy support to pull models ( HTTPS_PROXY=http://myproxy:3128 ollama serve ). works in normal terminal outside of tmux #33919

Closed
opened 2026-04-22 17:06:05 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @neurostream on GitHub (Sep 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12270

What is the issue?

INSIDE OF A TMUX SESSION:

curl uses proxy correctly, but "ollama serve" gives a "no route to host" to an "ollama pull" request

worker@msm3u1 Downloads % ollama pull gpt-oss:120b  
pulling manifest 
Error: pull model manifest: Get "https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b": proxyconnect tcp: dial tcp myproxy:3128: connect: no route to host

Relevant log output

curl uses the same HTTPS_PROXY env var set for ollama serve and fully proxies (this example will pull the same ollama.ai registry target that "ollama serve" now fails to do since that the couple of updates ago:
---
worker@msm3u1 Downloads % curl -v https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b 
* Uses proxy env variable https_proxy == 'http://myproxy:3128'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying myproxy:3128...
* Connected to myproxy (myproxy) port 3128
* CONNECT tunnel: HTTP/1.1 negotiated
* allocate connect buffer
* Establish HTTP proxy tunnel to registry.ollama.ai:443
> CONNECT registry.ollama.ai:443 HTTP/1.1
> Host: registry.ollama.ai:443
> User-Agent: curl/8.7.1
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 Connection established
< 
* CONNECT phase completed
* CONNECT tunnel established, response 200
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [323 bytes data]
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [19 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [2524 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [79 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [36 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [36 bytes data]
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=ollama.ai
*  start date: Aug  3 12:44:37 2025 GMT
*  expire date: Nov  1 13:41:51 2025 GMT
*  subjectAltName: host "registry.ollama.ai" matched cert's "*.ollama.ai"
*  issuer: C=US; O=Google Trust Services; CN=WE1
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: registry.ollama.ai]
* [HTTP/2] [1] [:path: /v2/library/gpt-oss/manifests/120b]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> GET /v2/library/gpt-oss/manifests/120b HTTP/2
> Host: registry.ollama.ai
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 200 
< date: Fri, 12 Sep 2025 18:26:19 GMT
< content-type: text/plain; charset=utf-8
< content-length: 903
< via: 1.1 google
< alt-svc: h3=":443"; ma=86400
< cf-cache-status: DYNAMIC
< nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
< report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=xg3xqi1Wa2np4UPYlHVnUT78pP5h1CMImTrPaNaotzCu6YQHN0yVw83TN2Cvt7HdATaWzJJLJTelLiE%2FAVrIv1QGOihrhmlTEUlt%2F8Pb26LAzQ%3D%3D"}]}
< server: cloudflare
< cf-ray: 97e173d63c8d1032-LAX
< 
{ [903 bytes data]
100   903  100   903    0     0   3904      0 --:--:-- --:--:-- --:--:--  3909
* Connection #0 to host myproxy left intact
{"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":{"mediaType":"application/vnd.docker.container.image.v1+json","digest":"sha256:af6975b0c34cc6df988617dcbcbfc159e10aa5ea888f20ec162f5ee94f355a4a","size":490},"layers":[{"mediaType":"application/vnd.ollama.image.model","digest":"sha256:90a618fe6ff21b09ca968df959104eb650658b0bef0faef785c18c2795d993e3","size":65290172992,"from":"gpt-oss:120b"},{"mediaType":"application/vnd.ollama.image.template","digest":"sha256:51468a0fd901ba85884effc057a361f6dd33e4b3c99ead45f2673d2fd79a8943","size":7355,"from":"gpt-oss:120b"},{"mediaType":"application/vnd.ollama.image.license","digest":"sha256:f60356777647e927149cbd4c0ec1314a90caba9400ad205ddc4ce47ed001c2d6","size":11353},{"mediaType":"application/vnd.ollama.image.params","digest":"sha256:d8ba2f9a17b3bbdeb5690efaa409b3fcb0b56296a777c7a69c78aa33bbddf182","size":18}]}%                                                                                                                                                                                                         worker@msm3u1 Downloads %

If i run ollama serve the same way, but NOT IN A TMUX SESSION, then the ollama pull works.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.11.10

Originally created by @neurostream on GitHub (Sep 12, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12270 ### What is the issue? INSIDE OF A TMUX SESSION: curl uses proxy correctly, but "ollama serve" gives a "no route to host" to an "ollama pull" request ``` worker@msm3u1 Downloads % ollama pull gpt-oss:120b pulling manifest Error: pull model manifest: Get "https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b": proxyconnect tcp: dial tcp myproxy:3128: connect: no route to host ````` ### Relevant log output ```shell curl uses the same HTTPS_PROXY env var set for ollama serve and fully proxies (this example will pull the same ollama.ai registry target that "ollama serve" now fails to do since that the couple of updates ago: --- worker@msm3u1 Downloads % curl -v https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b * Uses proxy env variable https_proxy == 'http://myproxy:3128' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying myproxy:3128... * Connected to myproxy (myproxy) port 3128 * CONNECT tunnel: HTTP/1.1 negotiated * allocate connect buffer * Establish HTTP proxy tunnel to registry.ollama.ai:443 > CONNECT registry.ollama.ai:443 HTTP/1.1 > Host: registry.ollama.ai:443 > User-Agent: curl/8.7.1 > Proxy-Connection: Keep-Alive > < HTTP/1.1 200 Connection established < * CONNECT phase completed * CONNECT tunnel established, response 200 * ALPN: curl offers h2,http/1.1 * (304) (OUT), TLS handshake, Client hello (1): } [323 bytes data] * CAfile: /etc/ssl/cert.pem * CApath: none * (304) (IN), TLS handshake, Server hello (2): { [122 bytes data] * (304) (IN), TLS handshake, Unknown (8): { [19 bytes data] * (304) (IN), TLS handshake, Certificate (11): { [2524 bytes data] * (304) (IN), TLS handshake, CERT verify (15): { [79 bytes data] * (304) (IN), TLS handshake, Finished (20): { [36 bytes data] * (304) (OUT), TLS handshake, Finished (20): } [36 bytes data] * SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF * ALPN: server accepted h2 * Server certificate: * subject: CN=ollama.ai * start date: Aug 3 12:44:37 2025 GMT * expire date: Nov 1 13:41:51 2025 GMT * subjectAltName: host "registry.ollama.ai" matched cert's "*.ollama.ai" * issuer: C=US; O=Google Trust Services; CN=WE1 * SSL certificate verify ok. * using HTTP/2 * [HTTP/2] [1] OPENED stream for https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b * [HTTP/2] [1] [:method: GET] * [HTTP/2] [1] [:scheme: https] * [HTTP/2] [1] [:authority: registry.ollama.ai] * [HTTP/2] [1] [:path: /v2/library/gpt-oss/manifests/120b] * [HTTP/2] [1] [user-agent: curl/8.7.1] * [HTTP/2] [1] [accept: */*] > GET /v2/library/gpt-oss/manifests/120b HTTP/2 > Host: registry.ollama.ai > User-Agent: curl/8.7.1 > Accept: */* > * Request completely sent off < HTTP/2 200 < date: Fri, 12 Sep 2025 18:26:19 GMT < content-type: text/plain; charset=utf-8 < content-length: 903 < via: 1.1 google < alt-svc: h3=":443"; ma=86400 < cf-cache-status: DYNAMIC < nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800} < report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=xg3xqi1Wa2np4UPYlHVnUT78pP5h1CMImTrPaNaotzCu6YQHN0yVw83TN2Cvt7HdATaWzJJLJTelLiE%2FAVrIv1QGOihrhmlTEUlt%2F8Pb26LAzQ%3D%3D"}]} < server: cloudflare < cf-ray: 97e173d63c8d1032-LAX < { [903 bytes data] 100 903 100 903 0 0 3904 0 --:--:-- --:--:-- --:--:-- 3909 * Connection #0 to host myproxy left intact {"schemaVersion":2,"mediaType":"application/vnd.docker.distribution.manifest.v2+json","config":{"mediaType":"application/vnd.docker.container.image.v1+json","digest":"sha256:af6975b0c34cc6df988617dcbcbfc159e10aa5ea888f20ec162f5ee94f355a4a","size":490},"layers":[{"mediaType":"application/vnd.ollama.image.model","digest":"sha256:90a618fe6ff21b09ca968df959104eb650658b0bef0faef785c18c2795d993e3","size":65290172992,"from":"gpt-oss:120b"},{"mediaType":"application/vnd.ollama.image.template","digest":"sha256:51468a0fd901ba85884effc057a361f6dd33e4b3c99ead45f2673d2fd79a8943","size":7355,"from":"gpt-oss:120b"},{"mediaType":"application/vnd.ollama.image.license","digest":"sha256:f60356777647e927149cbd4c0ec1314a90caba9400ad205ddc4ce47ed001c2d6","size":11353},{"mediaType":"application/vnd.ollama.image.params","digest":"sha256:d8ba2f9a17b3bbdeb5690efaa409b3fcb0b56296a777c7a69c78aa33bbddf182","size":18}]}% worker@msm3u1 Downloads % ``` If i run ollama serve the same way, but NOT IN A TMUX SESSION, then the ollama pull works. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.11.10
GiteaMirror added the bug label 2026-04-22 17:06:05 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 12, 2025):

Seems to work fine here.

$ export HTTPS_PROXY=http://proxy-au:8080
$ export OLLAMA_HOST=:11444
$ ollama serve &
[1] 3037317
time=2025-09-12T21:28:35.904+02:00 level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY:http://proxy-au:8080 HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://:11444 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/rick/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-09-12T21:28:35.909+02:00 level=INFO source=images.go:477 msg="total blobs: 13"
time=2025-09-12T21:28:35.909+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-09-12T21:28:35.909+02:00 level=INFO source=routes.go:1384 msg="Listening on [::]:11444 (version 0.11.10)"
time=2025-09-12T21:28:35.909+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-09-12T21:28:35.995+02:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5d7e56c-4491-8eeb-cb2d-e8d8424e5bb7 library=cuda variant=v12 compute=8.9 driver=12.4 name="NVIDIA GeForce RTX 4070" total="11.7 GiB" available="11.6 GiB"
time=2025-09-12T21:28:35.995+02:00 level=INFO source=routes.go:1425 msg="entering low vram mode" "total vram"="11.7 GiB" threshold="20.0 GiB"

$ ollama -v
[GIN] 2025/09/12 - 21:29:53 | 200 |      68.444µs |       127.0.0.1 | GET      "/api/version"
ollama version is 0.11.10

$ ollama pull qwen2.5:0.5b
[GIN] 2025/09/12 - 21:29:08 | 200 |      80.007µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/09/12 - 21:29:11 | 200 |  2.402010546s |       127.0.0.1 | POST     "/api/pull"

pulling manifest ⠸
pulling manifest
pulling c5396e06af29: 100% ▕███████████████████████████████████████████████████████████████████▏ 397 MB
pulling 66b9ea09bd5b: 100% ▕███████████████████████████████████████████████████████████████████▏   68 B
pulling eb4402837c78: 100% ▕███████████████████████████████████████████████████████████████████▏ 1.5 KB
pulling 832dd9e00a68: 100% ▕███████████████████████████████████████████████████████████████████▏  11 KB
pulling 005f95c74751: 100% ▕███████████████████���███████████████████████████████████████████████▏  490 B
verifying sha256 digest
writing manifest
success

$ ollama list
[GIN] 2025/09/12 - 21:32:53 | 200 |      47.142µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/09/12 - 21:32:53 | 200 |    2.796996ms |       127.0.0.1 | GET      "/api/tags"
NAME               ID              SIZE      MODIFIED
qwen2.5:0.5b       a8b0c5157701    397 MB    3 minutes ago
Error: pull model manifest: Get "https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b": proxyconnect tcp: dial tcp myproxy:3128: connect: no route to host

I think this is complaining about not being able to connect to the proxy, not about connecting to ollama.ai.

<!-- gh-comment-id:3286608875 --> @rick-github commented on GitHub (Sep 12, 2025): Seems to work fine here. ``` $ export HTTPS_PROXY=http://proxy-au:8080 $ export OLLAMA_HOST=:11444 $ ollama serve & [1] 3037317 time=2025-09-12T21:28:35.904+02:00 level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY:http://proxy-au:8080 HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://:11444 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/rick/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-09-12T21:28:35.909+02:00 level=INFO source=images.go:477 msg="total blobs: 13" time=2025-09-12T21:28:35.909+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-09-12T21:28:35.909+02:00 level=INFO source=routes.go:1384 msg="Listening on [::]:11444 (version 0.11.10)" time=2025-09-12T21:28:35.909+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-09-12T21:28:35.995+02:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5d7e56c-4491-8eeb-cb2d-e8d8424e5bb7 library=cuda variant=v12 compute=8.9 driver=12.4 name="NVIDIA GeForce RTX 4070" total="11.7 GiB" available="11.6 GiB" time=2025-09-12T21:28:35.995+02:00 level=INFO source=routes.go:1425 msg="entering low vram mode" "total vram"="11.7 GiB" threshold="20.0 GiB" $ ollama -v [GIN] 2025/09/12 - 21:29:53 | 200 | 68.444µs | 127.0.0.1 | GET "/api/version" ollama version is 0.11.10 $ ollama pull qwen2.5:0.5b [GIN] 2025/09/12 - 21:29:08 | 200 | 80.007µs | 127.0.0.1 | HEAD "/" [GIN] 2025/09/12 - 21:29:11 | 200 | 2.402010546s | 127.0.0.1 | POST "/api/pull" pulling manifest ⠸ pulling manifest pulling c5396e06af29: 100% ▕███████████████████████████████████████████████████████████████████▏ 397 MB pulling 66b9ea09bd5b: 100% ▕███████████████████████████████████████████████████████████████████▏ 68 B pulling eb4402837c78: 100% ▕███████████████████████████████████████████████████████████████████▏ 1.5 KB pulling 832dd9e00a68: 100% ▕███████████████████████████████████████████████████████████████████▏ 11 KB pulling 005f95c74751: 100% ▕███████████████████���███████████████████████████████████████████████▏ 490 B verifying sha256 digest writing manifest success $ ollama list [GIN] 2025/09/12 - 21:32:53 | 200 | 47.142µs | 127.0.0.1 | HEAD "/" [GIN] 2025/09/12 - 21:32:53 | 200 | 2.796996ms | 127.0.0.1 | GET "/api/tags" NAME ID SIZE MODIFIED qwen2.5:0.5b a8b0c5157701 397 MB 3 minutes ago ``` ``` Error: pull model manifest: Get "https://registry.ollama.ai/v2/library/gpt-oss/manifests/120b": proxyconnect tcp: dial tcp myproxy:3128: connect: no route to host ``` I think this is complaining about not being able to connect to the proxy, not about connecting to ollama.ai.
Author
Owner

@rick-github commented on GitHub (Sep 12, 2025):

My example above was run in a tmux session.

<!-- gh-comment-id:3286623674 --> @rick-github commented on GitHub (Sep 12, 2025): My example above was run in a tmux session.
Author
Owner

@neurostream commented on GitHub (Sep 12, 2025):

using the same startup script (which sets my proxy env vars and all my OLLAMA_ env vars) - which i don't launch into the background (&) - holding the tty to see the "ollama serve" stdout and stderr there:

  • if the ollama serve start script is run inside a tmux-wrapped terminal, an "ollama pull" request fails
  • if the ollama serve start script is run in a plain terminal, an "ollama pull" request succeeds

local inference works on the LAN in both startup scenarios

<!-- gh-comment-id:3286676932 --> @neurostream commented on GitHub (Sep 12, 2025): using the same startup script (which sets my proxy env vars and all my OLLAMA_ env vars) - which i don't launch into the background (&) - holding the tty to see the "ollama serve" stdout and stderr there: - if the ollama serve start script is run inside a tmux-wrapped terminal, an "ollama pull" request fails - if the ollama serve start script is run in a plain terminal, an "ollama pull" request succeeds local inference works on the LAN in both startup scenarios
Author
Owner

@rick-github commented on GitHub (Sep 12, 2025):

What's the startup script?

<!-- gh-comment-id:3286682308 --> @rick-github commented on GitHub (Sep 12, 2025): What's the startup script?
Author
Owner

@neurostream commented on GitHub (Sep 12, 2025):

worker@msm3u1 .ollama % cat startup.sh


sudo sysctl iogpu.wired_limit_mb=$(echo "496*1024" | bc)

OLLAMA_FLASH_ATTENTION=1 OLLAMA_SCHED_SPREAD=1 OLLAMA_CONTEXT_LENGTH=$( echo "192*1024" | bc ) OLLAMA_NOPRUNE=0 OLLAMA_NEW_ENGINE=1 OLLAMA_NOHISTORY=1 OLLAMA_LOAD_TIMEOUT=3600 OLLAMA_KEEP_ALIVE="-1m" OLLAMA_MODELS=${HOME}/.ollama/models OLLAMA_HOST=0.0.0.0:11434 OLLAMA_NUM_PARALLEL=4 OLLAMA_KV_CACHE_TYPE=q8_0 HTTPS_PROXY=http://myproxy:3128 HTTP_PROXY=http://myproxy:3128 https_proxy=${HTTPS_PROXY} http_proxy=${HTTP_PROXY} NO_PROXY="192.168.0.0/16,127.0.0.0/8,10.0.0.0/8" no_proxy=${NO_PROXY} ALL_PROXY=http://myproxy:3128 all_proxy=${ALL_PROXY} ollama serve 2>&1 

<!-- gh-comment-id:3286710304 --> @neurostream commented on GitHub (Sep 12, 2025): `worker@msm3u1 .ollama % cat startup.sh ` --- ``` sudo sysctl iogpu.wired_limit_mb=$(echo "496*1024" | bc) OLLAMA_FLASH_ATTENTION=1 OLLAMA_SCHED_SPREAD=1 OLLAMA_CONTEXT_LENGTH=$( echo "192*1024" | bc ) OLLAMA_NOPRUNE=0 OLLAMA_NEW_ENGINE=1 OLLAMA_NOHISTORY=1 OLLAMA_LOAD_TIMEOUT=3600 OLLAMA_KEEP_ALIVE="-1m" OLLAMA_MODELS=${HOME}/.ollama/models OLLAMA_HOST=0.0.0.0:11434 OLLAMA_NUM_PARALLEL=4 OLLAMA_KV_CACHE_TYPE=q8_0 HTTPS_PROXY=http://myproxy:3128 HTTP_PROXY=http://myproxy:3128 https_proxy=${HTTPS_PROXY} http_proxy=${HTTP_PROXY} NO_PROXY="192.168.0.0/16,127.0.0.0/8,10.0.0.0/8" no_proxy=${NO_PROXY} ALL_PROXY=http://myproxy:3128 all_proxy=${ALL_PROXY} ollama serve 2>&1 ```
Author
Owner

@neurostream commented on GitHub (Sep 12, 2025):

I'll try it on my linux ollama host to to see if its specific to macos

<!-- gh-comment-id:3286730810 --> @neurostream commented on GitHub (Sep 12, 2025): I'll try it on my linux ollama host to to see if its specific to macos
Author
Owner

@rick-github commented on GitHub (Sep 12, 2025):

tmux session with horizontal split, top pane:

$ ./12270.sh
time=2025-09-12T22:35:38.904+02:00 level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY:http://proxy-au:8080 HTTP_PROXY:http://proxy-au:8080 NO_PROXY:192.168.0.0/16,127.0.0.0/8,10.0.0.0/8 OLLAMA_CONTEXT_LENGTH:196608 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:1h0m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/rick/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:true OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:true OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:4 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES: http_proxy:http://proxy-au:8080 https_proxy:http://proxy-au:8080 no_proxy:192.168.0.0/16,127.0.0.0/8,10.0.0.0/8]"
time=2025-09-12T22:35:38.907+02:00 level=INFO source=images.go:477 msg="total blobs: 10"
time=2025-09-12T22:35:38.907+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-09-12T22:35:38.907+02:00 level=INFO source=routes.go:1384 msg="Listening on [::]:11434 (version 0.11.10)"
time=2025-09-12T22:35:38.908+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-09-12T22:35:39.439+02:00 level=INFO source=gpu.go:613 msg="no nvidia devices detected by library /usr/lib/x86_64-linux-gnu/libcuda.so.575.57.08"
time=2025-09-12T22:35:39.912+02:00 level=INFO source=amd_linux.go:390 msg="amdgpu is supported" gpu=0 gpu_type=gfx1151
time=2025-09-12T22:35:39.917+02:00 level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1151 driver=6.12 name=1002:1586 total="96.0 GiB" available="95.8 GiB"
[GIN] 2025/09/12 - 22:36:16 | 200 |   11.994387ms |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/09/12 - 22:36:21 | 200 |      59.058µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/09/12 - 22:36:21 | 200 |   13.416084ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/09/12 - 22:36:52 | 200 |      25.429µs |       127.0.0.1 | HEAD     "/"
time=2025-09-12T22:36:55.815+02:00 level=INFO source=download.go:177 msg="downloading c5396e06af29 in 4 100 MB part(s)"
time=2025-09-12T22:37:29.871+02:00 level=INFO source=download.go:177 msg="downloading 66b9ea09bd5b in 1 68 B part(s)"
time=2025-09-12T22:37:31.905+02:00 level=INFO source=download.go:177 msg="downloading eb4402837c78 in 1 1.5 KB part(s)"
time=2025-09-12T22:37:33.939+02:00 level=INFO source=download.go:177 msg="downloading 832dd9e00a68 in 1 11 KB part(s)"
time=2025-09-12T22:37:35.971+02:00 level=INFO source=download.go:177 msg="downloading 005f95c74751 in 1 490 B part(s)"
[GIN] 2025/09/12 - 22:37:37 | 200 | 44.863944786s |       127.0.0.1 | POST     "/api/pull"
[GIN] 2025/09/12 - 22:37:42 | 200 |      44.379µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/09/12 - 22:37:42 | 200 |    1.016652ms |       127.0.0.1 | GET      "/api/tags"

Bottom pane:

$ ollama -v
ollama version is 0.11.10
$ ollama list
NAME                ID              SIZE      MODIFIED
deepseek-r1:14b     c333b7232bdb    9.0 GB    2 months ago
qwen3:14b-q4_K_M    bdbd181c33f2    9.3 GB    2 months ago
$ ollama pull qwen2.5:0.5b
pulling manifest
pulling c5396e06af29: 100% ▕█████████████████████████████████████████████████████████���███████████████████████████████████████████████████████████████████████████████████████████▏ 397 MB
pulling 66b9ea09bd5b: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏   68 B
pulling eb4402837c78: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.5 KB
pulling 832dd9e00a68: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  11 KB
pulling 005f95c74751: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏  490 B
verifying sha256 digest
writing manifest
success
$ ollama list
NAME                ID              SIZE      MODIFIED
qwen2.5:0.5b        a8b0c5157701    397 MB    4 seconds ago
deepseek-r1:14b     c333b7232bdb    9.0 GB    2 months ago
qwen3:14b-q4_K_M    bdbd181c33f2    9.3 GB    2 months ago
$

Script. Reformatted for clarity but the only changes were the name of the proxy and the removal of the wired limit:

#!/bin/bash

OLLAMA_FLASH_ATTENTION=1 \
  OLLAMA_SCHED_SPREAD=1 \
  OLLAMA_CONTEXT_LENGTH=$( echo "192*1024" | bc ) \
  OLLAMA_NOPRUNE=0 \
  OLLAMA_NEW_ENGINE=1 \
  OLLAMA_NOHISTORY=1 \
  OLLAMA_LOAD_TIMEOUT=3600 \
  OLLAMA_KEEP_ALIVE="-1m" \
  OLLAMA_MODELS=${HOME}/.ollama/models \
  OLLAMA_HOST=0.0.0.0:11434 \
  OLLAMA_NUM_PARALLEL=4 \
  OLLAMA_KV_CACHE_TYPE=q8_0 \
  HTTPS_PROXY=http://proxy-au:8080 \
  HTTP_PROXY=http://proxy-au:8080 \
  https_proxy=${HTTPS_PROXY} \
  http_proxy=${HTTP_PROXY} \
  NO_PROXY="192.168.0.0/16,127.0.0.0/8,10.0.0.0/8" \
  no_proxy=${NO_PROXY} \
  ALL_PROXY=http://proxy-au:8080 \
  all_proxy=${ALL_PROXY} \
  ollama serve 2>&1 

If you post a log from the mac system it might help.

<!-- gh-comment-id:3286773375 --> @rick-github commented on GitHub (Sep 12, 2025): tmux session with horizontal split, top pane: ```console $ ./12270.sh time=2025-09-12T22:35:38.904+02:00 level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY:http://proxy-au:8080 HTTP_PROXY:http://proxy-au:8080 NO_PROXY:192.168.0.0/16,127.0.0.0/8,10.0.0.0/8 OLLAMA_CONTEXT_LENGTH:196608 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:1h0m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/rick/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:true OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:true OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:4 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES: http_proxy:http://proxy-au:8080 https_proxy:http://proxy-au:8080 no_proxy:192.168.0.0/16,127.0.0.0/8,10.0.0.0/8]" time=2025-09-12T22:35:38.907+02:00 level=INFO source=images.go:477 msg="total blobs: 10" time=2025-09-12T22:35:38.907+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-09-12T22:35:38.907+02:00 level=INFO source=routes.go:1384 msg="Listening on [::]:11434 (version 0.11.10)" time=2025-09-12T22:35:38.908+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-09-12T22:35:39.439+02:00 level=INFO source=gpu.go:613 msg="no nvidia devices detected by library /usr/lib/x86_64-linux-gnu/libcuda.so.575.57.08" time=2025-09-12T22:35:39.912+02:00 level=INFO source=amd_linux.go:390 msg="amdgpu is supported" gpu=0 gpu_type=gfx1151 time=2025-09-12T22:35:39.917+02:00 level=INFO source=types.go:131 msg="inference compute" id=0 library=rocm variant="" compute=gfx1151 driver=6.12 name=1002:1586 total="96.0 GiB" available="95.8 GiB" [GIN] 2025/09/12 - 22:36:16 | 200 | 11.994387ms | 127.0.0.1 | GET "/api/version" [GIN] 2025/09/12 - 22:36:21 | 200 | 59.058µs | 127.0.0.1 | HEAD "/" [GIN] 2025/09/12 - 22:36:21 | 200 | 13.416084ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/09/12 - 22:36:52 | 200 | 25.429µs | 127.0.0.1 | HEAD "/" time=2025-09-12T22:36:55.815+02:00 level=INFO source=download.go:177 msg="downloading c5396e06af29 in 4 100 MB part(s)" time=2025-09-12T22:37:29.871+02:00 level=INFO source=download.go:177 msg="downloading 66b9ea09bd5b in 1 68 B part(s)" time=2025-09-12T22:37:31.905+02:00 level=INFO source=download.go:177 msg="downloading eb4402837c78 in 1 1.5 KB part(s)" time=2025-09-12T22:37:33.939+02:00 level=INFO source=download.go:177 msg="downloading 832dd9e00a68 in 1 11 KB part(s)" time=2025-09-12T22:37:35.971+02:00 level=INFO source=download.go:177 msg="downloading 005f95c74751 in 1 490 B part(s)" [GIN] 2025/09/12 - 22:37:37 | 200 | 44.863944786s | 127.0.0.1 | POST "/api/pull" [GIN] 2025/09/12 - 22:37:42 | 200 | 44.379µs | 127.0.0.1 | HEAD "/" [GIN] 2025/09/12 - 22:37:42 | 200 | 1.016652ms | 127.0.0.1 | GET "/api/tags" ``` Bottom pane: ```console $ ollama -v ollama version is 0.11.10 $ ollama list NAME ID SIZE MODIFIED deepseek-r1:14b c333b7232bdb 9.0 GB 2 months ago qwen3:14b-q4_K_M bdbd181c33f2 9.3 GB 2 months ago $ ollama pull qwen2.5:0.5b pulling manifest pulling c5396e06af29: 100% ▕█████████████████████████████████████████████████████████���███████████████████████████████████████████████████████████████████████████████████████████▏ 397 MB pulling 66b9ea09bd5b: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 68 B pulling eb4402837c78: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.5 KB pulling 832dd9e00a68: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 11 KB pulling 005f95c74751: 100% ▕█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 490 B verifying sha256 digest writing manifest success $ ollama list NAME ID SIZE MODIFIED qwen2.5:0.5b a8b0c5157701 397 MB 4 seconds ago deepseek-r1:14b c333b7232bdb 9.0 GB 2 months ago qwen3:14b-q4_K_M bdbd181c33f2 9.3 GB 2 months ago $ ``` Script. Reformatted for clarity but the only changes were the name of the proxy and the removal of the wired limit: ```sh #!/bin/bash OLLAMA_FLASH_ATTENTION=1 \ OLLAMA_SCHED_SPREAD=1 \ OLLAMA_CONTEXT_LENGTH=$( echo "192*1024" | bc ) \ OLLAMA_NOPRUNE=0 \ OLLAMA_NEW_ENGINE=1 \ OLLAMA_NOHISTORY=1 \ OLLAMA_LOAD_TIMEOUT=3600 \ OLLAMA_KEEP_ALIVE="-1m" \ OLLAMA_MODELS=${HOME}/.ollama/models \ OLLAMA_HOST=0.0.0.0:11434 \ OLLAMA_NUM_PARALLEL=4 \ OLLAMA_KV_CACHE_TYPE=q8_0 \ HTTPS_PROXY=http://proxy-au:8080 \ HTTP_PROXY=http://proxy-au:8080 \ https_proxy=${HTTPS_PROXY} \ http_proxy=${HTTP_PROXY} \ NO_PROXY="192.168.0.0/16,127.0.0.0/8,10.0.0.0/8" \ no_proxy=${NO_PROXY} \ ALL_PROXY=http://proxy-au:8080 \ all_proxy=${ALL_PROXY} \ ollama serve 2>&1 ``` If you post a log from the mac system it might help.
Author
Owner

@neurostream commented on GitHub (Sep 13, 2025):

Update:

if the tmux PID starts on:

  • linux and i ssh to the mac ollama host and start ollama serve, it works (ollama serve uses the HTTPS_PROXY, when an ollama pull is requested)
  • macos (brew installed tmux), and I open a pane to a shell on the mac (the same mac in this case) to start ollama serve, I encounter the proxyconnect issue ( but curl and other HTTPS_PROXY -using executables on macos work ) when an ollama pull is requested
<!-- gh-comment-id:3287606896 --> @neurostream commented on GitHub (Sep 13, 2025): Update: if the tmux PID starts on: - linux and i ssh to the mac ollama host and start ollama serve, it works (ollama serve uses the HTTPS_PROXY, when an ollama pull is requested) - macos (brew installed tmux), and I open a pane to a shell on the mac (the same mac in this case) to start ollama serve, I encounter the proxyconnect issue ( but curl and other HTTPS_PROXY -using executables on macos work ) when an ollama pull is requested
Author
Owner

@neurostream commented on GitHub (Sep 15, 2025):

perhaps this issue can be re-opened or referrenced if otherrs encounter it. I can work around it by not starting ollama serve as a decendent of a brew-installed aarch64 tmux pid.

<!-- gh-comment-id:3290146664 --> @neurostream commented on GitHub (Sep 15, 2025): perhaps this issue can be re-opened or referrenced if otherrs encounter it. I can work around it by not starting ollama serve as a decendent of a brew-installed aarch64 tmux pid.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#33919