[PR #15342] server: add phase-specific timeouts for cloud proxy #15119

Open
opened 2026-04-13 01:10:45 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15342
Author: @Akshatkasera
Created: 4/5/2026
Status: 🔄 Open

Base: mainHead: fix/cloud-proxy-phase-timeouts


📝 Commits (1)

  • cabd650 server: add phase-specific timeouts for cloud proxy

📊 Changes

1 file changed (+13 additions, -4 deletions)

View changed files

📝 server/cloud_proxy.go (+13 -4)

📄 Description

Summary

  • Replaced http.DefaultClient with a dedicated cloudProxyHTTPClient that has phase-specific timeouts
  • Connect and TLS handshake: 10s each (fail fast on unreachable servers)
  • Response header (time-to-first-byte): 30s (detect unresponsive backends)
  • No overall Client.Timeout, so long-lived streaming responses are never killed mid-stream
  • Idle connection timeout: 90s (clean up stale keep-alive connections)

Resolves the TODO at server/cloud_proxy.go:216 (added by @drifkin)

Context

The cloud proxy previously used http.DefaultClient, which has no timeouts at all. This meant:

  • Hung connections could block indefinitely during connect/TLS/TTFB
  • Setting a single Client.Timeout would risk killing long-running streaming inference responses

The fix uses Go's http.Transport fields to set per-phase timeouts while leaving the overall request duration unbounded.

Test plan

  • Existing TestCopyProxyRequestHeaders and TestCopyProxyResponseHeaders still pass (verified locally — build failures are pre-existing macOS-only mlx/imagegen issues unrelated to this change)
  • Manual test: cloud model requests should connect within 10s or fail fast
  • Manual test: long streaming responses should complete without timeout
  • Manual test: requests to unreachable servers should fail after ~10s instead of hanging

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15342 **Author:** [@Akshatkasera](https://github.com/Akshatkasera) **Created:** 4/5/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/cloud-proxy-phase-timeouts` --- ### 📝 Commits (1) - [`cabd650`](https://github.com/ollama/ollama/commit/cabd650dfd994aacb9b5bb2ff198bc20201ab216) server: add phase-specific timeouts for cloud proxy ### 📊 Changes **1 file changed** (+13 additions, -4 deletions) <details> <summary>View changed files</summary> 📝 `server/cloud_proxy.go` (+13 -4) </details> ### 📄 Description ## Summary - Replaced `http.DefaultClient` with a dedicated `cloudProxyHTTPClient` that has phase-specific timeouts - Connect and TLS handshake: 10s each (fail fast on unreachable servers) - Response header (time-to-first-byte): 30s (detect unresponsive backends) - No overall `Client.Timeout`, so long-lived streaming responses are never killed mid-stream - Idle connection timeout: 90s (clean up stale keep-alive connections) Resolves the TODO at `server/cloud_proxy.go:216` (added by @drifkin) ## Context The cloud proxy previously used `http.DefaultClient`, which has no timeouts at all. This meant: - Hung connections could block indefinitely during connect/TLS/TTFB - Setting a single `Client.Timeout` would risk killing long-running streaming inference responses The fix uses Go's `http.Transport` fields to set per-phase timeouts while leaving the overall request duration unbounded. ## Test plan - [x] Existing `TestCopyProxyRequestHeaders` and `TestCopyProxyResponseHeaders` still pass (verified locally — build failures are pre-existing macOS-only mlx/imagegen issues unrelated to this change) - [ ] Manual test: cloud model requests should connect within 10s or fail fast - [ ] Manual test: long streaming responses should complete without timeout - [ ] Manual test: requests to unreachable servers should fail after ~10s instead of hanging --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 01:10:45 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#15119