[GH-ISSUE #2764] Suggestion: Add a timeout parameter to Chat and Generation calls. #27426

Open
opened 2026-04-22 04:46:29 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @dezoito on GitHub (Feb 26, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2764

First of all thanks for the hard work you guys are putting into this!

I don't think there's an easy way to do this directly... please correct me if I'm wrong.

(Looks like Ollama-py implements this for sync calls, but it is passed to the httpx client, and not to the Ollama host.)

The motivation is to allow production apps to drop programmatically drop requests that are taking too long, freeing up resources, and allow client libs (like Ollama-rs and Ollama-js), to "pass through" this parameter to the ollama host, simplifying implementation.

Thoughts?

Originally created by @dezoito on GitHub (Feb 26, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2764 First of all thanks for the hard work you guys are putting into this! I don't think there's an easy way to do this directly... please correct me if I'm wrong. (Looks like Ollama-py implements this for sync calls, but it is passed to the `httpx` client, and not to the Ollama host.) The motivation is to allow production apps to drop programmatically drop requests that are taking too long, freeing up resources, and allow client libs (like Ollama-rs and Ollama-js), to "pass through" this parameter to the ollama host, simplifying implementation. Thoughts?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#27426