[GH-ISSUE #13089] New and FRUSTRATING very limiting max tokens on the cloud models to only 16,384 #8663

Closed
opened 2026-04-12 21:25:35 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Akamel01 on GitHub (Nov 14, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13089

Originally assigned to: @jmorganca on GitHub.

New and FRUSTRATING very limiting max tokens on the cloud models to only 16,384. This number of tokens is extremely limiting for Agent work. We pay per token and we want to keep the model as is. Compared to other providers, this max output token is about 10% of what is provided by other services.

Originally created by @Akamel01 on GitHub (Nov 14, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13089 Originally assigned to: @jmorganca on GitHub. New and FRUSTRATING very limiting max tokens on the cloud models to only 16,384. This number of tokens is extremely limiting for Agent work. We pay per token and we want to keep the model as is. Compared to other providers, this max output token is about 10% of what is provided by other services.
GiteaMirror added the cloudbug labels 2026-04-12 21:25:35 -05:00
Author
Owner

@jmorganca commented on GitHub (Nov 14, 2025):

Hi @Akamel01, this should be fixed in the next 5 minutes or so. Let me know if you're still seeing issues.

<!-- gh-comment-id:3534565701 --> @jmorganca commented on GitHub (Nov 14, 2025): Hi @Akamel01, this should be fixed in the next 5 minutes or so. Let me know if you're still seeing issues.
Author
Owner

@tippyentertainment commented on GitHub (Mar 1, 2026):

Holy shit rate limiting sux on Ollama Cloud

<!-- gh-comment-id:3981333727 --> @tippyentertainment commented on GitHub (Mar 1, 2026): Holy shit rate limiting sux on Ollama Cloud
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8663