[GH-ISSUE #281] Consider a non streaming api for /api/generate #123

Closed
opened 2026-04-12 09:39:22 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @jmorganca on GitHub (Aug 4, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/281

Originally assigned to: @BruceMacD on GitHub.

If Content-Type: application/json is set, we should consider returning a single large json object vs an event stream. This would be an elegant design as there are no new flags

Originally created by @jmorganca on GitHub (Aug 4, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/281 Originally assigned to: @BruceMacD on GitHub. If `Content-Type: application/json` is set, we should consider returning a single large json object vs an event stream. This would be an elegant design as there are no new flags
GiteaMirror added the feature request label 2026-04-12 09:39:22 -05:00
Author
Owner

@priamai commented on GitHub (Aug 10, 2023):

Please please please! Also I would like to see at least temperature and max tokens available in the API request and as settings in the model.

image

<!-- gh-comment-id:1672897969 --> @priamai commented on GitHub (Aug 10, 2023): Please please please! Also I would like to see at least temperature and max tokens available in the API request and as settings in the model. ![image](https://github.com/jmorganca/ollama/assets/57333254/da558a11-cd48-4c66-b6f9-14db08436459)
Author
Owner

@BruceMacD commented on GitHub (Aug 10, 2023):

Hi @priamai, you can actually set temperature through an API option right now, I'll have to make a separate issue for max tokens, I dont think we take that.

Here's an example of setting temperature in the API:

curl -X POST -H "Content-Type: application/json" -d '{
    "model": "llama2",
    "prompt": "why is the sky blue",
    "options": {
      "temperature": 1
    }
}' 'localhost:11434/api/embeddings'

Here is an example of setting temperature in a Modelfile:

FROM llama2
PARAMETER temperature 1
<!-- gh-comment-id:1673785386 --> @BruceMacD commented on GitHub (Aug 10, 2023): Hi @priamai, you can actually set temperature through an API option right now, I'll have to make a separate issue for max tokens, I dont think we take that. Here's an example of setting temperature in the API: ``` curl -X POST -H "Content-Type: application/json" -d '{ "model": "llama2", "prompt": "why is the sky blue", "options": { "temperature": 1 } }' 'localhost:11434/api/embeddings' ``` Here is an example of setting temperature in a Modelfile: ``` FROM llama2 PARAMETER temperature 1 ```
Author
Owner

@samilao101 commented on GitHub (Oct 3, 2023):

So is the API able to handle non-stream requests right now? If so, how? Thank you

<!-- gh-comment-id:1745250248 --> @samilao101 commented on GitHub (Oct 3, 2023): So is the API able to handle non-stream requests right now? If so, how? Thank you
Author
Owner

@SabareeshGC commented on GitHub (Oct 7, 2023):

@samilao101 here is an example


curl --location 'http://localhost:11434/api/generate' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
  "model": "mistral",
  "prompt": "Why is the sky blue?"
}'
<!-- gh-comment-id:1751850295 --> @SabareeshGC commented on GitHub (Oct 7, 2023): @samilao101 here is an example ``` curl --location 'http://localhost:11434/api/generate' \ --header 'Accept: application/json' \ --header 'Content-Type: application/json' \ --data '{ "model": "mistral", "prompt": "Why is the sky blue?" }' ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#123