[GH-ISSUE #581] How to use num_predict? #261

Closed
opened 2026-04-12 09:47:22 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @jamesbraza on GitHub (Sep 23, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/581

From https://github.com/jmorganca/ollama/issues/318#issuecomment-1710181439, I see num_predict exists, and am trying to figure out how to use it.

Where are the docs on parameters like this?

More specifically, I am trying to figure out how to specify num_predict (and similar parameters) to the Ollama server process and/or /generate API calls.

Originally created by @jamesbraza on GitHub (Sep 23, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/581 From https://github.com/jmorganca/ollama/issues/318#issuecomment-1710181439, I see `num_predict` exists, and am trying to figure out how to use it. Where are the docs on parameters like this? More specifically, I am trying to figure out how to specify `num_predict` (and similar parameters) to the Ollama server process and/or `/generate` API calls.
Author
Owner

@willowell commented on GitHub (Sep 25, 2023):

Hello @jamesbraza!

Had a look around - looks like the docs for these parameters are in llama.cpp. For instance, here's the doc for n_predict: https://github.com/ggerganov/llama.cpp/tree/master/examples/main#number-of-tokens-to-predict.

I found these types for the parameters https://github.com/jmorganca/ollama/blob/main/llm/llama.go#L378 - it looks like these are out-of-sync with the ones in https://github.com/jmorganca/ollama/blob/main/api/types.go#L161?

I don't see num_predict used outside types.go.

Does this help?

<!-- gh-comment-id:1733774990 --> @willowell commented on GitHub (Sep 25, 2023): Hello @jamesbraza! Had a look around - looks like the docs for these parameters are in llama.cpp. For instance, here's the doc for `n_predict`: https://github.com/ggerganov/llama.cpp/tree/master/examples/main#number-of-tokens-to-predict. I found these types for the parameters https://github.com/jmorganca/ollama/blob/main/llm/llama.go#L378 - it looks like these are out-of-sync with the ones in https://github.com/jmorganca/ollama/blob/main/api/types.go#L161? I don't see `num_predict` used outside `types.go`. Does this help?
Author
Owner

@BruceMacD commented on GitHub (Sep 25, 2023):

This parameter tells the LLM the maximum number of tokens it is allowed to generate.

It's not exposed in the CLI at the moment, but you can define it directly in the body of requests make to the API at the /generate endpoint. Here is an example of that, setting the num_predict to 1.

Request

curl --request POST \
     --url http://localhost:11434/api/generate \
     --header "Content-Type: application/json" \
     --data '{
         "prompt": "hi",
         "model": "llama2",
         "options": {
             "num_predict": 1
         }
     }'

Response Stream

{
    "model": "llama2",
    "created_at": "2023-09-25T14:32:13.093801Z",
    "response": " Hello",
    "done": false
}
{
    "model": "llama2",
    "created_at": "2023-09-25T14:32:13.095352Z",
    "done": true,
    "context": [29961,25580,29962,7251,518,29914,25580,29962,29871,15043],
    "total_duration": 383724416,
    "load_duration": 1833458,
    "prompt_eval_count": 5,
    "prompt_eval_duration": 373471000,
    "eval_count": 1
}
<!-- gh-comment-id:1733841969 --> @BruceMacD commented on GitHub (Sep 25, 2023): This parameter tells the LLM the maximum number of tokens it is allowed to generate. It's not exposed in the CLI at the moment, but you can define it directly in the body of requests make to the API at the `/generate` endpoint. Here is an example of that, setting the `num_predict` to 1. ## Request ``` curl --request POST \ --url http://localhost:11434/api/generate \ --header "Content-Type: application/json" \ --data '{ "prompt": "hi", "model": "llama2", "options": { "num_predict": 1 } }' ``` ## Response Stream ``` { "model": "llama2", "created_at": "2023-09-25T14:32:13.093801Z", "response": " Hello", "done": false } { "model": "llama2", "created_at": "2023-09-25T14:32:13.095352Z", "done": true, "context": [29961,25580,29962,7251,518,29914,25580,29962,29871,15043], "total_duration": 383724416, "load_duration": 1833458, "prompt_eval_count": 5, "prompt_eval_duration": 373471000, "eval_count": 1 } ```
Author
Owner

@jamesbraza commented on GitHub (Sep 25, 2023):

Thank you both @willowell and @BruceMacD! Is there a relevant portion of the Ollama docs for this somewhere? Otherwise I would gladly add this somewhere, if you could point me to the right place

<!-- gh-comment-id:1734379394 --> @jamesbraza commented on GitHub (Sep 25, 2023): Thank you both @willowell and @BruceMacD! Is there a relevant portion of the Ollama docs for this somewhere? Otherwise I would gladly add this somewhere, if you could point me to the right place
Author
Owner

@BruceMacD commented on GitHub (Sep 26, 2023):

The closest documentation would be this table of parameter options in the modelfile docs. It looks like it is missing a few options now though, num_predict isn't there:
https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

<!-- gh-comment-id:1735643153 --> @BruceMacD commented on GitHub (Sep 26, 2023): The closest documentation would be this table of parameter options in the modelfile docs. It looks like it is missing a few options now though, num_predict isn't there: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
Author
Owner

@jamesbraza commented on GitHub (Sep 26, 2023):

As long as a search for num_predict has some matches in docs/, I will call it a win.

Looks like that table comes from Options, I will add a few entries to it shortly

<!-- gh-comment-id:1735831779 --> @jamesbraza commented on GitHub (Sep 26, 2023): As long as a search for `num_predict` has some matches in `docs/`, I will call it a win. Looks like that table comes from [`Options`](https://github.com/jmorganca/ollama/blob/v0.0.21/api/types.go#L161), I will add a few entries to it shortly
Author
Owner

@jamesbraza commented on GitHub (Sep 27, 2023):

PR opened, thanks all!

<!-- gh-comment-id:1736544212 --> @jamesbraza commented on GitHub (Sep 27, 2023): [PR opened](https://github.com/jmorganca/ollama/pull/614), thanks all!
Author
Owner

@vusiernestmthuk commented on GitHub (Nov 5, 2025):

Ind lunchtime and teatime for today

<!-- gh-comment-id:3489578619 --> @vusiernestmthuk commented on GitHub (Nov 5, 2025): Ind lunchtime and teatime for today
Author
Owner

@vusiernestmthuk commented on GitHub (Nov 5, 2025):

Lunchtime and teatime

<!-- gh-comment-id:3489579874 --> @vusiernestmthuk commented on GitHub (Nov 5, 2025): Lunchtime and teatime
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#261