[GH-ISSUE #10857] num_predict parameter does not work ? #53644

Closed
opened 2026-04-29 04:21:54 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @zswodegit on GitHub (May 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10857

What is the issue?

hi,
Im trying to test ollama performance like ttft, latency etc.. with qwen model. I couldnt find any related instruction on repository main page, so i trying to modify some parameter during the inference: setting num_predict=1 to test the ttft, but it does not work at all. do you know whats the problem is?

Image

Relevant log output

url = "http://localhost:11434/api/generate"
headers = {"Content-Type": "application/json"}
data = {"model":"qwenvl", "prompt":"say something", "num_predict": 1, "stream": True}
res = requests.post(url, headers=headers, json=data)

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

ollama=0.7.0

Originally created by @zswodegit on GitHub (May 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10857 ### What is the issue? hi, Im trying to test ollama performance like ttft, latency etc.. with qwen model. I couldnt find any related instruction on repository main page, so i trying to modify some parameter during the inference: setting `num_predict=1` to test the ttft, but it does not work at all. do you know whats the problem is? ![Image](https://github.com/user-attachments/assets/6c3cde16-68a9-46dd-b53c-432ae4684ee5) ### Relevant log output ```shell url = "http://localhost:11434/api/generate" headers = {"Content-Type": "application/json"} data = {"model":"qwenvl", "prompt":"say something", "num_predict": 1, "stream": True} res = requests.post(url, headers=headers, json=data) ``` ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version ollama=0.7.0
GiteaMirror added the bug label 2026-04-29 04:21:54 -05:00
Author
Owner

@rick-github commented on GitHub (May 26, 2025):

num_predict goes in the options field.

$ curl localhost:11434/api/generate -d '{"model":"qwen2.5:0.5b","prompt":"say something","options":{"num_predict":1}}'
{"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888672145Z","response":"Hello","done":false}
{"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888676576Z","response":"","done":true,"done_reason":"length","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,36790,2494,151645,198,151644,77091,198,9707],"total_duration":353061979,"load_duration":336888958,"prompt_eval_count":31,"prompt_eval_duration":11179534,"eval_count":1,"eval_duration":3387738}

<!-- gh-comment-id:2908849505 --> @rick-github commented on GitHub (May 26, 2025): `num_predict` goes in the `options` field. ```console $ curl localhost:11434/api/generate -d '{"model":"qwen2.5:0.5b","prompt":"say something","options":{"num_predict":1}}' {"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888672145Z","response":"Hello","done":false} {"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888676576Z","response":"","done":true,"done_reason":"length","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,36790,2494,151645,198,151644,77091,198,9707],"total_duration":353061979,"load_duration":336888958,"prompt_eval_count":31,"prompt_eval_duration":11179534,"eval_count":1,"eval_duration":3387738} ```
Author
Owner

@zswodegit commented on GitHub (May 27, 2025):

num_predict goes in the options field.

$ curl localhost:11434/api/generate -d '{"model":"qwen2.5:0.5b","prompt":"say something","options":{"num_predict":1}}'
{"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888672145Z","response":"Hello","done":false}
{"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888676576Z","response":"","done":true,"done_reason":"length","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,36790,2494,151645,198,151644,77091,198,9707],"total_duration":353061979,"load_duration":336888958,"prompt_eval_count":31,"prompt_eval_duration":11179534,"eval_count":1,"eval_duration":3387738}

it works thanks!

<!-- gh-comment-id:2910960252 --> @zswodegit commented on GitHub (May 27, 2025): > `num_predict` goes in the `options` field. > > $ curl localhost:11434/api/generate -d '{"model":"qwen2.5:0.5b","prompt":"say something","options":{"num_predict":1}}' > {"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888672145Z","response":"Hello","done":false} > {"model":"qwen2.5:0.5b","created_at":"2025-05-26T07:36:26.888676576Z","response":"","done":true,"done_reason":"length","context":[151644,8948,198,2610,525,1207,16948,11,3465,553,54364,14817,13,1446,525,264,10950,17847,13,151645,198,151644,872,198,36790,2494,151645,198,151644,77091,198,9707],"total_duration":353061979,"load_duration":336888958,"prompt_eval_count":31,"prompt_eval_duration":11179534,"eval_count":1,"eval_duration":3387738} it works thanks!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53644