[GH-ISSUE #6561] Inconsistent API Behavior #50640

Open
opened 2026-04-28 16:42:07 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @negaralizadeh on GitHub (Aug 29, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6561

What is the issue?

I'm calling the generate API as follows:

url = 'http://localhost:11434/api/generate'
data = {
  "model": model_name,
  "stream": False,
  "options": {
      "temperature": 0.2,
      "top_p": 0.8,
      "seed": 42,
      "num_predict": 300,
  },
  "system": set_role()
}

response = requests.post(url, json=data).json()

Although I set the stream flag to false, sometimes I don't get the whole response in one package (done = False in the first received package).
The other problem is that sometimes, even when the package is final (done=true), I don't see all the expected additional information there, for example, prompt_eval_count is missing.

This problem also persists with the Python library. I’ve carefully checked the documentation, and I believe it might be some sort of bug.

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.5

Originally created by @negaralizadeh on GitHub (Aug 29, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6561 ### What is the issue? I'm calling the generate API as follows: ``` url = 'http://localhost:11434/api/generate' data = { "model": model_name, "stream": False, "options": { "temperature": 0.2, "top_p": 0.8, "seed": 42, "num_predict": 300, }, "system": set_role() } response = requests.post(url, json=data).json() ``` Although I set the stream flag to false, sometimes I don't get the whole response in one package (done = False in the first received package). The other problem is that sometimes, even when the package is final (done=true), I don't see all the expected additional information there, for example, prompt_eval_count is missing. This problem also persists with the Python library. I’ve carefully checked the documentation, and I believe it might be some sort of bug. ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.5
GiteaMirror added the bug label 2026-04-28 16:42:07 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 29, 2024):

What is the system field supposed to be? What model are you seeing issues with?

<!-- gh-comment-id:2319440746 --> @rick-github commented on GitHub (Aug 29, 2024): What is the `system` field supposed to be? What model are you seeing issues with?
Author
Owner

@negaralizadeh commented on GitHub (Aug 30, 2024):

What is the system field supposed to be? What model are you seeing issues with?

According to the documentation, the system field overwrites the system message in the Modelfile.

I mostly encounter this issue with "codellama:7b-instruct-q5_K_M" but it has also been observed with "granite-code:8b-instruct-q4_K_M" and "llama3:8b-instruct-q5_K_M"

To be more specific, I'm doing experiments with the HumanEval (Python) dataset.

data = {
"model": codellama:7b-instruct-q5_K_M
"prompt": "Generate only the Check Function in Python with at most 10 assertions according to the Instruction: " + prompt
"stream": False,
    "options": {
        "temperature": 0.2,
        "top_p": 0.8,
        "seed": 42,
        "num_predict": 300,
    },
"system": 'You are an exceptionally intelligent AI assistant that consistently delivers accurate and reliable Python test cases for a function. Without any explanations in natural language.'
}
response = requests.post(url, json=data).json()

And this is what I got as a response for task Python/84:
{'model': 'codellama:7b-instruct-q5_K_M', 'created_at': '2024-08-30T08:24:41.524783507Z', 'response': '```\ndef check(solve):\n assert solve(1000) == "1"\n assert solve(150) == "110"\n assert solve(147) == "1100"\n assert solve(123456789) == "1001001001001001001001001001001"\n assert solve(1000000000) == "1111111111111111111111111111111', 'done': False}

However it's flaky, I cannot always reproduce this bug.

<!-- gh-comment-id:2320505742 --> @negaralizadeh commented on GitHub (Aug 30, 2024): > What is the `system` field supposed to be? What model are you seeing issues with? According to the [documentation](https://github.com/ollama/ollama/blob/56346ccfa3e51eec51fc26ae8e91fc88cb74a9b8/docs/api.md?plain=1#L50), the system field overwrites the system message in the Modelfile. I mostly encounter this issue with "codellama:7b-instruct-q5_K_M" but it has also been observed with "granite-code:8b-instruct-q4_K_M" and "llama3:8b-instruct-q5_K_M" To be more specific, I'm doing experiments with the HumanEval (Python) dataset. ``` data = { "model": codellama:7b-instruct-q5_K_M "prompt": "Generate only the Check Function in Python with at most 10 assertions according to the Instruction: " + prompt "stream": False, "options": { "temperature": 0.2, "top_p": 0.8, "seed": 42, "num_predict": 300, }, "system": 'You are an exceptionally intelligent AI assistant that consistently delivers accurate and reliable Python test cases for a function. Without any explanations in natural language.' } response = requests.post(url, json=data).json() ``` And this is what I got as a response for task Python/84: {'model': 'codellama:7b-instruct-q5_K_M', 'created_at': '2024-08-30T08:24:41.524783507Z', 'response': '```\ndef check(solve):\n assert solve(1000) == "1"\n assert solve(150) == "110"\n assert solve(147) == "1100"\n assert solve(123456789) == "1001001001001001001001001001001"\n assert solve(1000000000) == "1111111111111111111111111111111', 'done': False} However it's flaky, I cannot always reproduce this bug.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50640