[GH-ISSUE #1582] ollama crashes when calling /api/generate with invalid duration message #871

Closed
opened 2026-04-12 10:31:59 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @michaelgloeckner on GitHub (Dec 18, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1582

Originally assigned to: @BruceMacD on GitHub.

Hi,

i run ollama in k8s cluster and i upgraded from 0.1.9 to 0.1.16 to get mixtral fix.

the error occurred first time with version 0.1.14.

But when i call /api/generate ollama stops.
Looking into ollama logs i see the following messages:
panic: time: invalid duration "-6414107897391086.000000ms"

More logs are attached:
error_ollama_0.1.16.txt

Originally created by @michaelgloeckner on GitHub (Dec 18, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1582 Originally assigned to: @BruceMacD on GitHub. Hi, i run ollama in k8s cluster and i upgraded from 0.1.9 to 0.1.16 to get mixtral fix. the error occurred first time with version 0.1.14. But when i call /api/generate ollama stops. Looking into ollama logs i see the following messages: panic: time: invalid duration "-6414107897391086.000000ms" More logs are attached: [error_ollama_0.1.16.txt](https://github.com/jmorganca/ollama/files/13704366/error_ollama_0.1.16.txt)
GiteaMirror added the bug label 2026-04-12 10:31:59 -05:00
Author
Owner

@BruceMacD commented on GitHub (Dec 18, 2023):

Hi @michaelgloeckner, thanks for opening the issue. Which distro is running on the nodes in your kube cluster? Im wondering if this has something to do with the library used to get the timestamps.

<!-- gh-comment-id:1861436546 --> @BruceMacD commented on GitHub (Dec 18, 2023): Hi @michaelgloeckner, thanks for opening the issue. Which distro is running on the nodes in your kube cluster? Im wondering if this has something to do with the library used to get the timestamps.
Author
Owner

@michaelgloeckner commented on GitHub (Dec 19, 2023):

Hi,

it is running on an Amazon node with gpu support (g5.8xlarge):

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
SUPPORT_END="2025-06-30"
Amazon Linux release 2 (Karoo)

If you have an idea how to check the library, let me know and I can test it on the system.

<!-- gh-comment-id:1862302467 --> @michaelgloeckner commented on GitHub (Dec 19, 2023): Hi, it is running on an Amazon node with gpu support (g5.8xlarge): NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora" VERSION_ID="2" PRETTY_NAME="Amazon Linux 2" CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2" SUPPORT_END="2025-06-30" Amazon Linux release 2 (Karoo) If you have an idea how to check the library, let me know and I can test it on the system.
Author
Owner

@michaelgloeckner commented on GitHub (Dec 19, 2023):

I made some more checks and figured out that it is related to the way we sending the prompt:
if we use template to send full prompt than it fails since version 0.1.14
/api/generate json:
{
"model": "llama2",
"template": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:",
"stream": false,
"options": {
"stop": ["<|im_start|>", "<|im_end|>"]
}
}

if we use "prompt" instead, the command finishes successful.

/api/generate json:
{
"model": "llama2",
"prompt": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:",
"stream": false,
"options": {
"stop": ["<|im_start|>", "<|im_end|>"]
}
}

<!-- gh-comment-id:1862458438 --> @michaelgloeckner commented on GitHub (Dec 19, 2023): I made some more checks and figured out that it is related to the way we sending the prompt: if we use template to send full prompt than it fails since version 0.1.14 /api/generate json: { "model": "llama2", "**template**": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:", "stream": false, "options": { "stop": ["<|im_start|>", "<|im_end|>"] } } if we use "prompt" instead, the command finishes successful. /api/generate json: { "model": "llama2", "**prompt**": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:", "stream": false, "options": { "stop": ["<|im_start|>", "<|im_end|>"] } }
Author
Owner

@BruceMacD commented on GitHub (Dec 19, 2023):

Thanks for the details @michaelgloeckner, still looking at this. In the meantime, the way you are using template seems a bit off from my expectations. Normally it would be used along a prompt parameter, I'd suggest this request body instead:

{
    "model": "llama2",
    "raw": true,
    "prompt": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:",
    "stream": false,
    "options": {
        "stop": [
            "<|im_start|>",
            "<|im_end|>"
        ]
    }
}
<!-- gh-comment-id:1863101254 --> @BruceMacD commented on GitHub (Dec 19, 2023): Thanks for the details @michaelgloeckner, still looking at this. In the meantime, the way you are using `template` seems a bit off from my expectations. Normally it would be used along a prompt parameter, I'd suggest this request body instead: ``` { "model": "llama2", "raw": true, "prompt": "<|im_start|>system\nYou are an assistant for question-answering tasks.<|im_end|>\n<|im_start|>user\nQuestion: What are the advantages for Aladin??\nContext: Some context here<|im_end|>\n<|im_start|>assistant Answer:", "stream": false, "options": { "stop": [ "<|im_start|>", "<|im_end|>" ] } } ```
Author
Owner

@BruceMacD commented on GitHub (Dec 19, 2023):

I still haven't been able to reproduce this one in an Amazon Linux container for some reason. I'd also suggest trying the latest version of the llama2 model if its been a while since you pulled: ollama pull llama2

<!-- gh-comment-id:1863172435 --> @BruceMacD commented on GitHub (Dec 19, 2023): I still haven't been able to reproduce this one in an Amazon Linux container for some reason. I'd also suggest trying the latest version of the llama2 model if its been a while since you pulled: `ollama pull llama2`
Author
Owner

@michaelgloeckner commented on GitHub (Dec 20, 2023):

I also could not reproduce it locally. So I switched to "prompt" now and it looks like it is working. Nevertheless the ollama documentation says which is kind of misleading for me.

"template: the full prompt or prompt template (overrides what is defined in the Modelfile)"

<!-- gh-comment-id:1864140941 --> @michaelgloeckner commented on GitHub (Dec 20, 2023): I also could not reproduce it locally. So I switched to "prompt" now and it looks like it is working. Nevertheless the ollama documentation says which is kind of misleading for me. "template: the full prompt or prompt template (overrides what is defined in the Modelfile)"
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#871