[GH-ISSUE #6853] Setting temperature on any llava model makes the Ollama server hangs on REST calls #66362

Closed
opened 2026-05-04 03:07:11 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @jluisreymejias on GitHub (Sep 18, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6853

What is the issue?

When calling llava models from a REST client, setting temperature cause the ollama server hangs until process is killed.

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.3.10

Originally created by @jluisreymejias on GitHub (Sep 18, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6853 ### What is the issue? When calling llava models from a REST client, setting temperature cause the ollama server hangs until process is killed. ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.3.10
GiteaMirror added the bugneeds more info labels 2026-05-04 03:07:16 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 18, 2024):

$ time curl -s localhost:11434/api/chat -d '{"model":"llava:7b","options":{"temperature":2},"messages":[{"role":"user","content":"describe this image","images":["'$(base64 -w0 puppy.jpg)'"]}],"stream":false}' | jq
{
  "model": "llava:7b",
  "created_at": "2024-09-18T11:58:28.138705728Z",
  "message": {
    "role": "assistant",
    "content": " The image is a photograph of an adorable puppy sitting on a stone surface. The puppy has a fluffy white coat and appears to be young with a playful expression. It is wearing a red collar, suggesting it may be trained or ready for a walk. Behind the puppy is what seems to be an exterior wall with a patterned design and possibly some plants or trees can be inferred from the vegetation patterns on the wall. The photograph is taken in natural light and has a shallow depth of field, which slightly blurs the background. "
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 1589083329,
  "load_duration": 6159553,
  "prompt_eval_count": 1,
  "prompt_eval_duration": 229572000,
  "eval_count": 115,
  "eval_duration": 1301140000
}

real	0m1.600s
user	0m0.027s
sys	0m0.003s
<!-- gh-comment-id:2358273920 --> @rick-github commented on GitHub (Sep 18, 2024): ```console $ time curl -s localhost:11434/api/chat -d '{"model":"llava:7b","options":{"temperature":2},"messages":[{"role":"user","content":"describe this image","images":["'$(base64 -w0 puppy.jpg)'"]}],"stream":false}' | jq { "model": "llava:7b", "created_at": "2024-09-18T11:58:28.138705728Z", "message": { "role": "assistant", "content": " The image is a photograph of an adorable puppy sitting on a stone surface. The puppy has a fluffy white coat and appears to be young with a playful expression. It is wearing a red collar, suggesting it may be trained or ready for a walk. Behind the puppy is what seems to be an exterior wall with a patterned design and possibly some plants or trees can be inferred from the vegetation patterns on the wall. The photograph is taken in natural light and has a shallow depth of field, which slightly blurs the background. " }, "done_reason": "stop", "done": true, "total_duration": 1589083329, "load_duration": 6159553, "prompt_eval_count": 1, "prompt_eval_duration": 229572000, "eval_count": 115, "eval_duration": 1301140000 } real 0m1.600s user 0m0.027s sys 0m0.003s ```
Author
Owner

@jluisreymejias commented on GitHub (Sep 18, 2024):

The issue is random, sometimes hangs at first call, sometimes you need 200 requests,, so changing the temperature just makes it more frequent, I make a ton of tests, and the only way to avoid the issue when processing big batches of images (thousands), is to set the keep_alive value to 0m, meaning that a fresh copy of the model is loaded for each new request....

<!-- gh-comment-id:2358505350 --> @jluisreymejias commented on GitHub (Sep 18, 2024): The issue is random, sometimes hangs at first call, sometimes you need 200 requests,, so changing the temperature just makes it more frequent, I make a ton of tests, and the only way to avoid the issue when processing big batches of images (thousands), is to set the keep_alive value to 0m, meaning that a fresh copy of the model is loaded for each new request....
Author
Owner

@rick-github commented on GitHub (Sep 19, 2024):

I did 200 requests without seeing any problems. Server logs may aid in debugging.

<!-- gh-comment-id:2359678406 --> @rick-github commented on GitHub (Sep 19, 2024): I did 200 requests without seeing any problems. [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.
Author
Owner

@dhiltgen commented on GitHub (Nov 6, 2024):

@jluisreymejias please give the new 0.4.0 release a try and see if that resolves the sporadic hang problem you were seeing.

<!-- gh-comment-id:2458462593 --> @dhiltgen commented on GitHub (Nov 6, 2024): @jluisreymejias please give the new 0.4.0 release a try and see if that resolves the sporadic hang problem you were seeing.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#66362