[GH-ISSUE #3361] Ollama hangs with multi-modal models #48577

Closed
opened 2026-04-28 08:52:03 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @jmorganca on GitHub (Mar 26, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3361

What is the issue?

Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"launch_slot_with_data","level":"INFO","line":804,"msg":"slot is processing task","slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704}
Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"update_slots","level":"INFO","line":1808,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704}
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 128
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 64
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 32
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 16
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 8
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 4
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 2
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 1
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to decode the batch, n_batch = 1, ret = 1
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256
Originally created by @jmorganca on GitHub (Mar 26, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3361 ### What is the issue? ``` Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"launch_slot_with_data","level":"INFO","line":804,"msg":"slot is processing task","slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704} Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"update_slots","level":"INFO","line":1808,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704} Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 128 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 64 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 32 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 16 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 8 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 4 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 2 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 1 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to decode the batch, n_batch = 1, ret = 1 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256 ```
GiteaMirror added the bug label 2026-04-28 08:52:03 -05:00
Author
Owner

@jmorganca commented on GitHub (Apr 15, 2024):

This should be fixed as of 0.1.31

<!-- gh-comment-id:2057671061 --> @jmorganca commented on GitHub (Apr 15, 2024): This should be fixed as of 0.1.31
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#48577