[GH-ISSUE #3483] Ollama hangs on CUDA devices when running multi-modal models #64182

Closed
opened 2026-05-03 16:28:58 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @jmorganca on GitHub (Apr 4, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3483

What is the issue?

Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"launch_slot_with_data","level":"INFO","line":804,"msg":"slot is processing task","slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704}
Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"update_slots","level":"INFO","line":1808,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704}
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 128
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 64
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 32
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 16
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 8
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 4
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 2
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 1
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to decode the batch, n_batch = 1, ret = 1
Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256

What did you expect to see?

No response

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

No response

OS

No response

Architecture

No response

Platform

No response

Ollama version

No response

GPU

No response

GPU info

No response

CPU

No response

Other software

No response

Originally created by @jmorganca on GitHub (Apr 4, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3483 ### What is the issue? ``` Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"launch_slot_with_data","level":"INFO","line":804,"msg":"slot is processing task","slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704} Apr 04 05:15:04 gpu.us-central1-a.c.ollama.internal ollama[5042]: {"function":"update_slots","level":"INFO","line":1808,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":29930,"tid":"140079034640064","timestamp":1712207704} Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 128 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 64 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 32 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 16 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 8 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 4 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 2 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 1 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to decode the batch, n_batch = 1, ret = 1 Apr 04 05:15:44 gpu.us-central1-a.c.ollama.internal ollama[5042]: [1712207744] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256 ``` ### What did you expect to see? _No response_ ### Steps to reproduce _No response_ ### Are there any recent changes that introduced the issue? _No response_ ### OS _No response_ ### Architecture _No response_ ### Platform _No response_ ### Ollama version _No response_ ### GPU _No response_ ### GPU info _No response_ ### CPU _No response_ ### Other software _No response_
GiteaMirror added the bug label 2026-05-03 16:28:58 -05:00
Author
Owner

@jmorganca commented on GitHub (Apr 4, 2024):

Closing for https://github.com/ollama/ollama/issues/3361

<!-- gh-comment-id:2036294866 --> @jmorganca commented on GitHub (Apr 4, 2024): Closing for https://github.com/ollama/ollama/issues/3361
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64182