[GH-ISSUE #7930] failed to decode batch: could not find a kv cache slot #30836

Closed
opened 2026-04-22 10:46:41 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @wangpf09 on GitHub (Dec 4, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7930

What is the issue?

i run the ollama 0.4.6、0.4.7 and the source code,all have this error
and i used apple m2

time=2024-12-04T18:45:31.343+08:00 level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=2052 keep=5 new=2048
panic: failed to decode batch: could not find a kv cache slot

goroutine 36 [running]:
main.(*Server).run(0x1400014a120, {0x1025f6d88, 0x140001000a0})
	github.com/ollama/ollama/llama/runner/runner.go:344 +0x1e0
created by main.main in goroutine 1
	github.com/ollama/ollama/llama/runner/runner.go:978 +0xb30

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.4.6、0.4.7

Originally created by @wangpf09 on GitHub (Dec 4, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7930 ### What is the issue? i run the ollama 0.4.6、0.4.7 and the source code,all have this error and i used apple m2 ``` time=2024-12-04T18:45:31.343+08:00 level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=2052 keep=5 new=2048 panic: failed to decode batch: could not find a kv cache slot goroutine 36 [running]: main.(*Server).run(0x1400014a120, {0x1025f6d88, 0x140001000a0}) github.com/ollama/ollama/llama/runner/runner.go:344 +0x1e0 created by main.main in goroutine 1 github.com/ollama/ollama/llama/runner/runner.go:978 +0xb30 ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.4.6、0.4.7
GiteaMirror added the bug label 2026-04-22 10:46:41 -05:00
Author
Owner

@123jddjb commented on GitHub (Dec 5, 2024):

Hello, I meet the same problem. Can you tell me how you finally solved this problem?

<!-- gh-comment-id:2519814324 --> @123jddjb commented on GitHub (Dec 5, 2024): Hello, I meet the same problem. Can you tell me how you finally solved this problem?
Author
Owner

@wangpf09 commented on GitHub (Dec 5, 2024):

Hello, I meet the same problem. Can you tell me how you finally solved this problem?

i use ollama for graphrag, i just change the config concurrent_requests: 1,then it works.

<!-- gh-comment-id:2520147411 --> @wangpf09 commented on GitHub (Dec 5, 2024): > Hello, I meet the same problem. Can you tell me how you finally solved this problem? i use ollama for graphrag, i just change the config `concurrent_requests: 1`,then it works.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30836