[GH-ISSUE #4990] First value different on CUDA/ROCM when setting seed #65192

Open
opened 2026-05-03 19:58:23 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @jmorganca on GitHub (Jun 12, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4990

What is the issue?

This seems to be an issue with the kv cache on Nvidia/AMD GPUs. See https://github.com/ggerganov/llama.cpp/issues/2838

Originally created by @jmorganca on GitHub (Jun 12, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4990 ### What is the issue? This seems to be an issue with the kv cache on Nvidia/AMD GPUs. See https://github.com/ggerganov/llama.cpp/issues/2838
GiteaMirror added the bugamdnvidia labels 2026-05-03 19:58:23 -05:00
Author
Owner

@ScreamingHawk commented on GitHub (Jun 16, 2024):

Worth noting, setting temperature: 0 didn't seem to give consistent results either. Despite what is said in that linked ticket. Not sure if that is a related issue or something separate. See #5012 for reproducible error

<!-- gh-comment-id:2171825292 --> @ScreamingHawk commented on GitHub (Jun 16, 2024): Worth noting, setting `temperature: 0` didn't seem to give consistent results either. Despite what is said in that linked ticket. Not sure if that is a related issue or something separate. See #5012 for reproducible error
Author
Owner

@pulinagrawal commented on GitHub (Feb 13, 2025):

I teach a class of 40 students who are using seed with Ollama. I am fairly certain that the issue is with prompt caching. Seed stops working when they run their code multiple times. If Ollama is restarted they get the first result they got when they ran it the first time.

https://github.com/ggerganov/llama.cpp/issues/2838#issuecomment-1817601850
also references this issue with cache_prompt.

In ollama cache_prompt seems to be permanently set based on this 8cf16063a5/llm/server.go (L701)

<!-- gh-comment-id:2657680228 --> @pulinagrawal commented on GitHub (Feb 13, 2025): I teach a class of 40 students who are using seed with Ollama. I am fairly certain that the issue is with prompt caching. Seed stops working when they run their code multiple times. If Ollama is restarted they get the first result they got when they ran it the first time. https://github.com/ggerganov/llama.cpp/issues/2838#issuecomment-1817601850 also references this issue with cache_prompt. In ollama cache_prompt seems to be permanently set based on this https://github.com/ollama/ollama/blob/8cf16063a52deb416e16039c73264e26f7e9a43a/llm/server.go#L701
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65192