[GH-ISSUE #4673] BUG: PHI-3 #2938

Open
opened 2026-04-12 13:18:35 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @MichaelFomenko on GitHub (May 28, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4673

What is the issue?

When I start the Conversation in German Language, Phi-3 Mini and Medium working fine. But after some Conversations, the Models starting producing slowly Gibberish and Nonsens and repeating phrases, word and tokens and don't answering my Quaestiones anymore. When I start a new Conversation, it works fine.

Open Web UI Version: 1.1.125

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.1.39

Originally created by @MichaelFomenko on GitHub (May 28, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4673 ### What is the issue? When I start the Conversation in German Language, Phi-3 Mini and Medium working fine. But after some Conversations, the Models starting producing slowly Gibberish and Nonsens and repeating phrases, word and tokens and don't answering my Quaestiones anymore. When I start a new Conversation, it works fine. Open Web UI Version: 1.1.125 ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.1.39
GiteaMirror added the bug label 2026-04-12 13:18:35 -05:00
Author
Owner

@jaasperhelin commented on GitHub (May 29, 2024):

I have the same issue, especially with RAG. Even after upgrading to the newest version of ollama today. Issue appears both with Phi 3 Mini and Medium, context window is set to 4000 tokens.
I use the prompt templates suggested by ollama. Does it require another template for RAG?

<!-- gh-comment-id:2137297809 --> @jaasperhelin commented on GitHub (May 29, 2024): I have the same issue, especially with RAG. Even after upgrading to the newest version of ollama today. Issue appears both with Phi 3 Mini and Medium, context window is set to 4000 tokens. I use the prompt templates suggested by ollama. Does it require another template for RAG?
Author
Owner

@MichaelFomenko commented on GitHub (May 30, 2024):

is here someone, who cares about BUGs? Maybee the Developer should not focus on new Features and make the existing code stable. And only implement new features whet it is absolutely necessary. Because every new feature needs testing time and with time the complexity if the Project will increase and if the base code is not testet the whole project fail, because of the BUGs.

<!-- gh-comment-id:2139185187 --> @MichaelFomenko commented on GitHub (May 30, 2024): is here someone, who cares about BUGs? Maybee the Developer should not focus on new Features and make the existing code stable. And only implement new features whet it is absolutely necessary. Because every new feature needs testing time and with time the complexity if the Project will increase and if the base code is not testet the whole project fail, because of the BUGs.
Author
Owner

@nils-se commented on GitHub (Jun 25, 2024):

Having the same issue. phi3 3B was very fast on my 4 core cpu. now its painfully slow. What changed is the ollama version and the phi3 version. I can't get the old version back, as I deleted it before the update, but its probably 10x slower. Gemma still gives solid performance.
OS: Kubuntu 22.04
ollama version: 0.1.45
CPU: Intel
no GPU

<!-- gh-comment-id:2188254238 --> @nils-se commented on GitHub (Jun 25, 2024): Having the same issue. phi3 3B was very fast on my 4 core cpu. now its painfully slow. What changed is the ollama version and the phi3 version. I can't get the old version back, as I deleted it before the update, but its probably 10x slower. Gemma still gives solid performance. OS: Kubuntu 22.04 ollama version: 0.1.45 CPU: Intel no GPU
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2938