[GH-ISSUE #7075] Hallucination fix? #4491

Closed
opened 2026-04-12 15:25:02 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @Lu-Yi-Fan on GitHub (Oct 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7075

Hi, when i use the models(llama3:70b/llama3:latest) via ollama, it seems to keep track of all the conversations and query. This causes hallucination and information to appear across different channels which shouldnt be the case. What could be a possible remedy for this? Would it be possible to instaniate the model without keeping track of the history. Thank you in advance

Originally created by @Lu-Yi-Fan on GitHub (Oct 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7075 Hi, when i use the models(llama3:70b/llama3:latest) via ollama, it seems to keep track of all the conversations and query. This causes hallucination and information to appear across different channels which shouldnt be the case. What could be a possible remedy for this? Would it be possible to instaniate the model without keeping track of the history. Thank you in advance
GiteaMirror added the feature request label 2026-04-12 15:25:02 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 2, 2024):

How are you querying the model? ollama run or some other client?

<!-- gh-comment-id:2387853601 --> @rick-github commented on GitHub (Oct 2, 2024): How are you querying the model? `ollama run` or some other client?
Author
Owner

@Lu-Yi-Fan commented on GitHub (Oct 2, 2024):

Hi I am currently running via server side, the ollama model via api calls, using langchain_community.llms import Ollama.
To share more context perhaps: I am currently working on a RAG system that ingests PDFs. And I have a seperate conversation chain and seperate vectorstore for each PDF to "Talk to the PDF". However, the issue is that after many conversations with lets say PDF A and PDF B, information from PDF B would get leaked into the conversation with PDF A. I have checked through my code and made sure the conversation chains have no spillovers and it is not due to the wrong assignment of variables. When I tested it using ollama run to test my hypothesis "Conversations are stored", it is true as the model is able to repeat what I had said earlier on, hence my suspicion to the hallucination. Thank you

<!-- gh-comment-id:2387862719 --> @Lu-Yi-Fan commented on GitHub (Oct 2, 2024): Hi I am currently running via server side, the ollama model via api calls, using langchain_community.llms import Ollama. To share more context perhaps: I am currently working on a RAG system that ingests PDFs. And I have a seperate conversation chain and seperate vectorstore for each PDF to "Talk to the PDF". However, the issue is that after many conversations with lets say PDF A and PDF B, information from PDF B would get leaked into the conversation with PDF A. I have checked through my code and made sure the conversation chains have no spillovers and it is not due to the wrong assignment of variables. When I tested it using ollama run to test my hypothesis "Conversations are stored", it is true as the model is able to repeat what I had said earlier on, hence my suspicion to the hallucination. Thank you
Author
Owner

@rick-github commented on GitHub (Oct 2, 2024):

The ollama server doesn't store any state. If you are finding that the server is responding as if it is, then that would seem to be a problem with the client. See https://github.com/ollama/ollama/issues/6992 for the exact opposite of your problem, where a user was wondering why the server wasn't using earlier conversations in responses.

ollama run is a client that does preserve conversational history. You can reset the history with the /clear command.

<!-- gh-comment-id:2387975246 --> @rick-github commented on GitHub (Oct 2, 2024): The ollama server doesn't store any state. If you are finding that the server is responding as if it is, then that would seem to be a problem with the client. See https://github.com/ollama/ollama/issues/6992 for the exact opposite of your problem, where a user was wondering why the server wasn't using earlier conversations in responses. `ollama run` is a client that does preserve conversational history. You can reset the history with the `/clear` command.
Author
Owner

@Lu-Yi-Fan commented on GitHub (Oct 2, 2024):

ok thank you for the findings, I will check and test again and get back to you. Thank you!

<!-- gh-comment-id:2388021176 --> @Lu-Yi-Fan commented on GitHub (Oct 2, 2024): ok thank you for the findings, I will check and test again and get back to you. Thank you!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4491