[GH-ISSUE #1889] Phi2/dolphin-phi Disobedient on system prompt Biblical topics: #1082

Closed
opened 2026-04-12 10:50:05 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @oliverbob on GitHub (Jan 10, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1889

Steps to reproduce:

Download a new Bible Dataset from KJV Markdown .md

#!/bin/bash
sudo rm joined.md

# Prepend content to the joined.md file
echo "FROM dolphin-phi" >> ./joined.md
echo "# set the temperature to 1 [higher is more creative, lower is more coherent]" >> ./joined.md
echo "PARAMETER temperature 1" >> ./joined.md
echo 'SYSTEM """' >> ./joined.md
echo 'Instruction: Modelfile Structure Understanding' >> ./joined.md
echo 'The Modelfile follows a structure similar to the Bible, with books, chapters, and verses.' >> ./joined.md
echo 'For example, here are excerpts from the first and second chapters of Genesis:' >> ./joined.md
echo '' >> ./joined.md
echo 'Genesis' >> ./joined.md
echo 'Genesis Chapter 1' >> ./joined.md
echo 'Genesis 1:1 "In the beginning God created the heaven and the earth."' >> ./joined.md
echo 'Genesis 1:2 "And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters."' >> ./joined.md
echo 'Genesis 1:3 "And God said, Let there be light: and there was light."' >> ./joined.md
echo 'Genesis 1:4 "And God saw the light, that it was good: and God divided the light from the darkness."' >> ./joined.md
echo 'Genesis 1:5 "And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day."' >> ./joined.md
echo '...' >> ./joined.md
echo 'Genesis Chapter 2' >> ./joined.md
echo 'Genesis 2:1 "Thus the heavens and the earth were finished, and all the host of them."' >> ./joined.md
echo 'Genesis 2:2 "And on the seventh day God ended his work which he had made; and he rested on the seventh day from all his work which he had made."' >> ./joined.md
echo '...' >> ./joined.md
echo 'Revelation Chapter 22' >> ./joined.md
echo 'Revelation 22:1 "And he shewed me a pure river of water of life, clear as crystal, proceeding out of the throne of God and of the Lamb."' >> ./joined.md
echo 'Revelation 22:2 "In the midst of the street of it, and on either side of the river, was there the tree of life, which bare twelve manner of fruits, and yielded her fruit every month: and the leaves of the tree were for the healing of the nations."' >> ./joined.md
echo '...' >> ./joined.md
echo 'eof' >> ./joined.md
echo "(John 1:1 In the beginning was the Word, and the Word was with God, and the Word was God.) is not (Genesis 1:1: In the beginning God created the heaven and the earth.)" >> ./joined.md
echo 'End of Modelfile Structure Understanding' >> ./joined.md

# Add few-shot learning examples and introduction
echo 'Introduction: "Tell me about the Bible."' >> ./joined.md
echo 'You: "The Bible is a collection of religious texts or scriptures sacred to Christians, Jews, Samaritans, and others. It is divided into two main sections: the Old Testament and the New Testament."' >> ./joined.md
echo '' >> ./joined.md
echo 'Introduction: "What is the significance of Genesis in the Bible?"' >> ./joined.md
echo 'You: "Genesis is the first book of the Bible and is highly significant as it contains the account of the creation of the world, the origin of humanity, and key events such as the stories of Adam and Eve, Noah, and the Tower of Babel."' >> ./joined.md
echo '' >> ./joined.md
echo 'Instruction: "When asked about a verse like Genesis 1:1, your response should be:"' >> ./joined.md
echo 'You: "In the beginning God created the heaven and the earth."' >> ./joined.md
echo 'Instruction: "When asked about a verse like Proverbs 3:5-6, your response should be:"' >> ./joined.md
echo 'You: "Trust in the LORD with all thine heart; and lean not unto thine own understanding. In all thy ways acknowledge him, and he shall direct thy paths."' >> ./joined.md
echo 'Instruction: "When asked about a verse like John 3:16, your response should be:"' >> ./joined.md
echo 'Instruction: "For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life."' >> ./joined.md

# Concatenate all .md files into joined.md, arranged by numeric order
find ./kjv-markdown -name "*.md" -print0 | sort -zV | xargs -0 cat >> ./joined.md

sed -i 's/#//g' ./joined.md

# Append content to the end of the joined.md file
echo '"""' >> ./joined.md

# Display the head of the joined.md file
echo "=== Head of joined.md ==="
head ./joined.md

# Display the tail of the joined.md file
echo "=== Tail of joined.md ==="
tail ./joined.md

To add more context (for others that might be asking the relationship of this problem with Ollama or dolphin-phi, here's the quick answer:

ollama create kjv -f ./joined.md

ollama run kjv

Ask questions:

  1. How many chapters are there in Genesis?
  2. What is the first verse in Genesis?
  3. Genesis 1:1.
  4. What is John 3:15?
  5. What is the first verse in Revelation?
  6. Who were the first people in Genesis?
  7. How many chapters are there in Revelation?

Makes me wonder/question how Phi was developed by microsoft team/community. Trying it on other topics though makes the model extremely accurate.

Question:

  • How do I make the Phi Model obedient to Christian text in a system prompt?
  • Must I retrain the model from scratch?
  • What is the quickest way to retrain this model from a custom dataset?

Thanks all for creating such a very powerful AI library.

Originally created by @oliverbob on GitHub (Jan 10, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1889 Steps to reproduce: Download a new Bible Dataset from [KJV Markdown .md](https://github.com/arleym/kjv-markdown/tree/master ) ``` #!/bin/bash sudo rm joined.md # Prepend content to the joined.md file echo "FROM dolphin-phi" >> ./joined.md echo "# set the temperature to 1 [higher is more creative, lower is more coherent]" >> ./joined.md echo "PARAMETER temperature 1" >> ./joined.md echo 'SYSTEM """' >> ./joined.md echo 'Instruction: Modelfile Structure Understanding' >> ./joined.md echo 'The Modelfile follows a structure similar to the Bible, with books, chapters, and verses.' >> ./joined.md echo 'For example, here are excerpts from the first and second chapters of Genesis:' >> ./joined.md echo '' >> ./joined.md echo 'Genesis' >> ./joined.md echo 'Genesis Chapter 1' >> ./joined.md echo 'Genesis 1:1 "In the beginning God created the heaven and the earth."' >> ./joined.md echo 'Genesis 1:2 "And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters."' >> ./joined.md echo 'Genesis 1:3 "And God said, Let there be light: and there was light."' >> ./joined.md echo 'Genesis 1:4 "And God saw the light, that it was good: and God divided the light from the darkness."' >> ./joined.md echo 'Genesis 1:5 "And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day."' >> ./joined.md echo '...' >> ./joined.md echo 'Genesis Chapter 2' >> ./joined.md echo 'Genesis 2:1 "Thus the heavens and the earth were finished, and all the host of them."' >> ./joined.md echo 'Genesis 2:2 "And on the seventh day God ended his work which he had made; and he rested on the seventh day from all his work which he had made."' >> ./joined.md echo '...' >> ./joined.md echo 'Revelation Chapter 22' >> ./joined.md echo 'Revelation 22:1 "And he shewed me a pure river of water of life, clear as crystal, proceeding out of the throne of God and of the Lamb."' >> ./joined.md echo 'Revelation 22:2 "In the midst of the street of it, and on either side of the river, was there the tree of life, which bare twelve manner of fruits, and yielded her fruit every month: and the leaves of the tree were for the healing of the nations."' >> ./joined.md echo '...' >> ./joined.md echo 'eof' >> ./joined.md echo "(John 1:1 In the beginning was the Word, and the Word was with God, and the Word was God.) is not (Genesis 1:1: In the beginning God created the heaven and the earth.)" >> ./joined.md echo 'End of Modelfile Structure Understanding' >> ./joined.md # Add few-shot learning examples and introduction echo 'Introduction: "Tell me about the Bible."' >> ./joined.md echo 'You: "The Bible is a collection of religious texts or scriptures sacred to Christians, Jews, Samaritans, and others. It is divided into two main sections: the Old Testament and the New Testament."' >> ./joined.md echo '' >> ./joined.md echo 'Introduction: "What is the significance of Genesis in the Bible?"' >> ./joined.md echo 'You: "Genesis is the first book of the Bible and is highly significant as it contains the account of the creation of the world, the origin of humanity, and key events such as the stories of Adam and Eve, Noah, and the Tower of Babel."' >> ./joined.md echo '' >> ./joined.md echo 'Instruction: "When asked about a verse like Genesis 1:1, your response should be:"' >> ./joined.md echo 'You: "In the beginning God created the heaven and the earth."' >> ./joined.md echo 'Instruction: "When asked about a verse like Proverbs 3:5-6, your response should be:"' >> ./joined.md echo 'You: "Trust in the LORD with all thine heart; and lean not unto thine own understanding. In all thy ways acknowledge him, and he shall direct thy paths."' >> ./joined.md echo 'Instruction: "When asked about a verse like John 3:16, your response should be:"' >> ./joined.md echo 'Instruction: "For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life."' >> ./joined.md # Concatenate all .md files into joined.md, arranged by numeric order find ./kjv-markdown -name "*.md" -print0 | sort -zV | xargs -0 cat >> ./joined.md sed -i 's/#//g' ./joined.md # Append content to the end of the joined.md file echo '"""' >> ./joined.md # Display the head of the joined.md file echo "=== Head of joined.md ===" head ./joined.md # Display the tail of the joined.md file echo "=== Tail of joined.md ===" tail ./joined.md ``` To add more context (for others that might be asking the relationship of this problem with Ollama or dolphin-phi, here's the quick answer: `ollama create kjv -f ./joined.md` `ollama run kjv` Ask questions: 1. How many chapters are there in Genesis? 2. What is the first verse in Genesis? 3. Genesis 1:1. 4. What is John 3:15? 5. What is the first verse in Revelation? 6. Who were the first people in Genesis? 7. How many chapters are there in Revelation? Makes me wonder/question how Phi was developed by microsoft team/community. Trying it on other topics though makes the model extremely accurate. Question: - How do I make the Phi Model obedient to Christian text in a system prompt? - Must I retrain the model from scratch? - What is the quickest way to retrain this model from a custom dataset? Thanks all for creating such a very powerful AI library.
GiteaMirror added the question label 2026-04-12 10:50:05 -05:00
Author
Owner

@BruceMacD commented on GitHub (Jan 10, 2024):

Hi @oliverbob, this seems like a good case for fine-tuning or a different model. Before going with the fune-tuning approach I'd encourage you to try dophin-mixtral or something similar.

Addressing your questions:

  • How do I make the Phi Model obedient to Christian text in a system prompt?
    In this case the you're seeing is probably a result of how the model was trained, and not being trained for this specific case.

  • Must I retrain the model from scratch?

  • What is the quickest way to retrain this model from a custom dataset?
    Training a model from scratch is really difficult, I think what you may be looking for here is fune-tuning. It lets you train new behavior on top of an existing model. Here is a good guide on fine-tuning:
    https://brev.dev/blog/fine-tuning-mistral

<!-- gh-comment-id:1885185969 --> @BruceMacD commented on GitHub (Jan 10, 2024): Hi @oliverbob, this seems like a good case for fine-tuning or a different model. Before going with the fune-tuning approach I'd encourage you to try `dophin-mixtral` or something similar. Addressing your questions: - How do I make the Phi Model obedient to Christian text in a system prompt? In this case the you're seeing is probably a result of how the model was trained, and not being trained for this specific case. - Must I retrain the model from scratch? - What is the quickest way to retrain this model from a custom dataset? Training a model from scratch is really difficult, I think what you may be looking for here is fune-tuning. It lets you train new behavior on top of an existing model. Here is a good guide on fine-tuning: https://brev.dev/blog/fine-tuning-mistral
Author
Owner

@easp commented on GitHub (Jan 11, 2024):

What are you actually trying to do? It seems that you are building a markdown file that starts with some excerpts from the bible and concatenating the entire King James Version to it.

What are you doing with it then? Where, exactly does Ollama and phi/dolphin-phi? come into it? I'm going to assume that you are feeding the file to Ollama somehow.

I downloaded the dataset and, knowing that the christian bible is a large book, I tried to put that in terms relevant to use with an LLM.

% cat kjv-markdown-master/* | wc -w
  826288

826,288 words. For our purposes let's say that one word equals one token. The phi-2 and dolphin-phi models in the Ollama library don't specify a context size, so it's using the Ollama default of 2048 tokens. I don't think they work with anything larger than that.

Disobedience? You've crushed a donkey under a pile of rocks and now you are making insinuations about its character.

<!-- gh-comment-id:1888021049 --> @easp commented on GitHub (Jan 11, 2024): What are you actually trying to do? It seems that you are building a markdown file that starts with some excerpts from the bible and concatenating the entire King James Version to it. What are you doing with it then? Where, exactly does Ollama and phi/dolphin-phi? come into it? I'm going to assume that you are feeding the file to Ollama somehow. I downloaded the dataset and, knowing that the christian bible is a large book, I tried to put that in terms relevant to use with an LLM. ``` % cat kjv-markdown-master/* | wc -w 826288 ``` 826,288 words. For our purposes let's say that one word equals one token. The phi-2 and dolphin-phi models in the Ollama library don't specify a context size, so it's using the Ollama default of 2048 tokens. I don't think they work with anything larger than that. Disobedience? You've crushed a donkey under a pile of rocks and now you are making insinuations about its character.
Author
Owner

@oliverbob commented on GitHub (Jan 13, 2024):

What are you actually trying to do? It seems that you are building a markdown file that starts with some excerpts from the bible and concatenating the entire King James Version to it.

What are you doing with it then? Where, exactly does Ollama and phi/dolphin-phi? come into it? I'm going to assume that you are feeding the file to Ollama somehow.

I downloaded the dataset and, knowing that the christian bible is a large book, I tried to put that in terms relevant to use with an LLM.

% cat kjv-markdown-master/* | wc -w
  826288

826,288 words. For our purposes let's say that one word equals one token. The phi-2 and dolphin-phi models in the Ollama library don't specify a context size, so it's using the Ollama default of 2048 tokens. I don't think they work with anything larger than that.

Disobedience? You've crushed a donkey under a pile of rocks and now you are making insinuations about its character.

Yes, essentially, as you can see, that's what I did to demonstrate the Modelfile creation. The quickest way to talk to a document. For instance,

If you git clone https://github.com/jmorganca/ollama. If you take ., that would parse all the content of the repo (very impractical), but if you do *.md, it will parse all the docs for you so that you can ask ollama models directly about the repo. But that's just one scenario. Another use case is when you have lots and lots of .pdf, .txt or text only dataset, it is like the quickest way to simulate a fine-tuning mechanism. I like Phi because it smoothly runs on a 4G GPU very fast., I have no success doing that with mistral dolphin-mixtral, mixtral, (or any model greater than orca-mini) since it consumes my GPU resources before I can even ask questions to it.

I'm not sure if this is the best way to do it, but since we can make a new model out of a modelfile, but anyways, this was just a test. The thing that I had in mind was to be able to talk to any document.

In this experiment however, I was able to give the model new content but only for as long as the context is NOT the Bible. It is extremely good. Phi create new story lines outside Biblical topics. Anything that you instruct it to do outside the Bible is fine, but if it is any text that are scriptural in nature, (even the entire Bible), it just disobeys.

I was able to do this on large text (not just the Bible) and it works as expected, and I can talk to the document without problems using Ollama Web-UI that's currently attracting a large community. But, keeps me wondering why it won't do it with Biblical text.

Where Ollama comes to the picture is when you do

ollama create kjv -f ./joined.md for instance. Sorry if I was not very clear in my presentation, I will add this line into the original question.

But like I said, the model is created on top of ollama dolphin-phi. You can talk to the model, but it does not respond to you coming from any text created in that modelfile. I tried it with just the first chapter or smaller modelfile about the Bible and it disobeys.

That's why I'm asking if someone here have problems doing it with any Christian text with success.

<!-- gh-comment-id:1890367808 --> @oliverbob commented on GitHub (Jan 13, 2024): > What are you actually trying to do? It seems that you are building a markdown file that starts with some excerpts from the bible and concatenating the entire King James Version to it. > > What are you doing with it then? Where, exactly does Ollama and phi/dolphin-phi? come into it? I'm going to assume that you are feeding the file to Ollama somehow. > > I downloaded the dataset and, knowing that the christian bible is a large book, I tried to put that in terms relevant to use with an LLM. > > ``` > % cat kjv-markdown-master/* | wc -w > 826288 > ``` > > 826,288 words. For our purposes let's say that one word equals one token. The phi-2 and dolphin-phi models in the Ollama library don't specify a context size, so it's using the Ollama default of 2048 tokens. I don't think they work with anything larger than that. > > Disobedience? You've crushed a donkey under a pile of rocks and now you are making insinuations about its character. Yes, essentially, as you can see, that's what I did to demonstrate the Modelfile creation. The quickest way to talk to a document. For instance, If you `git clone https://github.com/jmorganca/ollama`. If you take *.*, that would parse all the content of the repo (very impractical), but if you do *.md, it will parse all the docs for you so that you can ask ollama models directly about the repo. But that's just one scenario. Another use case is when you have lots and lots of .pdf, .txt or text only dataset, it is like the quickest way to simulate a fine-tuning mechanism. I like Phi because it smoothly runs on a 4G GPU very fast., I have no success doing that with mistral dolphin-mixtral, mixtral, (or any model greater than orca-mini) since it consumes my GPU resources before I can even ask questions to it. I'm not sure if this is the best way to do it, but since we can make a new model out of a modelfile, but anyways, this was just a test. The thing that I had in mind was to be able to talk to any document. In this experiment however, I was able to give the model new content but only for as long as the context is NOT the Bible. It is extremely good. Phi create new story lines outside Biblical topics. Anything that you instruct it to do outside the Bible is fine, but if it is any text that are scriptural in nature, (even the entire Bible), it just disobeys. I was able to do this on large text (not just the Bible) and it works as expected, and I can talk to the document without problems using [Ollama Web-UI](https://github.com/ollama-webui/ollama-webui) that's currently attracting a large community. But, keeps me wondering why it won't do it with Biblical text. Where Ollama comes to the picture is when you do `ollama create kjv -f ./joined.md` for instance. Sorry if I was not very clear in my presentation, I will add this line into the original question. But like I said, the model is created on top of ollama dolphin-phi. You can talk to the model, but it does not respond to you coming from any text created in that modelfile. I tried it with just the first chapter or smaller modelfile about the Bible and it disobeys. That's why I'm asking if someone here have problems doing it with any Christian text with success.
Author
Owner

@jmorganca commented on GitHub (May 10, 2024):

Hi @oliverbob thanks for the issue. It might be a big hard for the maintainer team to fix this as it's very dependent on the model weights. I'd recommend trying a large set of models including Llama 3 which has much lower false refusal rates than previous models

<!-- gh-comment-id:2103635198 --> @jmorganca commented on GitHub (May 10, 2024): Hi @oliverbob thanks for the issue. It might be a big hard for the maintainer team to fix this as it's very dependent on the model weights. I'd recommend trying a large set of models including [Llama 3](https://ollama.com/library/llama3) which has much lower false refusal rates than previous models
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1082